Moved local import hack behind if __name__ == "__main__"
These scripts aren't really intended to be used as python libraries.
Still, it's useful to import them for debugging and to get access to
their juicy internals.
This seems like a more fitting name now that this script has evolved
into more of a general purpose high-level CSV tool.
Unfortunately this does conflict with the standard csv module in Python,
breaking every script that imports csv (which is most of them).
Fortunately, Python is flexible enough to let us remove the current
directory before imports with a bit of an ugly hack:
# prevent local imports
__import__('sys').path.pop(0)
These scripts are intended to be standalone anyways, so this is probably
a good pattern to adopt.
These work by keeping a set of all seen mroots as we descend down the
mroot chain. Simple, but it works.
The downside of this approach is that the mroot set grows unbounded, but
it's unlikely we'll ever have enough mroots in a system for this to
really matter.
This fixes scripts like dbgbmap.py getting stuck on intentional mroot
cycles created for testing. It's not a problem for a foreground script
to get stuck in an infinite loop, since you can just kill it, but a
background script getting stuck at 100% CPU is a bit more annoying.
This matches the style used in C, which is good for consistency:
a_really_long_function_name(
double_indent_after_first_newline(
single_indent_nested_newlines))
We were already doing this for multiline control-flow statements, simply
because I'm not sure how else you could indent this without making
things really confusing:
if a_really_long_function_name(
double_indent_after_first_newline(
single_indent_nested_newlines)):
do_the_thing()
This was the only real difference style-wise between the Python code and
C code, so now both should be following roughly the same style (80 cols,
double-indent multiline exprs, prefix multiline binary ops, etc).
Now that most scripts show relevant cksums, it makes sense for
dbgblock.py to just always show a cksum as well. It's not like this has
any noticable impact on the script's runtime.
Example:
$ ./scripts/dbgblock.py disk -b4096 0
block 0x0, size 4096, cksum e6e3ad25
00000000: 01 00 00 00 80 03 00 08 6c 69 74 74 6c 65 66 73 ........littlefs
00000010: 80 04 00 02 00 00 80 05 00 02 fa 01 80 09 00 04 ................
...
I think this makes a bit more sense.
I think the original reasoning for -x/--cksum was to match -x/--device
in dbgrbyd.py, but that flag no longer exists. This could go all the way
back to matching --xsum at some point, but I'm not sure.
Common hash related utils, sha256sum, md5sum, etc, use -c/--check to
validate their hash, so that's sort of prior art?
So now these should be invoked like so:
$ ./scripts/dbglfs.py -b4096x256 disk
The motivation for this change is to better match other filesystem
tooling. Some prior art:
- mkfs.btrfs
- -n/--nodesize => node size in bytes, power of 2 >= sector
- -s/--sectorsize => sector size in bytes, power of 2
- zfs create
- -b => block size in bytes
- mkfs.xfs
- -b => block size in bytes, power of 2 >= sector
- -s => sector size in bytes, power of 2 >= 512
- mkfs.ext[234]
- -b => block size in bytes, power of 2 >= 1024
- mkfs.ntfs
- -c/--cluster-size => cluster size in bytes, power of 2 >= sector
- -s/--sector-size => sector size in bytes, power of 2 >= 256
- mkfs.fat
- -s => cluster size in sectors, power of 2
- -S => sector size in bytes, power of 2 >= 512
Why care so much about the flag naming for internal scripts? The
intention is for external tooling to eventually use the same set of
flags. And maybe even create publically consumable versions of the dbg
scripts. It's important that if/when this happens flags stay consistent.
Everyone familiar with the ssh -p/scp -P situation knows how annoying
this can be.
It's especially important for littlefs's -b/--block-size flag, since
this will likely end up used everywhere. Unlike other filesystems,
littlefs can't mount without knowing the block-size, so any tool that
mounts littlefs is going to need the -b/--block-size flag.
---
The original motivation for -B was to avoid conflicts with the -b/--by
flag that was already in use in all of the measurement scripts. But
these are internal, and not really littlefs-related, so I don't think
that's a good reason any more. Worst case we can just make the --by flag
-B, or just not have a short form (--by is only 4 letters after all).
Somehow we ended up with no scripts needing both -b/--block-size and
-b/--by so far.
Some other conflicts/inconsistencies tweaks were needed, here are all
the flag changes:
- -B/--block-size -> -b/--block-size
- -M/--mleaf-weight -> -m/--mleaf-weight
- -b/--btree -> -B/--btree
- -C/--block-cycles -> -c/--block-cycles (in tracebd.py)
- -c/--coalesce -> -S/--coalesce (in tracebd.py)
- -m/--mdirs -> -M/--mdirs (in dbgbmap.py)
- -b/--btrees -> -B/--btrees (in dbgbmap.py)
- -d/--datas -> -D/--datas (in dbgbmap.py)
This is useful for debugging checksum mismatches on disk.
And since dbgblock.py has some relatively flexible options for slicing
the disk, this can be used to find the checksum of any on-disk data
pretty easily.
- Tried to do the rescaling a bit better with truncating divisions, so
there shouldn't be weird cross-pixel updates when things aren't well
aligned.
- Adopted optional -B<block_size>x<block_count> flag for explicitly
specifying the block-device geometry in a way that is compatible with
other scripts. Should adopt this more places.
- Adopted optional <block>.<off> argument for start of range. This
should match dbgblock.py.
- Adopted '-' for noop/zero-wear.
- Renamed a few internal things.
- Dropped subscript chars for wear, this didn't really add anything and
can be accomplished by specifying the --wear-chars explicitly.
Also changed dbgblock.py to match, this mostly affects the --off/-n/--size
flags. For example, these are all the same:
./scripts/dbgblock.py disk -B4096 --off=10 --size=5
./scripts/dbgblock.py disk -B4096 --off=10 -n5
./scripts/dbgblock.py disk -B4096 --off=10,15
./scripts/dbgblock.py disk -B4096 -n10,15
./scripts/dbgblock.py disk -B4096 0.10 -n5
Also also adopted block-device geometry argument across scripts, where
the -B flag can optionally be a full <block_size>x<block_count> geometry:
./scripts/tracebd.py disk -B4096x256
Though this is mostly unused outside of tracebd.py right now. It will be
useful for anything that formats littlefs (littlefs-fuse?) and allowing
the format everywhere is a bit of a nice convenience.
I had never noticed xxd has no header until comparing its output against
dbgblock.py. Turns out these headers aren't really all that useful, and
even sometimes wrong in dbglfs.py.