You forget one script, running in the background, hogging a whole
core, and suddenly watch's default 2 second sleep time makes a lot more
sense...
One of the main motivators for watch.py _was_ for shorter sleep times,
short enough to render realtime animations (watch is limited to 0.1
seconds for some reason?), but this doesn't mean it needs to be the
default. This can still be accomplished by explicitly specifying
-s/--sleep, and we probably don't want the default to hog all the CPU.
The use case for fast sleeps has been mostly replaced by -k/--keep-open
anyways.
For tailpipe.py and tracebd.py it's a bit less clear, but we probably
don't need to be spamming open calls 10 times a second.
Unfortunately the import sys in the argparse block was hiding missing
sys imports.
The mistake was assuming the import sys in Python would limit the scope
to that if block, but Python's late binding strikes again...
Moved local import hack behind if __name__ == "__main__"
These scripts aren't really intended to be used as python libraries.
Still, it's useful to import them for debugging and to get access to
their juicy internals.
This seems like a more fitting name now that this script has evolved
into more of a general purpose high-level CSV tool.
Unfortunately this does conflict with the standard csv module in Python,
breaking every script that imports csv (which is most of them).
Fortunately, Python is flexible enough to let us remove the current
directory before imports with a bit of an ugly hack:
# prevent local imports
__import__('sys').path.pop(0)
These scripts are intended to be standalone anyways, so this is probably
a good pattern to adopt.
These work by keeping a set of all seen mroots as we descend down the
mroot chain. Simple, but it works.
The downside of this approach is that the mroot set grows unbounded, but
it's unlikely we'll ever have enough mroots in a system for this to
really matter.
This fixes scripts like dbgbmap.py getting stuck on intentional mroot
cycles created for testing. It's not a problem for a foreground script
to get stuck in an infinite loop, since you can just kill it, but a
background script getting stuck at 100% CPU is a bit more annoying.
This matches the style used in C, which is good for consistency:
a_really_long_function_name(
double_indent_after_first_newline(
single_indent_nested_newlines))
We were already doing this for multiline control-flow statements, simply
because I'm not sure how else you could indent this without making
things really confusing:
if a_really_long_function_name(
double_indent_after_first_newline(
single_indent_nested_newlines)):
do_the_thing()
This was the only real difference style-wise between the Python code and
C code, so now both should be following roughly the same style (80 cols,
double-indent multiline exprs, prefix multiline binary ops, etc).
Mainly to avoid conflicts with match results m, this frees up the single
letter variables m for other purposes.
Choosing a two letter alias was surprisingly difficult, but mt is nice
in that it somewhat matches it (for itertools) and ft (for functools).
This lets you view the first n lines of output instead of the last n
lines, as though the output was piped through head.
This is how the standard watch command works, and can be more useful
when most of the information is at the top, such as in our dbg*.py
scripts (watch.py was originally used as a sort of inotifywait-esque
build runner, which is the main reason it's different).
To make this work, RingIO (renamed from LinesIO) now uses terminal
height as a part of its canvas rendering. This has the added benefit of
more rigorously enforcing the canvas boundaries, but risks breaking when
not associated with a terminal. But that raises the question, does
RingIO even make sense without a terminal?
Worst case you can bypass all of this with -z/--cat.
So now these should be invoked like so:
$ ./scripts/dbglfs.py -b4096x256 disk
The motivation for this change is to better match other filesystem
tooling. Some prior art:
- mkfs.btrfs
- -n/--nodesize => node size in bytes, power of 2 >= sector
- -s/--sectorsize => sector size in bytes, power of 2
- zfs create
- -b => block size in bytes
- mkfs.xfs
- -b => block size in bytes, power of 2 >= sector
- -s => sector size in bytes, power of 2 >= 512
- mkfs.ext[234]
- -b => block size in bytes, power of 2 >= 1024
- mkfs.ntfs
- -c/--cluster-size => cluster size in bytes, power of 2 >= sector
- -s/--sector-size => sector size in bytes, power of 2 >= 256
- mkfs.fat
- -s => cluster size in sectors, power of 2
- -S => sector size in bytes, power of 2 >= 512
Why care so much about the flag naming for internal scripts? The
intention is for external tooling to eventually use the same set of
flags. And maybe even create publically consumable versions of the dbg
scripts. It's important that if/when this happens flags stay consistent.
Everyone familiar with the ssh -p/scp -P situation knows how annoying
this can be.
It's especially important for littlefs's -b/--block-size flag, since
this will likely end up used everywhere. Unlike other filesystems,
littlefs can't mount without knowing the block-size, so any tool that
mounts littlefs is going to need the -b/--block-size flag.
---
The original motivation for -B was to avoid conflicts with the -b/--by
flag that was already in use in all of the measurement scripts. But
these are internal, and not really littlefs-related, so I don't think
that's a good reason any more. Worst case we can just make the --by flag
-B, or just not have a short form (--by is only 4 letters after all).
Somehow we ended up with no scripts needing both -b/--block-size and
-b/--by so far.
Some other conflicts/inconsistencies tweaks were needed, here are all
the flag changes:
- -B/--block-size -> -b/--block-size
- -M/--mleaf-weight -> -m/--mleaf-weight
- -b/--btree -> -B/--btree
- -C/--block-cycles -> -c/--block-cycles (in tracebd.py)
- -c/--coalesce -> -S/--coalesce (in tracebd.py)
- -m/--mdirs -> -M/--mdirs (in dbgbmap.py)
- -b/--btrees -> -B/--btrees (in dbgbmap.py)
- -d/--datas -> -D/--datas (in dbgbmap.py)
Also limited block_size/block_count updates to only happen when the
configured value is None. This matches dbgbmap.py.
Basically just a cleanup of some bugs after the rework related to
matching dbgbmap.py. Unfortunately these scripts have too much surface
area and no tests...
dbgbmap.py parses littlefs's mtree/btrees and displays that status of
every block in use:
$ ./scripts/dbgbmap.py disk -B4096x256 -Z -H8 -W64
bd 4096x256, 7.8% mdir, 10.2% btree, 78.1% data
mmddbbddddddmmddddmmdd--bbbbddddddddddddddbbdddd--ddddddmmdddddd
mmddddbbddbbddddddddddddddddbbddddbbddddddmmddbbdddddddddddddddd
bbdddddddddddd--ddddddddddddddddbbddddmmmmddddddddddddmmmmdddddd
ddddddddddbbdddddddddd--ddddddddddddddmmddddddddddddddddddddmmdd
ddddddbbddddddddbb--ddddddddddddddddddddbb--mmmmddbbdddddddddddd
ddddddddddddddddddddbbddbbdddddddddddddddddddddddddddddddddddddd
dddddddddd--ddddbbddddddddmmbbdd--ddddddddddddddbbmmddddbbdddddd
ddmmddddddddddmmddddddddmmddddbbbbdddddddd--ddbbddddddmmdd--ddbb
(ok, it looks a bit better with colors)
dbgbmap.py matches the layout and has the same options as tracebd.py,
allowing the combination of both to provide valuable insight into what
exactly littlefs is doing.
This required a bit of tweaking of tracebd.py to get right, mostly
around conflicting order-based arguments. This also reworks the internal
Bmap class to be more resilient to out-of-window ops, and adds an
optional informative header.
In the hack where we wait for multiple updates to fill out a full
braille/dots line we store the current pixels in a temporary array.
Unfortunately, in some cases, this is the array we modify with
updates...
A copy fixes this.
- Tried to do the rescaling a bit better with truncating divisions, so
there shouldn't be weird cross-pixel updates when things aren't well
aligned.
- Adopted optional -B<block_size>x<block_count> flag for explicitly
specifying the block-device geometry in a way that is compatible with
other scripts. Should adopt this more places.
- Adopted optional <block>.<off> argument for start of range. This
should match dbgblock.py.
- Adopted '-' for noop/zero-wear.
- Renamed a few internal things.
- Dropped subscript chars for wear, this didn't really add anything and
can be accomplished by specifying the --wear-chars explicitly.
Also changed dbgblock.py to match, this mostly affects the --off/-n/--size
flags. For example, these are all the same:
./scripts/dbgblock.py disk -B4096 --off=10 --size=5
./scripts/dbgblock.py disk -B4096 --off=10 -n5
./scripts/dbgblock.py disk -B4096 --off=10,15
./scripts/dbgblock.py disk -B4096 -n10,15
./scripts/dbgblock.py disk -B4096 0.10 -n5
Also also adopted block-device geometry argument across scripts, where
the -B flag can optionally be a full <block_size>x<block_count> geometry:
./scripts/tracebd.py disk -B4096x256
Though this is mostly unused outside of tracebd.py right now. It will be
useful for anything that formats littlefs (littlefs-fuse?) and allowing
the format everywhere is a bit of a nice convenience.
Yes, erases are the more costly operation that we should highlight. But,
aside from broken code, you can never prog more than you erase.
This makes it more useful to priortize progs over erases, so erases
without an overlaying prog show up as a relatively unique blue,
indicating regions of memory that have been erased but not progged.
Too many erased-but-not-progged regions indicate a potentially wastefull
algorithm.
Based loosely on Linux's perf tool, perfbd.py uses trace output with
backtraces to aggregate and show the block device usage of all functions
in a program, propagating block devices operation cost up the backtrace
for each operation.
This combined with --trace-period and --trace-freq for
sampling/filtering trace events allow the bench-runner to very
efficiently record the general cost of block device operations with very
little overhead.
Adopted this as the default side-effect of make bench, replacing
cycle-based performance measurements which are less important for
littlefs.
This provides 2 things:
1. perf integration with the bench/test runners - This is a bit tricky
with perf as it doesn't have its own way to combine perf measurements
across multiple processes. perf.py works around this by writing
everything to a zip file, using flock to synchronize. As a plus, free
compression!
2. Parsing and presentation of perf results in a format consistent with
the other CSV-based tools. This actually ran into a surprising number of
issues:
- We need to process raw events to get the information we want, this
ends up being a lot of data (~16MiB at 100Hz uncompressed), so we
paralellize the parsing of each decompressed perf file.
- perf reports raw addresses post-ASLR. It does provide sym+off which
is very useful, but to find the source of static functions we need to
reverse the ASLR by finding the delta the produces the best
symbol<->addr matches.
- This isn't related to perf, but decoding dwarf line-numbers is
really complicated. You basically need to write a tiny VM.
This also turns on perf measurement by default for the bench-runner, but at a
low frequency (100 Hz). This can be decreased or removed in the future
if it causes any slowdown.
- Changed multi-field flags to action=append instead of comma-separated.
- Dropped short-names for geometries/powerlosses
- Renamed -Pexponential -> -Plog
- Allowed omitting the 0 for -W0/-H0/-n0 and made -j0 consistent
- Better handling of --xlim/--ylim
Instead of trying to align to block-boundaries tracebd.py now just
aliases to whatever dimensions are provided.
Also reworked how scripts handle default sizing. Now using reasonable
defaults with 0 being a placeholder for automatic sizing. The addition
of -z/--cat makes it possible to pipe directly to stdout.
Also added support for dots/braille output which can capture more
detail, though care needs to be taken to not rely on accurate coloring.
These are really just different flavors of test.py and test_runner.c
without support for power-loss testing, but with support for measuring
the cumulative number of bytes read, programmed, and erased.
Note that the existing define parameterization should work perfectly
fine for running benchmarks across various dimensions:
./scripts/bench.py \
runners/bench_runner \
bench_file_read \
-gnor \
-DSIZE='range(0,131072,1024)'
Also added a couple basic benchmarks as a starting point.
- Added the littlefs license note to the scripts.
- Adopted parse_intermixed_args everywhere for more consistent arg
handling.
- Removed argparse's implicit help text formatting as it does not
work with perse_intermixed_args and breaks sometimes.
- Used string concatenation for argparse everywhere, uses backslashed
line continuations only works with argparse because it strips
redundant whitespace.
- Consistent argparse formatting.
- Consistent openio mode handling.
- Consistent color argument handling.
- Adopted functools.lru_cache in tracebd.py.
- Moved unicode printing behind --subscripts in traceby.py, making all
scripts ascii by default.
- Renamed pretty_asserts.py -> prettyasserts.py.
- Renamed struct.py -> struct_.py, the original name conflicts with
Python's built in struct module in horrible ways.
These are just some minor quality of life improvements
- Added a "make build-test" alias
- Made test runner a positional arg for test.py since it is almost
always required. This shortens the command line invocation most of the
time.
- Added --context to test.py
- Renamed --output in test.py to --stdout, note this still merges
stderr. Maybe at some point these should be split, but it's not really
worth it for now.
- Reworked the test_id parsing code a bit.
- Changed the test runner --step to take a range such as -s0,12,2
- Changed tracebd.py --block and --off to take ranges
Based on a handful of local hacky variations, this sort of trace
rendering is surprisingly useful for getting an understanding of how
different filesystem operations interact with the underlying
block-device.
At some point it would probably be good to reimplement this in a
compiled language. Parsing and tracking the trace output quickly
becomes a bottleneck with the amount of trace output the tests
generate.
Note also that since tracebd.py run on trace output, it can also be
used to debug logged block-device operations post-run.