This adopts the Attr rework for the --add-xticklabel and
--add-yticklabel flags.
Sort of.
These require a bit of special behavior to make work, but should at
least be externally consistent with the other Attr flags.
Instead of assigning to by-field groups, --add-xticklabel/yticklabel
assign to the relevant x/y coord:
$ ./scripts/plotmpl.py \
--add-xticklabel='0=zero' \
--add-yticklabel='100=one-hundred'
The real power comes from our % modifiers. As a special case,
--add-xticklabel/yticklabel can reference the special x/y field, which
represents the current x/y coord:
$ ./scripts/plotmpl.py --y2 --yticks=5 --add-yticklabel='%(y)d KiB'
Combined with format specifiers, this allows for quite a bit:
$ ./scripts/plotmpl.py --y2 --yticks=5 --add-yticklabel='0x%(y)04x'
---
Note that plot.py only shows the min/max x/yticks, so plot.py only
accepts indexed --add-xticklabel/yticklabels, and will error if the
assigning variant is used.
Unifying these complicated attr-assigning flags across all the scripts
is the main benefit of the new internal Attr system.
The only tricky bit is we need to somehow keep track of all input fields
in case % modifiers reference fields, when we could previously discard
non-data fields.
Tricky but doable.
Updated flags:
- -L/--label -> -L/--add-label
- --colors -> -C/--add-color
- --formats -> -F/--add-format
- --chars -> -*/--add-char/--chars
- --line-chars -> -_/--add-line-char/--line-chars
I've also tweaked Attr to accept glob matches when figuring out group
assignments. This is useful for matching slightly different, but
similarly named results in our benchmark scripts.
There's probably a clever way to do this by injecting new by fields with
csv.py, but just adding globbing is simpler and makes attr assignment
even more flexible.
Like treemap.py, but outputting an svg file, which is quite a bit more
useful.
Things svg is _not_:
- A simple vector graphics format
Things svg _is_:
- A surprisingly powerful high-level graphics language.
I might have to use svgs as an output format more often. It's
surprisingly easy to generate graphics without worrying about low-level
rendering details.
---
Aside from the extra flags for svg details like font, padding,
background colors, etc, the main difference between treemap.py and
treemapd3.py is the addition of the --nested mode, which renders a
containing tile for each recursive group (each -b/--by field).
There's no way --nested would've worked in treemap.py. The main benefit
is the extra labels per subgroup, which are already hard enough to read
in treemap.py.
Other than that, treemapd3.py is mostly the same as treemap.py, but with
a resolution that's actually readable.
Based on the d3 javascript library (https://d3js.org), treemap.py
renders heirarchical data as ascii art:
$ ./scripts/treemap.py lfs.code.csv \
-bfunction -fsize --chars=asdf -W60 -H8
total 65454, avg 369 +-366.8σ, min 3, max 4990
aaaassssddddddaaaadddddssddfffaaadfffaassaassfasssdfdfsddfad
aaaassssddddddaaaadddddssddfffaaadfffaassdfaafasssdfdfsddfsf
aaaassssddddddaaaafffffssddfffsssdaaaddffdfaadfaaasdfafaasfa
aaaassssddddddaaaafffffaaaddddsssaassddffdfaaffssfssfsfadffa
aaaassssffffffssssfffffaaaddddsssaassssffddffffssfdffsadfsad
aaaassssffffffssssaaaaasssffffddfaassssaaassdaaddadffsadadad
aaaassssffffffssssaaaaasssffffddfddffddssassdfassadffsadaffa
aaaassssffffffssssaaaaasssffffddfddffddssassdfaddsdadasfsada
(Normally this is also colored, but you know.)
I've been playing around with d3 to try to better visualize code costs
in littlefs, and it's been quite neat. I figured it would be useful to
directly integrate a similar treemap renderer into our result scripts.
That being said, this ascii rendering is probably too difficult to parse
for any non-trivial data. I'm also working on an svg-based renderer, so
treemap.py is really just for in-terminal previews and an exercise to
understand the underlying algorithms, similar to plot.py/plotmpl.py.
You forget one script, running in the background, hogging a whole
core, and suddenly watch's default 2 second sleep time makes a lot more
sense...
One of the main motivators for watch.py _was_ for shorter sleep times,
short enough to render realtime animations (watch is limited to 0.1
seconds for some reason?), but this doesn't mean it needs to be the
default. This can still be accomplished by explicitly specifying
-s/--sleep, and we probably don't want the default to hog all the CPU.
The use case for fast sleeps has been mostly replaced by -k/--keep-open
anyways.
For tailpipe.py and tracebd.py it's a bit less clear, but we probably
don't need to be spamming open calls 10 times a second.
That was confusing.
The -L/--label flag is already tricky enough to get right. Allowing
-L/--label to filter datasets is counter-intuitive and just makes it
harder to debug things.
Unfortunately the import sys in the argparse block was hiding missing
sys imports.
The mistake was assuming the import sys in Python would limit the scope
to that if block, but Python's late binding strikes again...
Moved local import hack behind if __name__ == "__main__"
These scripts aren't really intended to be used as python libraries.
Still, it's useful to import them for debugging and to get access to
their juicy internals.
This is now inconsistent with csv.py, and I don't really want to add a
full expr parser to every script that might want to rename fields.
Field renaming (or any expr really!) can be accomplished with
intermediate calls to csv.py anyways. No reason to make these scripts
more complicated than they need to be.
This seems like a more fitting name now that this script has evolved
into more of a general purpose high-level CSV tool.
Unfortunately this does conflict with the standard csv module in Python,
breaking every script that imports csv (which is most of them).
Fortunately, Python is flexible enough to let us remove the current
directory before imports with a bit of an ugly hack:
# prevent local imports
__import__('sys').path.pop(0)
These scripts are intended to be standalone anyways, so this is probably
a good pattern to adopt.
This matches the style used in C, which is good for consistency:
a_really_long_function_name(
double_indent_after_first_newline(
single_indent_nested_newlines))
We were already doing this for multiline control-flow statements, simply
because I'm not sure how else you could indent this without making
things really confusing:
if a_really_long_function_name(
double_indent_after_first_newline(
single_indent_nested_newlines)):
do_the_thing()
This was the only real difference style-wise between the Python code and
C code, so now both should be following roughly the same style (80 cols,
double-indent multiline exprs, prefix multiline binary ops, etc).
Mainly to avoid conflicts with match results m, this frees up the single
letter variables m for other purposes.
Choosing a two letter alias was surprisingly difficult, but mt is nice
in that it somewhat matches it (for itertools) and ft (for functools).
This lets you view the first n lines of output instead of the last n
lines, as though the output was piped through head.
This is how the standard watch command works, and can be more useful
when most of the information is at the top, such as in our dbg*.py
scripts (watch.py was originally used as a sort of inotifywait-esque
build runner, which is the main reason it's different).
To make this work, RingIO (renamed from LinesIO) now uses terminal
height as a part of its canvas rendering. This has the added benefit of
more rigorously enforcing the canvas boundaries, but risks breaking when
not associated with a terminal. But that raises the question, does
RingIO even make sense without a terminal?
Worst case you can bypass all of this with -z/--cat.
Since we were only registering our inotify reader after the previous
operation completed, it was easy to miss modifications that happened
faster than our scripts. Since our scripts are in Python, this happened
quite often and made it hard to trust the current state of scripts
with --keep-open, sort of defeating the purpose of --keep-open...
I think previously this race condition wasn't avoided because of the
potential to loop indefinitely if --keep-open referenced a file that the
script itself modified, but it's up to the user to avoid this if it is
an issue.
---
Also while fixing this, I noticed our use of the inotify_simple library
was leaking file descriptors everywhere! I just wasn't closing any
inotify objects at all. A bit concerning since scripts with --keep-open
can be quite long lived...
Previously, any labeling was _technically_ possible, but tricky to get
right and usually required repeated renderings.
It evolved out of the way colors/formats were provided: a cycled
order-significant list that gets zipped with the datasets. This works
ok for somewhat arbitrary formatting, such as colors/formats, but falls
apart for labels, where it turns out to be somewhat important what
exactly you are labeling.
The new scheme makes the label's relationship explicit, at the cost of
being a bit more verbose:
$ ./scripts/plotmpl.py bench.csv -obench.svg \
-Linorder=0,4096,avg,bench_readed \
-Lreversed=1,4096,avg,bench_readed \
-Lrandom=2,4096,avg,bench_readed
This could also be adopted in the CSV manipulation scripts (code.py,
stack.py, summary.py, etc), but I don't think it would actually see that
much use. You can always awk the output to change names and it would add
more complexity to a set of scripts that are probably already way
over-designed.
Note there's a bit of subtlety here, field _types_ are still infered,
but the intention of the fields, i.e. if the field contains data vs
row name/other properties, must be unambiguous in the scripts.
There is still a _tiny_ bit of inference. For most scripts only one
of --by or --fields is strictly needed, since this makes the purpose of
the other fields unambiguous.
The reason for this change is so the scripts are a bit more reliable,
but also because this simplifies the data parsing/inference a bit.
Oh, and this also changes field inference to use the csv.DictReader's
fieldnames field instead of only inspecting the returned dicts. This
should also save a bit of O(n) overhead when parsing CSV files.
The whitespace sensitivity of field args was starting to be a problem,
mostly for advanced plotmpl.py usage (which tbf might be appropriately
described as "super hacky" in how it uses CLI parameters):
./scripts/plotmpl.py \
-Dcase=" \
bench_rbyd_attr_append, \
bench_rbyd_attr_remove, \
bench_rbyd_attr_fetch, \
..."
This may present problems when parsing CSV files with whitespace, in
theory, maybe. But given the scope of these scripts for littlefs...
just don't do that. Thanks.
With the quantity of data being output by bench.py now, filtering ASAP
while parsing CSV files is a valuable optimization. And thanks to how
CSV files are structured, we can even avoid ever loading the full
contents into RAM.
This does end up with use filtering for defines redundantly in a few
places, but this is well worth the saved overhead from early filtering.
Also tried to clean up the plot.py/plotmpl.py's data folding path,
though that may have been wasted effort.
These benchmarks are now more useful for seeing how these B-trees perform.
In plot.py/plotmpl.py:
- Added --legend as another alias for -l, --legend-right.
- Allowed omitting of datasets from the legend by using empty strings
in --labels.
- Do not sum multiple data points on the same x coordinate. This was a
bad idea that risks invalid results going unnoticed.
As a plus multiple data points on the same x coordinate can be abused for
a cheap representation of measurement error.
- Added both uattr (limited to 256) and id (limited to 65535) benchmarks
covering the main rbyd operations
- Fixed issue where --defines gets passed to the test/bench runners when
querying id-specific information. After changing the test/bench
runners to prioritize explicit defines, this causes problems for
recorded benchmark results and debug related things.
- In plot.py/plotmpl.py, made --by/-x/-y in subplots behave somewhat
reasonably, contributing to a global dataset and the figure's legend,
colors, etc, but only shown in the specified subplot. This is useful
mainly for showing different -y values on different subplots.
- In plot.py/plotmpl.py, added --labels to allow explicit configuration
of legend labels, much like --colors/--formats/--chars/etc. This
removes one of the main annoying needs for modifying benchmark results.
- Fixed prettyasserts.py parsing when '->' is in expr
- Made prettyasserts.py failures not crash (yay dynamic typing)
- Fixed the initial state of the emubd disk file to match the internal
state in RAM
- Fixed true/false getting changed to True/False in test.py/bench.py
defines
- Fixed accidental substring matching in plot.py's --by comparison
- Fixed a missed LFS_BLOCk_CYCLES in test_superblocks.toml that was
missed
- Changed test.py/bench.py -v to only show commands being run
Including the test output is still possible with test.py -v -O-, making
the implicit inclusion redundant and noisy.
- Added license comments to bench_runner/test_runner
Note that plotmpl.py tries to share many arguments with plot.py,
allowing plot.py to act as a sort of draft mode for previewing plots
before creating an svg.
Based loosely on Linux's perf tool, perfbd.py uses trace output with
backtraces to aggregate and show the block device usage of all functions
in a program, propagating block devices operation cost up the backtrace
for each operation.
This combined with --trace-period and --trace-freq for
sampling/filtering trace events allow the bench-runner to very
efficiently record the general cost of block device operations with very
little overhead.
Adopted this as the default side-effect of make bench, replacing
cycle-based performance measurements which are less important for
littlefs.
This provides 2 things:
1. perf integration with the bench/test runners - This is a bit tricky
with perf as it doesn't have its own way to combine perf measurements
across multiple processes. perf.py works around this by writing
everything to a zip file, using flock to synchronize. As a plus, free
compression!
2. Parsing and presentation of perf results in a format consistent with
the other CSV-based tools. This actually ran into a surprising number of
issues:
- We need to process raw events to get the information we want, this
ends up being a lot of data (~16MiB at 100Hz uncompressed), so we
paralellize the parsing of each decompressed perf file.
- perf reports raw addresses post-ASLR. It does provide sym+off which
is very useful, but to find the source of static functions we need to
reverse the ASLR by finding the delta the produces the best
symbol<->addr matches.
- This isn't related to perf, but decoding dwarf line-numbers is
really complicated. You basically need to write a tiny VM.
This also turns on perf measurement by default for the bench-runner, but at a
low frequency (100 Hz). This can be decreased or removed in the future
if it causes any slowdown.
The main change is requiring field names for -b/-f/-s/-S, this
is a bit more powerful, and supports hidden extra fields, but
can require a bit more typing in some cases.
- Changed multi-field flags to action=append instead of comma-separated.
- Dropped short-names for geometries/powerlosses
- Renamed -Pexponential -> -Plog
- Allowed omitting the 0 for -W0/-H0/-n0 and made -j0 consistent
- Better handling of --xlim/--ylim
Instead of trying to align to block-boundaries tracebd.py now just
aliases to whatever dimensions are provided.
Also reworked how scripts handle default sizing. Now using reasonable
defaults with 0 being a placeholder for automatic sizing. The addition
of -z/--cat makes it possible to pipe directly to stdout.
Also added support for dots/braille output which can capture more
detail, though care needs to be taken to not rely on accurate coloring.
Now both scripts also fallback to guessing what fields to use based on
what fields can be converted to integers. This is more falible, and
doesn't work for tests/benchmarks, but in those cases explicit fields
can be used (which is what would be needed without guessing anyways).