Commit Graph

22 Commits

Author SHA1 Message Date
Christopher Haster
b8d3a5ef46 Fixed inotify race conditions and fd leak in scripts
Since we were only registering our inotify reader after the previous
operation completed, it was easy to miss modifications that happened
faster than our scripts. Since our scripts are in Python, this happened
quite often and made it hard to trust the current state of scripts
with --keep-open, sort of defeating the purpose of --keep-open...

I think previously this race condition wasn't avoided because of the
potential to loop indefinitely if --keep-open referenced a file that the
script itself modified, but it's up to the user to avoid this if it is
an issue.

---

Also while fixing this, I noticed our use of the inotify_simple library
was leaking file descriptors everywhere! I just wasn't closing any
inotify objects at all. A bit concerning since scripts with --keep-open
can be quite long lived...
2023-12-06 22:23:38 -06:00
Christopher Haster
c3d7cbfb09 Changed how labels work in plot.py/plotmpl.py to actually be useable
Previously, any labeling was _technically_ possible, but tricky to get
right and usually required repeated renderings.

It evolved out of the way colors/formats were provided: a cycled
order-significant list that gets zipped with the datasets. This works
ok for somewhat arbitrary formatting, such as colors/formats, but falls
apart for labels, where it turns out to be somewhat important what
exactly you are labeling.

The new scheme makes the label's relationship explicit, at the cost of
being a bit more verbose:

  $ ./scripts/plotmpl.py bench.csv -obench.svg \
        -Linorder=0,4096,avg,bench_readed \
        -Lreversed=1,4096,avg,bench_readed \
        -Lrandom=2,4096,avg,bench_readed

This could also be adopted in the CSV manipulation scripts (code.py,
stack.py, summary.py, etc), but I don't think it would actually see that
much use. You can always awk the output to change names and it would add
more complexity to a set of scripts that are probably already way
over-designed.
2023-11-05 19:55:10 -06:00
Christopher Haster
1e4d4cfdcf Tried to write errors to stderr consistently in scripts 2023-11-05 15:55:07 -06:00
Christopher Haster
d0a6ef0c89 Changed scripts to not infer field purposes from CSV values
Note there's a bit of subtlety here, field _types_ are still infered,
but the intention of the fields, i.e. if the field contains data vs
row name/other properties, must be unambiguous in the scripts.

There is still a _tiny_ bit of inference. For most scripts only one
of --by or --fields is strictly needed, since this makes the purpose of
the other fields unambiguous.

The reason for this change is so the scripts are a bit more reliable,
but also because this simplifies the data parsing/inference a bit.

Oh, and this also changes field inference to use the csv.DictReader's
fieldnames field instead of only inspecting the returned dicts. This
should also save a bit of O(n) overhead when parsing CSV files.
2023-11-04 15:24:18 -05:00
Christopher Haster
0f93fa3057 Tweaked script field arg parsing to strip whitespace almost everywhere
The whitespace sensitivity of field args was starting to be a problem,
mostly for advanced plotmpl.py usage (which tbf might be appropriately
described as "super hacky" in how it uses CLI parameters):

  ./scripts/plotmpl.py \
      -Dcase=" \
          bench_rbyd_attr_append, \
          bench_rbyd_attr_remove, \
          bench_rbyd_attr_fetch, \
          ..."

This may present problems when parsing CSV files with whitespace, in
theory, maybe. But given the scope of these scripts for littlefs...
just don't do that. Thanks.
2023-11-03 15:03:46 -05:00
Christopher Haster
616b4e1c9e Tweaked scripts that consume .csv files to filter defines early
With the quantity of data being output by bench.py now, filtering ASAP
while parsing CSV files is a valuable optimization. And thanks to how
CSV files are structured, we can even avoid ever loading the full
contents into RAM.

This does end up with use filtering for defines redundantly in a few
places, but this is well worth the saved overhead from early filtering.

Also tried to clean up the plot.py/plotmpl.py's data folding path,
though that may have been wasted effort.
2023-11-03 14:30:22 -05:00
Christopher Haster
e7bf5ad82f Added scripts/crc32c.py
This seems like a useful script to have.
2023-09-15 18:42:48 -05:00
Christopher Haster
27fc481ec2 Generalized btree benchmarks, amortized benchmarks, plot.py/plotmpl.py tweaks
These benchmarks are now more useful for seeing how these B-trees perform.

In plot.py/plotmpl.py:

- Added --legend as another alias for -l, --legend-right.

- Allowed omitting of datasets from the legend by using empty strings
  in --labels.

- Do not sum multiple data points on the same x coordinate. This was a
  bad idea that risks invalid results going unnoticed.

  As a plus multiple data points on the same x coordinate can be abused for
  a cheap representation of measurement error.
2023-03-19 01:21:31 -05:00
Christopher Haster
0eccd6515f In plot.py/plotmpl.py, allowed escaped commas in certain comma-separated fields 2023-02-12 17:14:42 -06:00
Christopher Haster
9a8e1d93c6 Added some rbyd benchmarks, fixed/tweaked some related scripts
- Added both uattr (limited to 256) and id (limited to 65535) benchmarks
  covering the main rbyd operations

- Fixed issue where --defines gets passed to the test/bench runners when
  querying id-specific information. After changing the test/bench
  runners to prioritize explicit defines, this causes problems for
  recorded benchmark results and debug related things.

- In plot.py/plotmpl.py, made --by/-x/-y in subplots behave somewhat
  reasonably, contributing to a global dataset and the figure's legend,
  colors, etc, but only shown in the specified subplot. This is useful
  mainly for showing different -y values on different subplots.

- In plot.py/plotmpl.py, added --labels to allow explicit configuration
  of legend labels, much like --colors/--formats/--chars/etc. This
  removes one of the main annoying needs for modifying benchmark results.
2023-02-12 17:14:42 -06:00
Christopher Haster
1f37eb5563 Adopted --subplot* in plot.py
As well as --legend* and --*ticklabels. Mostly for close feature parity, making
it easier to move plots between plot.py and plotmpl.py.
2022-12-16 16:47:42 -06:00
Christopher Haster
1a07c2ce0d A number of small script fixes/tweaks from usage
- Fixed prettyasserts.py parsing when '->' is in expr

- Made prettyasserts.py failures not crash (yay dynamic typing)

- Fixed the initial state of the emubd disk file to match the internal
  state in RAM

- Fixed true/false getting changed to True/False in test.py/bench.py
  defines

- Fixed accidental substring matching in plot.py's --by comparison

- Fixed a missed LFS_BLOCk_CYCLES in test_superblocks.toml that was
  missed

- Changed test.py/bench.py -v to only show commands being run

  Including the test output is still possible with test.py -v -O-, making
  the implicit inclusion redundant and noisy.

- Added license comments to bench_runner/test_runner
2022-11-15 13:42:07 -06:00
Christopher Haster
6fce9e5156 Changed plotmpl.py/plot.py to not treat missing values as discontinuities 2022-11-15 13:38:13 -06:00
Christopher Haster
559e174660 Added plotmpl.py for creating svg/png plots with matplotlib
Note that plotmpl.py tries to share many arguments with plot.py,
allowing plot.py to act as a sort of draft mode for previewing plots
before creating an svg.
2022-11-15 13:38:13 -06:00
Christopher Haster
b2a2cc9a19 Added teepipe.py and watch.py 2022-11-15 13:38:13 -06:00
Christopher Haster
3a33c3795b Added perfbd.py and block device performance sampling in bench-runner
Based loosely on Linux's perf tool, perfbd.py uses trace output with
backtraces to aggregate and show the block device usage of all functions
in a program, propagating block devices operation cost up the backtrace
for each operation.

This combined with --trace-period and --trace-freq for
sampling/filtering trace events allow the bench-runner to very
efficiently record the general cost of block device operations with very
little overhead.

Adopted this as the default side-effect of make bench, replacing
cycle-based performance measurements which are less important for
littlefs.
2022-11-15 13:38:13 -06:00
Christopher Haster
490e1c4616 Added perf.py a wrapper around Linux's perf tool for perf sampling
This provides 2 things:

1. perf integration with the bench/test runners - This is a bit tricky
   with perf as it doesn't have its own way to combine perf measurements
   across multiple processes. perf.py works around this by writing
   everything to a zip file, using flock to synchronize. As a plus, free
   compression!

2. Parsing and presentation of perf results in a format consistent with
   the other CSV-based tools. This actually ran into a surprising number of
   issues:

   - We need to process raw events to get the information we want, this
     ends up being a lot of data (~16MiB at 100Hz uncompressed), so we
     paralellize the parsing of each decompressed perf file.

   - perf reports raw addresses post-ASLR. It does provide sym+off which
     is very useful, but to find the source of static functions we need to
     reverse the ASLR by finding the delta the produces the best
     symbol<->addr matches.

   - This isn't related to perf, but decoding dwarf line-numbers is
     really complicated. You basically need to write a tiny VM.

This also turns on perf measurement by default for the bench-runner, but at a
low frequency (100 Hz). This can be decreased or removed in the future
if it causes any slowdown.
2022-11-15 13:38:13 -06:00
Christopher Haster
ca66993812 Tweaked scripts to share more code, added coverage calls/hits
The main change is requiring field names for -b/-f/-s/-S, this
is a bit more powerful, and supports hidden extra fields, but
can require a bit more typing in some cases.
2022-11-15 13:38:13 -06:00
Christopher Haster
9507e6243c Several tweaks to script flags
- Changed multi-field flags to action=append instead of comma-separated.
- Dropped short-names for geometries/powerlosses
- Renamed -Pexponential -> -Plog
- Allowed omitting the 0 for -W0/-H0/-n0 and made -j0 consistent
- Better handling of --xlim/--ylim
2022-11-15 13:38:13 -06:00
Christopher Haster
42d889e141 Reworked/simplified tracebd.py a bit
Instead of trying to align to block-boundaries tracebd.py now just
aliases to whatever dimensions are provided.

Also reworked how scripts handle default sizing. Now using reasonable
defaults with 0 being a placeholder for automatic sizing. The addition
of -z/--cat makes it possible to pipe directly to stdout.

Also added support for dots/braille output which can capture more
detail, though care needs to be taken to not rely on accurate coloring.
2022-11-15 13:38:13 -06:00
Christopher Haster
fb58148df2 Consistent handling of by/field arguments for plot.py and summary.py
Now both scripts also fallback to guessing what fields to use based on
what fields can be converted to integers. This is more falible, and
doesn't work for tests/benchmarks, but in those cases explicit fields
can be used (which is what would be needed without guessing anyways).
2022-11-15 13:38:13 -06:00
Christopher Haster
7591d9cf74 Added plot.py for in-terminal plotting 2022-11-15 13:38:05 -06:00