Commit Graph

18 Commits

Author SHA1 Message Date
Christopher Haster
07244fb2d4 In test/bench.py, added "internal" flag
This marks internal tests/benches (case.in="lfs.c") with an otherwise-unused
flag that is printed during --summary/--list-*. This just helps identify which
tests/benches are internal.
2023-06-01 17:40:48 -05:00
Christopher Haster
82027f3d90 Changed bench/test.py to error if explicit suite/case can't be found
Previously no matches would noop, which, while consistent with an empty
test suite that contains no tests but shouldn't really error, this made
it easy to miss when a typo would cause tests to be missed.

Also added a bit of color to script-level errors in test/bench.py
2023-06-01 17:16:21 -05:00
Christopher Haster
9b033987ef Renamed --gdb-case => --gdb-permutation for correctness 2023-03-19 01:21:27 -05:00
Christopher Haster
83eba5268d Added support for globs in test.py/bench.py, better -b/-B
This reworks test.py/bench.py a bit to map arguments to ids as a first
step instead of defering as much as possible. This is a better design
and avoids the hackiness around -b/-B. As a plus, test_id globbing is
easy to add.
2023-03-17 15:15:53 -05:00
Christopher Haster
59a57cb767 Reworked test_runner/bench_runner to evaluate define permutations lazily
I wondered if walking in Python 2's footsteps was going to run into the
same issues and sure enough, memory backed iterators became unweildy.

The motivation for this change is that large ranges in tests, such as
iterators over seeds or permutations, became prohibitively expensive to
compile. This meant more iteration moving into tests with more steps to
reproduce failures. This sort of defeats the purpuse of the test
framework.

The solution here is to move test permutation generation out of test.py
and into the test runner itself. The allows defines to generate their
values programmatically.

This does conflict with the test frameworks support of sets of explicit
permutations, but this is fixed by also moving these "permutation sets"
down into the test runner.

I guess it turns out the closer your representation matches your
implementation the better everythign works.

Additionally the define caching layer got a bit of tweaking. We can't
precalculate the defines because of mutual recursion, but we can
precalculate which define/permutation each define id maps to. This is
necessary as otherwise figuring out each define's define-specific
permutation would be prohibitively expensive.
2023-03-17 15:06:56 -05:00
Christopher Haster
a20625be7c Allowed empty suites in test.py/bench.py
This happens when you need to comment out an entire suite due to
temporary changes.
2023-03-17 14:20:09 -05:00
Christopher Haster
9a8e1d93c6 Added some rbyd benchmarks, fixed/tweaked some related scripts
- Added both uattr (limited to 256) and id (limited to 65535) benchmarks
  covering the main rbyd operations

- Fixed issue where --defines gets passed to the test/bench runners when
  querying id-specific information. After changing the test/bench
  runners to prioritize explicit defines, this causes problems for
  recorded benchmark results and debug related things.

- In plot.py/plotmpl.py, made --by/-x/-y in subplots behave somewhat
  reasonably, contributing to a global dataset and the figure's legend,
  colors, etc, but only shown in the specified subplot. This is useful
  mainly for showing different -y values on different subplots.

- In plot.py/plotmpl.py, added --labels to allow explicit configuration
  of legend labels, much like --colors/--formats/--chars/etc. This
  removes one of the main annoying needs for modifying benchmark results.
2023-02-12 17:14:42 -06:00
Christopher Haster
801cf278ef Tweaked/fixed a number of small runner things after a bit of use
- Added support for negative numbers in the leb16 encoding with an
  optional 'w' prefix.

- Changed prettyasserts.py rule to .a.c => .c, allowing other .a.c files
  in the future.

- Updated .gitignore with missing generated files (tags, .csv).

- Removed suite-namespacing of test symbols, these are no longer needed.

- Changed test define overrides to have higher priority than explicit
  defines encoded in test ids. So:

    ./runners/bench_runner bench_dir_open:0f1g12gg2b8c8dgg4e0 -DREAD_SIZE=16

  Behaves as expected.

  Otherwise it's not easy to experiment with known failing test cases.

- Fixed issue where the -b flag ignored explicit test/bench ids.
2022-12-17 12:35:44 -06:00
Christopher Haster
397aa27181 Removed unnecessarily heavy RAM usage from logs in bench/test.py
For long running processes (testing with >1pls) these logs can grow into
multiple gigabytes, humorously we never access more than the last n lines
as requested by --context. Piping the stdout with --stdout does not use
additional RAM.
2022-12-06 23:07:28 -06:00
Christopher Haster
eba5553314 Fixed hidden orphans by separating deorphan search into two passes
This happens in rare situations where there is a failed mdir relocation,
interrupted by a power-loss, containing the destination of a directory
rename operation, where the directory being renamed preceded the
relocating mdir in the mdir tail-list. This requires at some point for a
previous directory rename to create a cycle.

If this happens, it's possible for the half-orphan to contain the only
reference to the renamed directory. Since half-orphans contain outdated
state when viewed through the mdir tail-list, the renamed directory
appears to be a full-orphan until we fix the relocating half-orphan.
This causes littlefs to incorrectly remove the renamed directory from
the mdir tail-list, causes catastrophic problems down the line.

The source of the problem is that the two different types of orphans
really operate on two different levels of abstraction: half-orphans fix
failed mdir commits, while full-orphans fix directory removes/renames.
Conflating the two leads to situations where we attempt to fix assumed
problems about the directory tree before we have fixed problems with the
mdir state.

The fix here is to separate out the deorphan search into two passes: one
to fix half-orphans and correct any mdir-commits, restoring the mdirs
and gstate to a known good state, then two to fix failed
removes/renames.

---

This was found with the -Plinear heuristic powerloss testing, which now
runs on more geometries. The failing case was:

  test_relocations_reentrant_renames:112gg261dk1e3f3:123456789abcdefg1h1i1j1k1
  l1m1n1o1p1q1r1s1t1u1v1g2h2i2j2k2l2m2n2o2p2q2r2s2t2

Also fixed/tweaked some parts of the test framework as a part of finding
this bug:

- Fixed off-by-one in exhaustive powerloss state encoding.

- Added --gdb-powerloss-before and --gdb-powerloss-after to help debug
  state changes through a failing powerloss, maybe this should be
  expanded to any arbitrary powerloss number in the future.

- Added lfs_emubd_crc and lfs_emubd_bdcrc to get block/bd crcs for quick
  state comparisons while debugging.

- Fixed bd read/prog/erase counts not being copied during exhaustive
  powerloss testing.

- Fixed small typo in lfs_emubd trace.
2022-11-28 12:51:18 -06:00
Christopher Haster
bcc88f52f4 A couple Makefile-related tweaks
- Changed --(tool)-tool to --(tool)-path in scripts, this seems to be
  a more common name for this sort of flag.

- Changed BUILDDIR to not have implicit slash, makes Makefile internals
  a bit more readable.

- Fixed some outdated names hidden in less-often used ifdefs.
2022-11-17 10:26:26 -06:00
Christopher Haster
1a07c2ce0d A number of small script fixes/tweaks from usage
- Fixed prettyasserts.py parsing when '->' is in expr

- Made prettyasserts.py failures not crash (yay dynamic typing)

- Fixed the initial state of the emubd disk file to match the internal
  state in RAM

- Fixed true/false getting changed to True/False in test.py/bench.py
  defines

- Fixed accidental substring matching in plot.py's --by comparison

- Fixed a missed LFS_BLOCk_CYCLES in test_superblocks.toml that was
  missed

- Changed test.py/bench.py -v to only show commands being run

  Including the test output is still possible with test.py -v -O-, making
  the implicit inclusion redundant and noisy.

- Added license comments to bench_runner/test_runner
2022-11-15 13:42:07 -06:00
Christopher Haster
b2a2cc9a19 Added teepipe.py and watch.py 2022-11-15 13:38:13 -06:00
Christopher Haster
3a33c3795b Added perfbd.py and block device performance sampling in bench-runner
Based loosely on Linux's perf tool, perfbd.py uses trace output with
backtraces to aggregate and show the block device usage of all functions
in a program, propagating block devices operation cost up the backtrace
for each operation.

This combined with --trace-period and --trace-freq for
sampling/filtering trace events allow the bench-runner to very
efficiently record the general cost of block device operations with very
little overhead.

Adopted this as the default side-effect of make bench, replacing
cycle-based performance measurements which are less important for
littlefs.
2022-11-15 13:38:13 -06:00
Christopher Haster
490e1c4616 Added perf.py a wrapper around Linux's perf tool for perf sampling
This provides 2 things:

1. perf integration with the bench/test runners - This is a bit tricky
   with perf as it doesn't have its own way to combine perf measurements
   across multiple processes. perf.py works around this by writing
   everything to a zip file, using flock to synchronize. As a plus, free
   compression!

2. Parsing and presentation of perf results in a format consistent with
   the other CSV-based tools. This actually ran into a surprising number of
   issues:

   - We need to process raw events to get the information we want, this
     ends up being a lot of data (~16MiB at 100Hz uncompressed), so we
     paralellize the parsing of each decompressed perf file.

   - perf reports raw addresses post-ASLR. It does provide sym+off which
     is very useful, but to find the source of static functions we need to
     reverse the ASLR by finding the delta the produces the best
     symbol<->addr matches.

   - This isn't related to perf, but decoding dwarf line-numbers is
     really complicated. You basically need to write a tiny VM.

This also turns on perf measurement by default for the bench-runner, but at a
low frequency (100 Hz). This can be decreased or removed in the future
if it causes any slowdown.
2022-11-15 13:38:13 -06:00
Christopher Haster
296c5afea7 Renamed bench_read/prog/erased -> bench_readed/proged/erased
Yes this isn't really correct english anymore, but these names avoid the
read/read ambiguity.
2022-11-15 13:38:13 -06:00
Christopher Haster
9507e6243c Several tweaks to script flags
- Changed multi-field flags to action=append instead of comma-separated.
- Dropped short-names for geometries/powerlosses
- Renamed -Pexponential -> -Plog
- Allowed omitting the 0 for -W0/-H0/-n0 and made -j0 consistent
- Better handling of --xlim/--ylim
2022-11-15 13:38:13 -06:00
Christopher Haster
4fe0738ff4 Added bench.py and bench_runner.c for benchmarking
These are really just different flavors of test.py and test_runner.c
without support for power-loss testing, but with support for measuring
the cumulative number of bytes read, programmed, and erased.

Note that the existing define parameterization should work perfectly
fine for running benchmarks across various dimensions:

./scripts/bench.py \
    runners/bench_runner \
    bench_file_read \
    -gnor \
    -DSIZE='range(0,131072,1024)'

Also added a couple basic benchmarks as a starting point.
2022-11-15 13:33:34 -06:00