21 Commits

Author SHA1 Message Date
Christopher Haster
71930a5c01 scripts: Tweaked openio comment
Dang, this touched like every single script.
2025-04-16 15:23:06 -05:00
Christopher Haster
313696ecf9 scripts: Fixed openio issue where some scripts didn't import os
This only failed if "-" was used as an argument (for stdin/stdout), so
the issue was pretty hard to spot.

openio is a heavily copy-pasted function, so it makes sense to just add
the import os to openio directly. Otherwise this mistake will likely
happen again in the future.
2025-03-12 21:18:51 -05:00
Christopher Haster
62cc4dbb14 scripts: Disabled local import hack on import
Moved local import hack behind if __name__ == "__main__"

These scripts aren't really intended to be used as python libraries.
Still, it's useful to import them for debugging and to get access to
their juicy internals.
2025-01-28 14:41:30 -06:00
Christopher Haster
a3ac512cc1 scripts: Adopted Parser class in prettyasserts.py
This ended up being a pretty in-depth rework of prettyasserts.py to
adopt the shared Parser class. But now prettyasserts.py should be both
more robust and faster.

The tricky parts:

- The Parser class eagerly munches whitespace by default. This is
  usually a good thing, but for prettyasserts.py we need to keep track
  of the whitespace somehow in order to write it to the output file.

  The solution here is a little bit hacky. Instead of complicating the
  Parser class, we implicitly add a regex group for whitespace when
  compiling our lexer.

  Unfortunately this does make last-minute patching of the lexer a bit
  messy (for things like -p/--prefix, etc), thanks to Python's
  re.Pattern class not being extendable. To work around this, the Lexer
  class keeps track of the original patterns to allow recompilation.

- Since we no longer tokenize in a separate pass, we can't use the
  None token to match any unmatched tokens.

  Fortunately this can be worked around with sufficiently ugly regex.
  See the 'STUFF' rule.

  It's a good thing Python has negative lookaheads.

  On the flip side, this means we no longer need to explicitly specify
  all possible tokens when multiple tokens overlap.

- Unlike stack.py/csv.py, prettyasserts.py needs multi-token lookahead.

  Fortunately this has a pretty straightforward solution with the
  addition of an optional stack to the Parser class.

  We can even have a bit of fun with Python's with statements (though I
  do wish with statements could have else clauses, so we wouldn't need
  double nesting to catch parser exceptions).

---

In addition to adopting the new Parser class, I also made sure to
eliminate intermediate string allocation through heavy use of Python's
io.StringIO class.

This, plus Parser's cheap shallow chomp/slice operations, gives
prettyasserts.py a much needed speed boost.

(Honestly, the original prettyasserts.py was pretty naive, with the
assumption that it wouldn't be the bottleneck during compilation. This
turned out to be wrong.)

These changes cut total compile time in ~half:

                                          real      user      sys
  before (time make test-runner -j): 0m56.202s 2m31.853s 0m2.827s
  after  (time make test-runner -j): 0m26.836s 1m51.213s 0m2.338s

Keep in mind this includes both prettyasserts.py and gcc -Os (and other
Makefile stuff).
2024-12-17 15:34:44 -06:00
Christopher Haster
eeab0c41e8 scripts: Reverted to lh type preference in prettyasserts.py
This was flipped in b5e264b.

Infering the type from the right-hand side is tempting here, but the
right-hand side if often a constant, which gets a bit funky in C.

Consider:

  assert(lfs->cfg->read != NULL);

  gcc: warning: ISO C forbids initialization between function pointer
  and ‘void *’ [-Wpedantic]

  assert(err < 0ULL);

  gcc: warning: comparison of unsigned expression in ‘< 0’ is always
  false [-Wtype-limits]

Prefering the left-hand type should hopefully avoid these issues most of
the time.
2024-12-17 15:34:44 -06:00
Christopher Haster
7cfcc1af1d scripts: Renamed summary.py -> csv.py
This seems like a more fitting name now that this script has evolved
into more of a general purpose high-level CSV tool.

Unfortunately this does conflict with the standard csv module in Python,
breaking every script that imports csv (which is most of them).
Fortunately, Python is flexible enough to let us remove the current
directory before imports with a bit of an ugly hack:

  # prevent local imports
  __import__('sys').path.pop(0)

These scripts are intended to be standalone anyways, so this is probably
a good pattern to adopt.
2024-11-09 12:31:16 -06:00
Christopher Haster
007ac97bec scripts: Adopted double-indent on multiline expressions
This matches the style used in C, which is good for consistency:

  a_really_long_function_name(
          double_indent_after_first_newline(
              single_indent_nested_newlines))

We were already doing this for multiline control-flow statements, simply
because I'm not sure how else you could indent this without making
things really confusing:

  if a_really_long_function_name(
          double_indent_after_first_newline(
              single_indent_nested_newlines)):
      do_the_thing()

This was the only real difference style-wise between the Python code and
C code, so now both should be following roughly the same style (80 cols,
double-indent multiline exprs, prefix multiline binary ops, etc).
2024-11-06 15:31:17 -06:00
Christopher Haster
096f968cbb Dropped prettyasserts.py --no-arrows shorthand form
This probably doesn't deserve a shorthand form as you can usually just
ignore the arrow parsing. This frees up -A for potential future use.
2024-06-20 13:04:12 -05:00
Christopher Haster
4d76551d6b Fixed parse errors in prettyasserts.py caused by ternary operators
Because of course ternary operators would cause problems.

The two problem:

  LFS_ASSERT((exists) ? !err : err == LFS_ERR_NOENT);
  lfsr_file_sync(&lfs, &file) => (zombie) ? 0 : LFS_ERR_NOENT;

We could work around these with parentheses, but with different assert
parsers floating around this issue is likely to crop up again in the
future.

Fortunately this just required separate "sep" vs "term" rules and a bit
more strict parsing.
2024-05-22 18:50:54 -05:00
Christopher Haster
4208aa21e2 Extended prettyasserts.py to support prefixed memcmp/strcmp
The move to lfs_memcmp/lfs_strcmp highlighted an interesting hole in
prettyasserts.py: the lack of support for custom memcmp/strcmp symbols.

Rather than just adding more flags for an increasing number of symbols,
I've added -p/--prefix and -P/--prefix-insensitive to generate relevant
symbols based on a prefix. In littlefs's case, we use -Plfs_, which
matches both lfs_memcmp and LFS_ASSERT (and LFS_MEMCMP and lfs_assert
but ignore those):

  $ ./scripts/prettyasserts.py -Plfs_ lfs.t.c -o lfs.t.a.c

Don't worry, you can still provide explicit symbols, but only via
long-form flags. This gets a bit noisy:

  $ ./scripts/prettyasserts.py \
      --assert=LFS_ASSERT \
      --unreachable=LFS_UNREACHABLE \
      --memcmp=lfs_memcmp \
      --strcmp=lfs_strcmp \
      lfs.t.c -o lfs.t.a.c

This commit also finally gives the prettyasserts.py's symbols actual
word boundaries, instead of the big error-prone hack of sorting by size.
2024-05-22 15:43:46 -05:00
Christopher Haster
7d95a2ff29 Added ability to disable default patterns in prettyasserts.py
- -n/--no-defaults - disable default patterns

The default patterns can be brought back explicitly with:

- -a/--assert      - enable assert pattern
- -u/--unreachable - enable unreachable pattern
- -A/--arrow       - enable arrow patterns

Technically the default configuration is equivalent to the follow:

  $ ./scripts/prettyasserts.py \
      -a assert \
      -a __builtin_assert \
      -u unreachable \
      -u __builtin_unreachable \
      -A \
      input.a.c -o output.c

This isn't really useful for littlefs, but may be useful elsewhere
2024-02-14 12:22:19 -06:00
Christopher Haster
738dd86339 Extended prettyasserts.py to support unreachable statements
The main benefit is control over error reporting and avoiding the dive
into stdlib layers when debugging thanks to __builtin_trap().

This changes -p/--pattern -> -a/--assert

And adds -u/--unreachable
2024-02-14 01:59:03 -06:00
Christopher Haster
1422a61d16 Made generated prettyasserts more debuggable
The main star of the show is the adoption of __builtin_trap() for
aborting on assert failure. I discovered this GCC/Clang extension
recently and it integrates much, _much_ better with GDB.

With stdlib's abort(), GDB drops you off in several layers of internal
stdlib functions, which is a pain to navigate out of to get to where the
assert actually happened. With __builtin_trap(), GDB stops immediately,
making debugging quick and easy.

This is great! The pain of debugging needs to come from understanding
the error, not just getting to it.

---

Also tweaked a few things with the internal print functions to make
reading the generated source easier, though I realize this is a rare
thing to do.
2024-02-14 01:14:36 -06:00
Christopher Haster
b5e264bec4 Fixed issue with pointer comparisons in prettyasserts.py
We end up passing intmax_t pointers around, but without a cast. This
results in a warning. Adding a cast fixes the warning. This is in the
printing logic, not the actual comparison, so hiding warnings with this
cast is not a concern here.

I also flipped the type we compare with to use the right-hand side. The
pretty-assert code already treats the right-hand as the "expected" value
(I wonder if this is an english language quirk), so I think it makes
sense to use the right-hand side as the "expected" type.
2024-02-03 18:14:51 -06:00
Christopher Haster
e7bf5ad82f Added scripts/crc32c.py
This seems like a useful script to have.
2023-09-15 18:42:48 -05:00
Christopher Haster
37dcee8868 Fixed prettyasserts.py getting confused by escaped-newlines
This is just a messy part of the C grammar.

Also fixed >> and << confusing certain assert expressions, which isn't
surprising.
2023-02-12 12:06:04 -06:00
Christopher Haster
1a07c2ce0d A number of small script fixes/tweaks from usage
- Fixed prettyasserts.py parsing when '->' is in expr

- Made prettyasserts.py failures not crash (yay dynamic typing)

- Fixed the initial state of the emubd disk file to match the internal
  state in RAM

- Fixed true/false getting changed to True/False in test.py/bench.py
  defines

- Fixed accidental substring matching in plot.py's --by comparison

- Fixed a missed LFS_BLOCk_CYCLES in test_superblocks.toml that was
  missed

- Changed test.py/bench.py -v to only show commands being run

  Including the test output is still possible with test.py -v -O-, making
  the implicit inclusion redundant and noisy.

- Added license comments to bench_runner/test_runner
2022-11-15 13:42:07 -06:00
Christopher Haster
b2a2cc9a19 Added teepipe.py and watch.py 2022-11-15 13:38:13 -06:00
Christopher Haster
3a33c3795b Added perfbd.py and block device performance sampling in bench-runner
Based loosely on Linux's perf tool, perfbd.py uses trace output with
backtraces to aggregate and show the block device usage of all functions
in a program, propagating block devices operation cost up the backtrace
for each operation.

This combined with --trace-period and --trace-freq for
sampling/filtering trace events allow the bench-runner to very
efficiently record the general cost of block device operations with very
little overhead.

Adopted this as the default side-effect of make bench, replacing
cycle-based performance measurements which are less important for
littlefs.
2022-11-15 13:38:13 -06:00
Christopher Haster
490e1c4616 Added perf.py a wrapper around Linux's perf tool for perf sampling
This provides 2 things:

1. perf integration with the bench/test runners - This is a bit tricky
   with perf as it doesn't have its own way to combine perf measurements
   across multiple processes. perf.py works around this by writing
   everything to a zip file, using flock to synchronize. As a plus, free
   compression!

2. Parsing and presentation of perf results in a format consistent with
   the other CSV-based tools. This actually ran into a surprising number of
   issues:

   - We need to process raw events to get the information we want, this
     ends up being a lot of data (~16MiB at 100Hz uncompressed), so we
     paralellize the parsing of each decompressed perf file.

   - perf reports raw addresses post-ASLR. It does provide sym+off which
     is very useful, but to find the source of static functions we need to
     reverse the ASLR by finding the delta the produces the best
     symbol<->addr matches.

   - This isn't related to perf, but decoding dwarf line-numbers is
     really complicated. You basically need to write a tiny VM.

This also turns on perf measurement by default for the bench-runner, but at a
low frequency (100 Hz). This can be decreased or removed in the future
if it causes any slowdown.
2022-11-15 13:38:13 -06:00
Christopher Haster
20ec0be875 Cleaned up a number of small tweaks in the scripts
- Added the littlefs license note to the scripts.

- Adopted parse_intermixed_args everywhere for more consistent arg
  handling.

- Removed argparse's implicit help text formatting as it does not
  work with perse_intermixed_args and breaks sometimes.

- Used string concatenation for argparse everywhere, uses backslashed
  line continuations only works with argparse because it strips
  redundant whitespace.

- Consistent argparse formatting.

- Consistent openio mode handling.

- Consistent color argument handling.

- Adopted functools.lru_cache in tracebd.py.

- Moved unicode printing behind --subscripts in traceby.py, making all
  scripts ascii by default.

- Renamed pretty_asserts.py -> prettyasserts.py.

- Renamed struct.py -> struct_.py, the original name conflicts with
  Python's built in struct module in horrible ways.
2022-11-15 13:31:11 -06:00