67 Commits

Author SHA1 Message Date
Christopher Haster
651c3e1eb4 scripts: Renamed Attr -> CsvAttr
Mainly to avoid confusion with littlefs's attrs, uattrs, rattrs, etc.

This risked things getting _really_ confusing as the scripts evolve.
2025-05-15 18:48:46 -05:00
Christopher Haster
aebe5b1d1b scripts: plot[mpl].py: Added --x/ylim-stddev for data-dependent limits
This adds --xlim-stddev and --ylim-stddev as alternatives to -X/--xlim
and -Y/--ylim that define the plot limits in terms of standard
deviations from the mean, instead of in absolute values.

So want to only plot data within +-1 standard deviation? Use:

  $ ./scripts/plot.py --ylim-stddev=-1,+1

Want to ignore outliers >3 standard deviations? Use:

  $ ./scripts/plot.py --ylim-stddev=3

This is very useful for plotting the amortized/per-byte benchmarks,
which have a tendency to run off towards infinity near zero.

Before, we could truncate data explicitly with -Y/--ylim, but this was
getting very tedious and doesn't work well when you don't know what the
data is going to look like beforehand.
2025-05-15 18:23:09 -05:00
Christopher Haster
48daeed509 scripts: Fixed rounding-towards-zero issue in si/si2 prefixes
This should be floor (rounds towards -inf), not int (rounds towards
zero), otherwise sub-integer results get funky:

- floor si(0.00001) => 10u
- int   si(0.00001) => 0.01m

- floor si(0.000001) => 1u
- int   si(0.000001) => m (???)
2025-05-15 16:08:04 -05:00
Christopher Haster
c04f36ead4 scripts: plot[mpl].py: Adopted -s/--sort and -S for legend sorting
Before this, the only option for ordering the legend was by specifying
explicit -L/--add-label labels. This works for the most part, but
doesn't cover the case where you don't know the parameterization of the
input data.

And we already have -s/-S flags in other csv scripts, so it makes sense
to adopt them in plot.py/plotmpl.py to allow sorting by one or more
explicit fields.

Note that -s/-S can be combined with explicit -L/--add-labels to order
datasets with the same sort field:

  $ ./scripts/plot.py bench.csv \
          -bBLOCK_SIZE \
          -xn \
          -ybench_readed \
          -ybench_proged \
          -ybench_erased \
          --legend \
          -sBLOCK_SIZE \
          -L'*,bench_readed=bs=%(BLOCK_SIZE)s' \
          -L'*,bench_proged=' \
          -L'*,bench_erased='

---

Unfortunately this conflicted with -s/--sleep, which is a common flag in
the ascii-art scripts. This was bound to conflict with -s/--sort
eventually, so a came up with some alternatives:

- -s/--sleep -> -~/--sleep
- -S/--coalesce -> -+/--coalesce

But I'll admit I'm not the happiest about these...
2025-05-15 15:51:49 -05:00
Christopher Haster
7526b469b9 scripts: Adopted globs in all field matchers (-D/--define, -c/--compare)
Globs in CLI attrs (-L'*=bs=%(bs)s' for example), have been remarkably
useful. It makes sense to extend this to the other flags that match
against CSV fields, though this does add complexity to a large number of
smaller scripts.

- -D/--define can now use globs when filtering:

    $ ./scripts/code.py lfs.o -Dfunction='lfsr_file_*'

  -D/--define already accepted a comma-separated list of options, so
  extending this to globs makes sense.

  Note this differs from test.py/bench.py's -D/--define. Globbing in
  test.py/bench.py wouldn't really work since -D/--define is generative,
  not matching. But there's already other differences such as integer
  parsing, range, etc. It's not worth making these perfectly consistent
  as they are really two different tools that just happen to look the
  same.

- -c/--compare now matches with globs when finding the compare entry:

    $ ./scripts/code.py lfs.o -c'lfs*_file_sync'

  This is quite a bit less useful that -D/--define, but makes sense for
  consistency.

  Note -c/--compare just chooses the first match. It doesn't really make
  sense to compare against multiple entries.

This raised the question of globs in the field specifiers themselves
(-f'bench_*' for example), but I'm rejecting this for now as I need to
draw the complexity/scope _somewhere_, and I'm worried it's already way
over on the too-complex side.

So, for now, field names must always be specified explicitly. Globbing
field names would add too much complexity. Especially considering how
many flags accept field names in these scripts.
2025-05-15 14:28:57 -05:00
Christopher Haster
48c1a016a0 scripts: Fixed missing tuple unpack in glob-all CLI attrs
This was broken:

  $ ./scripts/plotmpl.py -L'*=bs=%(bs)s'

There may be a better way to organize this logic, but spamming if
statements works well enough.
2025-05-15 13:47:09 -05:00
Christopher Haster
71930a5c01 scripts: Tweaked openio comment
Dang, this touched like every single script.
2025-04-16 15:23:06 -05:00
Christopher Haster
57c77b1b72 scripts: Fixed most flickering issues in RingIO
Two new tricks:

1. Hide the cursor while redrawing the ring buffer.

2. Build up the entire redraw in RAM first, and render everything in a
   single write call.

These _mostly_ get rid of the cursor flickering issues in rapidly
updating scripts.
2025-04-16 15:23:05 -05:00
Christopher Haster
cffa9ec67e scripts: Adopted ring name for stdout substitution 2025-04-16 15:22:51 -05:00
Christopher Haster
5952431660 scripts: Consistently use color='auto' default in main 2025-04-16 15:22:50 -05:00
Christopher Haster
fc5bfdae14 scripts: Adopted -n/--lines in most ascii art scripts
The notable exception being plot.py, where line-level history doesn't
really make sense.

These scripts all default to height=1, and -n/--lines can be useful for
viewing changes over time.

In theory you could achieve something similar to this with tailpipe.py,
but you would lose the header info, which is useful.

---

Note, as a point of simplicity, we do _not_ show sub-char history like
we used to in tracebd.py. That was way too complicated for what it was
worth.
2025-04-16 15:22:46 -05:00
Christopher Haster
8e3760c5b8 scripts: Tweaked punescape to expect dict-like attrs
This simplifies attrs a bit, and scripts can always override
__getitem__ if they want to provide lazy attr generation.

The original intention of accepting functions was to make lazy attr
generation easier, but while tinkering around with the idea I realized
the actual attr mapping/generation would be complicated enough that
you'd probably want a full class anyways.

All of our scripts are only using dict attrs anyways. And lazy attr
generation is probably a premature optimization for the same reason
everyone's ok with Python's slices being O(n).
2025-04-16 15:22:45 -05:00
Christopher Haster
06bb34fd99 scripts: Adopted Attr class changes in all scripts
Mainly the addition of Attr.getall, Attr.get, and changing
Attr.__getitem__ to raise KeyError (just like a normal dict).
2025-04-16 15:22:44 -05:00
Christopher Haster
cd039f6227 scripts: Adopted height-relative negative values for -n/--lines
This mirrors how -H/--height and -W/--width work, with -n-1 using the
terminal height - 1 for the output.

This is very useful for carving out space for the shell prompt and other
things, without sacrificing automatic sizing.
2025-04-16 15:22:42 -05:00
Christopher Haster
d5c0e142f0 scripts: Reworked tracebd.py, needs cleanup
It's a mess but it's working. Still a number of TODOs to cleanup...

This adopts all of the changes in dbgbmap.py/dbgbmapd3.py, block
grouping, nested curves, Canvas, Attrs, etc:

- Like dbgbmap.py, we now group by block first before applying space
  filling curves, using nested space filling curves to render byte-level
  operations.

  Python's ft.lru_cache really shines here.

  The previous behavior is still available via -u/--contiguous

- Adopted most features in dbgbmap.py, so --to-scale, -t/--tiny, custom
  --title strings, etc.

- Adopted Attrs so now chars/coloring can be customized with
  -./--add-char, -,/--add-wear-char, -C/--add-color,
  -G/--add-wear-color.

- Renamed -R/--reset -> --volatile, which is a much better name.

- Wear is now colored cyan -> white -> read, which is a bit more
  visually interesting. And we're not using cyan in any scripts yet.

In addition to the new stuff, there were a few simplifications:

- We no longer support sub-char -n/--lines with -:/--dots or
  -⣿/--braille. Too complicated, required Canvas state hacks to get
  working, and wasn't super useful.

  We probably want to avoid doing too much cleverness with -:/--dots and
  -⣿/--braille since we can't color sub-chars.

- Dropped -@/--blocks byte-level range stuff. This was just not worth
  the amount of complexity it added. -@/--blocks is now limited to
  simple block ranges. High-level scripts should stick to high-level
  options.

- No fancy/complicated Bmap class. The bmap object is just a dict of
  TraceBlocks which contain RangeSets for relevant operations.

  Actually the new RangeSet class deserves a mention but this commit
  message is probably already too long.

  RangeSet is a decently efficient set of, well, ranges, that can be
  merged and queried. In a lower-level language it should be implemented
  as a binary tree, but in Python we're just using a sorted list because
  we're probably not going to be able to beat O(n) list operations.

- Wear is tracked at the block level, no reason to overcomplicate this.

- We no longer resize based on new info. Instead we either expect a
  -b/--block-size argument or wait until first bd init call.

  We can probably drop the block size in BD_TRACE statements now, but
  that's a TODO item.

- Instead of one amalgamated regex, we use string searches to figure out
  the bd op and then smaller regexes to parse. Lesson learned here:
  Python's string search is very fast (compared to regex).

- We do _not_ support labels on blocks like we do in treemap.py/
  codemap.py. It's less useful here and would just be more hassle.

I also tried to reorganize main a bit to mirror the simple two-main
approach in dbgbmap.py and other ascii-rendering scripts, but it's a bit
difficult here since trace info is very stateful. Building up main
functions in the main main function seemed to work well enough:

  main -+-> main_ -> trace__ (main thread)
        '-> draw_ -> draw__ (daemon thread)

---

You may note some weirdness going on with flags. That's me trying to
avoid upcoming flag conflicts.

I think we want -n/--lines in more scripts, now that it's relatively
self-contained, but this conflicts with -n/--namespace-depth in
codemap[d3].py, and risks conflict with -N/--notes in csv.py which may
end up with namespace-related functionality in the future.

I ended up hijacking -_, but this conflicted with -_/--add-line-char in
plot.py, but that's ok because we also want a common "secondary char"
flag for wear in tracebd.py... Long story short I ended up moving a
bunch of flags around:

- added                   -n/--lines
- -n/--namespace-depth -> -_/--namespace-depth
- -N/--notes           -> -N/--notes
- -./--add-char        -> -./--add-char
- -_/--add-line-char   -> -,/--add-line-char
- added                   -,/--add-wear-char
- -C/--color           -> -C/--add-color
- added                -> -G/--add-wear-color

Worth it? Dunno.
2025-04-16 15:22:38 -05:00
Christopher Haster
5f06558cbe scripts: Added dbgbmapd3.py for bmap -> svg rendering
Like codemapd3.py this include an interactive UI for viewing the
underlying filesystem graph, including:

- mode-tree - Shows all reachable blocks from a given block
- mode-branches - Shows immediate children of a given block
- mode-references - Shows parents of a given block
- mode-redund - Shows sibling blocks in redund groups (This is
  currently just mdir pairs, but the plan is to add more)

This is _not_ a full filesystem explorer, so we don't embed all block
data/metadata in the svg. That's probably a project for another time.
However we do include interesting bits such as trunk addresses,
checksums, etc.

An example:

  # create an filesystem image
  $ make test-runner -j
  $ ./scripts/test.py -B test_files_many -a -ddisk -O- \
          -DBLOCK_SIZE=1024 \
          -DCHUNK=10 \
          -DSIZE=2050 \
          -DN=128 \
          -DBLOCK_RECYCLES=1
  ... snip ...
  done: 2/2 passed, 0/2 failed, 164pls!, in 0.16s

  # generate bmap svg
  $ ./scripts/dbgbmapd3.py disk -b1024 -otest.svg \
          -W1400 -H750 -Z --dark
  updated test.svg, littlefs v0.0 1024x1024 0x{26e,26f}.d8 w64.128, cksu
  m 41ea791e

And open test.svg in a browser of your choice.

Here's what the current colors mean:

- yellow => mdirs
- blue   => btree nodes
- green  => data blocks
- red    => corrupt/conflict issue
- gray   => unused blocks

But like codemapd3.py the output is decently customizable. See -h/--help
for more info.

And, just like codemapd3.py, this is based on ideas from d3 and
brendangregg's flamegraphs:

- d3 - https://d3js.org
- brendangregg's flamegraphs - https://github.com/brendangregg/FlameGraph

Note we don't actually use d3... the name might be a bit confusing...

---

One interesting change from the previous dbgbmap.py is the addition of
"corrupt" (bad checksum) and "conflict" (multiple parents) blocks, which
can help find bugs.

You may find the "conflict" block reporting a bit strange. Yes it's
useful for finding block allocation failures, but won't naturally formed
dags in file btrees also be reported as "conflicts"?

Yes, but the long-term plan is to move away from dags and make littlefs
a pure tree (for block allocator and error correction reasons). This
hasn't been implemented yet, so for now dags will result in false
positives.

---

Implementation wise, this script was pretty straightforward given prior
dbglfs.py and codemapd3.py work.

However there was an interesting case of https://xkcd.com/1425:

- Traverse the filesystem and build a graph - easy
- Tile a rectangle with n nice looking rectangles - uhhh

I toyed around with an analytical approach (something like block width =
sqrt(canvas_width*canvas_height/n) * block_aspect_ratio), but ended up
settling on an algorithm that divides the number of columns by 2 until
we hit our target aspect ratio.

This algorithm seems to work quite well, runs in only O(log n), and
perfectly tiles the grid for powers-of-two. Honestly the result is
better than I was expecting.
2025-04-16 15:22:17 -05:00
Christopher Haster
460d8870ec scripts: Adopted -c for --cat shortform
This avoids conflicts with -z/--depth, which will likely be an issue for
dbgbmap.py.
2025-04-16 15:21:55 -05:00
Christopher Haster
cb0119681d scripts: Adopted -k/--keep-open in treemap.py/codemap.py
It probably makes sense to adopt this in all ascii-art scripts that
operate on files.
2025-04-16 15:21:49 -05:00
Christopher Haster
6c36f76ffd scripts: plot.py: Tweaked -H/--height to allow negative carve-outs
This finally solves the how-do-I-make-space-for-shell-prompts problem:

- plot.py -H0  => use full terminal height
- plot.py -H-1 => use height-1, making space for shell prompts
- plot.py -H   => automatic based on other flags

While also allowing other carveouts in case your prompt takes up more
than 1 line.

Unfortunately this does make -H (no arg) subtly different from -H0, but
sometimes you can't have everything.
2025-04-16 15:21:45 -05:00
Christopher Haster
5e8344a1c6 scripts: Renamed report -> main_ in perf.py/perfbd.py
This is a useful convention. It indicates these functions don't do
anything a main function wouldn't normally do.
2025-04-16 15:21:42 -05:00
Christopher Haster
59703f3b16 scripts: plot.py: Reorganized main -> main_ to isolate -k/--keep-open logic
This simplifies plot.py's -k/--keep-open logic into a self-contained
loop that just calls main_ on an update.

This is a compromise on getting rid of -k/--keep-open completely, since
we _could_ just rely on watch.py. But plot.py knowing which argument is
the file to watch is convenient.

The eventual plan is to adopt this small bit of copy-pastable-code in
the other ascii-art scripts (treemap.py, dbgbmap.py, etc).
2025-04-16 15:21:40 -05:00
Christopher Haster
9b03933f2d scripts: tracebd.py/dbgbmap.py: Made off range a subparser of block range
So:

- before: ./scripts/dbgbmap.py disk -b4096 -@0 -n16,32
- after:  ./scripts/dbgbmap.py disk -b4096 -@'0 -n16,32'

This is mainly to avoid the naming conflict between -n/--size and
-n/--lines, while also separating out the namespaces a bit.

It's probably not the most intuitive CLI UI, but --off and -n/--size are
probably infrequent arguments at this level of script anyways.
2025-04-16 15:21:37 -05:00
Christopher Haster
0dbd1561ae scripts: Fixed some issues with -k/--keep-open
- Fixed a NameError in watch.py caused by an outdated variable name
  (renamed paths -> keep_open_paths). Yay for dynamic typing.

- Fixed fieldnames is None issue when csv file is empty.
2025-04-16 15:21:27 -05:00
Christopher Haster
f3889d8932 scripts: Adopted Canvas class in plot.py
This should have no noticeable impact on plot.py, but shared classes
have proven helpful for maintaining these scripts.

Unfortunately, this did require some tweaking of the Canvas class to get
things working.

Now, instead of storing things in an internal high-resolution grid,
the Canvas class only keeps track of the most recent character, with
bitmasked ints storing sub-char info.

This makes it so sub-char draws overwrite full characters, which is
necessary for plot.py's axis/data overlap to work.
2025-03-12 21:23:16 -05:00
Christopher Haster
313696ecf9 scripts: Fixed openio issue where some scripts didn't import os
This only failed if "-" was used as an argument (for stdin/stdout), so
the issue was pretty hard to spot.

openio is a heavily copy-pasted function, so it makes sense to just add
the import os to openio directly. Otherwise this mistake will likely
happen again in the future.
2025-03-12 21:18:51 -05:00
Christopher Haster
b2646148c1 scripts: Tweaked -./-p/-P flags in ascii scripts
- -*/--add-char/--chars -> -./--add-char/--chars
- -./--points -> -p/--points
- -!/--points-and-lines -> -P/--points-and-lines

Also fixed an issue in plot.py/Attr where non-list default were failing
to concatenate.
2025-03-12 21:18:15 -05:00
Christopher Haster
ff803dfc0f scripts: Added vestigial -:/--dots flag to plot.py
This just makes it easier to jump between plot.py and plotmpl.py.
2025-03-12 21:13:40 -05:00
Christopher Haster
4ea710f62c scripts: Adopted % modifiers in all attr arguments
This turns out to be extremely useful, for the sole purpose of being
able to specify colors/formats/etc in csv fields (-C'%(fields)s' for
example, or -C'#%(field)06x' for a cooler example).

This is a bit tricky for --chars, but doable with a psplit helper
function.

Also fixed a bug in plot.py where we weren't using dataattrs_ correctly.
2025-03-12 21:12:12 -05:00
Christopher Haster
c60301719a scripts: Adopted dat tweak in other scripts
This just makes dat behave similarly to Python's getattr, etc:

- dat("bogus")       -> raises ValueError
- dat("bogus", 1234) -> returns 1234

This replaces try_dat, which is easy to forget about when copy-pasting
between scripts.

Though all of this wouldn't be necessary if only we could catch
exceptions in expressions...
2025-03-12 21:12:12 -05:00
Christopher Haster
861dc3bd6a scripts: csv.py: Added --help-mods to help explain % modifiers
I guess in addition to its other utilities, csv.py is now also turning
into a sort of man database for some of the more complicated APIs in the
scripts:

  ./csv.py --help
  ./csv.py --help-exprs
  ./csv.py --help-mods

It's a bit minimal, but better than nothing.

Also dropped the %c modifier because this never actually worked.
2025-03-12 19:10:17 -05:00
Christopher Haster
d90a8e87c4 scripts: Removed clearly unused isinf condition in dat parser 2025-03-11 18:50:06 -05:00
Christopher Haster
5f2ea77c42 scripts: plot[mpl].py: Reworked --add-xticklabel/yticklabel
This adopts the Attr rework for the --add-xticklabel and
--add-yticklabel flags.

Sort of.

These require a bit of special behavior to make work, but should at
least be externally consistent with the other Attr flags.

Instead of assigning to by-field groups, --add-xticklabel/yticklabel
assign to the relevant x/y coord:

  $ ./scripts/plotmpl.py \
          --add-xticklabel='0=zero' \
          --add-yticklabel='100=one-hundred'

The real power comes from our % modifiers. As a special case,
--add-xticklabel/yticklabel can reference the special x/y field, which
represents the current x/y coord:

  $ ./scripts/plotmpl.py --y2 --yticks=5 --add-yticklabel='%(y)d KiB'

Combined with format specifiers, this allows for quite a bit:

  $ ./scripts/plotmpl.py --y2 --yticks=5 --add-yticklabel='0x%(y)04x'

---

Note that plot.py only shows the min/max x/yticks, so plot.py only
accepts indexed --add-xticklabel/yticklabels, and will error if the
assigning variant is used.
2025-03-11 18:22:18 -05:00
Christopher Haster
86f3bad2a4 scripts: Adopted Attr rework in plot.py/plotmpl.py
Unifying these complicated attr-assigning flags across all the scripts
is the main benefit of the new internal Attr system.

The only tricky bit is we need to somehow keep track of all input fields
in case % modifiers reference fields, when we could previously discard
non-data fields.

Tricky but doable.

Updated flags:

- -L/--label -> -L/--add-label
- --colors -> -C/--add-color
- --formats -> -F/--add-format
- --chars -> -*/--add-char/--chars
- --line-chars -> -_/--add-line-char/--line-chars

I've also tweaked Attr to accept glob matches when figuring out group
assignments. This is useful for matching slightly different, but
similarly named results in our benchmark scripts.

There's probably a clever way to do this by injecting new by fields with
csv.py, but just adding globbing is simpler and makes attr assignment
even more flexible.
2025-03-11 18:09:18 -05:00
Christopher Haster
2135c6a003 scripts: Added treemapd3.py
Like treemap.py, but outputting an svg file, which is quite a bit more
useful.

Things svg is _not_:

- A simple vector graphics format

Things svg _is_:

- A surprisingly powerful high-level graphics language.

I might have to use svgs as an output format more often. It's
surprisingly easy to generate graphics without worrying about low-level
rendering details.

---

Aside from the extra flags for svg details like font, padding,
background colors, etc, the main difference between treemap.py and
treemapd3.py is the addition of the --nested mode, which renders a
containing tile for each recursive group (each -b/--by field).

There's no way --nested would've worked in treemap.py. The main benefit
is the extra labels per subgroup, which are already hard enough to read
in treemap.py.

Other than that, treemapd3.py is mostly the same as treemap.py, but with
a resolution that's actually readable.
2025-03-11 14:11:07 -05:00
Christopher Haster
d6c909e724 scripts: Added treemap.py
Based on the d3 javascript library (https://d3js.org), treemap.py
renders heirarchical data as ascii art:

  $ ./scripts/treemap.py lfs.code.csv \
          -bfunction -fsize --chars=asdf -W60 -H8
  total 65454, avg 369 +-366.8σ, min 3, max 4990
  aaaassssddddddaaaadddddssddfffaaadfffaassaassfasssdfdfsddfad
  aaaassssddddddaaaadddddssddfffaaadfffaassdfaafasssdfdfsddfsf
  aaaassssddddddaaaafffffssddfffsssdaaaddffdfaadfaaasdfafaasfa
  aaaassssddddddaaaafffffaaaddddsssaassddffdfaaffssfssfsfadffa
  aaaassssffffffssssfffffaaaddddsssaassssffddffffssfdffsadfsad
  aaaassssffffffssssaaaaasssffffddfaassssaaassdaaddadffsadadad
  aaaassssffffffssssaaaaasssffffddfddffddssassdfassadffsadaffa
  aaaassssffffffssssaaaaasssffffddfddffddssassdfaddsdadasfsada

(Normally this is also colored, but you know.)

I've been playing around with d3 to try to better visualize code costs
in littlefs, and it's been quite neat. I figured it would be useful to
directly integrate a similar treemap renderer into our result scripts.

That being said, this ascii rendering is probably too difficult to parse
for any non-trivial data. I'm also working on an svg-based renderer, so
treemap.py is really just for in-terminal previews and an exercise to
understand the underlying algorithms, similar to plot.py/plotmpl.py.
2025-03-11 14:10:21 -05:00
Christopher Haster
2b1738e6d1 scripts: Increased default sleep time to 2 seconds
You forget one script, running in the background, hogging a whole
core, and suddenly watch's default 2 second sleep time makes a lot more
sense...

One of the main motivators for watch.py _was_ for shorter sleep times,
short enough to render realtime animations (watch is limited to 0.1
seconds for some reason?), but this doesn't mean it needs to be the
default. This can still be accomplished by explicitly specifying
-s/--sleep, and we probably don't want the default to hog all the CPU.

The use case for fast sleeps has been mostly replaced by -k/--keep-open
anyways.

For tailpipe.py and tracebd.py it's a bit less clear, but we probably
don't need to be spamming open calls 10 times a second.
2025-02-13 15:51:36 -06:00
Christopher Haster
eae7665977 scripts: Adopted % for escape codes
This is what git --format does, and it's a clever way sidestep the
escape-hell that is bash sometimes.
2025-02-12 18:41:32 -06:00
Christopher Haster
995d942349 scripts: plot.py/plotmpl.py: Stop dropping unlabeled datasets
That was confusing.

The -L/--label flag is already tricky enough to get right. Allowing
-L/--label to filter datasets is counter-intuitive and just makes it
harder to debug things.
2025-02-11 02:50:38 -06:00
Christopher Haster
361cd3fec0 scripts: Added missing sys imports
Unfortunately the import sys in the argparse block was hiding missing
sys imports.

The mistake was assuming the import sys in Python would limit the scope
to that if block, but Python's late binding strikes again...
2025-01-28 14:41:45 -06:00
Christopher Haster
62cc4dbb14 scripts: Disabled local import hack on import
Moved local import hack behind if __name__ == "__main__"

These scripts aren't really intended to be used as python libraries.
Still, it's useful to import them for debugging and to get access to
their juicy internals.
2025-01-28 14:41:30 -06:00
Christopher Haster
02881faf6f scripts: Dropped field renames from plot.py/plotmpl.py/amor.py/avg.py
This is now inconsistent with csv.py, and I don't really want to add a
full expr parser to every script that might want to rename fields.

Field renaming (or any expr really!) can be accomplished with
intermediate calls to csv.py anyways. No reason to make these scripts
more complicated than they need to be.
2024-11-16 16:17:15 -06:00
Christopher Haster
7cfcc1af1d scripts: Renamed summary.py -> csv.py
This seems like a more fitting name now that this script has evolved
into more of a general purpose high-level CSV tool.

Unfortunately this does conflict with the standard csv module in Python,
breaking every script that imports csv (which is most of them).
Fortunately, Python is flexible enough to let us remove the current
directory before imports with a bit of an ugly hack:

  # prevent local imports
  __import__('sys').path.pop(0)

These scripts are intended to be standalone anyways, so this is probably
a good pattern to adopt.
2024-11-09 12:31:16 -06:00
Christopher Haster
007ac97bec scripts: Adopted double-indent on multiline expressions
This matches the style used in C, which is good for consistency:

  a_really_long_function_name(
          double_indent_after_first_newline(
              single_indent_nested_newlines))

We were already doing this for multiline control-flow statements, simply
because I'm not sure how else you could indent this without making
things really confusing:

  if a_really_long_function_name(
          double_indent_after_first_newline(
              single_indent_nested_newlines)):
      do_the_thing()

This was the only real difference style-wise between the Python code and
C code, so now both should be following roughly the same style (80 cols,
double-indent multiline exprs, prefix multiline binary ops, etc).
2024-11-06 15:31:17 -06:00
Christopher Haster
48c2e7784b scripts: Renamed import math alias m -> mt
Mainly to avoid conflicts with match results m, this frees up the single
letter variables m for other purposes.

Choosing a two letter alias was surprisingly difficult, but mt is nice
in that it somewhat matches it (for itertools) and ft (for functools).
2024-11-05 01:58:40 -06:00
Christopher Haster
a231bbac6e Added -^/--head to watch.py and other scripts, RingIO tweaks
This lets you view the first n lines of output instead of the last n
lines, as though the output was piped through head.

This is how the standard watch command works, and can be more useful
when most of the information is at the top, such as in our dbg*.py
scripts (watch.py was originally used as a sort of inotifywait-esque
build runner, which is the main reason it's different).

To make this work, RingIO (renamed from LinesIO) now uses terminal
height as a part of its canvas rendering. This has the added benefit of
more rigorously enforcing the canvas boundaries, but risks breaking when
not associated with a terminal. But that raises the question, does
RingIO even make sense without a terminal?

Worst case you can bypass all of this with -z/--cat.
2024-06-02 00:37:48 -05:00
Christopher Haster
b8d3a5ef46 Fixed inotify race conditions and fd leak in scripts
Since we were only registering our inotify reader after the previous
operation completed, it was easy to miss modifications that happened
faster than our scripts. Since our scripts are in Python, this happened
quite often and made it hard to trust the current state of scripts
with --keep-open, sort of defeating the purpose of --keep-open...

I think previously this race condition wasn't avoided because of the
potential to loop indefinitely if --keep-open referenced a file that the
script itself modified, but it's up to the user to avoid this if it is
an issue.

---

Also while fixing this, I noticed our use of the inotify_simple library
was leaking file descriptors everywhere! I just wasn't closing any
inotify objects at all. A bit concerning since scripts with --keep-open
can be quite long lived...
2023-12-06 22:23:38 -06:00
Christopher Haster
c3d7cbfb09 Changed how labels work in plot.py/plotmpl.py to actually be useable
Previously, any labeling was _technically_ possible, but tricky to get
right and usually required repeated renderings.

It evolved out of the way colors/formats were provided: a cycled
order-significant list that gets zipped with the datasets. This works
ok for somewhat arbitrary formatting, such as colors/formats, but falls
apart for labels, where it turns out to be somewhat important what
exactly you are labeling.

The new scheme makes the label's relationship explicit, at the cost of
being a bit more verbose:

  $ ./scripts/plotmpl.py bench.csv -obench.svg \
        -Linorder=0,4096,avg,bench_readed \
        -Lreversed=1,4096,avg,bench_readed \
        -Lrandom=2,4096,avg,bench_readed

This could also be adopted in the CSV manipulation scripts (code.py,
stack.py, summary.py, etc), but I don't think it would actually see that
much use. You can always awk the output to change names and it would add
more complexity to a set of scripts that are probably already way
over-designed.
2023-11-05 19:55:10 -06:00
Christopher Haster
1e4d4cfdcf Tried to write errors to stderr consistently in scripts 2023-11-05 15:55:07 -06:00
Christopher Haster
d0a6ef0c89 Changed scripts to not infer field purposes from CSV values
Note there's a bit of subtlety here, field _types_ are still infered,
but the intention of the fields, i.e. if the field contains data vs
row name/other properties, must be unambiguous in the scripts.

There is still a _tiny_ bit of inference. For most scripts only one
of --by or --fields is strictly needed, since this makes the purpose of
the other fields unambiguous.

The reason for this change is so the scripts are a bit more reliable,
but also because this simplifies the data parsing/inference a bit.

Oh, and this also changes field inference to use the csv.DictReader's
fieldnames field instead of only inspecting the returned dicts. This
should also save a bit of O(n) overhead when parsing CSV files.
2023-11-04 15:24:18 -05:00
Christopher Haster
0f93fa3057 Tweaked script field arg parsing to strip whitespace almost everywhere
The whitespace sensitivity of field args was starting to be a problem,
mostly for advanced plotmpl.py usage (which tbf might be appropriately
described as "super hacky" in how it uses CLI parameters):

  ./scripts/plotmpl.py \
      -Dcase=" \
          bench_rbyd_attr_append, \
          bench_rbyd_attr_remove, \
          bench_rbyd_attr_fetch, \
          ..."

This may present problems when parsing CSV files with whitespace, in
theory, maybe. But given the scope of these scripts for littlefs...
just don't do that. Thanks.
2023-11-03 15:03:46 -05:00