Commit Graph

23 Commits

Author SHA1 Message Date
Christopher Haster
0dbd1561ae scripts: Fixed some issues with -k/--keep-open
- Fixed a NameError in watch.py caused by an outdated variable name
  (renamed paths -> keep_open_paths). Yay for dynamic typing.

- Fixed fieldnames is None issue when csv file is empty.
2025-04-16 15:21:27 -05:00
Christopher Haster
f3889d8932 scripts: Adopted Canvas class in plot.py
This should have no noticeable impact on plot.py, but shared classes
have proven helpful for maintaining these scripts.

Unfortunately, this did require some tweaking of the Canvas class to get
things working.

Now, instead of storing things in an internal high-resolution grid,
the Canvas class only keeps track of the most recent character, with
bitmasked ints storing sub-char info.

This makes it so sub-char draws overwrite full characters, which is
necessary for plot.py's axis/data overlap to work.
2025-03-12 21:23:16 -05:00
Christopher Haster
4df90dfa0a scripts: Added --squarify-ratio to treemap[d3].py/codemap[d3].py
Might as well. The internal algorithm already supports this.
2025-03-12 21:23:09 -05:00
Christopher Haster
313696ecf9 scripts: Fixed openio issue where some scripts didn't import os
This only failed if "-" was used as an argument (for stdin/stdout), so
the issue was pretty hard to spot.

openio is a heavily copy-pasted function, so it makes sense to just add
the import os to openio directly. Otherwise this mistake will likely
happen again in the future.
2025-03-12 21:18:51 -05:00
Christopher Haster
b2646148c1 scripts: Tweaked -./-p/-P flags in ascii scripts
- -*/--add-char/--chars -> -./--add-char/--chars
- -./--points -> -p/--points
- -!/--points-and-lines -> -P/--points-and-lines

Also fixed an issue in plot.py/Attr where non-list default were failing
to concatenate.
2025-03-12 21:18:15 -05:00
Christopher Haster
5b5745bca9 scripts: treemap.py/codemap.py: Tweaked -L to imply -l
And added the optional --no-label to explicitly opt out.

This is a bit more consistent with treemapd3.py/codemapd3.py's handling
of labels, while still keeping the no-label default. It also makes it
easier to temporarily hide labels when editing commands.
2025-03-12 21:14:37 -05:00
Christopher Haster
59ffbde3ad scripts: treemap.py/codemap.py: Use parts of name for char defaults
So by default, instead of just using "." for tiles, we use interesting
parts of the tile's name:

- For treemap.py, we use the first character of the last by-field (so
  "lfs.c,lfsr_file_write,1234" -> "1").

- For codemap.py, we use the first character of the non-subsystem part
  of the function name (so "lfsr_file_write" -> "w").

This nice thing about this, is the resulting treemap is somewhat
understandable even without colors:

  $ ./scripts/codemap.py lfs.o lfs_util.o lfs.ci lfs_util.ci -W60 -H8
  code 35528 stack 2440 ctx 636
  ffffffoooffaaaaaaaaaaaacccccccccttttccccrrrrpgffmmrraifmmcss
  ffffffwwwttaaaaaaaaaaaacccccccccttttccccrprrpcscmmoommrrcepp
  ffffffwwwttaaaaaaaaalllcccccccccttttccccrpppccscmmsrmmrrrrss
  ccccssrrfclaaaaanneeasscccccccccgpppccccrpppsgsummstmmrrlfgf
  ccccssrrfccaaaaanneeaaaccccccsaagpppcccccrrrfrrcccrrfiiilucs
  ccccssrrtfcfffffaapplcccccccclssgnnllllcrrffrrrccccifssscmcm
  ccccssrrtrdfffffaapppapcccfffllsgnnllllcrrrffrrcccorfsssicnu

Ok, so maybe the word "somewhat" is doing a lot of heavy lifting...
2025-03-12 21:12:12 -05:00
Christopher Haster
3d53f5393d scripts: Added codemap.py
Like codemapd3.py, but with an ascii renderer.

This is basically just codemapd3.py and treemap.py smooshed together.
It's not the cleanest, but it gets the job done. codemap.py is not
the most critical of scripts.

Unfortunately callgraph and stack/ctx info are difficult (impossible?)
to render usefully in ascii, but we can at least do the script calling,
parsing, namespacing, etc, necessary to create the code cost tilemap.
2025-03-12 21:12:12 -05:00
Christopher Haster
4ea710f62c scripts: Adopted % modifiers in all attr arguments
This turns out to be extremely useful, for the sole purpose of being
able to specify colors/formats/etc in csv fields (-C'%(fields)s' for
example, or -C'#%(field)06x' for a cooler example).

This is a bit tricky for --chars, but doable with a psplit helper
function.

Also fixed a bug in plot.py where we weren't using dataattrs_ correctly.
2025-03-12 21:12:12 -05:00
Christopher Haster
3d355d7783 scripts: Added -t/--tiny to treemap.py
Even though I think this makes less sense for the ascii-rendering
scripts, it's useful to have this flag around when jumping between
treemap.py and treemapd3.py.

And it might actually make sense sometimes now that -t/--tiny does not
override --to-scale.
2025-03-12 21:12:12 -05:00
Christopher Haster
c60301719a scripts: Adopted dat tweak in other scripts
This just makes dat behave similarly to Python's getattr, etc:

- dat("bogus")       -> raises ValueError
- dat("bogus", 1234) -> returns 1234

This replaces try_dat, which is easy to forget about when copy-pasting
between scripts.

Though all of this wouldn't be necessary if only we could catch
exceptions in expressions...
2025-03-12 21:12:12 -05:00
Christopher Haster
e780fd40f7 scripts: Added codemapd3.py
Inspired heavily by d3 and brendangregg's flamegraphs, codemapd3.py is
intended to be a powerful high-level code exploring tool.

It's a visual tool, so probably best explained visually:

  $ CFLAGS='-DLFS_NO_LOG -DLFS_NO_ASSERT' make -j
  $ ./scripts/codemapd3.py \
          lfs.o lfs_util.o \
          lfs.ci lfs_util.ci \
          -otest.svg -W1500 -H700 --dark
  updated test.svg, code 35528 stack 2440 ctx 636

And open test.svg in a browser of your choice.

(TODO add a make rule for this)

---

Features include:

- Rendering of code cost in a treemap organized by subsystem (based on
  underscore-separated namespaces), making it relatively easy to see
  where the bulk of our code cost comes from.

- Rendering of the deepest stack/ctx cost as a set of tiles, making it
  relatively easy to see where the bulk of our stack cost comes from.

- Interactive (on mouseover) rendering of callgraph info, showing
  dependencies and relevant stack/ctx costs per-function.

  This currently includes 4 modes:

  1. mode-callgraph - This shows the full callgraph, including all
     children's children, which is effectively all dependencies of that
     function, i.e. the total code cost necessary for that _specific_
     function to work.

  2. mode-deepest - This shows the deepest/hot path of calls from that
     function, which is every child that contributes to the function's
     stack cost.

  3. mode-callees - This shows all functions the current function
     immediately calls.

  4. mode-callers - This shows all functions that call the current
     function.

  And yes, cycles are handled correctly: We show the deepest
  non-cyclical path, but display the measured stack usage as infinite.

For more details see ./scripts/codemapd3.py --help.

---

One particularly neat feature I'm happy about is -t/--tiny, which scales
the resulting image such that 1 pixel ~= 1 byte. This should be useful
for comparing littlefs to other filesystems in a way that is visually
interesting.

- d3 - https://d3js.org
- brendangregg's flamegraphs - https://github.com/brendangregg/FlameGraph
2025-03-12 21:08:23 -05:00
Christopher Haster
fb03e27baf scripts: Added --no-stats to treemap.py/treemapd3.py
The previous behavior of -N/--no-header still rendering a header when
--title is also provided was confusing. I think this is a better API,
at the minor cost of needing to pass one more flag if you don't want
stats in the header.
2025-03-12 20:07:27 -05:00
Christopher Haster
861dc3bd6a scripts: csv.py: Added --help-mods to help explain % modifiers
I guess in addition to its other utilities, csv.py is now also turning
into a sort of man database for some of the more complicated APIs in the
scripts:

  ./csv.py --help
  ./csv.py --help-exprs
  ./csv.py --help-mods

It's a bit minimal, but better than nothing.

Also dropped the %c modifier because this never actually worked.
2025-03-12 19:10:17 -05:00
Christopher Haster
d90a8e87c4 scripts: Removed clearly unused isinf condition in dat parser 2025-03-11 18:50:06 -05:00
Christopher Haster
5f2ea77c42 scripts: plot[mpl].py: Reworked --add-xticklabel/yticklabel
This adopts the Attr rework for the --add-xticklabel and
--add-yticklabel flags.

Sort of.

These require a bit of special behavior to make work, but should at
least be externally consistent with the other Attr flags.

Instead of assigning to by-field groups, --add-xticklabel/yticklabel
assign to the relevant x/y coord:

  $ ./scripts/plotmpl.py \
          --add-xticklabel='0=zero' \
          --add-yticklabel='100=one-hundred'

The real power comes from our % modifiers. As a special case,
--add-xticklabel/yticklabel can reference the special x/y field, which
represents the current x/y coord:

  $ ./scripts/plotmpl.py --y2 --yticks=5 --add-yticklabel='%(y)d KiB'

Combined with format specifiers, this allows for quite a bit:

  $ ./scripts/plotmpl.py --y2 --yticks=5 --add-yticklabel='0x%(y)04x'

---

Note that plot.py only shows the min/max x/yticks, so plot.py only
accepts indexed --add-xticklabel/yticklabels, and will error if the
assigning variant is used.
2025-03-11 18:22:18 -05:00
Christopher Haster
86f3bad2a4 scripts: Adopted Attr rework in plot.py/plotmpl.py
Unifying these complicated attr-assigning flags across all the scripts
is the main benefit of the new internal Attr system.

The only tricky bit is we need to somehow keep track of all input fields
in case % modifiers reference fields, when we could previously discard
non-data fields.

Tricky but doable.

Updated flags:

- -L/--label -> -L/--add-label
- --colors -> -C/--add-color
- --formats -> -F/--add-format
- --chars -> -*/--add-char/--chars
- --line-chars -> -_/--add-line-char/--line-chars

I've also tweaked Attr to accept glob matches when figuring out group
assignments. This is useful for matching slightly different, but
similarly named results in our benchmark scripts.

There's probably a clever way to do this by injecting new by fields with
csv.py, but just adding globbing is simpler and makes attr assignment
even more flexible.
2025-03-11 18:09:18 -05:00
Christopher Haster
8b04e35ea5 scripts: Tweaked how Attr handles indexed attrs
No more special indexed attrs at the top-level, now all attrs are
indexed, even if assigned to a specific group.

This just makes it so group-specific cycles are possible:

  $ ./scripts/treemap.py -Clfs.c=red -Clfs.c=green
2025-03-11 18:07:09 -05:00
Christopher Haster
baa1a1b3a8 scripts: treemap[d3].py: Implemented more flexible labeling/coloring system
Now, instead of specifying a specific field or comma-separated set of
order-defined constants, -L/--add-label, -C/--add-color, and
-./--add-char/--chars accept a by-field group assignment similar to
-L/--label in plotmpl.py.

I also reworked our % modifiers to behave a bit more like printf
modifiers with optional field targets.

It gets a bit complicated, but this ends up extremely flexible:

- Assign to a specific group:

    $ ./scripts/treemap.py -Clfs.c,lfsr_format=orange

- Note this is heirarchical, with more specific groups taking priority:

    $ ./scripts/treemap.py -Clfs.c=blue -Clfs.c,lfsr_format=orange

- We can still get the order-assigned behavior by specifying multiple
  options, but note there is no longer a comma ambiguity! This is useful
  if you want to specify a palette and don't care which dataset gets
  which attr:

    $ ./scripts/treemap.py -Cred -Cgreen -Cblue

- Mix and match:

    $ ./scripts/treemap.py -Cred -Cgreen -Cblue -Clfsr_format=orange

- And with the new % modifiers, we can still use labels stored in a
  field:

    $ ./scripts/treemap.py -L'%(label_field)s'

- -./--add-char/--chars in treemap.py is a bit of a special case. Since
  it only accepts single characters, we can still accept multiple
  options with a single flag without having to worry about ambiguities:

    $ ./scripts/treemap.py -.asdf

  Well, unless you want to include a literal '='. This is possible, but
  a bit messy:

    $ ./scripts/treemap.py -.as -.=== -.df

  Yes that is 3 equal signs... One for argparse, one for the assignment,
  one for the '=' literal.

  This one is minor, but nice for terseness.
2025-03-11 17:29:58 -05:00
Christopher Haster
6a6b74d631 scripts: treemap[d3].py: Show redundant datasets as redundant tiles
A painful lesson learned from plot[mpl].py: we should never implicitly
sum results in a late-stage rendering script. It just makes it way to
easy to accidentally render incorrect/misleading data, while being
difficult to notice.

We should always render redundant results as redundant results.

If the redundant results are an error, this hopefully makes the problem
more obvious to the user. And if the user really does want summed
results, they can always use csv.py as an intermediate step:

  $ ./scripts/treemap.py \
          <(./scripts/csv.py lfs.code.csv -bfile -fsize -q -o-)
          -fsize
2025-03-11 15:19:14 -05:00
Christopher Haster
1c92b7e892 scripts: treemap[d3].py: Squared --squarify, added --rectify
This adds --rectify for a parent-aspect-ratio-preserving --squarify
variant, reverting squarify to try to match the aspect ratio of a
square (1:1).

I can see arguments for both of these. On one hand --squarify makes the
squarest squares, which according to Mark Bruls et al's paper on the
topic is easier visually compare. On the other hand --rectify may be
more visually pleasing and fit into parent tiles better.

d3 allows for any ratio, but at the moment I'm not seeing a strong
reason for the extra parameter.
2025-03-11 15:19:04 -05:00
Christopher Haster
2135c6a003 scripts: Added treemapd3.py
Like treemap.py, but outputting an svg file, which is quite a bit more
useful.

Things svg is _not_:

- A simple vector graphics format

Things svg _is_:

- A surprisingly powerful high-level graphics language.

I might have to use svgs as an output format more often. It's
surprisingly easy to generate graphics without worrying about low-level
rendering details.

---

Aside from the extra flags for svg details like font, padding,
background colors, etc, the main difference between treemap.py and
treemapd3.py is the addition of the --nested mode, which renders a
containing tile for each recursive group (each -b/--by field).

There's no way --nested would've worked in treemap.py. The main benefit
is the extra labels per subgroup, which are already hard enough to read
in treemap.py.

Other than that, treemapd3.py is mostly the same as treemap.py, but with
a resolution that's actually readable.
2025-03-11 14:11:07 -05:00
Christopher Haster
d6c909e724 scripts: Added treemap.py
Based on the d3 javascript library (https://d3js.org), treemap.py
renders heirarchical data as ascii art:

  $ ./scripts/treemap.py lfs.code.csv \
          -bfunction -fsize --chars=asdf -W60 -H8
  total 65454, avg 369 +-366.8σ, min 3, max 4990
  aaaassssddddddaaaadddddssddfffaaadfffaassaassfasssdfdfsddfad
  aaaassssddddddaaaadddddssddfffaaadfffaassdfaafasssdfdfsddfsf
  aaaassssddddddaaaafffffssddfffsssdaaaddffdfaadfaaasdfafaasfa
  aaaassssddddddaaaafffffaaaddddsssaassddffdfaaffssfssfsfadffa
  aaaassssffffffssssfffffaaaddddsssaassssffddffffssfdffsadfsad
  aaaassssffffffssssaaaaasssffffddfaassssaaassdaaddadffsadadad
  aaaassssffffffssssaaaaasssffffddfddffddssassdfassadffsadaffa
  aaaassssffffffssssaaaaasssffffddfddffddssassdfaddsdadasfsada

(Normally this is also colored, but you know.)

I've been playing around with d3 to try to better visualize code costs
in littlefs, and it's been quite neat. I figured it would be useful to
directly integrate a similar treemap renderer into our result scripts.

That being said, this ascii rendering is probably too difficult to parse
for any non-trivial data. I'm also working on an svg-based renderer, so
treemap.py is really just for in-terminal previews and an exercise to
understand the underlying algorithms, similar to plot.py/plotmpl.py.
2025-03-11 14:10:21 -05:00