This adopts the Attr rework for the --add-xticklabel and
--add-yticklabel flags.
Sort of.
These require a bit of special behavior to make work, but should at
least be externally consistent with the other Attr flags.
Instead of assigning to by-field groups, --add-xticklabel/yticklabel
assign to the relevant x/y coord:
$ ./scripts/plotmpl.py \
--add-xticklabel='0=zero' \
--add-yticklabel='100=one-hundred'
The real power comes from our % modifiers. As a special case,
--add-xticklabel/yticklabel can reference the special x/y field, which
represents the current x/y coord:
$ ./scripts/plotmpl.py --y2 --yticks=5 --add-yticklabel='%(y)d KiB'
Combined with format specifiers, this allows for quite a bit:
$ ./scripts/plotmpl.py --y2 --yticks=5 --add-yticklabel='0x%(y)04x'
---
Note that plot.py only shows the min/max x/yticks, so plot.py only
accepts indexed --add-xticklabel/yticklabels, and will error if the
assigning variant is used.
Unifying these complicated attr-assigning flags across all the scripts
is the main benefit of the new internal Attr system.
The only tricky bit is we need to somehow keep track of all input fields
in case % modifiers reference fields, when we could previously discard
non-data fields.
Tricky but doable.
Updated flags:
- -L/--label -> -L/--add-label
- --colors -> -C/--add-color
- --formats -> -F/--add-format
- --chars -> -*/--add-char/--chars
- --line-chars -> -_/--add-line-char/--line-chars
I've also tweaked Attr to accept glob matches when figuring out group
assignments. This is useful for matching slightly different, but
similarly named results in our benchmark scripts.
There's probably a clever way to do this by injecting new by fields with
csv.py, but just adding globbing is simpler and makes attr assignment
even more flexible.
No more special indexed attrs at the top-level, now all attrs are
indexed, even if assigned to a specific group.
This just makes it so group-specific cycles are possible:
$ ./scripts/treemap.py -Clfs.c=red -Clfs.c=green
Now, instead of specifying a specific field or comma-separated set of
order-defined constants, -L/--add-label, -C/--add-color, and
-./--add-char/--chars accept a by-field group assignment similar to
-L/--label in plotmpl.py.
I also reworked our % modifiers to behave a bit more like printf
modifiers with optional field targets.
It gets a bit complicated, but this ends up extremely flexible:
- Assign to a specific group:
$ ./scripts/treemap.py -Clfs.c,lfsr_format=orange
- Note this is heirarchical, with more specific groups taking priority:
$ ./scripts/treemap.py -Clfs.c=blue -Clfs.c,lfsr_format=orange
- We can still get the order-assigned behavior by specifying multiple
options, but note there is no longer a comma ambiguity! This is useful
if you want to specify a palette and don't care which dataset gets
which attr:
$ ./scripts/treemap.py -Cred -Cgreen -Cblue
- Mix and match:
$ ./scripts/treemap.py -Cred -Cgreen -Cblue -Clfsr_format=orange
- And with the new % modifiers, we can still use labels stored in a
field:
$ ./scripts/treemap.py -L'%(label_field)s'
- -./--add-char/--chars in treemap.py is a bit of a special case. Since
it only accepts single characters, we can still accept multiple
options with a single flag without having to worry about ambiguities:
$ ./scripts/treemap.py -.asdf
Well, unless you want to include a literal '='. This is possible, but
a bit messy:
$ ./scripts/treemap.py -.as -.=== -.df
Yes that is 3 equal signs... One for argparse, one for the assignment,
one for the '=' literal.
This one is minor, but nice for terseness.
A painful lesson learned from plot[mpl].py: we should never implicitly
sum results in a late-stage rendering script. It just makes it way to
easy to accidentally render incorrect/misleading data, while being
difficult to notice.
We should always render redundant results as redundant results.
If the redundant results are an error, this hopefully makes the problem
more obvious to the user. And if the user really does want summed
results, they can always use csv.py as an intermediate step:
$ ./scripts/treemap.py \
<(./scripts/csv.py lfs.code.csv -bfile -fsize -q -o-)
-fsize
This adds --rectify for a parent-aspect-ratio-preserving --squarify
variant, reverting squarify to try to match the aspect ratio of a
square (1:1).
I can see arguments for both of these. On one hand --squarify makes the
squarest squares, which according to Mark Bruls et al's paper on the
topic is easier visually compare. On the other hand --rectify may be
more visually pleasing and fit into parent tiles better.
d3 allows for any ratio, but at the moment I'm not seeing a strong
reason for the extra parameter.
Like treemap.py, but outputting an svg file, which is quite a bit more
useful.
Things svg is _not_:
- A simple vector graphics format
Things svg _is_:
- A surprisingly powerful high-level graphics language.
I might have to use svgs as an output format more often. It's
surprisingly easy to generate graphics without worrying about low-level
rendering details.
---
Aside from the extra flags for svg details like font, padding,
background colors, etc, the main difference between treemap.py and
treemapd3.py is the addition of the --nested mode, which renders a
containing tile for each recursive group (each -b/--by field).
There's no way --nested would've worked in treemap.py. The main benefit
is the extra labels per subgroup, which are already hard enough to read
in treemap.py.
Other than that, treemapd3.py is mostly the same as treemap.py, but with
a resolution that's actually readable.
Based on the d3 javascript library (https://d3js.org), treemap.py
renders heirarchical data as ascii art:
$ ./scripts/treemap.py lfs.code.csv \
-bfunction -fsize --chars=asdf -W60 -H8
total 65454, avg 369 +-366.8σ, min 3, max 4990
aaaassssddddddaaaadddddssddfffaaadfffaassaassfasssdfdfsddfad
aaaassssddddddaaaadddddssddfffaaadfffaassdfaafasssdfdfsddfsf
aaaassssddddddaaaafffffssddfffsssdaaaddffdfaadfaaasdfafaasfa
aaaassssddddddaaaafffffaaaddddsssaassddffdfaaffssfssfsfadffa
aaaassssffffffssssfffffaaaddddsssaassssffddffffssfdffsadfsad
aaaassssffffffssssaaaaasssffffddfaassssaaassdaaddadffsadadad
aaaassssffffffssssaaaaasssffffddfddffddssassdfassadffsadaffa
aaaassssffffffssssaaaaasssffffddfddffddssassdfaddsdadasfsada
(Normally this is also colored, but you know.)
I've been playing around with d3 to try to better visualize code costs
in littlefs, and it's been quite neat. I figured it would be useful to
directly integrate a similar treemap renderer into our result scripts.
That being said, this ascii rendering is probably too difficult to parse
for any non-trivial data. I'm also working on an svg-based renderer, so
treemap.py is really just for in-terminal previews and an exercise to
understand the underlying algorithms, similar to plot.py/plotmpl.py.