Commit Graph

55 Commits

Author SHA1 Message Date
Christopher Haster
2f20f53e90 scripts: csv.py: Reverted define filtering to before expr eval
It's just too unintuitive to filter after exprs.

Note this is consistent with how exprs/mods are evaluated. Exprs/mods
can't reference other exprs/mods because csv.py is only single-pass, so
allowing defines to reference exprs/mods is surprising.

And the solution to needing these sort of post-expr/mod references is
the same for defines: You can always chain multiple csv.py calls.

The reason defines were change to evaluate after expr eval was because
this seemed inconsistent with other result scripts, but this is not
actually the case. Other result scripts simply don't have exprs/mods, so
filtering in fold is the same as filtering during collection. Note that
even in fold, filtering is done _before_ the actual fold/sum operation.

---

Also fixed a recursive-define regression when folding. Counter-
intuitively, we _don't_ want to recursively apply define filters. If we
do the results will just end up too confusing to be useful.
2025-03-12 19:10:17 -05:00
Christopher Haster
e851c654c5 scripts: Fixed typo hiding zero-sized results in table renderer
This should either have checked diff_result==None, or we should be
mapping diff_result=None => diff_result_=None. To be safe I've done
both.

This was a nasty typo and I only noticed because ctx.py stopped printing
"cycle detected" for our linked-lists (which are expected to be cyclic).
2025-03-12 19:10:17 -05:00
Christopher Haster
7789714560 scripts: Adopted single folding pass, fixing perf[bd].py -r/--hot issue
There's an ordering issue with hotifying and folding when we have
multiple foldable results with children. This was hard to notice since
most of the recursive scripts have unique results, but it _is_ an issue
for perf.py/perfbd.py, which rely on result folding to merge samples.

The fix is to fold _before_ hotifying.

We could fold multiple times to avoid changing the behavior of the
result scripts, but instead I've just moved the folding in the table
renderer up into the relevant main functions. This means 1. we only fold
once, and 2. folding affects outputted csv/json files.

I'm a bit on the fence about this behavior change, but it is a bit more
consistent with how -r/--hot, -z/--depth, etc, affect both table and
csv/json results consistently.

Maybe we should move towards the table render always reflecting the
csv/json results? Most csv/json usage is with -q/--quiet anyways...

---

This does create a new risk in that the table renderer can hide results
if they aren't folded first.

To hopefully avoid this I've added an assert in the table renderer if it
notices results being hidden.
2025-03-12 19:10:17 -05:00
Christopher Haster
b2768becaa scripts: Added -l/--labels to csv.py
This gives csv.py access to a hidden feature in our table renderer used
by some of the other scripts: fields that affect by-field grouping, but
aren't actually printed.

For example, this prevents summing same named functions in different
files, but only shows the function name in the table render:

  $ ./scripts/csv.py lfs.code.csv -bfile -bfunction -lfunction
  function                                size
  lfs_alloc                                398
  lfs_alloc_discard                         31
  lfs_alloc_findfree                        77
  ...

This is especially useful when enumerating results. For example, this
prevents any summing without extra table noise:

  $ ./scripts/csv.py lfs.code.csv -i -bfunction -fsize -lfunction
  function                                size
  lfs_alloc                                398
  lfs_alloc_discard                         31
  lfs_alloc_findfree                        77
  ...

I also tweaked -b/--by field defaults a bit to account to
enumerate/label fields a bit better.
2025-03-12 19:10:17 -05:00
Christopher Haster
748815bb46 scripts: Disentangled -r/--hot and -i/--enumerate
This removes most of the special behavior around how -r/--hot and
-i/--enumerate interact. This does mean -r/--hot risks folding results
if -i/--enumerate is not specified, but this is _technically_ a valid
operation.

For most of the recursive result scripts, I've replaced the "i" field
with separate "z" and "i" fields for depth and field number, which I
think is a bit more informative/useful.

I've also added a default-hidden "off" field to structs.py/ctx.py, since
we have that info available. I considered replacing "i" with this, but
decided against it since non-zero offsets for union members would risk
being confusing/mistake prone.
2025-03-12 19:10:17 -05:00
Christopher Haster
ac30a20d12 scripts: Reworked to support optional json input/output
Guh

This may have been more work than I expected. The goal was to allowing
passing recursive results (callgraph info, structs, etc) between
scripts, which is simply not possible with csv files.

Unfortunately, this raised a number of questions: What happens if a
script receives recursive results? -d/--diff with recursive results?
How to prevent folding of ordered results (structs, hot, etc) in piped
scripts? etc.

And ended up with a significant rewrite of most of the result scripts'
internals.

Key changes:

- Most result scripts now support -O/--output-json in addition to
  -o/--json, with -O/--output-json including any recursive results in
  the "children" field.

- Most result scripts now support both csv and json as input to relevant
  flags: -u/--use, -d/--diff, -p/--percent. This is accomplished by
  looking for a '[' as the first character to decide if an input file is
  json or csv.

  Technically this breaks if your json has leading whitespace, but why
  would you ever keep whitespace around in json? The human-editability
  of json was already ruined the moment comments were disallowed.

- csv.py requires all fields to be explicitly defined, so added
  -i/--enumerate, -Z/--children, and -N/--notes. At least we can provide
  some reasonable defaults so you shouldn't usually need to type out the
  whole field.

- Notably, the rendering scripts (plot.py, treemapd3.py, etc) and
  test/bench scripts do _not_ support json. csv.py can always convert
  to/from json when needed.

- The table renderer now supports diffing recursive results, which is
  nice for seeing how the hot path changed in stack.py/perf.py/etc.

- Moved the -r/--hot logic up into main, so it also affects the
  outputted results. Note it is impossible for -z/--depth to _not_
  affect the outputted results.

- We now sort in one pass, which is in theory more efficient.

- Renamed -t/--hot -> -r/--hot and -R/--reverse-hot, matching -s/-S.

- Fixed an issue with -S/--reverse-sort where only the short form was
  actually reversed (I misunderstood what argparse passes to Action
  classes).

- csv.py now supports json input/output, which is funny.
2025-03-12 19:09:43 -05:00
Christopher Haster
361cd3fec0 scripts: Added missing sys imports
Unfortunately the import sys in the argparse block was hiding missing
sys imports.

The mistake was assuming the import sys in Python would limit the scope
to that if block, but Python's late binding strikes again...
2025-01-28 14:41:45 -06:00
Christopher Haster
0adec7f15c scripts: Replaced __builtins__ with builtins
Apparently __builtins__ is a CPython implementation detail, and behaves
differently when executed vs imported???

import builtins is the correct way to go about this.
2025-01-28 14:41:45 -06:00
Christopher Haster
62cc4dbb14 scripts: Disabled local import hack on import
Moved local import hack behind if __name__ == "__main__"

These scripts aren't really intended to be used as python libraries.
Still, it's useful to import them for debugging and to get access to
their juicy internals.
2025-01-28 14:41:30 -06:00
Christopher Haster
1d8d0785fc scripts: More flags to control table renderer, -Q/--small-table, etc
Instead of trying to be too clever, this just adds a bunch of small
flags to control parts of table rendering:

- --no-header - Don't show the header.
- --small-header - Don't show by field names.
- --no-total - Don't show the total.
- -Q/--small-table - Equivalent to --small-header + --no-total.

Note that -Q/--small-table replaces the previous -Y/--summary +
-c/--compare hack, while also allowing a similar table style for
non-compare results.
2024-12-18 14:03:35 -06:00
Christopher Haster
4c87d59c7b scripts: Simplified result->file mapping, dropped collect_dwarf_files
This reverts per-result source file mapping, and tears out of a bunch of
messy dwarf parsing code. Results from the same .o file are now mapped
to the same source file.

This was just way too much complexity for slightly better result->file
mapping, which risked losing results accidentally mapped to the wrong
file.

---

I was originally going to revert all the way back to relying strictly on
the .o name and --build-dir (490e1c4) (this is the simplest solution),
but after poking around in dwarf-info a bit, I realized we do have
access to the original source file in DW_TAG_compile_unit's
DW_AT_comp_dir + DW_AT_name.

This is much simpler/more robust than parsing objdump --dwarf=rawline,
and avoid needing --build-dir in a bunch of scripts.

---

This also reverts stack.py to rely only on the .ci files. These seem as
reliable as DW_TAG_compile_unit while simplifying things significantly.

Symbol mapping used to be a problem, but this was fixed by using the
symbol in the title field instead of the label field (which strips some
optimization suffixes?)
2024-12-17 15:34:39 -06:00
Christopher Haster
6a6ed0f741 scripts: Dropped cycle detection from table renderer
Now that cycle detection is always done at result collection time, we
don't need this in the table renderer itself.

This had a tendency to cause problems for non-function scripts (ctx.py,
structs.py).
2024-12-16 19:26:21 -06:00
Christopher Haster
dd389f23ee scripts: Switched to sorted sets for result notes
God, I wish Python had an OrderedSet.

This is a fix for duplicate "cycle detected" notes when using -t/--hot.
This mix of merging both _hot_notes and _notes in the HotResult class is
tricky when the underlying container is a list.

The order is unlikely to be guaranteed anyways, when different results
with different notes are folded.

And if we ever want more control over the order of notes in result
scripts we can always change this back later.
2024-12-16 19:22:14 -06:00
Christopher Haster
3e03c2ee7f scripts: Adopted better input file handling in result scripts
- Error on no/insufficient files.

  Instead of just returning no results. This is more useful when
  debugging complicated bash scripts.

- Use elf magic to allow any file order in perfbd.py/stack.py.

  This was already implemented in stack.py, now also adopted in
  perfbd.py.

  Elf files always start with the magic string "\x7fELF", so we can use
  this to figure out the types of input files without needing to rely on
  argument order.

  This is just one less thing to worry about when invoking these
  scripts.
2024-12-16 19:13:22 -06:00
Christopher Haster
ac79c88c6f scripts: Improved cycle detection notes in scripts
- Prevented childrenof memoization from hiding the source of a
  detected cycle.

- Deduplicated multiple cycle detected notes.

- Fixed note rendering when last column does not have a notes list.
  Currently this only happens when entry is None (no results).
2024-12-16 18:01:46 -06:00
Christopher Haster
faf4d09c34 scripts: Added __repr__ to RInt and friends
Just a minor quality of life feature to help debugging these scripts.
2024-12-16 18:01:46 -06:00
Christopher Haster
8526cd9cf1 scripts: Prevented i/children/notes result field collisions
Without this, naming a column i/children/notes in csv.py could cause
things to break. Unlikely for children/notes, but very likely for i,
especially when benchmarking.

Unfortunately namedtuple makes this tricky. I _want_ to just rename
these to _i/_children/_notes and call the problem solved, but namedtuple
reserves all underscore-prefixed fields for its own use.

As a workaround, the table renderer now looks for _i/_children/_notes at
the _class_ level, as an optional name of which namedtuple field to use.
This way Result types can stay lightweight namedtuples while including
extra table rendering info without risk of conflicts.

This also makes the HotResult type a bit more funky, but that's not a
big deal.
2024-12-15 16:36:14 -06:00
Christopher Haster
183ede1b83 scripts: Option for result scripts to force children ordering
This extends the recursive part of the table renderer to sort children
by the optional "i" field, if available.

Note this only affects children entries. The top-level entries are
strictly ordered by the relevant "by" fields. I just haven't seen a use
case for this yet, and not sorting "i" at the top-level reduces that
number of things that can go wrong for scripts without children.

---

This also rewrites -t/--hot to take advantage of children ordering by
injecting a totally-no-hacky HotResult subclass.

Now -t/--hot should be strictly ordered by the call depth! Though note
entries that share "by" fields are still merged...

This also gives us a way to introduce the "cycle detected" note and
respect -z/--depth, so overall a big improvement for -t/--hot.
2024-12-15 16:35:52 -06:00
Christopher Haster
e6ed785a27 scripts: Removed padding from tail notes in tables
We don't really need padding for the notes on the last column of tables,
which is where row-level notes end up.

This may seem minor, but not padding here avoids quite a bit of
unnecessary line wrapping in small terminals.
2024-12-15 16:35:29 -06:00
Christopher Haster
512cf5ad4b scripts: Adopted ctx.py-related changes in other result scripts
- Adopted higher-level collect data structures:

  - high-level DwarfEntry/DwarfInfo class
  - high-level SymInfo class
  - high-level LineInfo class

  Note these had to be moved out of function scope due to pickling
  issues in perf.py/perfbd.py. These were only function-local to
  minimize scope leak so this fortunately was an easy change.

- Adopted better list-default patterns in Result types:

    def __new__(..., children=None):
        return Result(..., children if children is not None else [])

  A classic python footgun.

- Adopted notes rendering, though this is only used by ctx.py at the
  moment.

- Reverted to sorting children entries, for now.

  Unfortunately there's no easy way to sort the result entries in
  perf.py/perfbd.py before folding. Folding is going to make a mess
  of more complicated children anyways, so another solution is
  needed...

And some other shared miscellany.
2024-12-15 15:41:11 -06:00
Christopher Haster
25814ed5cb scripts: Fixed failed subprocess stderr, unconditionally forward
It looks like the failure case in our scripts' subprocess stderr
handling was not tested well during a fix to stderr blocking (a735bcd).

This code was attempting to print stderr only if an error occured, but
with stderr=None this just results in a NoneType TypeError.

In retrospect, completely hiding stderr is kind of shitty if a
subprocess fails, but it doesn't seem possible to read from both stdin
and stderr with Python's APIs without getting stuck when the stderr's
buffer is full.

It might be possible to work around this with either multithreading,
select calls, or a temp file, but I'm not sure slightly less verbose
scripts are worth the added complexity in every single subprocess call.

For now just reverting to unconditionally forwarding stderr from the
child process. This is the simplest/most robust option.
2024-12-14 15:08:39 -06:00
Christopher Haster
b58266c3b0 scripts: Small refactor to adopt collect_thing pattern everywhere
- stack.py:collect -> collect + collect_cov
- perf.py:collect_syms_and_lines -> collect_syms + collect_dwarf_lines
- perfbd.py:collect_syms_and_lines -> collect_syms + collect_dwarf_lines

This should hopefully lead to both better readability and better code
reuse.

Note collect_dwarf_lines is a bit different than collect_dwarf_files in
code.py/data.py/etc, but the extra complexity of collect_dwarf_lines is
probably not worth sharing here.
2024-12-14 15:08:04 -06:00
Christopher Haster
e00db216c1 scripts: Consistent table renderer, cycle detection optional
The fact that our scripts' table renderer was slightly different for
recursive scripts (stack.py, perf.py) and non-recursive scripts
(code.py, structs.py) was a ticking time bomb, one innocent edit away
from breaking half the scripts.

The makes the table renderer consistent across all scripts, allowing for
easy copy-pasting when editing at the cost of some unused code in
scripts.

One hiccup with this though is the difference in cycle detection
behavior between scripts:

- stack.py:

    lfsr_bd_sync
    '-> lfsr_bd_prog
        '-> lfsr_bd_sync  <-- cycle!

- structs.py:

    lfsr_bshrub_t
    '-> u
        '-> bsprout
            '-> u  <-- not a cycle!

To solve this the table renderer now accepts a simple detect_cycles
flag, which can be set per-script.
2024-12-14 12:25:15 -06:00
Christopher Haster
ef3accc07c scripts: Tweaked -p/--percent to accept the csv file for diffing
This makes the -p/--percent flag a bit more consistent with -d/--diff
and -c/--compare, both of which change the printing strategy based on
additional context.
2024-11-16 18:01:27 -06:00
Christopher Haster
9a2b561a76 scripts: Adopted -c/--compare in make summary-diff
This showcases the sort of high-level result printing where -c/--compare
is useful:

  $ make summary-diff
              code             data           stack          structs
  BEFORE     57057                0            3056             1476
  AFTER      68864 (+20.7%)       0 (+0.0%)    3744 (+22.5%)    1520 (+3.0%)

There was one hiccup though: how to hide the name of the first field.

It may seem minor, but the missing field name really does help
readability when you're staring at a wall of CLI output.

It's a bit of a hack, but this can now be controlled with -Y/--summary,
which has the sole purpose of disabling the first field name if mixed
with -c/--compare.

-c/--compare is already a weird case for the summary row anyways...
2024-11-16 18:01:15 -06:00
Christopher Haster
29eff6f3e8 scripts: Added -c/--compare for comparing specific result rows
Example:

  $ ./scripts/csv.py lfs.code.csv \
          -bfunction -fsize \
          -clfsr_rbyd_appendrattr
  function                                size
  lfsr_rbyd_appendrattr                   3598
  lfsr_mdir_commit                        5176 (+43.9%)
  lfsr_btree_commit__.constprop.0         3955 (+9.9%)
  lfsr_file_flush_                        2729 (-24.2%)
  lfsr_file_carve                         2503 (-30.4%)
  lfsr_mountinited                        2357 (-34.5%)
  ... snip ...

I don't think this is immediately useful for our code/stack/etc
measurement scripts, but it's certainly useful in csv.py for comparing
results at a high level.

And by useful I mean it replaces a 40-line long awk script that has
outgrown its original purpose...
2024-11-16 17:59:22 -06:00
Christopher Haster
2fa968dd3f scripts: csv.py: Fixed divide-by-zero, return +-inf
This may make some mathematician mad, but these are informative scripts.
Returning +-inf is much more useful than erroring when dealing with
several hundred rows of results.

And hey, if it's good enough for IEEE 754, it's good enough for us :)

Also fixed a division operator mismatch in RFrac that was causing
problems.
2024-11-16 16:47:48 -06:00
Christopher Haster
5dc9eabbf7 scripts: csv.py: Fixed use of __div__ vs __truediv__
Not sure if this is an old habit from Python 2, or just because it looks
nicer next to __mul__, __mod__, etc, but in Python 3 this should be
__truediv__ (or __floordiv__), not __div__.
2024-11-16 16:38:36 -06:00
Christopher Haster
0ac326d9cb scripts: Reduced table name widths to 8 chars minimum
I still think the 24 (23+1) char minimum is a good default for 2 column
output such as help text, especially if you don't have automatic width
detection. But our result scripts need to be a bit more flexible.

Consider:

  $ make summary
                              code     data    stack  structs
  TOTAL                      68864        0     3744     1520

Vs:

  $ make summary
              code     data    stack  structs
  TOTAL      68864        0     3744     1520

Up until now we were just kind of working around this with cut -c 25- in
our Makefile, but now that our result scripts automatically scale the
table widths, they should really just default to whatever is the most
useful.
2024-11-16 13:39:42 -06:00
Christopher Haster
434479f101 scripts: Adopted csv.py-related result-type tweaks in all scripts
- RInt/RFloat now accepts implicitly castable types (mainly
  RInt(RFloat(x)) and RFloat(RInt(x))).

- RInt/RFloat/RFrac are now "truthy", implements __bool__.

- More operator support for RInt/RFloat/RFrac:

  - __pos__ => +a
  - __neg__ => -a
  - __abs__ => abs(a)
  - __div__ => a/b
  - __mod__ => a%b

  These work in Python, but are mainly used to implement expr eval in
  csv.py.
2024-11-16 13:37:15 -06:00
Christopher Haster
7cfcc1af1d scripts: Renamed summary.py -> csv.py
This seems like a more fitting name now that this script has evolved
into more of a general purpose high-level CSV tool.

Unfortunately this does conflict with the standard csv module in Python,
breaking every script that imports csv (which is most of them).
Fortunately, Python is flexible enough to let us remove the current
directory before imports with a bit of an ugly hack:

  # prevent local imports
  __import__('sys').path.pop(0)

These scripts are intended to be standalone anyways, so this is probably
a good pattern to adopt.
2024-11-09 12:31:16 -06:00
Christopher Haster
007ac97bec scripts: Adopted double-indent on multiline expressions
This matches the style used in C, which is good for consistency:

  a_really_long_function_name(
          double_indent_after_first_newline(
              single_indent_nested_newlines))

We were already doing this for multiline control-flow statements, simply
because I'm not sure how else you could indent this without making
things really confusing:

  if a_really_long_function_name(
          double_indent_after_first_newline(
              single_indent_nested_newlines)):
      do_the_thing()

This was the only real difference style-wise between the Python code and
C code, so now both should be following roughly the same style (80 cols,
double-indent multiline exprs, prefix multiline binary ops, etc).
2024-11-06 15:31:17 -06:00
Christopher Haster
48c2e7784b scripts: Renamed import math alias m -> mt
Mainly to avoid conflicts with match results m, this frees up the single
letter variables m for other purposes.

Choosing a two letter alias was surprisingly difficult, but mt is nice
in that it somewhat matches it (for itertools) and ft (for functools).
2024-11-05 01:58:40 -06:00
Christopher Haster
c0a9af1e9a scripts: Moved recursive entry generation before table rendering
This fixes an issue where mixing recursive renderers (-t/--hot or
-z/--depth) with defines (-Dfunction=lfsr_mount) would not account for
children entry widths. An unexpected side-effect of no longer filtering
the children entries.

We could continue to try to estimate the width without table rendering,
but it would basically need two full recursive pass at this point...
Instead, I've just moved the recursive stuff before table rendering,
which should remove any issues with width calculation while also
deduplicating the recursive passes.

It's invasive for a small change, but probably worthwhile long term.

The downside is this does mean our recursive scripts now build the full
table (including all recursive calls!) before they start printing. When
mixed with unbounded recursive depth (-z0 or --depth=0) this can get
quite large and cause quite a slow start.

But I guess that was the tradeoff in adopting this sort of intermediate
table rendering... At least it does make the code simpler and less bug
prone...
2024-11-04 18:18:58 -06:00
Christopher Haster
d324333903 scripts: Fixed names/lines falling out of sync in diff table renderers
As a convenience, -d/--diff in our measurement scripts hides entries
that are unchanged by default.

Unfortunately this was broken during a recent refactor that ended up
filtering the line info but not the actual names.

Instead of reverting the broken part of the refactor, I've just moved the
filtering up to where we calculate the names. Hopefully this fixes the
bug while also simplifying this messy chunk of a logic a bit.
2024-11-04 18:04:58 -06:00
Christopher Haster
a735bcd667 Fixed hanging scripts trying to parse stderr
code.py, specifically, was getting messed up by inconsequential GCC
objdump errors on Clang -g3 generated binaries.

Now stderr from child processes is just redirected to /dev/null when
-v/--verbose is not provided.

If we actually depended on redirecting stderr->stdout these scripts
would have been broken when -v/--verbose was provided anyways. Not
really sure what the original code was trying to do...
2024-06-20 13:04:07 -05:00
Christopher Haster
54d77da2f5 Dropped csv field prefixes in scripts
The original idea was to allow merging a whole bunch of different csv
results into a single lfs.csv file, but this never really happened. It's
much easier to operate on smaller context-specific csv files, where the
field prefix:

- Doesn't really add much information
- Requires more typing
- Is confusing in how it doesn't match the table field names.

We can always use summary.py -fcode_size=size to add prefixes when
necessary anyways.
2024-06-02 19:19:46 -05:00
Christopher Haster
169952dec0 Tweaked scripts to render new entry ratios as +∞%
We already rely on this symbol in these scripts, so might use it to
display the mathematically correct ratio for new entries.

This has the added benefit of ordering new entries vs extremely big
changes correctly:

  $ ./scripts/code.py -u test.after.csv -d test.before.csv
  function (1 added, 0 removed)      osize    nsize    dsize
  test_a                                 -       49      +49 (+∞%)
  test_b                                19      719     +700 (+3684.2%)
  test_c                                91      191     +100 (+109.9%)
  TOTAL                                110      959     +849 (+771.8%)
2024-06-02 19:19:46 -05:00
Christopher Haster
06bfed7a8b Interspersed precent/notes in measurement scripts
This is a bit more complicated, but make testmarks really showed how
confusing this could get.

Now, instead of:

  suite                             passed    time
  test_alloc                       304/304     1.6 (100.0%)
  test_badblocks                 6880/6880  1323.3 (100.0%)
  ... snip ...
  test_rbyd                  385878/385878   592.7 (100.0%)
  test_relocations               7899/7899   318.8 (100.0%)
  TOTAL                      548206/548206  6229.7 (100.0%)

Percents/notes are interspersed next to their relevant fields:

  suite                             passed             time
  test_alloc                       304/304 (100.0%)     1.6
  test_badblocks                 6880/6880 (100.0%)  1323.3
  ... snip ...
  test_rbyd                  385878/385878 (100.0%)   592.7
  test_relocations               7899/7899 (100.0%)   318.8
  TOTAL                      548206/548206 (100.0%)  6229.7

Note has no effect on scripts with only a single field (code.py, etc).

But it does make multi-field diffs a bit more readable:

  $ ./scripts/stack.py -u after.stack.csv -d before.stack.csv -p
  function                       frame             limit
  lfsr_bd_sync                       8 (+100.0%)     216 (+100.0%)
  lfsr_bd_flush                     40 (+25.0%)      208 (+4.0%)
  ... snip ...
  lfsr_file_flush                   32 (+0.0%)      2424 (-0.3%)
  lfsr_file_flush_                 216 (-3.6%)      2392 (-0.3%)
  TOTAL                           9008 (+0.4%)      2600 (-0.3%)
2024-06-02 19:19:38 -05:00
Christopher Haster
37f738cc71 Changed RFrac equality in scripts
Now, fractions are considered equal if they have the same ratio:

- 6/6 == 12/12 => True
- 3/6 == 3/12  => False
- 1/6 == 2/12  => True

It's interesting to note this implementation is actually more
numerically stable than float comparison, though that wasn't really the
goal.

The main reason for this is to allow other fields to take over when
sorting multi-field fractional data: cov (lines + branches), testmarks
(passed + time), etc. Before, sorting would usually stop after
mismatched fraction fields, which wasn't all that useful.
2024-05-28 15:19:11 -05:00
Christopher Haster
a9f6b6e903 Renamed internal script result types * -> R*
So Int -> RInt, Frac -> RFrac, etc. This just helps distinguish these
types from builtin types, which could be confusing.
2024-05-18 13:00:15 -05:00
Christopher Haster
03ea2e6ac5 Tweaked cov.py, summary.py, to render fraction percents as notes
This matches how diff percentages are rendered, and simplifies the
internal table rendering by making Frac less of a special case. It also
allows for other type notes in the future.

One concern is how all the notes are shoved to the side, which may make
it a bit harder to find related percentages. If this becomes annoying we
should probably look into interspersing all notes (including diff
percentages) between the relevant columns.

Before:

  function                                   lines            branches
  lfsr_rbyd_appendattr             230/231   99.6%     172/192   89.6%
  lfsr_rbyd_p_recolor                33/34   97.1%       11/12   91.7%
  lfs_alloc                          40/42   95.2%       21/24   87.5%
  lfsr_rbyd_appendcompaction         54/57   94.7%       39/42   92.9%
  ...

After:

  function                           lines    branches
  lfsr_rbyd_appendattr             230/231     172/192 (99.6%, 89.6%)
  lfsr_rbyd_p_recolor                33/34       11/12 (97.1%, 91.7%)
  lfs_alloc                          40/42       21/24 (95.2%, 87.5%)
  lfsr_rbyd_appendcompaction         54/57       39/42 (94.7%, 92.9%)
  ...
2024-05-18 13:00:15 -05:00
Christopher Haster
1d88fa9864 In scripts -d/--diff, show either all percentages or none
Previously, with -d/--diff, we would only show non-zero percentages. But
this was ambiguous/confusing when dealing with multiple results
(stack.py, summary.py, etc).

To help with this, I've switched to showing all percentages unless all
percentages are zero (no change). This matches the -d/--diff row-hiding
logic, so by default all rows should show all percentages.

Note -p/--percent did not change, as it already showed all percentages
all of the time.
2024-05-18 13:00:15 -05:00
Christopher Haster
6d81b0f509 Changed --context short flag to -C in scripts
This matches diff and grep, and avoids lower-case conflicts in
test.py/bench.py.
2023-11-06 01:59:03 -06:00
Christopher Haster
1e4d4cfdcf Tried to write errors to stderr consistently in scripts 2023-11-05 15:55:07 -06:00
Christopher Haster
d0a6ef0c89 Changed scripts to not infer field purposes from CSV values
Note there's a bit of subtlety here, field _types_ are still infered,
but the intention of the fields, i.e. if the field contains data vs
row name/other properties, must be unambiguous in the scripts.

There is still a _tiny_ bit of inference. For most scripts only one
of --by or --fields is strictly needed, since this makes the purpose of
the other fields unambiguous.

The reason for this change is so the scripts are a bit more reliable,
but also because this simplifies the data parsing/inference a bit.

Oh, and this also changes field inference to use the csv.DictReader's
fieldnames field instead of only inspecting the returned dicts. This
should also save a bit of O(n) overhead when parsing CSV files.
2023-11-04 15:24:18 -05:00
Christopher Haster
0f93fa3057 Tweaked script field arg parsing to strip whitespace almost everywhere
The whitespace sensitivity of field args was starting to be a problem,
mostly for advanced plotmpl.py usage (which tbf might be appropriately
described as "super hacky" in how it uses CLI parameters):

  ./scripts/plotmpl.py \
      -Dcase=" \
          bench_rbyd_attr_append, \
          bench_rbyd_attr_remove, \
          bench_rbyd_attr_fetch, \
          ..."

This may present problems when parsing CSV files with whitespace, in
theory, maybe. But given the scope of these scripts for littlefs...
just don't do that. Thanks.
2023-11-03 15:03:46 -05:00
Christopher Haster
616b4e1c9e Tweaked scripts that consume .csv files to filter defines early
With the quantity of data being output by bench.py now, filtering ASAP
while parsing CSV files is a valuable optimization. And thanks to how
CSV files are structured, we can even avoid ever loading the full
contents into RAM.

This does end up with use filtering for defines redundantly in a few
places, but this is well worth the saved overhead from early filtering.

Also tried to clean up the plot.py/plotmpl.py's data folding path,
though that may have been wasted effort.
2023-11-03 14:30:22 -05:00
Christopher Haster
e7bf5ad82f Added scripts/crc32c.py
This seems like a useful script to have.
2023-09-15 18:42:48 -05:00
Christopher Haster
c4b3e9d826 A couple of script changes after CI integration
- Renamed struct_.py -> structs.py again.

- Removed lfs.csv, instead prefering script specific csv files.

- Added *-diff make rules for quick comparison against a previous
  result, results are now implicitly written on each run.

  For example, `make code` creates lfs.code.csv and prints the summary, which
  can be followed by `make code-diff` to compare changes against the saved
  lfs.code.csv without overwriting.

- Added nargs=? support for -s and -S, now uses a per-result _sort
  attribute to decide sort if fields are unspecified.
2022-12-06 23:09:07 -06:00