90 Commits

Author SHA1 Message Date
Christopher Haster
d5b28df33a scripts: Fixed excessive rounding when writing floats to csv/json files
This adds __csv__ methods to all Csv* classes to indicate how to write
csv/json output, and adopts Python's default float repr. As a plus, this
also lets us use "inf" for infinity in csv/json files, avoiding
potential unicode issues.

Before this we were reusing __str__ for both table rendering and
csv/json writing, which rounded to a single decimal digit! This made
float output pretty much useless outside of trivial cases.

---

Note Python apparently does some of its own rounding (1/10 -> 0.1?), so
the result may still not be round-trippable, but this is probably fine
for our somewhat hack-infested csv scripts.
2025-05-15 15:44:30 -05:00
Christopher Haster
7526b469b9 scripts: Adopted globs in all field matchers (-D/--define, -c/--compare)
Globs in CLI attrs (-L'*=bs=%(bs)s' for example), have been remarkably
useful. It makes sense to extend this to the other flags that match
against CSV fields, though this does add complexity to a large number of
smaller scripts.

- -D/--define can now use globs when filtering:

    $ ./scripts/code.py lfs.o -Dfunction='lfsr_file_*'

  -D/--define already accepted a comma-separated list of options, so
  extending this to globs makes sense.

  Note this differs from test.py/bench.py's -D/--define. Globbing in
  test.py/bench.py wouldn't really work since -D/--define is generative,
  not matching. But there's already other differences such as integer
  parsing, range, etc. It's not worth making these perfectly consistent
  as they are really two different tools that just happen to look the
  same.

- -c/--compare now matches with globs when finding the compare entry:

    $ ./scripts/code.py lfs.o -c'lfs*_file_sync'

  This is quite a bit less useful that -D/--define, but makes sense for
  consistency.

  Note -c/--compare just chooses the first match. It doesn't really make
  sense to compare against multiple entries.

This raised the question of globs in the field specifiers themselves
(-f'bench_*' for example), but I'm rejecting this for now as I need to
draw the complexity/scope _somewhere_, and I'm worried it's already way
over on the too-complex side.

So, for now, field names must always be specified explicitly. Globbing
field names would add too much complexity. Especially considering how
many flags accept field names in these scripts.
2025-05-15 14:28:57 -05:00
Christopher Haster
55ea13b994 scripts: Reverted del to resolve shadowed builtins
I don't know how I completely missed that this doesn't actually work!

Using del _does_ work in Python's repl, but it makes sense the repl may
differ from actual function execution in this case.

The problem is Python still thinks the relevant builtin is a local
variables after deletion, raising an UnboundLocalError instead of
performing a global lookup. In theory this would work if the variable
could be made global, but since global/nonlocal statements are lifted,
Python complains with "SyntaxError: name 'list' is parameter and
global".

And that's A-Ok! Intentionally shadowing language builtins already puts
this code deep into ugly hacks territory.
2025-05-15 14:10:42 -05:00
Christopher Haster
71930a5c01 scripts: Tweaked openio comment
Dang, this touched like every single script.
2025-04-16 15:23:06 -05:00
Christopher Haster
c63ed79c5f scripts: Prefer .a for single entry namedtuples
- CsvInt.x -> CsvInt.a
- CsvFloat.x -> CsvFloat.a
- Rev.x -> Rev.a

This matches CsvFrac.a (paired with CsvFrac.b), and avoids confusion
with x/y variables such as Tile.x and Tile.y.

The other contender was .v, since these are cs*v* related types, but
sticking with .a gets the point across that the name really doesn't have
any meaning.

There's also some irony that we're forcing namedtuples to have
meaningless names, but it is useful to have a quick accessor for the
internal value.
2025-04-16 15:23:03 -05:00
Christopher Haster
98b16a9013 scripts: Renamed RInt (and friends) -> CsvInt (and friends)
This prefix was extremely arbitrary anyways.

The prefix Csv* has slightly more meaning than R*, since these scripts
interact with .csv files quite a bit, and it avoids confusion with
rbyd-related things such as Rattr, Ralt, etc.
2025-04-16 15:23:02 -05:00
Christopher Haster
613fa0f27a scripts: Reverted to -p/--percent not providing a path
So now the result scripts always require -d/--diff to diff:

- before: ./scripts/csv.py a.csv -pb.csv
- after:  ./scripts/csv.py a.csv -db.csv -p

For a couple reasons:

- Easier to toggle
- Simpler internally to only have one diff path flag
- The previous behavior was a bit unintuitive
2025-04-16 15:23:00 -05:00
Christopher Haster
270230a833 scripts: Adopted del to resolve shadowed builtins
So:

  all_ = all; del all

Instead of:

  import builtins
  all_, all = all, builtins.all

The del exposes the globally scoped builtin we accidentally shadow.

This requires less megic, and no module imports, though tbh I'm
surprised it works.

It also works in the case where you change a builtin globally, but
that's a bit too crazy even for me...
2025-04-16 15:22:08 -05:00
Christopher Haster
3a290c41ab scripts: Reverted -o/-O to include all by-fields by default
For the same reason we output all field fields by default: Because
machines can process more information than humans can.

Worst case, by fields can still be limited via explicit -b/--by flags.
2025-03-12 21:26:11 -05:00
Christopher Haster
313696ecf9 scripts: Fixed openio issue where some scripts didn't import os
This only failed if "-" was used as an argument (for stdin/stdout), so
the issue was pretty hard to spot.

openio is a heavily copy-pasted function, so it makes sense to just add
the import os to openio directly. Otherwise this mistake will likely
happen again in the future.
2025-03-12 21:18:51 -05:00
Christopher Haster
92ac2a757e scripts: Adopted json -> is_json tweak, avoiding name conflict
This was a humorous name conflict that went unnoticed only because we
lazily import json in read_csv.
2025-03-12 21:12:12 -05:00
Christopher Haster
0d134a2830 scripts: Re-added -q/--quiet to result scripts
I forgot that this is still useful for erroring scripts, such as
stack.py when checking for recursion.

Technically this is possible with -o/dev/null, but that's both
unnecessarily complicated and includes the csv encoding cost for no
reason.
2025-03-12 20:02:19 -05:00
Christopher Haster
675a805164 scripts: Added -! as a short-form for --everything
-!/--everything has been useful enough to warrant a short form flag,
and -! is unlikely to conflict with other flags while also getting the
point across that this is a bit of an unusual option.
2025-03-12 20:02:12 -05:00
Christopher Haster
b0976379d7 scripts: Added -i/--internal to ctx.py/structs.py, re-limiting structs.py
This adds -i/--internal to ctx.py and structs.py, which has proven
useful for introspection/debugging. Being able to view the ctx/args of
internal functions is nice, even if they don't actually contribute to
the high-level cost.

This also reverts structs.py to limit to .h files by default, to match
ctx.py, once again relying on dwarf file info. This has been a bit
unreliable in the past, but there's not much else that determines if a
struct is part of the "public interface" in C.

But that's what ctx.py is for.

---

Also fixed an issue where structs appearing in multiple files would have
their sizes added together, which ends up with some pretty confusing
results (sizeof(uint32_t) => 8?).
2025-03-12 20:00:56 -05:00
Christopher Haster
9e22167a31 scripts: Re-adopted result prefixes
Now that I'm looking into some higher-level scripts, being able to merge
results without first renaming everything is useful.

This gives most scripts an implicit prefix for field fields, but _not_
by fields, allowing easy merging of results from different scripts:

  $ ./scripts/stack.py lfs.ci -o-
  function,stack_frame,stack_limit
  lfs_alloc,288,1328
  lfs_alloc_discard,8,8
  lfs_alloc_findfree,16,32
  ...

At least now these have better support in scripts with the addition of
the --prefix flag (this was tricky for csv.py), which allows explicit
control over field field prefixes:

  $ ./scripts/stack.py lfs.ci -o- --prefix=
  function,frame,limit
  lfs_alloc,288,1328
  lfs_alloc_discard,8,8
  lfs_alloc_findfree,16,32
  ...

  $ ./scripts/stack.py lfs.ci -o- --prefix=wonky_
  function,wonky_frame,wonky_limit
  lfs_alloc,288,1328
  lfs_alloc_discard,8,8
  lfs_alloc_findfree,16,32
  ...
2025-03-12 19:10:17 -05:00
Christopher Haster
aae03be54b scripts: Fixed diff result sorting
This was a bit broken when r was None. Which is unusual, but happens
when rendering added/removed diff results.
2025-03-12 19:10:17 -05:00
Christopher Haster
299e2604c6 scripts: Changed -o/-O to an exclusive operation
So:

  $ ./scripts/code.py lfs.o -o- -q

Becomes:

  $ ./scripts/code.py lfs.o -o-

The original intention of -o/-O _not_ being exclusive (aka table is
still rendered unless disabled with -q/--quiet), was to allow results to
be written to csv files and rendered to tables in a single pass.

But this was never useful. Heck, we're not even using this in our
Makefile right now because it would make the rule dependencies more
complicated than it's worth. Even for long-running result scripts
(perf.py, perfbd.py, etc), most of the work is building that csv file,
the cost of rendering a table in a second pass is negligible.

In every case I've used -o/-O, I've also wanted -q/--quiet, and almost
always forget this on the first run. So might as well make the expected
behavior the actual behavior.

---

As a plus, this let us simplify some of the scripts a bit, by replacing
visibility filters with -o/-O dependent by-fields.
2025-03-12 19:10:17 -05:00
Christopher Haster
e71aca65d9 scripts: Adopted default visibility in scripts with complex fields
This makes it so scripts with complex fields will still output all
fields to output csv/json files, while only showing a user-friendly
subset unless -f/--field is explicitly provided.

While internal fields are often too much information to show by default,
csv/json files are expected to go to other scripts, not humans. So more
information is more useful up until you actually hit a performance
bottleneck.

And if you _do_ somehow manage to hit a performance bottleneck, you can
always limit the output with explicit -f/--field flags.
2025-03-12 19:10:17 -05:00
Christopher Haster
051bf66f9a scripts: Tried to handle -d/--diff results consistently
With this, we apply the same result modifiers (exprs/defines/hot/etc) to
both the input results and -d/--diff results. So if both start with the
same format, diffing/hotifying/etc should work as expected.

This is really the only way I can seen -d/--diff results working with
result modifiers in a way that makes sense.

The downside of this is that you can't save results with some complex
operation applied, and then diff while applying the same operation,
since most of the newer operations (hotify) are _not_ idempotent.

Fortunately the two alternatives are not unreasonable:

1. Save results _without_ the operation applied, since the operation
   will be applied to both the input and diff results.

   This is a bit asymmetric, but should work.

2. Apply the operation to the input and then pipe to csv.py for diffing.

This used to "just work" when we did _not_ apply operations to output
csv/json, but this was really just equivalent to 1..

I think the moral of the story is you can solve any problem with enough
chained csv.py calls.
2025-03-12 19:10:17 -05:00
Christopher Haster
2f20f53e90 scripts: csv.py: Reverted define filtering to before expr eval
It's just too unintuitive to filter after exprs.

Note this is consistent with how exprs/mods are evaluated. Exprs/mods
can't reference other exprs/mods because csv.py is only single-pass, so
allowing defines to reference exprs/mods is surprising.

And the solution to needing these sort of post-expr/mod references is
the same for defines: You can always chain multiple csv.py calls.

The reason defines were change to evaluate after expr eval was because
this seemed inconsistent with other result scripts, but this is not
actually the case. Other result scripts simply don't have exprs/mods, so
filtering in fold is the same as filtering during collection. Note that
even in fold, filtering is done _before_ the actual fold/sum operation.

---

Also fixed a recursive-define regression when folding. Counter-
intuitively, we _don't_ want to recursively apply define filters. If we
do the results will just end up too confusing to be useful.
2025-03-12 19:10:17 -05:00
Christopher Haster
e851c654c5 scripts: Fixed typo hiding zero-sized results in table renderer
This should either have checked diff_result==None, or we should be
mapping diff_result=None => diff_result_=None. To be safe I've done
both.

This was a nasty typo and I only noticed because ctx.py stopped printing
"cycle detected" for our linked-lists (which are expected to be cyclic).
2025-03-12 19:10:17 -05:00
Christopher Haster
5811b11131 scripts: csv.py: Replaced -l/--label with -I/-B/-F for hidden fields
It felt weird that adding hidden fields required changing existing
flags unrelated to the field you actually want to affect, and the
upper/lower flag thing seems to work well for -s/-S sooo...

- Replaced -l/--label with -B/--hidden-by for by fields that can
  be hidden from the table renderer.

- Added -F/--hidden-field as a similar thing for field fields.

- Better integrated -i/--enumerate into by fields, now these actually
  maintain related order. And of course added a matching
  -I/--hidden-enumerate flag.

The only downside is this is eating a lot of flag names.. But one of the
nice thing about limiting this complexity to csv.py is it avoids these
flag names cluttering up the other result scripts.

---

The -F/--hidden-fields flag I'm not so sure about, since field exprs
can't really reference each other (single pass). But it does provide
symmetry with -B/--hidden-by, and reserves the name in case hidden field
fields are more useful in the future.

Unfortunately it _is_ annoyingly inconsistent with other hidden fields
(-S/--sort, -D/--define, etc) in that it does end up in output csvs...

But this script is already feeling way over-engineered as is.
2025-03-12 19:10:17 -05:00
Christopher Haster
7789714560 scripts: Adopted single folding pass, fixing perf[bd].py -r/--hot issue
There's an ordering issue with hotifying and folding when we have
multiple foldable results with children. This was hard to notice since
most of the recursive scripts have unique results, but it _is_ an issue
for perf.py/perfbd.py, which rely on result folding to merge samples.

The fix is to fold _before_ hotifying.

We could fold multiple times to avoid changing the behavior of the
result scripts, but instead I've just moved the folding in the table
renderer up into the relevant main functions. This means 1. we only fold
once, and 2. folding affects outputted csv/json files.

I'm a bit on the fence about this behavior change, but it is a bit more
consistent with how -r/--hot, -z/--depth, etc, affect both table and
csv/json results consistently.

Maybe we should move towards the table render always reflecting the
csv/json results? Most csv/json usage is with -q/--quiet anyways...

---

This does create a new risk in that the table renderer can hide results
if they aren't folded first.

To hopefully avoid this I've added an assert in the table renderer if it
notices results being hidden.
2025-03-12 19:10:17 -05:00
Christopher Haster
b2768becaa scripts: Added -l/--labels to csv.py
This gives csv.py access to a hidden feature in our table renderer used
by some of the other scripts: fields that affect by-field grouping, but
aren't actually printed.

For example, this prevents summing same named functions in different
files, but only shows the function name in the table render:

  $ ./scripts/csv.py lfs.code.csv -bfile -bfunction -lfunction
  function                                size
  lfs_alloc                                398
  lfs_alloc_discard                         31
  lfs_alloc_findfree                        77
  ...

This is especially useful when enumerating results. For example, this
prevents any summing without extra table noise:

  $ ./scripts/csv.py lfs.code.csv -i -bfunction -fsize -lfunction
  function                                size
  lfs_alloc                                398
  lfs_alloc_discard                         31
  lfs_alloc_findfree                        77
  ...

I also tweaked -b/--by field defaults a bit to account to
enumerate/label fields a bit better.
2025-03-12 19:10:17 -05:00
Christopher Haster
748815bb46 scripts: Disentangled -r/--hot and -i/--enumerate
This removes most of the special behavior around how -r/--hot and
-i/--enumerate interact. This does mean -r/--hot risks folding results
if -i/--enumerate is not specified, but this is _technically_ a valid
operation.

For most of the recursive result scripts, I've replaced the "i" field
with separate "z" and "i" fields for depth and field number, which I
think is a bit more informative/useful.

I've also added a default-hidden "off" field to structs.py/ctx.py, since
we have that info available. I considered replacing "i" with this, but
decided against it since non-zero offsets for union members would risk
being confusing/mistake prone.
2025-03-12 19:10:17 -05:00
Christopher Haster
ac30a20d12 scripts: Reworked to support optional json input/output
Guh

This may have been more work than I expected. The goal was to allowing
passing recursive results (callgraph info, structs, etc) between
scripts, which is simply not possible with csv files.

Unfortunately, this raised a number of questions: What happens if a
script receives recursive results? -d/--diff with recursive results?
How to prevent folding of ordered results (structs, hot, etc) in piped
scripts? etc.

And ended up with a significant rewrite of most of the result scripts'
internals.

Key changes:

- Most result scripts now support -O/--output-json in addition to
  -o/--json, with -O/--output-json including any recursive results in
  the "children" field.

- Most result scripts now support both csv and json as input to relevant
  flags: -u/--use, -d/--diff, -p/--percent. This is accomplished by
  looking for a '[' as the first character to decide if an input file is
  json or csv.

  Technically this breaks if your json has leading whitespace, but why
  would you ever keep whitespace around in json? The human-editability
  of json was already ruined the moment comments were disallowed.

- csv.py requires all fields to be explicitly defined, so added
  -i/--enumerate, -Z/--children, and -N/--notes. At least we can provide
  some reasonable defaults so you shouldn't usually need to type out the
  whole field.

- Notably, the rendering scripts (plot.py, treemapd3.py, etc) and
  test/bench scripts do _not_ support json. csv.py can always convert
  to/from json when needed.

- The table renderer now supports diffing recursive results, which is
  nice for seeing how the hot path changed in stack.py/perf.py/etc.

- Moved the -r/--hot logic up into main, so it also affects the
  outputted results. Note it is impossible for -z/--depth to _not_
  affect the outputted results.

- We now sort in one pass, which is in theory more efficient.

- Renamed -t/--hot -> -r/--hot and -R/--reverse-hot, matching -s/-S.

- Fixed an issue with -S/--reverse-sort where only the short form was
  actually reversed (I misunderstood what argparse passes to Action
  classes).

- csv.py now supports json input/output, which is funny.
2025-03-12 19:09:43 -05:00
Christopher Haster
361cd3fec0 scripts: Added missing sys imports
Unfortunately the import sys in the argparse block was hiding missing
sys imports.

The mistake was assuming the import sys in Python would limit the scope
to that if block, but Python's late binding strikes again...
2025-01-28 14:41:45 -06:00
Christopher Haster
0adec7f15c scripts: Replaced __builtins__ with builtins
Apparently __builtins__ is a CPython implementation detail, and behaves
differently when executed vs imported???

import builtins is the correct way to go about this.
2025-01-28 14:41:45 -06:00
Christopher Haster
62cc4dbb14 scripts: Disabled local import hack on import
Moved local import hack behind if __name__ == "__main__"

These scripts aren't really intended to be used as python libraries.
Still, it's useful to import them for debugging and to get access to
their juicy internals.
2025-01-28 14:41:30 -06:00
Christopher Haster
1d8d0785fc scripts: More flags to control table renderer, -Q/--small-table, etc
Instead of trying to be too clever, this just adds a bunch of small
flags to control parts of table rendering:

- --no-header - Don't show the header.
- --small-header - Don't show by field names.
- --no-total - Don't show the total.
- -Q/--small-table - Equivalent to --small-header + --no-total.

Note that -Q/--small-table replaces the previous -Y/--summary +
-c/--compare hack, while also allowing a similar table style for
non-compare results.
2024-12-18 14:03:35 -06:00
Christopher Haster
4c87d59c7b scripts: Simplified result->file mapping, dropped collect_dwarf_files
This reverts per-result source file mapping, and tears out of a bunch of
messy dwarf parsing code. Results from the same .o file are now mapped
to the same source file.

This was just way too much complexity for slightly better result->file
mapping, which risked losing results accidentally mapped to the wrong
file.

---

I was originally going to revert all the way back to relying strictly on
the .o name and --build-dir (490e1c4) (this is the simplest solution),
but after poking around in dwarf-info a bit, I realized we do have
access to the original source file in DW_TAG_compile_unit's
DW_AT_comp_dir + DW_AT_name.

This is much simpler/more robust than parsing objdump --dwarf=rawline,
and avoid needing --build-dir in a bunch of scripts.

---

This also reverts stack.py to rely only on the .ci files. These seem as
reliable as DW_TAG_compile_unit while simplifying things significantly.

Symbol mapping used to be a problem, but this was fixed by using the
symbol in the title field instead of the label field (which strips some
optimization suffixes?)
2024-12-17 15:34:39 -06:00
Christopher Haster
6a6ed0f741 scripts: Dropped cycle detection from table renderer
Now that cycle detection is always done at result collection time, we
don't need this in the table renderer itself.

This had a tendency to cause problems for non-function scripts (ctx.py,
structs.py).
2024-12-16 19:26:21 -06:00
Christopher Haster
dd389f23ee scripts: Switched to sorted sets for result notes
God, I wish Python had an OrderedSet.

This is a fix for duplicate "cycle detected" notes when using -t/--hot.
This mix of merging both _hot_notes and _notes in the HotResult class is
tricky when the underlying container is a list.

The order is unlikely to be guaranteed anyways, when different results
with different notes are folded.

And if we ever want more control over the order of notes in result
scripts we can always change this back later.
2024-12-16 19:22:14 -06:00
Christopher Haster
3e03c2ee7f scripts: Adopted better input file handling in result scripts
- Error on no/insufficient files.

  Instead of just returning no results. This is more useful when
  debugging complicated bash scripts.

- Use elf magic to allow any file order in perfbd.py/stack.py.

  This was already implemented in stack.py, now also adopted in
  perfbd.py.

  Elf files always start with the magic string "\x7fELF", so we can use
  this to figure out the types of input files without needing to rely on
  argument order.

  This is just one less thing to worry about when invoking these
  scripts.
2024-12-16 19:13:22 -06:00
Christopher Haster
4325a06277 scripts: Fixed incorrect files on recursive results
It's been a while since I've been hurt by Python's late-binding
variables. In this case the scope-creep of the "file" variable hid that
we didn't actually know which recursive result belonged to which file.
Instead we were just assigning whatever the most recent top-level result
was.

This is fixed by looking up the correct file in childrenof. Though this
unfortunately does add quite a bit of noise.
2024-12-16 19:12:46 -06:00
Christopher Haster
ac79c88c6f scripts: Improved cycle detection notes in scripts
- Prevented childrenof memoization from hiding the source of a
  detected cycle.

- Deduplicated multiple cycle detected notes.

- Fixed note rendering when last column does not have a notes list.
  Currently this only happens when entry is None (no results).
2024-12-16 18:01:46 -06:00
Christopher Haster
02ccbdfed2 scripts: Enabled symbol->dwarf mapping via address
We have symbol->addr info and dwarf->addr info (DW_AT_low_pc), so why
not use this to map symbols to dwarf entries?

This should hopefully be more reliable than the current name based
heuristic, but only works for functions (DW_TAG_subprogram).

Note that we still have to fuzzy match due to thumb-bit weirdness (small
rant below).

---

Ok. Why in Thumb does the symbol table include the thumb bit, but the
dwarf info does not?? Would it really have been that hard to add the
thumb bit to DW_AT_low_pc so symbols and dwarf entries match?

So, because of Thumb, we can't expect either the address or name to
match exactly. The best we can do is binary search and expect the symbol
to point somewhere _within_ the dwarf's DW_AT_low_pc/DW_AT_high_pc
range.

Also why does DW_AT_high_pc store the _size_ of the function?? Why isn't
it, idunno, the _high_pc_? I get that the size takes up less space when
leb128 encoding, but surely there could have been a better name?
2024-12-16 18:01:46 -06:00
Christopher Haster
eb09865868 scripts: Resolve DW_AT_abstract_origin during dwarf collection
Sometimes I feel like dwarf-info is designed to be as error-prone as
possible.

In this case, DW_AT_abstract_origin indicates that one dwarf entry
should inherit the attributes of another. If you don't know this, it's
easy to miss relevant dwarf entries due to missing name fields, etc.

Expanding DW_AT_abstract_origin lazily would be tricky due to how our
DwarfInfo class is structured, so instead I am just expanding
DW_AT_abstract_origins during collect_dwarf_info.

Note this doesn't handle recursive DW_AT_abstract_origins, but there is
at least an assert.

---

It does seem like DW_AT_abstract_origin is intended to be limited to
"Inline instances of inline subprograms" and "Out-of-line instances of
inline subprograms" according to the DWARF5 spec, but it's unclear if
this is a rule or suggestion...

This hasn't been an issue for existing scripts, but is needed from some
ongoing stack.py rework. Otherwise we don't find "out-of-line instances
of inline subprograms" (optimized functions?) correctly.
2024-12-16 18:01:46 -06:00
Christopher Haster
19cd428a3c scripts: Added DwarfEntry.info to help find recursive tags
Long story short: DW_TAG_lexical_blocks are annoying.

In order to search the full tree of children of a given dwarf entry, we
need a recursive function somewhere. We might as well make this function
a part of the DwarfEntry class so we can share it with other scripts.

Note this is roughly the same as collect_dwarf_info, but limited to
the children of a given dwarf entry.

This is useful for ongoing stack.py rework.
2024-12-16 18:01:46 -06:00
Christopher Haster
faf4d09c34 scripts: Added __repr__ to RInt and friends
Just a minor quality of life feature to help debugging these scripts.
2024-12-16 18:01:46 -06:00
Christopher Haster
eb7fff8843 scripts: Include all entries in collect_dwarf_info
Note this only affects the top-level entries. Dwarf-info contains a
heirarchical structure, but for some scripts we just don't care. Finding
DW_TAG_variables in nested DW_TAG_lexical_blocks for example.

This is useful for ongoing stack.py rework.
2024-12-16 18:01:46 -06:00
Christopher Haster
308b4b6080 scripts: Made dwarf tags explicit in ctx.py/structs.py
This will make ctx.py/structs.py more likely to error on unknown tags,
which is preferable to silently reporting incorrect numbers.
2024-12-16 18:01:46 -06:00
Christopher Haster
b90b2953ea scripts: Some minor regex cleanup
Just trying to make regex in scripts a bit more consistent. Though regex
being regex this may be fruitless.
2024-12-16 18:01:46 -06:00
Christopher Haster
28d89eb009 scripts: Adopted simpler+faster heuristic for symbol->dwarf mapping
After tinkering around with the scripts for a bit, I've started to
realize difflib is kinda... really slow...

I don't think this is strictly difflib's fault. It's a pure python
library (proof of concept?), may be prioritizing quality over speed, and
I may be throwing too much data at it.

difflib does have quick_ratio() and real_quick_ratio() for faster
comparisons, but while looking into these for correctness, I realized
there's a simpler heuristic we can use since GCC's optimized names seem
strictly additive: Choose the name that matches with the smallest prefix
and suffix.

So comparing, say, lfsr_rbyd_lookup to __lfsr_rbyd_lookup.constprop.0:

    lfsr_rbyd_lookup
  __lfsr_rbyd_lookup.constprop.0
   |'------.-------''----.-----'
   '-------|-----.   .---'
           v     v   v
  key: (matches, 2, 12)

Note we prioritize the prefix, since it seems GCC's optimized names are
strictly suffixes. We also now fail to match if the dwarf name is not
substring, instead of just finding the most similar looking symbol.

This results in both faster and more robust symbol->dwarf mapping:

  before: time code.py -Y: 0.393s
  after:  time code.py -Y: 0.152s

  (this is WITH the fast dict lookup on exact matches!)

This also drops difflib from the scripts. So one less dependency to
worry about.
2024-12-16 18:01:33 -06:00
Christopher Haster
e77010265e scripts: Replaced nm with objdump in code.py/data.py
There is an argument for prefering nm for code size measurements due to
portability. But I'm not sure this really holds up these days with
objdump being so prevalent.

We already depend on objdump for ctx/structs/perf and other dwarf info,
so using objdump -t to get symbol information means one less tool to
depend on/pass around when cross-compiling.

As a minor benefit this also gives us more control over which sections
to include, instead of relying on nm's predefined t/r/d/b section types.

---

Note code.py/data.py did _not_ require objdump before this. They did use
objdump to map symbols to source files, but would just guess if
objdump wasn't available.
2024-12-15 16:39:04 -06:00
Christopher Haster
8526cd9cf1 scripts: Prevented i/children/notes result field collisions
Without this, naming a column i/children/notes in csv.py could cause
things to break. Unlikely for children/notes, but very likely for i,
especially when benchmarking.

Unfortunately namedtuple makes this tricky. I _want_ to just rename
these to _i/_children/_notes and call the problem solved, but namedtuple
reserves all underscore-prefixed fields for its own use.

As a workaround, the table renderer now looks for _i/_children/_notes at
the _class_ level, as an optional name of which namedtuple field to use.
This way Result types can stay lightweight namedtuples while including
extra table rendering info without risk of conflicts.

This also makes the HotResult type a bit more funky, but that's not a
big deal.
2024-12-15 16:36:14 -06:00
Christopher Haster
183ede1b83 scripts: Option for result scripts to force children ordering
This extends the recursive part of the table renderer to sort children
by the optional "i" field, if available.

Note this only affects children entries. The top-level entries are
strictly ordered by the relevant "by" fields. I just haven't seen a use
case for this yet, and not sorting "i" at the top-level reduces that
number of things that can go wrong for scripts without children.

---

This also rewrites -t/--hot to take advantage of children ordering by
injecting a totally-no-hacky HotResult subclass.

Now -t/--hot should be strictly ordered by the call depth! Though note
entries that share "by" fields are still merged...

This also gives us a way to introduce the "cycle detected" note and
respect -z/--depth, so overall a big improvement for -t/--hot.
2024-12-15 16:35:52 -06:00
Christopher Haster
e6ed785a27 scripts: Removed padding from tail notes in tables
We don't really need padding for the notes on the last column of tables,
which is where row-level notes end up.

This may seem minor, but not padding here avoids quite a bit of
unnecessary line wrapping in small terminals.
2024-12-15 16:35:29 -06:00
Christopher Haster
512cf5ad4b scripts: Adopted ctx.py-related changes in other result scripts
- Adopted higher-level collect data structures:

  - high-level DwarfEntry/DwarfInfo class
  - high-level SymInfo class
  - high-level LineInfo class

  Note these had to be moved out of function scope due to pickling
  issues in perf.py/perfbd.py. These were only function-local to
  minimize scope leak so this fortunately was an easy change.

- Adopted better list-default patterns in Result types:

    def __new__(..., children=None):
        return Result(..., children if children is not None else [])

  A classic python footgun.

- Adopted notes rendering, though this is only used by ctx.py at the
  moment.

- Reverted to sorting children entries, for now.

  Unfortunately there's no easy way to sort the result entries in
  perf.py/perfbd.py before folding. Folding is going to make a mess
  of more complicated children anyways, so another solution is
  needed...

And some other shared miscellany.
2024-12-15 15:41:11 -06:00
Christopher Haster
55d01f69f9 scripts: Adopted ctx.py-related changes in structs.py
- Dropped --internal flag, structs.py includes all structs now.

  No reason to limit structs.py to public structs if ctx.py exists.

- Added struct/union/enum prefixes to results (enums were missing in
  ctx.py).

- Only sort children layers if explicitly requested. This should
  preserve field order, which is nice.

- Adopt more advanced FileInfo/DwarfInfo classes.

- Adopted table renderer changes (notes rendering).
2024-12-15 15:10:49 -06:00