littlefs

mirror of https://github.com/littlefs-project/littlefs.git synced 2025-12-09 17:12:40 +00:00

Author	SHA1	Message	Date
Christopher Haster	0adec7f15c	scripts: Replaced __builtins__ with builtins Apparently __builtins__ is a CPython implementation detail, and behaves differently when executed vs imported??? import builtins is the correct way to go about this.	2025-01-28 14:41:45 -06:00
Christopher Haster	62cc4dbb14	scripts: Disabled local import hack on import Moved local import hack behind if __name__ == "__main__" These scripts aren't really intended to be used as python libraries. Still, it's useful to import them for debugging and to get access to their juicy internals.	2025-01-28 14:41:30 -06:00
Christopher Haster	1d8d0785fc	scripts: More flags to control table renderer, -Q/--small-table, etc Instead of trying to be too clever, this just adds a bunch of small flags to control parts of table rendering: - --no-header - Don't show the header. - --small-header - Don't show by field names. - --no-total - Don't show the total. - -Q/--small-table - Equivalent to --small-header + --no-total. Note that -Q/--small-table replaces the previous -Y/--summary + -c/--compare hack, while also allowing a similar table style for non-compare results.	2024-12-18 14:03:35 -06:00
Christopher Haster	4c87d59c7b	scripts: Simplified result->file mapping, dropped collect_dwarf_files This reverts per-result source file mapping, and tears out of a bunch of messy dwarf parsing code. Results from the same .o file are now mapped to the same source file. This was just way too much complexity for slightly better result->file mapping, which risked losing results accidentally mapped to the wrong file. --- I was originally going to revert all the way back to relying strictly on the .o name and --build-dir (`490e1c4`) (this is the simplest solution), but after poking around in dwarf-info a bit, I realized we do have access to the original source file in DW_TAG_compile_unit's DW_AT_comp_dir + DW_AT_name. This is much simpler/more robust than parsing objdump --dwarf=rawline, and avoid needing --build-dir in a bunch of scripts. --- This also reverts stack.py to rely only on the .ci files. These seem as reliable as DW_TAG_compile_unit while simplifying things significantly. Symbol mapping used to be a problem, but this was fixed by using the symbol in the title field instead of the label field (which strips some optimization suffixes?)	2024-12-17 15:34:39 -06:00
Christopher Haster	5d777f84ad	scripts: Use depth to limit recursive result collection If we're not using these results, no reason to collect all of the children. Note that we still need to recurse for other measurements (limit, struct size, etc). This has a measurable, but small, impact on runtime: stack.py -z0 -Y: 0.202s stack.py -z1 -Y: 0.162s (~-19.8%) ctx.py -z0 -Y: 0.112s ctx.py -z1 -Y: 0.098s (~-12.5%)	2024-12-16 19:27:12 -06:00
Christopher Haster	6a6ed0f741	scripts: Dropped cycle detection from table renderer Now that cycle detection is always done at result collection time, we don't need this in the table renderer itself. This had a tendency to cause problems for non-function scripts (ctx.py, structs.py).	2024-12-16 19:26:21 -06:00
Christopher Haster	dd389f23ee	scripts: Switched to sorted sets for result notes God, I wish Python had an OrderedSet. This is a fix for duplicate "cycle detected" notes when using -t/--hot. This mix of merging both _hot_notes and _notes in the HotResult class is tricky when the underlying container is a list. The order is unlikely to be guaranteed anyways, when different results with different notes are folded. And if we ever want more control over the order of notes in result scripts we can always change this back later.	2024-12-16 19:22:14 -06:00
Christopher Haster	3e03c2ee7f	scripts: Adopted better input file handling in result scripts - Error on no/insufficient files. Instead of just returning no results. This is more useful when debugging complicated bash scripts. - Use elf magic to allow any file order in perfbd.py/stack.py. This was already implemented in stack.py, now also adopted in perfbd.py. Elf files always start with the magic string "\x7fELF", so we can use this to figure out the types of input files without needing to rely on argument order. This is just one less thing to worry about when invoking these scripts.	2024-12-16 19:13:22 -06:00
Christopher Haster	4325a06277	scripts: Fixed incorrect files on recursive results It's been a while since I've been hurt by Python's late-binding variables. In this case the scope-creep of the "file" variable hid that we didn't actually know which recursive result belonged to which file. Instead we were just assigning whatever the most recent top-level result was. This is fixed by looking up the correct file in childrenof. Though this unfortunately does add quite a bit of noise.	2024-12-16 19:12:46 -06:00
Christopher Haster	c8c12ffae8	scripts: Reverted stack.py to use -fcallgraph-info=su again See previous commit for the issues with stack.py's current approach. I'm convinced dwarf-info simply does not contain enough info to figure out stack usage. There is one last idea, which is to parse the dissassembly. In theory you only need to understand calls, branches (for control-flow), and push/pop instructions to figure out the worst-case stack usage. But this would be ISA-specific and error-prone, so it probably shouldn't _replace_ the -fcallgraph-info=su based stack.py. So, out of ideas, reverting. --- It's worth noting this isn't a trivial revert. There's a couple interesting changes in stack.py: - We now use .o files to map callgraph nodes to relevant symbol names. This should be a bit more robust than relying only on the names in the .ci files, and guarantees function names line up with other symbol-based scripts (code.py, ctx.py, etc). This also lets us warn on missing callgraph nodes, in case the callgraph info is incomplete. - Callgraph parsing should be quite a bit more robust now. Added a small (and reusable?) Parser class. - Moved cycle detection into result collection. This should let us drop cycle detection from the table renderer eventually.	2024-12-16 18:10:23 -06:00
Christopher Haster	ac79c88c6f	scripts: Improved cycle detection notes in scripts - Prevented childrenof memoization from hiding the source of a detected cycle. - Deduplicated multiple cycle detected notes. - Fixed note rendering when last column does not have a notes list. Currently this only happens when entry is None (no results).	2024-12-16 18:01:46 -06:00
Christopher Haster	02ccbdfed2	scripts: Enabled symbol->dwarf mapping via address We have symbol->addr info and dwarf->addr info (DW_AT_low_pc), so why not use this to map symbols to dwarf entries? This should hopefully be more reliable than the current name based heuristic, but only works for functions (DW_TAG_subprogram). Note that we still have to fuzzy match due to thumb-bit weirdness (small rant below). --- Ok. Why in Thumb does the symbol table include the thumb bit, but the dwarf info does not?? Would it really have been that hard to add the thumb bit to DW_AT_low_pc so symbols and dwarf entries match? So, because of Thumb, we can't expect either the address or name to match exactly. The best we can do is binary search and expect the symbol to point somewhere _within_ the dwarf's DW_AT_low_pc/DW_AT_high_pc range. Also why does DW_AT_high_pc store the _size_ of the function?? Why isn't it, idunno, the _high_pc_? I get that the size takes up less space when leb128 encoding, but surely there could have been a better name?	2024-12-16 18:01:46 -06:00
Christopher Haster	eb09865868	scripts: Resolve DW_AT_abstract_origin during dwarf collection Sometimes I feel like dwarf-info is designed to be as error-prone as possible. In this case, DW_AT_abstract_origin indicates that one dwarf entry should inherit the attributes of another. If you don't know this, it's easy to miss relevant dwarf entries due to missing name fields, etc. Expanding DW_AT_abstract_origin lazily would be tricky due to how our DwarfInfo class is structured, so instead I am just expanding DW_AT_abstract_origins during collect_dwarf_info. Note this doesn't handle recursive DW_AT_abstract_origins, but there is at least an assert. --- It does seem like DW_AT_abstract_origin is intended to be limited to "Inline instances of inline subprograms" and "Out-of-line instances of inline subprograms" according to the DWARF5 spec, but it's unclear if this is a rule or suggestion... This hasn't been an issue for existing scripts, but is needed from some ongoing stack.py rework. Otherwise we don't find "out-of-line instances of inline subprograms" (optimized functions?) correctly.	2024-12-16 18:01:46 -06:00
Christopher Haster	19cd428a3c	scripts: Added DwarfEntry.info to help find recursive tags Long story short: DW_TAG_lexical_blocks are annoying. In order to search the full tree of children of a given dwarf entry, we need a recursive function somewhere. We might as well make this function a part of the DwarfEntry class so we can share it with other scripts. Note this is roughly the same as collect_dwarf_info, but limited to the children of a given dwarf entry. This is useful for ongoing stack.py rework.	2024-12-16 18:01:46 -06:00
Christopher Haster	faf4d09c34	scripts: Added __repr__ to RInt and friends Just a minor quality of life feature to help debugging these scripts.	2024-12-16 18:01:46 -06:00
Christopher Haster	b4038e3c27	scripts: Include global/section info in collect_syms, added Sym We have this info, might as well expose it for scripts to use. Unfortunately this extra info did make tuple unpacking a bit of a mess, especially in scripts that don't use this extra info, so I've added a small Sym class similar to DwarfEntry in collect_dwarf_info. This is useful for some ongoing stack.py rework.	2024-12-16 18:01:46 -06:00
Christopher Haster	eb7fff8843	scripts: Include all entries in collect_dwarf_info Note this only affects the top-level entries. Dwarf-info contains a heirarchical structure, but for some scripts we just don't care. Finding DW_TAG_variables in nested DW_TAG_lexical_blocks for example. This is useful for ongoing stack.py rework.	2024-12-16 18:01:46 -06:00
Christopher Haster	bd7004a4f3	scripts: Prefer objdump --syms over -t in scripts objdump --syms is a bit more self-documenting. The other uses of objdump already use the long forms (--dwarf=rawline, --dwarf=info).	2024-12-16 18:01:46 -06:00
Christopher Haster	308b4b6080	scripts: Made dwarf tags explicit in ctx.py/structs.py This will make ctx.py/structs.py more likely to error on unknown tags, which is preferable to silently reporting incorrect numbers.	2024-12-16 18:01:46 -06:00
Christopher Haster	b90b2953ea	scripts: Some minor regex cleanup Just trying to make regex in scripts a bit more consistent. Though regex being regex this may be fruitless.	2024-12-16 18:01:46 -06:00
Christopher Haster	28d89eb009	scripts: Adopted simpler+faster heuristic for symbol->dwarf mapping After tinkering around with the scripts for a bit, I've started to realize difflib is kinda... really slow... I don't think this is strictly difflib's fault. It's a pure python library (proof of concept?), may be prioritizing quality over speed, and I may be throwing too much data at it. difflib does have quick_ratio() and real_quick_ratio() for faster comparisons, but while looking into these for correctness, I realized there's a simpler heuristic we can use since GCC's optimized names seem strictly additive: Choose the name that matches with the smallest prefix and suffix. So comparing, say, lfsr_rbyd_lookup to __lfsr_rbyd_lookup.constprop.0: lfsr_rbyd_lookup __lfsr_rbyd_lookup.constprop.0 \|'------.-------''----.-----' '-------\|-----. .---' v v v key: (matches, 2, 12) Note we prioritize the prefix, since it seems GCC's optimized names are strictly suffixes. We also now fail to match if the dwarf name is not substring, instead of just finding the most similar looking symbol. This results in both faster and more robust symbol->dwarf mapping: before: time code.py -Y: 0.393s after: time code.py -Y: 0.152s (this is WITH the fast dict lookup on exact matches!) This also drops difflib from the scripts. So one less dependency to worry about.	2024-12-16 18:01:33 -06:00
Christopher Haster	e77010265e	scripts: Replaced nm with objdump in code.py/data.py There is an argument for prefering nm for code size measurements due to portability. But I'm not sure this really holds up these days with objdump being so prevalent. We already depend on objdump for ctx/structs/perf and other dwarf info, so using objdump -t to get symbol information means one less tool to depend on/pass around when cross-compiling. As a minor benefit this also gives us more control over which sections to include, instead of relying on nm's predefined t/r/d/b section types. --- Note code.py/data.py did _not_ require objdump before this. They did use objdump to map symbols to source files, but would just guess if objdump wasn't available.	2024-12-15 16:39:04 -06:00
Christopher Haster	8526cd9cf1	scripts: Prevented i/children/notes result field collisions Without this, naming a column i/children/notes in csv.py could cause things to break. Unlikely for children/notes, but very likely for i, especially when benchmarking. Unfortunately namedtuple makes this tricky. I _want_ to just rename these to _i/_children/_notes and call the problem solved, but namedtuple reserves all underscore-prefixed fields for its own use. As a workaround, the table renderer now looks for _i/_children/_notes at the _class_ level, as an optional name of which namedtuple field to use. This way Result types can stay lightweight namedtuples while including extra table rendering info without risk of conflicts. This also makes the HotResult type a bit more funky, but that's not a big deal.	2024-12-15 16:36:14 -06:00
Christopher Haster	183ede1b83	scripts: Option for result scripts to force children ordering This extends the recursive part of the table renderer to sort children by the optional "i" field, if available. Note this only affects children entries. The top-level entries are strictly ordered by the relevant "by" fields. I just haven't seen a use case for this yet, and not sorting "i" at the top-level reduces that number of things that can go wrong for scripts without children. --- This also rewrites -t/--hot to take advantage of children ordering by injecting a totally-no-hacky HotResult subclass. Now -t/--hot should be strictly ordered by the call depth! Though note entries that share "by" fields are still merged... This also gives us a way to introduce the "cycle detected" note and respect -z/--depth, so overall a big improvement for -t/--hot.	2024-12-15 16:35:52 -06:00
Christopher Haster	e6ed785a27	scripts: Removed padding from tail notes in tables We don't really need padding for the notes on the last column of tables, which is where row-level notes end up. This may seem minor, but not padding here avoids quite a bit of unnecessary line wrapping in small terminals.	2024-12-15 16:35:29 -06:00
Christopher Haster	512cf5ad4b	scripts: Adopted ctx.py-related changes in other result scripts - Adopted higher-level collect data structures: - high-level DwarfEntry/DwarfInfo class - high-level SymInfo class - high-level LineInfo class Note these had to be moved out of function scope due to pickling issues in perf.py/perfbd.py. These were only function-local to minimize scope leak so this fortunately was an easy change. - Adopted better list-default patterns in Result types: def __new__(..., children=None): return Result(..., children if children is not None else []) A classic python footgun. - Adopted notes rendering, though this is only used by ctx.py at the moment. - Reverted to sorting children entries, for now. Unfortunately there's no easy way to sort the result entries in perf.py/perfbd.py before folding. Folding is going to make a mess of more complicated children anyways, so another solution is needed... And some other shared miscellany.	2024-12-15 15:41:11 -06:00
Christopher Haster	55d01f69f9	scripts: Adopted ctx.py-related changes in structs.py - Dropped --internal flag, structs.py includes all structs now. No reason to limit structs.py to public structs if ctx.py exists. - Added struct/union/enum prefixes to results (enums were missing in ctx.py). - Only sort children layers if explicitly requested. This should preserve field order, which is nice. - Adopt more advanced FileInfo/DwarfInfo classes. - Adopted table renderer changes (notes rendering).	2024-12-15 15:10:49 -06:00
Christopher Haster	c8a4ee91a6	scripts: ctx.py: Only sort children layers if explicitly requested - Sorting struct fields by name? Eh, that's not a big deal. - Sorting function params by name? Okay, that's really annoying. This compromises by sorting only the top-level results by name, and leaving recursive results in the order returned by collect by default. Recursive results should usually have a well-defined order. This should be extendable to the other result scripts as well.	2024-12-15 15:04:11 -06:00
Christopher Haster	3a0a58369a	scripts: ctx.py: Added struct/union namespace prefix to results This is a bit more readable and better matches the names used in the C code (lfs_config vs struct lfs_config). The downside is we now have fields with spaces in them, which may cause problems for naive parsers.	2024-12-15 14:56:53 -06:00
Christopher Haster	2df97cd858	scripts: Added ctx.py for finding function contexts ctx.py reports functions' "contexts", i.e. the sum of the size of all function parameters and indirect structs, recursively dereferencing pointers when possible. The idea is this should give us a rough lower bound on the amount of state that needs to be allocated to call the function: $ ./scripts/ctx.py lfs.o lfs_util.o -Dfunction=lfsr_file_write -z3 -s function size lfsr_file_write 596 \|-> lfs 436 \| '-> lfs_t 432 \|-> file 152 \| '-> lfsr_file_t 148 \|-> buffer 4 '-> size 4 TOTAL 596 --- The long story short is that structs.py, while very useful for introspection, has not been useful as a general metric. Sure it can give you a rough idea of the impact of small changes to struct sizes, but it's not uncommon for larger changes to add/remove structs that have no real impact on the user facing RAM usage. There are some structs we care about (lfs_t) and some we don't (lfsr_data_t). Internal-only structs should already be measured by stack.py. Which raises the question, how do we know which structs we care about? The idea here is to look at function parameters and chase pointers. This gives a complicated, but I think reasonable, heuristic. Fortunately dwarf-info gives us all the necessary info. Some notes: - This does _not_ include buffer sizes. Buffer sizes are user configurable, so it's sort of up to the user to account for these. - We include structs once if we find a cycle (lfsr_file_t.o for example). Can't really do any better and this at least provides a lower bound for complex data-structures. - We sum all params/fields, but find the max of all functions. Note this prevents common types (lfs_t for example) from being counted more than once. - We only include global functions (based on the symbol flag). In theory the context of all internal functions should end up in stack.py. This can be overridden with --everything. Note this doesn't replace structs.py. structs.py is still useful for looking at all structs in the system. ctx.py should just be more useful for comparing builds at a high level.	2024-12-15 13:24:31 -06:00

30 Commits