Commit Graph

125 Commits

Author SHA1 Message Date
Christopher Haster
5b0a6d4747 Reworked scripts to move field details into classes
These scripts can't easily share the common logic, but separating
field details from the print/merge/csv logic should make the common
part of these scripts much easier to create/modify going forward.

This also tweaked the behavior of summary.py slightly.
2022-06-06 01:35:16 -05:00
Christopher Haster
4a7e94fb15 Reimplemented coverage.py, using only gcov and with line+branch coverage
This also adds coverage support to the new test framework, which due to
reduction in scope, no longer needs aggregation and can be much
simpler. Really all we need to do is pass --coverage to GCC, which
builds its .gcda files during testing in a multi-process-safe manner.

The addition of branch coverage leverages information that was available
in both lcov and gcov.

This was made easier with the addition of the --json-format to gcov
in GCC 9.0, however the lax backwards compatibility for gcov's
intermediary options is a bit concerning. Hopefully --json-format
sticks around for a while.
2022-06-06 01:35:14 -05:00
Christopher Haster
2b11f2b426 Tweaked generation of .cgi files, error code for recursion in stack.py
GCC is a bit annoying here, it can't generate .cgi files without
generating the related .o files, though I suppose the alternative risks
duplicating a large amount of compilation work (littlefs is really
a small project).

Previously we rebuilt the .o files anytime we needed .cgi files
(callgraph info used for stack.py). This changes it so we always
built .cgi files as a side-effect of compilation. This is similar
to the .d file generation, though may be annoying if the system
cc doesn't support --callgraph-info.
2022-06-06 01:35:12 -05:00
Christopher Haster
1616115662 Fix test.py hang on ctrl-C, cleanup TODOs
A small mistake in test.py's control flow meant the failing test job
would succesfully kill all other test jobs, but then humorously start
up a new process to continue testing.
2022-06-06 01:35:09 -05:00
Christopher Haster
4a42326797 Moved test suites into custom linker section
This simplifies the interaction between code generation and the
test-runner.

In theory it also reduces compilation dependencies, but internal tests
make this difficult.
2022-06-06 01:35:07 -05:00
Christopher Haster
0781f50edb Ported tests to new framework
This mostly required names for each test case, declarations of
previously-implicit variables since the new test framework is more
conservative with what it declares (the small extra effort to add
declarations is well worth the simplicity and improved readability),
and tweaks to work with not-really-constant defines.

Also renamed test_ -> test, replacing the old ./scripts/test.py,
unfortunately git seems to have had a hard time with this.
2022-06-06 01:35:03 -05:00
Christopher Haster
d679fbb389 In ./scripts/test.py, readded external commands, tweaked subprocesses
- Added --exec for wrapping the test-runner with external commands, such as
  Qemu or Valgrind.

- Added --valgrind, which just aliases --exec=valgrind with a few extra
  flags useful during testing.

- Dropped the "valgrind" type for tests. These aren't separate tests
  that run in the test-runner, and I don't see a need for disabling
  Valgrind for any tests. This can be added back later if needed.

- Readded support for dropping directly into gdb after a test failure,
  either at the assert failure, entry point of test case, or entry point
  of the test runner with --gdb, --gdb-case, or --gdb-main.

- Added --isolate for running each test permutation in its own process,
  this is required for associating Valgrind errors with the right test
  case.

- Fixed an issue where explicit test identifier conflicted with
  per-stage test identifiers generated as a part of --by-suite and
  --by-case.
2022-06-06 01:35:03 -05:00
Christopher Haster
5a572ced3c Reworked how test defines are implemented to support recursion
Previously test defines were implemented using layers of index-mapped
uintmax_t arrays. This worked well for lookup, but limited defines to
constants computed at compile-time. Since test defines themselves are
actually calculated at _run-time_ (yeah, they have deviated quite
a bit from the original, compile-time evaluated defines, which makes
the name make less sense), this means defines can't depend on other
defines. Which was limiting since a lot of test defines relied on
defines generated from the geometry being tested.

This new implementation uses callbacks for the per-case defines. This
means they can easily contain full C statements, which can depend on
other test defines. This does means you can create infinitely-recursive
defines, but the test-runner will just break at run-time so don't do that.

One concern is that there might be a performance hit for evaluating all
defines through callbacks, but if there is it is well below the noise
floor:

- constants: 43.55s
- callbacks: 42.05s
2022-06-06 01:35:03 -05:00
Christopher Haster
be0e6ad5eb More progress toward test-runner feature parity
- Added internal tests, which can run tests inside other source files,
  allowing access to "private" functions and data

  Note this required a special bit of handling our defining and later
  undefining test configurations to not polute the namespace of the
  source file, since it can end up with test cases from different
  suites/configuration namespaces.

- Removed unnecessary/unused permutation argument to generated test
  functions.

- Some cleanup to progress output of test.py.
2022-06-06 01:35:01 -05:00
Christopher Haster
4962829017 Continued progress toward feature parity with new test-runner
- Expanded test defines to allow for lists of configurations

  These are useful for changing multi-dimensional test configurations
  without leading to extremely large and less useful configuration
  combinations.

- Made warnings more visible durring test parsing

- Add lfs_testbd.h to implicit test includes

- Fixed issue with not closing files in ./scripts/explode_asserts.py

- Add `make test_runner` and `make test_list` build rules for
  convenience
2022-06-06 01:35:00 -05:00
Christopher Haster
5ee4b052ae Misc test-runner improvements
- Added --disk/--trace/--output options for information-heavy debugging

- Renamed --skip/--count/--every to --start/--stop/--step.

  This matches common terms for ranges, and frees --skip for being used
  to skip test cases in the future.

- Better handling of SIGTERM, now all tests are killed, reported as
  failures, and testing is halted irregardless of -k.

  This is a compromise, you throw away the rest of the tests, which
  is normally what -k is for, but prevents annoying-to-terminate
  processes when debugging, which is a very interactive process.
2022-06-06 01:35:00 -05:00
Christopher Haster
5812d2b5cf Reworked how multi-layered defines work in the test-runner
In the test-runner, defines are parameterized constants (limited
to integers) that are generated from the test suite tomls resulting
in many permutations of each test.

In order to make this efficient, these defines are implemented as
multi-layered lookup tables, using per-layer/per-scope indirect
mappings. This lets the test-runner and test suites define their
own defines with compile-time indexes independently. It also makes
building of the lookup tables very efficient, since they can be
incrementally populated as we expand the test permutations.

The four current define layers and when we need to build them:

layer                           defines         predefine_map   define_map
user-provided overrides         per-run         per-run         per-suite
per-permutation defines         per-perm        per-case        per-perm
per-geometry defines            per-perm        compile-time    -
default defines                 compile-time    compile-time    -
2022-06-06 01:35:00 -05:00
Christopher Haster
64436933e2 Putting together rewritten test.py script 2022-06-06 01:34:57 -05:00
Christopher Haster
92a600a980 Added trace and persist flags to test_runner 2022-04-19 02:12:24 -05:00
Christopher Haster
9281ce26a7 More test_runner progress
- Added filtering based on suite, case, perm, type, geometry
- Added --skip, --count, and --every (will be used for parallelism)
- Implemented --list-defines
- Better helptext for flags with arguments
- Other minor tweaks
2022-04-18 15:15:57 -05:00
Christopher Haster
4b0aa6272e Some more minor improvements to the test_runner
- Indirect index map instead of bitmap+sparse array
- test_define_t and test_type_t
- Added back conditional filtering
- Added suite-level defines and filtering
2022-04-18 00:09:01 -05:00
Christopher Haster
d683f1c76c Reintroduced test-defines into the new test_runner
This moves defines entirely into the runtime of the test_runner,
simplifying thing and reducing the amount of generated code that needs
to be build, at the cost of limiting test-defines to uintmax_t types.

This is implemented using a set of index-based scopes (created by
test.py) that allow different layers to override defines from other
layers, accessible through the global `test_define` function.

layers:
1. command-line overrides
2. per-case defines
3. per-geometry defines
2022-04-17 21:45:47 -05:00
Christopher Haster
56a990336b Created new test_runner.c and test_.py
This is to try a different design for testing, the goals are to make the
test infrastructure a bit simpler, with clear stages for building and
running, and faster, by avoiding rebuilding lfs.c n-times.
2022-04-16 13:50:34 -05:00
Christopher Haster
c60c977c25 Merge pull request #658 from littlefs-project/no-recursion
Restructure littlefs to not use recursion, measure stack usage
2022-04-10 23:23:39 -05:00
Christopher Haster
554e4b1444 Fixed Popen deadlock issue in test.py
As noted in Python's subprocess library:

> This will deadlock when using stdout=PIPE and/or stderr=PIPE and the
> child process generates enough output to a pipe such that it blocks
> waiting for the OS pipe buffer to accept more data.

Curiously, this only became a problem when updating to Ubuntu 20.04
in CI (python3.6 -> python3.8).
2022-03-20 03:44:39 -05:00
Christopher Haster
fe8f3d4f18 Changed./scripts/struct.py to organize by header file
Avoids redundant counting of structs shared in multiple .c files, which
is very common. This is different from the other scripts,
code.py/data.py/stack.py, but this difference makes sense as struct
declarations have a very different lifetime.
2022-03-20 03:41:37 -05:00
Christopher Haster
8475c8064d Limit ./scripts/structs.py to report structs in local .h files
This requires parsing an additional section of the dwarfinfo (--dwarf=rawlines)
to get the declaration file info.

---

Interpreting the results of ./scripts/structs.py reporting is a bit more
complicated than other scripts, structs aren't used in a consistent
manner so the cost of a large struct depends on the context in which it
is used.

But that being said, there really isn't much reason to report
internal-only structs. These structs really only exist for type-checking
in internal algorithms, and their cost will end up reflected in other RAM
measurements, either stack, heap, or other.
2022-03-20 03:39:23 -05:00
Christopher Haster
e4adefd1d7 Fixed spurious encoding error
Using errors=replace in python utf-8 decoding makes these scripts more
resilient to underlying errors, rather than just throwing an unhelpfully
generic decode error.
2022-03-20 03:28:26 -05:00
Christopher Haster
7ea2b515aa A few more tweaks to scripts
- Changed `make summary` to show a one line summary
- Added `make lfs.csv` rule, which is useful for finding more info with
  other scripts
- Fixed small issue in ./scripts/summary.py
- Added *.ci (callgraph) and *.csv (script output) to CI
2022-03-20 03:28:26 -05:00
Christopher Haster
55b3c538d5 Added ./script/summary.py
A full summary of static measurements (code size, stack usage, etc) can now
be found with:

    make summary

This is done through the combination of a new ./scripts/summary.py
script and the ability of existing scripts to merge into existing csv
files, allowing multiple results to be merged either in a pipeline, or
in parallel with a single ./script/summary.py call.

The ./scripts/summary.py script can also be used to quickly compare
different builds or configurations. This is a proper implementation
of a similar but hacky shell script that has already been very useful
for making optimization decisions:

    $ ./scripts/structs.py new.csv -d old.csv --summary
    name (2 added, 0 removed)               code             stack            structs
    TOTAL                                  28648 (-2.7%)      2448               1012

Also some other small tweaks to scripts:

- Removed state saving diff rules. This isn't the most useful way to
  handle comparing changes.

- Added short flags for --summary (-Y) and --files (-F), since these
  are quite often used.
2022-03-20 03:28:26 -05:00
Christopher Haster
eb8be9f351 Some improvements to size scripts
- Added -L/--depth argument to show dependencies for scripts/stack.py,
  this replaces calls.py
- Additional internal restructuring to avoid repeated code
- Removed incorrect diff percentage when there is no actual size
- Consistent percentage rendering in test.py
2022-03-20 03:28:21 -05:00
Christopher Haster
50ad2adc96 Added make *-diff rules, quick commands to compare sizes
This required a patch to the --diff flag for the scripts to ignore
a missing file. This enables the useful one liner for making comparisons
with potentially missing previous versions:

    ./scripts/code.py lfs.o -d lfs.o.code.csv -o lfs.o.code.csv

    function (0 added, 0 removed)            old     new    diff
    TOTAL                                  25476   25476      +0

One downside, these previous files are easy to delete as a part of make
clean, which limits their usefulness for comparing configuration
changes...
2022-03-11 14:40:54 -06:00
Christopher Haster
0a2ff3b6ff Added scripts/structs.py for getting sizes of structs
Note this does include internal structs, so this should probably
be limited to informative purposes.
2022-03-11 14:40:54 -06:00
Christopher Haster
d7582efec8 Changed script's CSV formats to allow for merging different measurements
- size  -> code_size
- size  -> data_size
- frame -> stack_frame
- limit -> stack_limit
- hits  -> coverage_hits
- count -> coverage_count
2022-03-11 14:40:54 -06:00
Christopher Haster
f4c7af76f8 Added scripts/stack.py for viewing stack usage
Note this detects loops (recursion), and renders this as infinity.
Currently littlefs does have a single recursive function and you can see
how this infects the full call graph. Eventually this should be removed.
2022-03-11 14:40:54 -06:00
Christopher Haster
20c58dcbaa Added coverage-sort to scripts/coverage.py
scripts/coverage.py was missed originally because it's not ran as often
as the others. Since it requires run-time info, it's usually only used
in CI.
2022-03-11 14:39:38 -06:00
Christopher Haster
f5286abe7a Added scripts/calls.py for viewing the callgraph directly 2022-03-11 14:39:36 -06:00
Christopher Haster
2cdabe810d Split out scripts/code.py into scripts/code.py and scripts/data.py
This is to avoid unexpected script behavior even though data.py should
always return 0 bytes for littlefs. Maybe a check for this should be
added to CI?
2022-03-11 14:39:36 -06:00
Christopher Haster
b045436c23 Added size-sort options to scripts/code.py
Now with -s/--sort and -S/--reverse-sort for sorting the functions by
size.

You may wonder why add reverse-sort, since its utility doesn't seem
worth the cost to implement (these are just helper scripts after all),
the reason is that reverse-sort is quite useful on the command-line,
where scrollback may be truncated, and you only care about the larger
entries.

Outside of the command-line, normal sort is prefered.

Fortunately the difference is just the sign in the sort key.

Note this conflicts with the short --summary flag, so that has been
removed.
2022-03-11 14:36:23 -06:00
mikee47
4977fa0c0e Fix spelling errors 2022-01-29 09:52:00 +00:00
YAMAMOTO Takashi
3bee4d9a19 scripts/test.py: Fix infinite busy loops on macOS
I confirmed that the same number of tests are run
with "make test" on:

    * Ubuntu with and without this change
    * macOS with this change

>   ====== results ======
>   tests passed 817/817 (100.00%)
>   tests failed 0/817 (0.00%)
2021-02-22 14:42:10 +09:00
Christopher Haster
bca64d76cf Merge branch 'devel' into ci-revamp
Needed to bring in new "error-asserts" configuration
2021-01-18 12:23:25 -06:00
Christopher Haster
21488d9e06 Fixed incorrect documentation in test.py
The argparse documented an outdated format, and was off by 1.

Found by sender6
2021-01-18 11:41:51 -06:00
Christopher Haster
104d65113d Reduced build sources to just the core littlefs
Currently this is just lfs.c and lfs_util.c. Previously this included
the block devices, but this meant all of the scripts needed to
explicitly deselect the block devices to avoid reporting build
size/coverage info on them.

Note that test.py still explicitly adds the block devices for compiling
tests, which is their main purpose. Humorously this means the block
devices will probably be compiled into most builds in this repo anyways.
2021-01-10 04:03:16 -06:00
Christopher Haster
9d6546071b Fixed a recompilation issue in CI, tweaked coverage.py a bit more
This was lost in the Travis -> GitHub transition, in serializing some of
the jobs, I missed that we need to clean between tests with different
geometry configurations. Otherwise we end up running outdated binaries,
which explains some of the weird test behavior we were seeing.

Also tweaked a few script things:
- Better subprocess error reporting (dump stderr on failure)
- Fixed a BUILDDIR rule issue in test.py
- Changed test-not-run status to None instead of undefined
2021-01-10 03:21:28 -06:00
Christopher Haster
b84fb6bcc5 Added BUILDDIR, a bit of script reworking
Now littlefs's Makefile can work with a custom build directory
for compilation output. Just set the BUILDDIR variable and the Makefile
will take care of the rest.

make BUILDDIR=build size

This makes it very easy to compare builds with different compile-time
configurations or different cross-compilers.

This meant most of code.py's build isolation is no longer needed,
so revisted the scripts and cleaned/tweaked a number of things.

Also bought code.py in line with coverage.py, fixing some of the
inconsistencies that were created while developing these scripts.

One change to note was removing the inline measuring logic, I realized
this feature is unnecessary thanks to GCC's -fkeep-static-functions and
-fno-inline flags.
2021-01-10 03:21:21 -06:00
Christopher Haster
887f3660ed Switched to lcov for coverage collection, greatly simplified coverage.py
Since we already have fairly complicated scriptts, I figured it wouldn't
be too hard to use the gcov tools and directly parse their output. Boy
was I wrong.

The gcov intermediary format is a bit of a mess. In version 5.4, a
text-based intermediary format is written to a single .gcov file per
executable. This changed sometime before version 7.5, when it started
writing separate .gcov files per .o files. And in version 9 this
intermediary format has been entirely replaced with an incompatible json
format!

Ironically, this means the internal-only .gcda/.gcno binary format has
actually been more stable than the intermediary format.

Also there's no way to avoid temporary .gcov files generated in the
project root, which risks messing with how test.py runs parallel tests.
Fortunately this looks like it will be fixed in gcov version 9.

---

Ended up switching to lcov, which was the right way to go. lcov handles
all of the gcov parsing, provides an easily parsable output, and even
provides a set of higher-level commands to manage coverage collection
from different runs.

Since this is all provided by lcov, was able to simplify coverage.py
quite a bit. Now it just parses the .info files output by lcov.
2021-01-10 02:21:33 -06:00
Christopher Haster
eeeceb9e30 Added coverage.py, and optional coverage info to test.py
Now coverage information can be collected if you provide the --coverage
to test.py. Internally this uses GCC's gcov instrumentation along with a
new script, coverage.py, to parse *.gcov files.

The main use for this is finding coverage info during CI runs. There's a
risk that the instrumentation may make it more difficult to debug, so I
decided to not make coverage collection enabled by default.
2021-01-10 02:12:45 -06:00
Christopher Haster
b2235e956d Added GitHub workflows to run tests
Mostly taken from .travis.yml, biggest changes were around how to get
the status updates to work.

We can't use a token on PRs the same way we could in Travis, so instead
we use a second workflow that checks every pull request for "status"
artifacts, and create the actual statuses in the "workflow_run" event,
where we have full access to repo secrets.
2021-01-09 23:42:49 -06:00
Christopher Haster
d804c2d3b7 Added scripts/code_size.py, for more in-depth code-size reporting
Inspired by Linux's Bloat-O-Meter, code_size.py wraps nm to provide
function-level code size, and supports detailed comparison between
different builds.

One difference is that code_size.py invokes littlefs's build system
similarly to test.py, creating a duplicate build in the "sizes"
directory. This makes it easy to monitor a cross-compiled build size
while simultaneously testing on the host machine.
2020-12-19 18:49:57 -06:00
Christopher Haster
0ea2871e24 Fixed typo in scripts/readtree.py
Not sure how this went unnoticed, I guess this is the first bug that
needed in-depth inspection after the a last-minute argument cleanup
in the debug scripts.
2020-11-22 15:05:22 -06:00
Christopher Haster
5137e4b0ba Last minute tweaks to debug scripts
- Standardized littlefs debug statements to use hex prefixes and
  brackets for printing pairs.

- Removed the entry behavior for readtree and made -t the default.
  This is because 1. the CTZ skip-list parsing was broken, which is not
  surprising, and 2. the entry parsing was more complicated than useful.
  This functionality may be better implemented as a proper filesystem
  read script, complete with directory tree dumping.

- Changed test.py's --gdb argument to take [init, main, assert],
  this matches the names of the stages in C's startup.

- Added printing of tail to all mdir dumps in readtree/readmdir.

- Added a print for if any mdirs are corrupted in readtree.

- Added debug script side-effects to .gitignore.
2020-03-29 21:19:33 -05:00
Christopher Haster
a7dfae4526 Minor tweaks to debugging scripts, fixed explode_asserts.py off-by-1
- Changed readmdir.py to print the metadata pair and revision count,
  which is useful when debugging commit issues.
- Added truncated data view to readtree.py by default. This does mean
  readtree.py must read all files on the filesystem to show the
  truncated data, hopefully this does not end up being a problem.
- Made overall representation hopefully more readable, including moving
  superblock under the root dir, userattrs under files, fixing a gstate
  rendering issue.
- Added rendering of soft-tails as dotted-arrows, hopefully this isn't
  too noisy.
- Fixed explode_asserts.py off-by-1 in #line mapping caused by a strip
  call in the assert generation eating newlines. The script matches
  line numbers between the original+modified files by emitting assert
  statements that use the same number of lines. An off-by-1 here causes
  the entire file to map lines incorrectly, which can be very annoying.
2020-02-22 23:50:03 -06:00
Christopher Haster
d04b077506 Fixed minor things to get CI passing again
- Added caching to Travis install dirs, because otherwise
  pip3 install fails randomly
- Increased size of littlefs-fuse disk because test script has
  a larger footprint now
- Skip a couple of reentrant tests under byte-level writes because
  the tests just take too long and cause Travis to bail due to no
  output for 10m
- Fixed various Valgrind errors
  - Suppressed uninit checks for tests where LFS_BLOCK_ERASE_VALUE == -1.
    In this case rambd goes uninitialized, which is fine for rambd's
    purposes. Note I couldn't figure out how to limit this suppression
    to only the malloc in rambd, this doesn't seem possible with Valgrind.
  - Fixed memory leaks in exhaustion tests
  - Fixed off-by-1 string null-terminator issue in paths tests
- Fixed lfs_file_sync issue caused by revealed by fixing memory leaks
  in exhaustion tests. Getting ENOSPC during a file write puts the file
  in a bad state where littlefs doesn't know how to write it out safely.
  In this case, lfs_file_sync and lfs_file_close return 0 without
  writing out state so that device-side resources can still be cleaned
  up. To recover from ENOSPC, the file needs to be reopened and the
  writes recreated. Not sure if there is a better way to handle this.
- Added some quality-of-life improvements to Valgrind testing
  - Fit Valgrind messages into truncated output when not in verbose mode
  - Turned on origin tracking
2020-02-18 18:05:03 -06:00
Christopher Haster
f4b17b379c Added test.py support for tmpfs-backed disks
RAM-backed testing is faster than file-backed testing. This is why
test.py uses rambd by default.

So why add support for tmpfs-backed disks if we can already run tests in
RAM? For reentrant testing.

Under reentrant testing we simulate power-loss by forcefully exiting the
test program at specific times. To make this power-loss meaningful, we need to
persist the disk across these power-losses. However, it's interesting to
note this persistence doesn't need to be actually backed by the
filesystem.

It may be possible to rearchitecture the tests to simulate power-loss a
different way, by say, using coroutines or setjmp/longjmp to leave
behind ongoing filesystem operations without terminating the program
completely. But at this point, I think it's best to work with what we
have.

And simply putting the test disks into a tmpfs mount-point seems to
work just fine.

Note this does force serialization of the tests, which isn't required
otherwise. Currently they are only serialized due to limitations in
test.py. If a future change wants to perallelize the tests, it may need
to rework RAM-backed reentrant tests.
2020-02-12 10:48:54 -06:00