104 Commits

Author SHA1 Message Date
Christopher Haster
bce8f45a64 scripts: Tried to better document ansi color codes 2025-05-25 13:00:11 -05:00
Christopher Haster
6d9c077261 Reordered LFSR_TAG_NAMELIMIT/FILELIMIT
Not sure why, but this just seems more intuitive/correct. Maybe because
LFSR_TAG_NAME is always the first tag in a file's attr set:

  LFSR_TAG_NAMELIMIT    0x0039  v--- ---- --11 1--1
  LFSR_TAG_FILELIMIT    0x003a  v--- ---- --11 1-1-

Seeing as several parts of the codebase still use the previous order,
it seems reasonable to switch back to that.

No code changes.
2025-05-24 21:51:06 -05:00
Christopher Haster
651c3e1eb4 scripts: Renamed Attr -> CsvAttr
Mainly to avoid confusion with littlefs's attrs, uattrs, rattrs, etc.

This risked things getting _really_ confusing as the scripts evolve.
2025-05-15 18:48:46 -05:00
Christopher Haster
c04f36ead4 scripts: plot[mpl].py: Adopted -s/--sort and -S for legend sorting
Before this, the only option for ordering the legend was by specifying
explicit -L/--add-label labels. This works for the most part, but
doesn't cover the case where you don't know the parameterization of the
input data.

And we already have -s/-S flags in other csv scripts, so it makes sense
to adopt them in plot.py/plotmpl.py to allow sorting by one or more
explicit fields.

Note that -s/-S can be combined with explicit -L/--add-labels to order
datasets with the same sort field:

  $ ./scripts/plot.py bench.csv \
          -bBLOCK_SIZE \
          -xn \
          -ybench_readed \
          -ybench_proged \
          -ybench_erased \
          --legend \
          -sBLOCK_SIZE \
          -L'*,bench_readed=bs=%(BLOCK_SIZE)s' \
          -L'*,bench_proged=' \
          -L'*,bench_erased='

---

Unfortunately this conflicted with -s/--sleep, which is a common flag in
the ascii-art scripts. This was bound to conflict with -s/--sort
eventually, so a came up with some alternatives:

- -s/--sleep -> -~/--sleep
- -S/--coalesce -> -+/--coalesce

But I'll admit I'm not the happiest about these...
2025-05-15 15:51:49 -05:00
Christopher Haster
55ea13b994 scripts: Reverted del to resolve shadowed builtins
I don't know how I completely missed that this doesn't actually work!

Using del _does_ work in Python's repl, but it makes sense the repl may
differ from actual function execution in this case.

The problem is Python still thinks the relevant builtin is a local
variables after deletion, raising an UnboundLocalError instead of
performing a global lookup. In theory this would work if the variable
could be made global, but since global/nonlocal statements are lifted,
Python complains with "SyntaxError: name 'list' is parameter and
global".

And that's A-Ok! Intentionally shadowing language builtins already puts
this code deep into ugly hacks territory.
2025-05-15 14:10:42 -05:00
Christopher Haster
48c1a016a0 scripts: Fixed missing tuple unpack in glob-all CLI attrs
This was broken:

  $ ./scripts/plotmpl.py -L'*=bs=%(bs)s'

There may be a better way to organize this logic, but spamming if
statements works well enough.
2025-05-15 13:47:09 -05:00
Christopher Haster
4a50c5c9ce scripts: dbgbmap[d3].py: Adopted slightly different row prioritization
This still forces the block_rows_ <= height invariant, but also prevents
ceiling errors from introducing blank rows.

I guess the simplest solution is the best one, eh?
2025-04-30 02:30:31 -05:00
Christopher Haster
de7564e448 Added phase bits to cksum tags
This carves out two more bits in cksum tags to store the "phase" of the
rbyd block (maybe the name is too fancy, this is just the lowest 2 bits
of the block address):

  LFSR_TAG_CKSUM        0x300p  v-11 ---- ---- -pqq
                                                ^ ^
                                                | '-- phase bits
                                                '---- perturb bit

The intention here is to catch mrootanchors that are "out-of-phase",
i.e. they've been shifted by a small number of blocks.

This can happen if we find the wrong mrootanchor (after, say, a magic
scan), and risks filesystem corruption:

                formatted
  .-----------------'-----------------.
                          mounted
           .-----------------'-----------------.
  .--------+--------+--------+--------+ ...
  |(erased)| mroot  |
  |        | anchor |                   ...
  |        |        |
  '--------+--------+--------+--------+ ...

Including the lower 2 bits of the block address in cksum tags avoids
this, for up to a 3 block shift (the maximum number of redund
mrootanchors).

---

Note that cksum tags really are the only place we could put these bits.
Anywhere else and they would interfere with the canonical cksum, which
would break error correction. By definition these need to be different
per block.

We include these phase bits in every cksum tag (because it's easier),
but these don't really say much about mdirs that are not the
mrootanchor. Non-anchor mdirs can have arbitrary block addresses,
therefore arbitrary phase bits.

You _might_ be able to do something interesting if you sort the rbyd
addresses and use the index as the phase bits, but that would add quite
a bit of code for questionable benefit...

You could argue this adds noise to our cksums, but:

1. 2 bits seems like a really small amount of noise
2. our cksums are just crc32cs
3. the phase bits humorously never change when you rewrite a block

---

As with any feature this adds code, but only a small amount. I think
it's worth the extra protection:

           code          stack          ctx
  before: 35792           2368          636
  after:  35824 (+0.1%)   2368 (+0.0%)  636 (+0.0%)

Also added test_mount_incompat_out_of_phase to test this.

The dbg scripts _don't_ error (block mismatch seems likely when
debugging), but dbgrbyd.py at least adds phase mismatch notes in
-l/--log mode.
2025-04-30 00:57:17 -05:00
Christopher Haster
f2e6b60f36 Reworked grm encoding a bit
This drops the leading count/mode byte, and instead uses mid=0 to
terminate grms. This shaves off 1 bytes from grmdeltas.

Previously, we needed the count/mode byte for a couple reasons:

- We needed to know the number of grm entries somehow, and there wasn't
  always an obvious sentinel value. mid=-1, for example, is
  unrepresentable with our unsigned leb128 encoding.

  But now that development has settled, we can use mid=0.0 to figure out
  the end-of-queue. mid=0.0 should always map to the root bookmark,
  which doesn't make sense to delete, so it makes for a reasonable null
  terminator here.

- It provided a route for future grm extensions, which could use the >2
  count/mode encodings.

  But I think we can use additional grm tag encodings for this.

  There's only one gdelta tag so far, but the current plan for future
  gdelta tags is to carve out the bottom 2 bits for redund like we do
  with the struct tags:

    LFSR_TAG_GDELTA        0x01tt  v--- ---1 -ttt ttrr
    LFSR_TAG_GRMDELTA      0x0100  v--- ---1 ---- ----
    LFSR_TAG_GBMAPDELTA    0x0104  v--- ---1 ---- -1rr
    LFSR_TAG_GDDTREEDELTA  0x0108  v--- ---1 ---- 1-rr
    LFSR_TAG_GPTREEDELTA   0x010c  v--- ---1 ---- 11rr
    ...

  Decoding is a bit more complicated for gstate, since we will need to
  xor those bits if mutable, but this avoids needing a full byte just
  for redund in every auxiliary tree.

  Long story short, we can leverage the lower 2 bits of the grm tag for
  future extensions using the same mechanism.

This may seem like a lot of effort for only a handful of bytes, but keep
in mind each gdelta lives in more-or-less every mdir in the filesystem.

Also saves a bit of code/ctx:

           code          stack          ctx
  before: 35772           2368          640
  after:  35768 (-0.0%)   2368 (+0.0%)  636 (-0.6%)
2025-04-30 00:53:33 -05:00
Christopher Haster
dc2d58d28e scripts: dbgbmap[d3].py: Prioritize rows at low resolution
This prevents some pretty unintuitive behavior with dbgbmap.py -H2 (the
default) in the terminal.

Consider before:

  bd 4096x256, 7.8% mdir, 0.4% btree, 0.0% data
  mm--------b-----mm--mm--mm--mmmmmmm--mm--mmmm-----------------------

Vs after:

  bd 4096x256, 7.8% mdir, 0.4% btree, 0.0% data
  m-----------------------------------b-mmmmmmmm----------------------

Compared to the original bmap (-H5):

  bd 4096x256, 7.8% mdir, 0.4% btree, 0.0% data
  mm------------------------------------------------------------------
  --------------------------------------------------------------------
  ----------b-----mm--mm--mm--mmmmmmm--mm--mmmm-----------------------
  --------------------------------------------------------------------

What's happening is dbgbmap.py is prioritizing aspect ratio over pixel
boundaries, so it's happy drawing a 4-row bmap to a 1-row Canvas. But of
course we can't see subpixels, so the result is quite confusing.

Prioritizing rows while tiling avoids this.
2025-04-30 00:44:26 -05:00
Christopher Haster
1f4d7b3b7e scripts: dbgmtree.py: Dropped Mtree.lookupnext
I was toying with making this look more like the mtree API in lfs.c (so
no lookupleaf/namelookupleaf, only lookup/namelookup), but dropped the
idea:

- It would be tedious

- The Mtree class's lookupleaf/namelookupleaf are also helpful for
  returning inner btree nodes when printing debug info

- Not embedding mids in the Mdir class would complicate things

It's ok for these classes to not match littlefs's internal API
_exactly_. The goal is easy access for debug info, not to port the
filesystem to Python.

At least dropped Mtree.lookupnext, because that function really makes no
sense.
2025-04-30 00:44:16 -05:00
Christopher Haster
677c078b50 Added LFSR_TAG_BNAME/MNAME, stop btree lookups at first tag
Now that we don't have to worry about name tag conflicts as much, we
can add name tags for things that aren't files.

This adds LFSR_TAG_BNAME for branch names, and LFSR_TAG_MNAME for mtree
names. Note that the upper 4 bits of the subtype match LFSR_TAG_BRANCH
and LFSR_TAG_MDIR respectively:

  LFSR_TAG_BNAME        0x0200  v--- --1- ---- ----
  LFSR_TAG_MNAME        0x0220  v--- --1- --1- ----

  LFSR_TAG_BRANCH       0x030r  v--- --11 ---- --rr
  LFSR_TAG_MDIR         0x0324  v--- --11 --1- -1rr

The encoding is somewhat arbitrary, but I figured reserving ~31 types
for files is probably going to be plenty for littlefs. POSIX seems to
do just fine with only ~7 all these years, and I think custom attributes
will be more enticing for "niche" file types (symlinks, compressed
files, etc), given the easy backwards compatibility.

---

In addition to the debugging benefits, the new name tags let us stop
btree lookups on the first non-bname/branch tag. Previously we always
had to fetch the first struct tag as well to check if it was a branch.

In theory this saves one rbyd lookup, but in practice it's a bit muddy.

The problem is that there's two ways to use named btrees:

1. As buckets: mtree -> mdir -> mid
2. As a table: ddtree -> ddid

The only named btree we _currently_ have is the mtree. And the mtree
operates in bucket mode, with each mdir acting more-or-less as an
extension to the btree. So we end up needing to do the second tag lookup
anyways, and all we've done is complicated up the code.

But we will _eventually_ need the table mode for the ddtree, where we
care if the ddname is an exact match.

And returning the first tag is arguably the more "correct" internal API,
vs arbitrarily the first struct tag.

But then again this change is pretty pricey...

           code          stack          ctx
  before: 35732           2440          640
  after:  35888 (+0.4%)   2480 (+1.6%)  640 (+0.0%)

---

It's worth noting the new BNAME/MNAME tags don't _require_ the btree
lookup changes (which is why we can get away with not touching the dbg
scripts). The previous algorithm of always checking for branch tags
still works.

Maybe there's an argument for conditionally using the previous API when
compiling without the ddtree, but that sounds horrendously messy...
2025-04-30 00:25:30 -05:00
Christopher Haster
5eb194c215 scripts: dbgbmap[d3].py: Limited block conflicts to mismatched types
Block conflict detection was originally implemented with non-dags in
mind. But now that dags are allowed, we shouldn't treat them as errors!

Instead, we only report blocks as conflicts if multiple references have
mismatching types.

This should still be very useful for debugging the upcoming bmap work.
2025-04-29 16:25:45 -05:00
Christopher Haster
d308ec8322 Reworked tag encoding a little bit
Mainly to make room for some future planned stuff:

- Moved the mroot's redund bits from LFSR_TAG_GEOMETRY to
  LFSR_TAG_MAGIC:

    LFSR_TAG_MAGIC        0x003r  v--- ---- --11 --rr

  This has the benefit of living in a fixed location (off=0x5), which
  may make mounting/debugging easier. It also makes LFSR_TAG_GEOMETRY
  less of a special case (LFSR_TAG_MAGIC is already a _very_ special
  case).

  Unfortunately, this does get in the way of our previous magic=0x3
  encoding. To compensate (and to avoid conflicts with LFSR_TAG_NULL),
  I've added the 0x3_ prefix. This has the funny side-effect of
  rendering redunds 0-3 as ascii 0-3 (0x30-0x33), which is a complete
  accident but may actually be useful when debugging.

  Currently all config tags fit in the 0x3_ prefix, which is nice for
  debugging but not a hard requirement.

- Flipped LFSR_TAG_FILELIMIT/NAMELIMIT:

    LFSR_TAG_FILELIMIT    0x0039  v--- ---- --11 1--1
    LFSR_TAG_NAMELIMIT    0x003a  v--- ---- --11 1-1-

  The file limit is a _bit_ more fundamental. It's effectively the
  required integer size for the filesystem.

  These may also be followed by LFSR_TAG_ATTRLIMIT based on how future
  attr revisits go.

- Rearranged struct tags so that LFSR_TAG_BRANCH = 0x300:

    LFSR_TAG_BRANCH       0x030r  v--- --11 ---- --rr
    LFSR_TAG_DATA         0x0304  v--- --11 ---- -1--
    LFSR_TAG_BLOCK        0x0308  v--- --11 ---- 1err
    LFSR_TAG_DDKEY*       0x0310  v--- --11 ---1 ----
    LFSR_TAG_DID          0x0314  v--- --11 ---1 -1--
    LFSR_TAG_BSHRUB       0x0318  v--- --11 ---1 1---
    LFSR_TAG_BTREE        0x031c  v--- --11 ---1 11rr
    LFSR_TAG_MROOT        0x032r  v--- --11 --1- --rr
    LFSR_TAG_MDIR         0x0324  v--- --11 --1- -1rr
    LFSR_TAG_MTREE        0x032c  v--- --11 --1- 11rr

    *Planned

  LFSR_TAG_BRANCH is a very special tag when it comes to bshrub/btree
  traversal, so I think it deserves the subtype=0 slot.

  This also just makes everything fit together better, and makes room
  for the future planned ddkey tag.

Code changes minimal:

           code          stack          ctx
  before: 35728           2440          640
  after:  35732 (+0.0%)   2440 (+0.0%)  640 (+0.0%)
2025-04-29 16:25:00 -05:00
Christopher Haster
7dd473df82 Tweaked LFSR_TAG_STICKYNOTE encoding 0x205 -> 0x203
Now that LFS_TYPE_STICKYNOTE is a real type users can interact with, it
makes sense to group it with REG/DIR. This also has the side-effect of
making these contiguous.

---

LFSR_TAG_BOOKMARKs, however, are still hidden from the user. This
unfortunately means there will be a bit of a jump if we ever add
LFS_TYPE_SYMLINK in the future, but I'm starting to wonder if that's the
best way to approach symlinks in littlefs...

If instead LFS_TYPE_SYMLINKS were implied via custom attribute, you
could avoid the headache that comes with adding a new tag encoding, and
allow perfect compatibility with non-symlink drivers. Win win.

This seems like a better approach for _all_ of the theoretical future
types (compressed files, device files, etc), and avoids the risk of
oversaturating the type space.

---

This had a surprising impact on code for just a minor encoding tweak. I
guess the contiguousness pushed the compiler to use tables/ranges for
more things? Or maybe 3 vs 5 is just an easier constant to encode?

           code          stack          ctx
  before: 35952           2440          640
  after:  35928 (-0.1%)   2440 (+0.0%)  640 (+0.0%)
2025-04-24 14:35:52 -05:00
Christopher Haster
a73f221317 scripts: Fixed issue where rbyd lookups rejected shrub tags
This was caused by including the shrub bit in the tag comparison in
Rbyd.lookup.

Fixed by adding an extra key mask (0xfff). Note this is already how
lfsr_rbyd_lookup works in lfs.c.
2025-04-23 23:19:37 -05:00
Christopher Haster
6d97398efc scripts: dbglfs.py: Fixed a couple mid=-1 issues
- Fixed Mtree.lookupleaf accepting mbid=0, which caused dbglfs.py to
  double print all files with mbid=-1

- Fixed grm mids not being mapped to mbid=-1 and related orphan false
  positives
2025-04-23 23:19:05 -05:00
Christopher Haster
8f1ccf089e Adopted lookupleaf, reworked internal btree APIs
This was a surprising side-effect the script rework: Realizing the
internal btree/rbyd lookup APIs were awkwardly inconsistent and could be
improved with a couple tweaks:

- Adopted lookupleaf name for functions that return leaf rbyds/mdirs.

  There's an argument this should be called lookupnextleaf, since it
  returns the next bid, unlike lookup, but I'm going to ignore that
  argument because:

  1. A non-next lookupleaf doesn't really make sense for trees where
     you don't have to fetch the leaf (the mtree)

  2. It would be a bit too verbose

- Adopted commitleaf name for functions that accept leaf rbyds.

  This makes the lfsr_bshrub_commit -> lfsr_btree_commit__ mess a bit
  more readable.

- Strictly limited lookup and lookupnext to return rattrs, even in
  complex trees like the mtree.

  Most use cases will probably stick to the lookupleaf variants, but at
  least the behavior will be consistent.

- Strictly limited lookup to expect a known bid/rid.

  This only really matters for lfsr_btree/bshrub_lookup, which as a
  quirk of their implementation _can_ lookup both bid + rattr at the
  same time. But I don't think we'll need this functionality, and
  limited the behavior may allow for future optimizations.

  Note there is no lfsr_file_lookup. File btrees currently only ever
  have a single leaf rattr, so this API doesn't really make sense.

Internal API changes:

- lfsr_btree_lookupnext_ -> lfsr_btree_lookupleaf
- lfsr_btree_lookupnext  -> lfsr_btree_lookupnext
- lfsr_btree_lookup      -> lfsr_btree_lookup
- added                     lfsr_btree_namelookupleaf
- lfsr_btree_namelookup  -> lfsr_btree_namelookup
- lfsr_btree_commit__    -> lfsr_btree_commit_
- lfsr_btree_commit_     -> lfsr_btree_commitleaf
- lfsr_btree_commit      -> lfsr_btree_commit

- added                     lfsr_bshrub_lookupleaf
- lfsr_bshrub_lookupnext -> lfsr_bshrub_lookupnext
- lfsr_bshrub_lookup     -> lfsr_bshrub_lookup
- lfsr_bshrub_commit_    -> lfsr_bshrub_commitleaf
- lfsr_bshrub_commit     -> lfsr_bshrub_commit

- lfsr_mtree_lookup      -> lfsr_mtree_lookupleaf
- added                     lfsr_mtree_lookupnext
- added                     lfsr_mtree_lookup
- added                     lfsr_mtree_namelookupleaf
- lfsr_mtree_namelookup  -> lfsr_mtree_namelookup

- added                     lfsr_file_lookupleaf
- lfsr_file_lookupnext   -> lfsr_file_lookupnext
- added                     lfsr_file_commitleaf
- lfsr_file_commit       -> lfsr_file_commit

Also added lookupnext to Mdir/Mtree in the dbg scripts.

Unfortunately this did add both code and stack, but only because of the
optional mdir returns in the mtree lookups:

           code          stack          ctx
  before: 35520           2440          636
  after:  35548 (+0.1%)   2472 (+1.3%)  636 (+0.0%)
2025-04-20 15:53:18 -05:00
Christopher Haster
3ca6670dcd Always log mbid=-1 for mroots and inlined mdirs
So mbid=0 now implies the mdir is not inlined.

Downsides:

- A bit more work to calculate
- May lose information due to masking everything when mtree.weight==0
- Risk of confusion when in-lfs.c state doesn't match (mbid=-1 is
  implied by mtree.weight==0)

Upsides:

- Includes more information about the topology of the mtree
- Avoids multiple dbgmbids for the same physical mdir

Also added lfsr_dbgmbid and lfsr_dbgmrid to help make logging
easier/more consistent.

And updated dbg scripts.
2025-04-20 15:53:18 -05:00
Christopher Haster
04d3002f3a Adopted ceiling division in mbits formula
So now:
               (block_size)
  mbits = nlog2(----------) = nlog2(block_size) - 3
               (     8    )

Instead of:

               (     (block_size))
  mbits = nlog2(floor(----------)) = nlog2(block_size & ~0x7) - 3
               (     (     8    ))

This makes the post-log - 3 formula simpler, which we probably want to
prefer as it avoids a division. And ceiling is arguably more intuitive
corner case behavior.

This may seem like a minor detail, but because mbits is purely
block_size derived and not configurable, any quirks here will become
a permanent compatibility requirement.

And hey, it saves a couple bytes (I'm not really sure why, the division
should've been optimized to a shift):

           code          stack          ctx
  before: 35528           2440          636
  after:  35520 (-0.0%)   2440 (+0.0%)  636 (+0.0%)
2025-04-20 15:53:18 -05:00
Christopher Haster
bd70270e11 scripts: Added -w/--word-bits to bound dbgleb128/dbgle32 parsing
This is limited to dbgle32.py, dbgleb128.py, and dbgtag.py for now.

This more closely matches how littlefs behaves, in that we read a
bounded number of bytes before leb128 decoding. This minimizes bugs
related to leb128 overflow and avoids reading inherently undecodable
data.

The previous unbounded behavior is still available with -w0.

Note this gives dbgle32.py much more flexibility in that it can now
decode other integer widths. Uh, ignore the name for now. At least it's
self documenting that the default is 32-bits...

---

Also fixed a bug in fromleb128 where size was reported incorrectly on
offset + truncated leb128.
2025-04-16 15:23:12 -05:00
Christopher Haster
0cea8b96fb scripts: Fixed O(n^2) slicing in Rbyd.fetch
Do you see the O(n^2) behavior in this loop?

  j = 0
  while j < len(data):
      word, d = fromleb(data[j:])
      j += d

The slice, data[j:], creates a O(n) copy every iteration of the loop.

A bit tricky. Or at least I found it tricky to notice. Maybe because
array indexing being cheap is baked into my brain...

Long story short, this repeated slicing resulted in O(n^2) behavior in
Rbyd.fetch and probably some other functions. Even though we don't care
_too_ much about performance in these scripts, having Rbyd.fetch run in
O(n^2) isn't great.

Tweaking all from* functions to take an optional index solves this, at
least on paper.

---

In practice I didn't actually find any measurable performance gain. I
guess array slicing in Python is optimized enough that the constant
factor takes over?

(Maybe it's being helped by us limiting Rbyd.fetch to block_size in most
scripts? I haven't tested NAND block sizes yet...)

Still, it's good to at least know this isn't a bottleneck.
2025-04-16 15:23:11 -05:00
Christopher Haster
b5c3b97ae1 scripts: Reworked dbgtag.py, added -i/--input, included hex in output
This just gives dbgtag.py a few more bells and whistles that may be
useful:

- Can now parse multiple tags from hex:

    $ ./scripts/dbgtag.py -x 71 01 01 01 12 02 02 02
    71 01 01 01    altrgt 0x101 w1 -1
    12 02 02 02    shrubdir w2 2

  Note this _does_ skip attached data, which risks some confusion but
  not skipping attached data will probably end up printing a bunch of
  garbage for most use cases:

    $ ./scripts/dbgtag.py -x 01 01 01 04 02 02 02 02 03 03 03 03
    01 01 01 04    gdelta 0x01 w1 4
    03 03 03 03    struct 0x03 w3 3

- Included hex in output. This is helpful for learning about the tag
  encoding and also helps identify tags when parsing multiple tags.

  I considered also included offsets, which might help with
  understanding attached data, but decided it would be too noisy. At
  some point you should probably jump to dbgrbyd.py anyways...

- Added -i/--input to read tags from a file. This is roughly the same as
  -x/--hex, but allows piping from other scripts:

    $ ./scripts/dbgcat.py disk -b4096 0 -n4,8 | ./scripts/dbgtag.py -i-
    80 03 00 08    magic 8

  Note this reads the entire file in before processing. We'd need to fit
  everything into RAM anyways to figure out padding.
2025-04-16 15:23:10 -05:00
Christopher Haster
a5747bb2b2 scripts: dbgmtree.py: Fixed minor mtree rendering/traversal issues
- Added TreeArt __bool__ and __len__.

  This was causing a crash in _treeartfrommtreertree when rtree was
  empty.

  The code was not updated in the set -> TreeArt class transition, and
  went unnoticed because it's unlikely to be hit unless the filesystem
  is corrupt.

  Fortunately(?) realtime rendering creates a bunch of transiently
  corrupt filesystem images.

- Tweaked lookupleaf to not include mroots in their own paths.

  This matches the behavior of leaf mdirs, and is intentionally
  different from btree's lookupleaf which needs to lookup the leaf rattr
  to terminate.

- Tweaked leaves to not remove the last path entry if it is an mdir.

  This hid the previous lookupleaf inconsistency. We only remove the
  last rbyd from the path because it is redundant, and for mdirs/mroots
  it should never be redundant.

  I ended up just replacing the corrupt check with an explicit check
  that the rbyd is redundant. This should be more precise and avoid
  issues like this in the future.

  Also adopted explicit redundant checks in Btree.leaves and
  Lfs.File.leaves.
2025-04-16 15:23:08 -05:00
Christopher Haster
57c77b1b72 scripts: Fixed most flickering issues in RingIO
Two new tricks:

1. Hide the cursor while redrawing the ring buffer.

2. Build up the entire redraw in RAM first, and render everything in a
   single write call.

These _mostly_ get rid of the cursor flickering issues in rapidly
updating scripts.
2025-04-16 15:23:05 -05:00
Christopher Haster
a5e59b2190 scripts: maps: Reverted all padding for status strings
After all, who doesn't love a good bit of flickering.

I think I was trying to be too clever, so reverting.

Printing these with no padding is the simplest solution, provides the
best information density, and worst case you can always add -s1 to limit
the update frequency if flickering is hurting readability.
2025-04-16 15:22:59 -05:00
Christopher Haster
27152ec597 scripts: maps: Adopted persistent padding for status strings
This automatically minimizes the status strings without flickering, all
it took was a bit of ~*global state*~.

---

If I'm remembering correctly, this was actually how tracebd.py used to
work before dbgbmap.py was added. The idea was dropped with dbgbmap.py
since dbgbmap.py relied on watch.py for real-time rendering and couldn't
persist state.

But now dbgbmap.py has its own -k/--keep-open flag, so that's not a
problem.
2025-04-16 15:22:58 -05:00
Christopher Haster
97c2287177 scripts: maps: Assume percentages never hit 100.0%
This isn't true, especially for dbgbmap.py, 100% is very possible in
filesystems with small files. But by limiting padding to 99.9%, we avoid
the annoying wasted space caused by the rare but occasional 100.0%.
2025-04-16 15:22:57 -05:00
Christopher Haster
eb4c4c612e scripts: Dropped --padding from ascii art scripts
No one is realistically ever going to use this.

Ascii art is just too low resolution, trying to pad anything just wastes
terminal space. So we might as well not support --padding and save on
the additional corner cases.

Worst case, in the future we can always find this commit and revert
things.
2025-04-16 15:22:56 -05:00
Christopher Haster
5e817be9cc scripts: maps: Cleaned up comments and junk
This took a bit of a messy route, but these scripts should be good to go
now.
2025-04-16 15:22:54 -05:00
Christopher Haster
50f652d44f scripts: maps: Cleaned up/moved header generation before rendering
Should've probably been two commits, but:

1. Cleaned up tracebd.py's header generation to be consistent with
   dbgbmap.py and other scripts.

   Percentage fields are now consistently floats in all scripts,
   allowing user-specified precision when punescaping.

2. Moved header generation up to where we still have the disk open (in
   dbgbmap[d3].py), to avoid issues with lazy Lfs attrs trying to access
   the disk after it's been closed.

   Found while testing with --title='cksum %(cksum)08x'. Lfs tries to
   validate the gcksum last minute and things break.
2025-04-16 15:22:53 -05:00
Christopher Haster
f0b8d34230 scripts: maps: Fixed divide-by-zero when packing blocks into small maps
This can be hit when dealing with very small maps, which is common since
we're rendering to the terminal. Not crashing here at least allows the
header/usage string to be shown.
2025-04-16 15:22:52 -05:00
Christopher Haster
cffa9ec67e scripts: Adopted ring name for stdout substitution 2025-04-16 15:22:51 -05:00
Christopher Haster
5952431660 scripts: Consistently use color='auto' default in main 2025-04-16 15:22:50 -05:00
Christopher Haster
61ce23ce7e scripts: maps: Fixed some aspect ratio issues, limited scope
Replacing -R/--aspect-ratio, --to-ratio now calculates the width/height
_before_ adding decoration such as headers, stack info, etc.

I toying around with generalizing -R/--aspect-ratio to include
decorations, but when Wolfram Alpha spit this mess for the post-header
formula:

      header*r - sqrt(4*v*r + padding^2*r)
  w = ------------------------------------
                        2

I decided maybe a generalized -R/--aspect-ratio is a _bit_ too
complicated for what are supposed to be small standalone Python
scripts...

---

Also fixed the scaling formula, which should've taken the sqrt _after_
multiplying by the aspect ratio:

  w = sqrt(v*r)

I only noticed while trying to solve for the more complicated
post-decoration formula, the difference is pretty minor.
2025-04-16 15:22:48 -05:00
Christopher Haster
fc5bfdae14 scripts: Adopted -n/--lines in most ascii art scripts
The notable exception being plot.py, where line-level history doesn't
really make sense.

These scripts all default to height=1, and -n/--lines can be useful for
viewing changes over time.

In theory you could achieve something similar to this with tailpipe.py,
but you would lose the header info, which is useful.

---

Note, as a point of simplicity, we do _not_ show sub-char history like
we used to in tracebd.py. That was way too complicated for what it was
worth.
2025-04-16 15:22:46 -05:00
Christopher Haster
8e3760c5b8 scripts: Tweaked punescape to expect dict-like attrs
This simplifies attrs a bit, and scripts can always override
__getitem__ if they want to provide lazy attr generation.

The original intention of accepting functions was to make lazy attr
generation easier, but while tinkering around with the idea I realized
the actual attr mapping/generation would be complicated enough that
you'd probably want a full class anyways.

All of our scripts are only using dict attrs anyways. And lazy attr
generation is probably a premature optimization for the same reason
everyone's ok with Python's slices being O(n).
2025-04-16 15:22:45 -05:00
Christopher Haster
06bb34fd99 scripts: Adopted Attr class changes in all scripts
Mainly the addition of Attr.getall, Attr.get, and changing
Attr.__getitem__ to raise KeyError (just like a normal dict).
2025-04-16 15:22:44 -05:00
Christopher Haster
b715e9a749 scripts: Prefer 1;30-37m ansi codes over 90-97m
Reading Wikipedia:

> Later terminals added the ability to directly specify the "bright"
> colors with 90–97 and 100–107.

So if we want to stick to one pattern, we should probably go with
brightness as a separate modifier.

This shouldn't noticeably change any script, unless your terminal
interprets 90-97m colors differently from 1;30-37m, in which case things
should be more consistent now.
2025-04-16 15:22:43 -05:00
Christopher Haster
cd039f6227 scripts: Adopted height-relative negative values for -n/--lines
This mirrors how -H/--height and -W/--width work, with -n-1 using the
terminal height - 1 for the output.

This is very useful for carving out space for the shell prompt and other
things, without sacrificing automatic sizing.
2025-04-16 15:22:42 -05:00
Christopher Haster
2fb115b84b scripts: Gave explicit chars priority over braille/dots
This allows for combining braille/dots with custom chars for specific
elements:

  $ ./scripts/codemap.py lfs.o -H16 -: -.lfsr_rbyd_appendrattr=A

Note this is already how plot.py works, letting braille/dots take
priority in the new scripts/reworks was just an oversight.
2025-04-16 15:22:41 -05:00
Christopher Haster
edc6c7ec99 scripts: dbgbmap[d3].py: Reverted percentages to entire bmap
So percentages now include unused blocks, instead of being derived from
only blocks in use.

This is a bit inconsistent with tracebd.py, where we show ops as
percentages of all ops, but it's more useful:

- mdir+btree+data gives you the total usage, which is useful if you want
  to know how full disk is. You can't get this info from in-use
  percentages.

  Note that total field is sticking around, so you can show the total
  usage directly if you provide your own title string:

    $ ./scripts/dbgbmap.py disk \
        --title="bd %(block_size)sx%(block_count)s, %(total_percent)s"

- You can derive the in-use percentages from total percentages if you
  need them: in-use-mdir = mdir/(mdir+btree+data).

  Maybe this should be added to the --title fields, but I can't think of
  a good name at the moment...

Attempting to make tracebd.py consistent with dbgbmap.py doesn't really
make sense either: Showing op percentages of total bmap will usually be
an extremely small number.

At least dbgbmap.py is consistent with tracebd.py's --wear percentage,
which is out of all erase state in the bmap.
2025-04-16 15:22:40 -05:00
Christopher Haster
465fdd1fca scripts: dbgbmap.py tweaks
- Create a grid with dashes even in -%/--usage mode.

  This was surprisingly annoying since it breaks the existing
  1 block = 1 char assumption.

- Derive percentages from in-use blocks, not all blocks. This matches
  behavior of tracebd.py's percentages (% read/prog/erase).

  Though not tracebd.py's percent wear...

- Added mdir/btree/data counts/percentages to dbgbmapd3.py, for use in
  custom --title strings and the newly added --title-usage.

  Because why not. Unlike dbgbmap.py, performance is not a concern at
  all, and the consistency between these two scripts helps
  maintainability.

  Case in point: also fixed a typo from copying the block_count
  inference between scripts.
2025-04-16 15:22:39 -05:00
Christopher Haster
d5c0e142f0 scripts: Reworked tracebd.py, needs cleanup
It's a mess but it's working. Still a number of TODOs to cleanup...

This adopts all of the changes in dbgbmap.py/dbgbmapd3.py, block
grouping, nested curves, Canvas, Attrs, etc:

- Like dbgbmap.py, we now group by block first before applying space
  filling curves, using nested space filling curves to render byte-level
  operations.

  Python's ft.lru_cache really shines here.

  The previous behavior is still available via -u/--contiguous

- Adopted most features in dbgbmap.py, so --to-scale, -t/--tiny, custom
  --title strings, etc.

- Adopted Attrs so now chars/coloring can be customized with
  -./--add-char, -,/--add-wear-char, -C/--add-color,
  -G/--add-wear-color.

- Renamed -R/--reset -> --volatile, which is a much better name.

- Wear is now colored cyan -> white -> read, which is a bit more
  visually interesting. And we're not using cyan in any scripts yet.

In addition to the new stuff, there were a few simplifications:

- We no longer support sub-char -n/--lines with -:/--dots or
  -⣿/--braille. Too complicated, required Canvas state hacks to get
  working, and wasn't super useful.

  We probably want to avoid doing too much cleverness with -:/--dots and
  -⣿/--braille since we can't color sub-chars.

- Dropped -@/--blocks byte-level range stuff. This was just not worth
  the amount of complexity it added. -@/--blocks is now limited to
  simple block ranges. High-level scripts should stick to high-level
  options.

- No fancy/complicated Bmap class. The bmap object is just a dict of
  TraceBlocks which contain RangeSets for relevant operations.

  Actually the new RangeSet class deserves a mention but this commit
  message is probably already too long.

  RangeSet is a decently efficient set of, well, ranges, that can be
  merged and queried. In a lower-level language it should be implemented
  as a binary tree, but in Python we're just using a sorted list because
  we're probably not going to be able to beat O(n) list operations.

- Wear is tracked at the block level, no reason to overcomplicate this.

- We no longer resize based on new info. Instead we either expect a
  -b/--block-size argument or wait until first bd init call.

  We can probably drop the block size in BD_TRACE statements now, but
  that's a TODO item.

- Instead of one amalgamated regex, we use string searches to figure out
  the bd op and then smaller regexes to parse. Lesson learned here:
  Python's string search is very fast (compared to regex).

- We do _not_ support labels on blocks like we do in treemap.py/
  codemap.py. It's less useful here and would just be more hassle.

I also tried to reorganize main a bit to mirror the simple two-main
approach in dbgbmap.py and other ascii-rendering scripts, but it's a bit
difficult here since trace info is very stateful. Building up main
functions in the main main function seemed to work well enough:

  main -+-> main_ -> trace__ (main thread)
        '-> draw_ -> draw__ (daemon thread)

---

You may note some weirdness going on with flags. That's me trying to
avoid upcoming flag conflicts.

I think we want -n/--lines in more scripts, now that it's relatively
self-contained, but this conflicts with -n/--namespace-depth in
codemap[d3].py, and risks conflict with -N/--notes in csv.py which may
end up with namespace-related functionality in the future.

I ended up hijacking -_, but this conflicted with -_/--add-line-char in
plot.py, but that's ok because we also want a common "secondary char"
flag for wear in tracebd.py... Long story short I ended up moving a
bunch of flags around:

- added                   -n/--lines
- -n/--namespace-depth -> -_/--namespace-depth
- -N/--notes           -> -N/--notes
- -./--add-char        -> -./--add-char
- -_/--add-line-char   -> -,/--add-line-char
- added                   -,/--add-wear-char
- -C/--color           -> -C/--add-color
- added                -> -G/--add-wear-color

Worth it? Dunno.
2025-04-16 15:22:38 -05:00
Christopher Haster
3ff25a4fdf scripts: dbgbmap[d3].py: Disabled gcksum checking by default
By default, we don't actually do anything if we find an invalid gcksum,
so there's no reason to calculate it everytime.

Though this performance improvement may not be very noticeable:

  dbgbmap.py w/  crc32c lib w/  no_ck --no-ckdata: 0m0.221s
  dbgbmap.py w/  crc32c lib w/o no_ck --no-ckdata: 0m0.269s
  dbgbmap.py w/o crc32c lib w/  no_ck --no-ckdata: 0m0.388s
  dbgbmap.py w/o crc32c lib w/o no_ck --no-ckdata: 0m0.490s
  dbgbmap.old.py:                                  0m0.231s

Note that there's no point in adopting this in dbgbmapd3.py: 1. svg
rendering dominates (probably, I haven't measured this), and 2. we
default to showing the littlefs mount string instead of mdir/btree/data
percentages.
2025-04-16 15:22:36 -05:00
Christopher Haster
3820be180d scripts: Adopted crc32c lib when available
Jumping from a simple Python implementation to the fully hardware
accelerated crc32c library basically deletes any crc32c related
bottlenecks:

  crc32c.py disk (1MiB) w/  crc32c lib: 0m0.027s
  crc32c.py disk (1MiB) w/o crc32c lib: 0m0.844s

This uses the same try-import trick we use for inotify_simple, so we get
the speed improvement without losing portability.

---

In dbgbmap.py:

  dbgbmap.py w/  crc32c lib:             0m0.273s
  dbgbmap.py w/o crc32c lib:             0m0.697s
  dbgbmap.py w/  crc32c lib --no-ckdata: 0m0.269s
  dbgbmap.py w/o crc32c lib --no-ckdata: 0m0.490s
  dbgbmap.old.py:                        0m0.231s

The bulk of the runtime is still in Rbyd.fetch, but this is now
dominated by leb128 decoding, which makes sense. We do ~twice as many
fetches in the new dbgbmap.py in order to calculate the gcksum (which
we then ignore...).
2025-04-16 15:22:34 -05:00
Christopher Haster
b5242d02ac scripts: dbgbmap[d3].py: Moved rbyd/bptr checks behind --no-ckmeta/ckdata
Checking every data block for errors really slows down dbgbmap.py, which
is unfortunate for realtime rendering.

To be fair, the real issue is our naive crc32c impl, but the mindset of
these scripts is if you want speed you really shouldn't be using Python
and should rewrite the script in Rust/C/something (see prettyasserts for
example). You _could_ speed things up with a table-based crc32c, but at
that point you should probably just find C-bindings for crc32c (maybe
optional like inotify?... actually that's not a bad idea...).

At least --no-ckmeta/--no-ckdata allow for the previous behavior of not
checking for relevant errors for a bit of speed.

---

Note that --no-ckmeta currently doesn't really do anything. I toyed with
adding a non-fetching Rbyd.fetchtrunk method, but this seems out of
scope for these scripts.
2025-04-16 15:22:34 -05:00
Christopher Haster
6ea18e6579 scripts: Tweaked bd.read to behave like an actual bd_read callback
This better matches what you would expect from a function called
bd.read, at least in the context of littlefs, while also decreasing the
state (seek) we have to worry about.

Note that bd.readblock already behaved mostly like this, and is
preferred by every class except for Bptr.
2025-04-16 15:22:32 -05:00
Christopher Haster
b2911fbbe7 scripts: Removed item/iter magic methods from fs object classes
So no more __getitem__, __contains__, or __iter__ for Rbyd, Btree, Mdir,
Mtree, Lfs.File, etc.

These were way too error-prone, especially when accidental unpacking
triggered unintended disk traversal and weird error states. We didn't
even use the implicit behavior because we preferred the full name for
heavy disk operations.

The motivation for this was Python not catching this bug, which is a bit
silly:

  rid, rattr, *path_ = rbyd
2025-04-16 15:22:28 -05:00
Christopher Haster
81b1a3cb71 scripts: dbgbmap.py: Dropped parents, siblings, and traverse path
We don't use these in dbgbmap.py, so no reason to calculate them. (We do
use them in dbgbmapd3.py, but that's a different script.)
2025-04-16 15:22:27 -05:00