Commit Graph

115 Commits

Author SHA1 Message Date
Christopher Haster
a5747bb2b2 scripts: dbgmtree.py: Fixed minor mtree rendering/traversal issues
- Added TreeArt __bool__ and __len__.

  This was causing a crash in _treeartfrommtreertree when rtree was
  empty.

  The code was not updated in the set -> TreeArt class transition, and
  went unnoticed because it's unlikely to be hit unless the filesystem
  is corrupt.

  Fortunately(?) realtime rendering creates a bunch of transiently
  corrupt filesystem images.

- Tweaked lookupleaf to not include mroots in their own paths.

  This matches the behavior of leaf mdirs, and is intentionally
  different from btree's lookupleaf which needs to lookup the leaf rattr
  to terminate.

- Tweaked leaves to not remove the last path entry if it is an mdir.

  This hid the previous lookupleaf inconsistency. We only remove the
  last rbyd from the path because it is redundant, and for mdirs/mroots
  it should never be redundant.

  I ended up just replacing the corrupt check with an explicit check
  that the rbyd is redundant. This should be more precise and avoid
  issues like this in the future.

  Also adopted explicit redundant checks in Btree.leaves and
  Lfs.File.leaves.
2025-04-16 15:23:08 -05:00
Christopher Haster
b715e9a749 scripts: Prefer 1;30-37m ansi codes over 90-97m
Reading Wikipedia:

> Later terminals added the ability to directly specify the "bright"
> colors with 90–97 and 100–107.

So if we want to stick to one pattern, we should probably go with
brightness as a separate modifier.

This shouldn't noticeably change any script, unless your terminal
interprets 90-97m colors differently from 1;30-37m, in which case things
should be more consistent now.
2025-04-16 15:22:43 -05:00
Christopher Haster
3820be180d scripts: Adopted crc32c lib when available
Jumping from a simple Python implementation to the fully hardware
accelerated crc32c library basically deletes any crc32c related
bottlenecks:

  crc32c.py disk (1MiB) w/  crc32c lib: 0m0.027s
  crc32c.py disk (1MiB) w/o crc32c lib: 0m0.844s

This uses the same try-import trick we use for inotify_simple, so we get
the speed improvement without losing portability.

---

In dbgbmap.py:

  dbgbmap.py w/  crc32c lib:             0m0.273s
  dbgbmap.py w/o crc32c lib:             0m0.697s
  dbgbmap.py w/  crc32c lib --no-ckdata: 0m0.269s
  dbgbmap.py w/o crc32c lib --no-ckdata: 0m0.490s
  dbgbmap.old.py:                        0m0.231s

The bulk of the runtime is still in Rbyd.fetch, but this is now
dominated by leb128 decoding, which makes sense. We do ~twice as many
fetches in the new dbgbmap.py in order to calculate the gcksum (which
we then ignore...).
2025-04-16 15:22:34 -05:00
Christopher Haster
6ea18e6579 scripts: Tweaked bd.read to behave like an actual bd_read callback
This better matches what you would expect from a function called
bd.read, at least in the context of littlefs, while also decreasing the
state (seek) we have to worry about.

Note that bd.readblock already behaved mostly like this, and is
preferred by every class except for Bptr.
2025-04-16 15:22:32 -05:00
Christopher Haster
b2911fbbe7 scripts: Removed item/iter magic methods from fs object classes
So no more __getitem__, __contains__, or __iter__ for Rbyd, Btree, Mdir,
Mtree, Lfs.File, etc.

These were way too error-prone, especially when accidental unpacking
triggered unintended disk traversal and weird error states. We didn't
even use the implicit behavior because we preferred the full name for
heavy disk operations.

The motivation for this was Python not catching this bug, which is a bit
silly:

  rid, rattr, *path_ = rbyd
2025-04-16 15:22:28 -05:00
Christopher Haster
33120bf930 scripts: Reworked dbgbmap.py
This is a rework of dbgbmap.py to match dbgbmapd3.py, adopt the new
Rbyd/Lfs class abstractions, as well as Canvas, -k/--keep-open, etc.

Some of the main changes:

- dbgbmap.py now reports corrupt/conflict blocks, which can be useful
  for debugging.

  Note though that you will probably get false positives if running with
  -k/--keep-open while something is writing to the disk. littlefs is
  powerloss safe, not multi-write safe! Very different problem!

- dbgbmap.py now groups by blocks before mapping to the space filling
  curve. This matches dbgbmapd3.py and I think is more intuitive now
  that we have a bmap tiling algorithm.

  -%/--usage still works, but is rendered as a second space filling
  curve _inside_ the block tile. Different blocks can end up with
  slightly different sizes due to rounding, but it's not the end of the
  world.

  I wasn't originally going to keep it around, but ended up caving, so
  you can still get the original byte-level curve via -u/--contiguous.

- Like the other ascii rendering script, dbgbmap.py now supports
  -k/--keep-open and friends as a thin main wrapper. This just makes it
  a bit easier to watch a realtime bmap without needing to use watch.py.

- --mtree-only is supported, but filtering via --mdirs/--btrees/--data
  is _not_ supported. This was too much complexity for a minor feature,
  and doesn't cover other niche blocks like corrupted/conflict or parity
  in the future.

- Things are more customizable thanks to the Attr class. For an example
  you can now use the littlefs mount string as the title via
  --title-littlefs.

- Support for --to-scale and -t/--tiny mode, if you want to scale based
  on block_size.

One of the bigger differences dbgbmapd3.py -> dbgbmap.py is that
dbgbmap.py still supports -%/--usage. Should we backport -%/--usage to
dbgbmapd3.py? Uhhhh...

This ends up a funny example of raster graphics vs vector graphics. A
pixel-level space filling curve is easy with raster graphics, but with
an svg you'd need some sort of pixel -> path wrapping algorithm...

So no -%/--usage in dbgbmapd3.py for now.

Also just ripped out all of the -@/--blocks byte-level range stuff. Way
too complicated for what it was worth. -@/--blocks is limited to simple
block ranges now. High-level scripts should stick to high-level options.

One last thing to note is the adoption of "if '%' in label__" checks
before applying punescape. I wasn't sure if we should support punescape
in dbgbmap.py, since it's quite a bit less useful here, and may be
costly due to the lazy attr generation. Adding this simple check avoids
the cost and consistency question, so I adopted it in all scripts.
2025-04-16 15:22:24 -05:00
Christopher Haster
202636cccd scripts: Tweaked corrupt rbyd coloring to include addresses
This matches the coloring in dbglfs.py for other erroneous conditions,
and also matches how we color hidden items when shown.

Also fixed some minor bugs in grm printing.
2025-04-16 15:22:23 -05:00
Christopher Haster
e5b430cb8c scripts: Adopted -q/--quiet in most debug scripts
This can be useful when you just want to check for errors.

The only exception being dbgblock.py/dbgcat.py, since these don't really
have a concept of an error.
2025-04-16 15:22:22 -05:00
Christopher Haster
5682fd6163 scripts: dbglfs.py: Added --ckmeta/--ckdata
For more aggressive checking of filesystem state. These should match the
behavior of LFS_M_CKMETA/CKDATA in lfs.c.

Also tweaked dbgbmapd3.py (and eventually dbgmap.py) to match, though we
don't need new flags there since we're already checking every block in
the filesystem.
2025-04-16 15:22:21 -05:00
Christopher Haster
97e2786545 scripts: Synced dbgbmapd3.py Lfs class changes
- Added Lfs.traverse for full filesystem traversal
- Added Rbyd.shrub flag so we can tell if an Rbyd is a shrub
- Removed redundant leaves from paths in leaf iters
2025-04-16 15:22:19 -05:00
Christopher Haster
5f06558cbe scripts: Added dbgbmapd3.py for bmap -> svg rendering
Like codemapd3.py this include an interactive UI for viewing the
underlying filesystem graph, including:

- mode-tree - Shows all reachable blocks from a given block
- mode-branches - Shows immediate children of a given block
- mode-references - Shows parents of a given block
- mode-redund - Shows sibling blocks in redund groups (This is
  currently just mdir pairs, but the plan is to add more)

This is _not_ a full filesystem explorer, so we don't embed all block
data/metadata in the svg. That's probably a project for another time.
However we do include interesting bits such as trunk addresses,
checksums, etc.

An example:

  # create an filesystem image
  $ make test-runner -j
  $ ./scripts/test.py -B test_files_many -a -ddisk -O- \
          -DBLOCK_SIZE=1024 \
          -DCHUNK=10 \
          -DSIZE=2050 \
          -DN=128 \
          -DBLOCK_RECYCLES=1
  ... snip ...
  done: 2/2 passed, 0/2 failed, 164pls!, in 0.16s

  # generate bmap svg
  $ ./scripts/dbgbmapd3.py disk -b1024 -otest.svg \
          -W1400 -H750 -Z --dark
  updated test.svg, littlefs v0.0 1024x1024 0x{26e,26f}.d8 w64.128, cksu
  m 41ea791e

And open test.svg in a browser of your choice.

Here's what the current colors mean:

- yellow => mdirs
- blue   => btree nodes
- green  => data blocks
- red    => corrupt/conflict issue
- gray   => unused blocks

But like codemapd3.py the output is decently customizable. See -h/--help
for more info.

And, just like codemapd3.py, this is based on ideas from d3 and
brendangregg's flamegraphs:

- d3 - https://d3js.org
- brendangregg's flamegraphs - https://github.com/brendangregg/FlameGraph

Note we don't actually use d3... the name might be a bit confusing...

---

One interesting change from the previous dbgbmap.py is the addition of
"corrupt" (bad checksum) and "conflict" (multiple parents) blocks, which
can help find bugs.

You may find the "conflict" block reporting a bit strange. Yes it's
useful for finding block allocation failures, but won't naturally formed
dags in file btrees also be reported as "conflicts"?

Yes, but the long-term plan is to move away from dags and make littlefs
a pure tree (for block allocator and error correction reasons). This
hasn't been implemented yet, so for now dags will result in false
positives.

---

Implementation wise, this script was pretty straightforward given prior
dbglfs.py and codemapd3.py work.

However there was an interesting case of https://xkcd.com/1425:

- Traverse the filesystem and build a graph - easy
- Tile a rectangle with n nice looking rectangles - uhhh

I toyed around with an analytical approach (something like block width =
sqrt(canvas_width*canvas_height/n) * block_aspect_ratio), but ended up
settling on an algorithm that divides the number of columns by 2 until
we hit our target aspect ratio.

This algorithm seems to work quite well, runs in only O(log n), and
perfectly tiles the grid for powers-of-two. Honestly the result is
better than I was expecting.
2025-04-16 15:22:17 -05:00
Christopher Haster
27370dec66 scripts: Tweaked mdir/shrub address printing
This fixes an issue where shrub trunks were never printed even with
-i/--internal.

While only showing mdir/shrub/btree/bptr addresses on block changes is
nice in theory, it results in shrub trunks never being printed because
the mdir -> shrub block doesn't change.

Also checking for changes in block type avoids this.
2025-04-16 15:22:16 -05:00
Christopher Haster
f550fa9a80 scripts: Changed most tree renderers to be pseudo-standalone
I'm trying to avoid having classes with different implementations across
scripts, as it makes updating things error-prone, but at same time
copying all the tree renderers to all dbg scripts would be a bit much.

Monkey-patching the TreeArt class in relevant scripts seems like a
reasonable compromise.
2025-04-16 15:22:15 -05:00
Christopher Haster
682f12a953 scripts: Moved tree renderers out into their own class
These are pretty script specific, so probably shouldn't be in the
abstract littlefs classes. This also avoids the tree renderers getting
copied into scripts that don't need them (mtree -> dbglfs.py, dbgbmap.py
in the future, etc).

This also makes TreeArt consistent with JumpArt and LifetimeArt.
2025-04-16 15:22:14 -05:00
Christopher Haster
002c2ea1e6 scripts: Tried to simplify optional path returns
So, instead of trying to be clever with python's tuple globbing, just
rely on lazy tuple unpacking and a whole bunch of if statements.

This is more verbose, but less magical. And generally, the less magic
there is, the easier things are to read.

This also drops the always-tupled lookup_ variants, which were
cluttering up the various namespaces.
2025-04-16 15:22:12 -05:00
Christopher Haster
82f4fd3c0f scripts: Dropped list/tuple distinction in Rbyd.fetch
Also tweaked how we fetch shrubs, adding Rbyd.fetchshrub and
Btree.fetchshrub instead of overloading the bd argument.

Oh, and also added --trunk to dbgmtree.py and dbglfs.py. Actually
_using_ --trunk isn't advised, since it will probably just result in a
corrupted filesystem, but these scripts are for accessing things that
aren't normally allowed anyways.

The reason for dropping the list/tuple distinction is because it was a
big ugly hack, unpythonic, and likely to catch users (and myself) by
surprise. Now, Rbyd.fetch and friends always require separate
block/trunk arguments, and the exercise of deciding which trunk to use
is left up to the caller.
2025-04-16 15:22:11 -05:00
Christopher Haster
8324786121 scripts: Reverted skipped branches in -t/--tree render
The inconsistency between inner/non-inner (-i/--inner) views was a bit
too confusing.

At least now the bptr rendering in dbglfs.py matches behavior, showing
the bptr tag -> bptr jump even when not showing inner nodes.

If the point of these renderers is to show all jumps necessary to reach
a given piece of data, hiding bptr jumps only sometimes is somewhat
counterproductive...
2025-04-16 15:22:07 -05:00
Christopher Haster
97b6489883 scripts: Reworked dbglfs.py, adopted Lfs, Config, Gstate, etc
I'm starting to regret these reworks. They've been a big time sink. But
at least these should be much easier to extend with the future planned
auxiliary trees?

New classes:

- Bptr - A representation of littlefs's data-only block pointers.

  Extra fun is the lazily checked Bptr.__bool__ method, which should
  prevent slowing down scripts that don't actually verify checksums.

- Config - The set of littlefs config entries.

- Gstate - The set of littlefs gstate.

  I may have had too much fun with Config and Gstate. Not only do these
  provide lookup functions for config/gstate, but known config/gstate
  get lazily parsed classes that can provide easy access to the relevant
  metadata.

  These even abuse Python's __subclasses__, so all you need to do to add
  a new known config/gstate is extend the relevant Config.Config/
  Gstate.Gstate class.

  The __subclasses__ API is a weird but powerful one.

- Lfs - The big one, a high-level abstraction of littlefs itself.

  Contains subclasses for known files: Lfs.Reg, Lfs.Dir, Lfs.Stickynote,
  etc, which can be accessed by path, did+name, mid, etc. It even
  supports iterating over orphaned files, though it's expensive (but
  incredibly valuable for debugging!).

  Note that all file types can currently have attached bshrubs/btrees.
  In the existing implementation only reg files should actually end up
  with bshrubs/btrees, but the whole point of these scripts is to debug
  things that _shouldn't_ happen.

  I intentionally gave up on providing depth bounds in Lfs. Too
  complicated for something so high-level.

On noteworthy change is not recursing into directories by default. This
hopefully avoids overloading new users and matches the behavior of most
other Linux/Unix tools.

This adopts -r/--recurse/--file-depth for controlling how far to recurse
down directories, and -z/--depth/--tree-depth for controlling how far to
recurse down tree structures (mostly files). I like this API. It's
consistent with -z/--depth in the other dbg scripts, and -r/--recurse is
probably intuitive for most Linux/Unix users.

To make this work we did need to change -r/--raw -> -x/--raw. But --raw
is already a bit of a weird name for what really means "include a hex
dump".

Note that -z/--depth/--tree-depth does _not_ imply --files. Right now
only files can contain tree structures, but this will change when we get
around to adding the auxiliary trees.

This also adds the ability to specify a file path to use as the root
directory, though we need the leading slash to disambiguate file paths
and mroot addresses.

---

Also tagrepr has been tweaked to include the global/delta names,
toggleable with the optional global_ kwarg.

Rattr now has its own lazy parsers for did + name. A more organized
codebase would probably have a separate Name type, but it just wasn't
worth the hassle.

And the abstraction classes have all been tweaked to require the
explicit Rbyd.repr() function for a CLI-friendly representation. Relying
on __str__ hurt readability and debugging, especially since Python
prefers __str__ over __repr__ when printing things.
2025-04-16 15:22:06 -05:00
Christopher Haster
cc20610488 scripts: Skip branches in -t/--tree render, fixed color-repro issues
The main difference between -t/--tree and -R/--tree-rbyd is that only
the latter shows all internal jumps (unconditional alt->alt), so it
makes sense to also hide internal branches (rbyd->rbyd).

Note that we already hide the rbyd->block branches in dbglfs.py.

Also added color-ignoring comparison operators to our internal
TreeBranch struct. This fixes an issue where our non-inner branch
merging logic could end up with identical branches with different
colors, resulting in different colorings per run. Not the end of the
world, but something we want to avoid.
2025-04-16 15:22:05 -05:00
Christopher Haster
45d9ea1f62 scripts: dbgmtree.py: Tightened dynamic mrid width
This requires an additional traversal of the mtree just to precalculate
the mrid width (mbits provides an upper-bound, but the actual number of
mrids in any given mdir may be much less), but it makes the output look
nicer.
2025-04-16 15:22:04 -05:00
Christopher Haster
582f92d073 scripts: Reworked dbgmtree.py, adopted Mtree, Mdir, etc
This is where the high-level structure of littlefs starts to reveal
itself.

This is also where a lot of really annoying Mtree vs Btree API questions
come to a head, like should Mtree.lookup return an Mdir or an Rattr?
What about Btree.lookup? What gets included in the returned path in all
of these? Well, at least this is an interesting exercise in rethinking
littlefs's internal APIs...

New classes:

- Mid - A representation of littlefs's metadata ids. I've just gone
  ahead and included the block_size-dependent mbits as a field in every
  Mid instance to try to make Mid operations easier.

  It's not like we care about one extra word of storage in Python.

- Mdir - Again, we intentionally _don't_ inherit Rbyd to try to reduce
  type errors, though Mdirs really are just Rbyds in this design.

- Mtree - The skeleton of littlefs. Tricky bits include traversing the
  mroot chain and handling mroot-inlined mdirs. Note mroots are included
  in the mdir/mid iteration methods.

Getting the tree renderers all working again was a real pain in the ass.
2025-04-16 15:22:03 -05:00
Christopher Haster
73127470f9 scripts: Adopted rbydaddr/tagrepr changes across scripts
Just some minor tweaks:

- rbydaddr: Return list instead of tuple, note we rely on the type
  distinction in Rbyd.fetch now.

- tagrepr: Rename w -> weight.
2025-04-16 15:21:59 -05:00
Christopher Haster
68f0534dd0 rbyd: Dropped special altn/alta encoding
altas, and to a lesser extend altns, are just too problematic for our
rbyd-append algorithm.

Main issue is these break our "narrowing" invariant, where each alt only
ever decreases the bounds.

I wanted to use altas to simplify lfsr_rbyd_appendcompaction, but
decided it wasn't worth it. Handling them correctly would require adding
a number of special cases to lfsr_rbyd_appendrat, adding complexity to
an already incredibly complex function.

---

Fortunately, we don't really need altns/altas on-disk, but we _do_ need
a way to mark alts as unreachable internally in order to know when we
can collapse alts when recoloring (at this point bounds information is
lost).

I was originally going to use the alt's sign bit for this, but it turns
out we already have this information thanks to setting jump=0 to assert
that an alt is unreachable. So no explicit flag needed!

This ends up saving a surprising amount of code for what is only a
couple lines of changes:

           code          stack          ctx
  before: 38512           2624          640
  after:  38440 (-0.2%)   2624 (+0.0%)  640 (+0.0%)
2025-02-08 14:53:47 -06:00
Christopher Haster
1c5adf71b3 Implemented self-validating global-checksums (gcksums)
This was quite a puzzle.

The problem: How do we detect corrupt mdirs?

Seems like a simple question, but we can't just rely on mdir cksums. Our
mdirs are independently updateable logs, and logs have this annoying
tendency to "rollback" to previously valid states when corrupted.

Rollback issues aren't littlefs-specific, but what _is_ littlefs-
specific is that when one mdir rolls back, it can disagree with other
mdirs, resulting in wildly incorrect filesystem state.

To solve this, or at least protect against disagreeable mdirs, we need
to somehow include the state of all other mdirs in each mdir commit.

---

The first thought: Why not use gstate?

We already have a system for storing distributed state. If we add the
xor of all of our mdir cksums, we can rebuild it during mount and verify
that nothing changed:

   .--------.   .--------.   .--------.   .--------.
  .| mdir 0 |  .| mdir 1 |  .| mdir 2 |  .| mdir 3 |
  ||        |  ||        |  ||        |  ||        |
  || gdelta |  || gdelta |  || gdelta |  || gdelta |
  |'-----|--'  |'-----|--'  |'-----|--'  |'-----|--'
  '------|-'   '------|-'   '------|-'   '------|-'
  '--.------'  '--.------'  '--.------'  '--.------'
   cksum |      cksum |      cksum |      cksum |
     |   |        v   |        v   |        v   |
     '---------> xor -------> xor -------> xor -------> gcksum
         |            v            v            v         =?
         '---------> xor -------> xor -------> xor ---> gcksum

Unfortunately it's not that easy. Consider what this looks like
mathematically (g is our gcksum, c_i is an mdir cksum, d_i is a
gcksumdelta, and +/-/sum is xor):

  g = sum(c_i) = sum(d_i)

If we solve for a new gcksumdelta, d_i:

  d_i = g' - g
  d_i = g + c_i - g
  d_i = c_i

The gcksum cancels itself out! We're left with an equation that depends
only on the current mdir, which doesn't help us at all.

Next thought: What if we permute the gcksum with a function t before
distributing it over our gcksumdeltas?

   .--------.   .--------.   .--------.   .--------.
  .| mdir 0 |  .| mdir 1 |  .| mdir 2 |  .| mdir 3 |
  ||        |  ||        |  ||        |  ||        |
  || gdelta |  || gdelta |  || gdelta |  || gdelta |
  |'-----|--'  |'-----|--'  |'-----|--'  |'-----|--'
  '------|-'   '------|-'   '------|-'   '------|-'
  '--.------'  '--.------'  '--.------'  '--.------'
   cksum |      cksum |      cksum |      cksum |
     |   |        v   |        v   |        v   |
     '---------> xor -------> xor -------> xor -------> gcksum
         |            |            |            |   .--t--'
         |            |            |            |   '-> t(gcksum)
         |            v            v            v          =?
         '---------> xor -------> xor -------> xor ---> t(gcksum)

In math terms:

  t(g) = t(sum(c_i)) = sum(d_i)

In order for this to work, t needs to be non-linear. If t is linear, the
same thing happens:

  d_i = t(g') - t(g)
  d_i = t(g + c_i) - t(g)
  d_i = t(g) + t(c_i) - t(g)
  d_i = t(c_i)

This was quite funny/frustrating (funnistrating?) during development,
because it means a lot of seemingly obvious functions don't work!

- t(g) = g              - Doesn't work
- t(g) = crc32c(g)      - Doesn't work because crc32cs are linear
- t(g) = g^2 in GF(2^n) - g^2 is linear in GF(2^n)!?

Fortunately, powers coprime with 2 finally give us a non-linear function
in GF(2^n), so t(g) = g^3 works:

  d_i = g'^3 - g^3
  d_i = (g + c_i)^3 - g^3
  d_i = (g^2 + gc_i + gc_i + c_i^2)(g + c_i) - g^3
  d_i = (g^2 + c_i^2)(g + c_i) - g^3
  d_i = g^3 + gc_i^2 + g^2c_i + c_i^3 - g^3
  d_i = gc_i^2 + g^2c_i + c_i^3

---

Bleh, now we need to implement finite-field operations? Well, not
entirely!

Note that our algorithm never uses division. This means we don't need a
full finite-field (+, -, *, /), but can get away with a finite-ring (+,
-, *). And conveniently for us, our crc32c polynomial defines a ring
epimorphic to a 31-bit finite-field.

All we need to do is define crc32c multiplication as polynomial
multiplication mod our crc32c polynomial:

  crc32cmul(a, b) = pmod(pmul(a, b), P)

And since crc32c is more-or-less just pmod(x, P), this lets us take
advantage of any crc32c hardware/tables that may be available.

---

Bunch of notes:

- Our 2^n-bit crc-ring maps to a 2^n-1-bit finite-field because our crc
  polynomial is defined as P(x) = Q(x)(x + 1), where Q(x) is a 2^n-1-bit
  irreducible polynomial.

  This is a common crc construction as it provides optimal odd-bit/2-bit
  error detection, so it shouldn't be too difficult to adapt to other
  crc sizes.

- t(g) = g^3 is not the only function that works, but it turns out to be
  a pretty good one:

  - 3 and 2^(2^n-1)-1 are coprime, which means our function t(g) = g^3
    provides a one-to-one mapping in the underlying fields of all crc
    rings of size 2^(2^n).

    We know 3 and 2^(2^n-1)-1 are coprime because 2^(2^n-1)-1 =
    2^(2^n)-1 (a Fermat number) - 2^(2^n-1) (a power-of-2), and 3
    divides Fermat numbers >=3 (A023394) and is not 2.

  - Our delta, when viewed as a polynomial in g: d(g) = gc^2 + g^2c +
    c^3, has degree 2, which implies there are at most 2 solutions or
    1-bit of information loss in the underlying field.

    This is optimal since the original definition already had 2
    solutions before we even chose a function:

      d(g) = t(g + c) - t(g)
      d(g) = t(g + c) - t((g + c) - c)
      d(g) = t((g + c) + c) - t(g + c)
      d(g) = d(g + c)

  Though note the mapping of our crc-ring to the underlying field
  already represents 1-bit of information loss.

- If you're using a cryptographic hash or other non-crc, you should
  probably just use an equal sized finite-field.

  Though note changing from a 2^n-1-bit field to a 2^n-bit field does
  change the math a bit, with t(g) = g^7 being a better non-linear
  function:

  - 7 is the smallest odd-number coprime with 2^n-1, a Fermat number,
    which makes t(g) = g^7 a one-to-one mapping.

    3 humorously divides all 2^n-1 Fermat numbers.

  - Expanding delta with t(g) = g^7 gives us a 6 degree polynomial,
    which implies at most 6 solutions or ~3-bits of information loss.

    This isn't actually the best you can do, some exhaustive searching
    over small fields (<=2^16) suggests t(g) = g^(2^(n-1)-1) _might_ be
    optimal, but that's a heck of a lot more multiplications.

- Because our crc32cs preserve parity/are epimorphic to parity bits,
  addition (xor) and multiplication (crc32cmul) also preserve parity,
  which can be used to show our entire gcksum system preserves parity.

  This is quite neat, and means we are guaranteed to detect any odd
  number of bit-errors across the entire filesystem.

- Another idea was to use two different addition operations: xor and
  overflowing addition (or mod a prime).

  This probably would have worked, but lacks the rigor of the above
  solution.

- You might think an RS-like construction would help here, where g =
  sum(c_ia^i), but this suffers from the same problem:

    d_i = g' - g
    d_i = g + c_ia^i - g
    d_i = c_ia^i

  Nothing here depends on anything outside of the current mdir.

- Another question is should we be using an RS-like construction anyways
  to include location information in our gcksum?

  Maybe in another system, but I don't think it's necessary in littlefs.

  While our mdir are independently updateable, they aren't _entirely_
  independent. The location of each mdir is stored in either the mtree
  or a parent mdir, so it always gets mixed into the gcksum somewhere.

  The only exception being the mrootanchor which is always at the fixed
  blocks 0x{0,1}.

- This does _not_ catch "global-rollback" issues, where the most recent
  commit in the entire filesystem is corrupted, revealing an older, but
  still valid, filesystem state.

  But as far as I am aware this is just a fundamental limitation of
  powerloss-resilient filesystems, short of doing destructive
  operations.

  At the very least, exposing the gcksum would allow the user to store
  it externally and prevent this issue.

---

Implementation details:

- Our gcksumdelta depends on the rbyd's cksum, so there's a catch-22 if
  we include it in the rbyd itself.

  We can avoid this by including it in the commit tags (actually the
  separate canonical cksum makes this easier than it would have been
  earlier), but this does mean LFSR_TAG_GCKSUMDELTA is not an
  LFSR_TAG_GDELTA subtype. Unfortunate but not a dealbreaker.

- Reading/writing the gcksumdelta gets a bit annoying with it not being
  in the rbyd. For now I've extended the low-level lfsr_rbyd_fetch_/
  lfsr_rbyd_appendcksum_ to accept an optional gcksumdelta pointer,
  which is a bit awkward, but I don't know of a better solution.

- Unlike the grm, _every_ mdir commit involves the gcksum, which means
  we either need to propagate the gcksumdelta up the mroot chain
  correctly, or somehow keep track of partially flushed gcksumdeltas.

  To make this work I modified the low-level lfsr_mdir_commit__
  functions to accept start_rid=-2 to indicate when gcksumdeltas should
  be flushed.

  It's a bit of a hack, but I think it might make sense to extend this
  to all gdeltas eventually.

The gcksum cost both code and RAM, but I think it's well worth it for
removing an entire category of filesystem corruption:

           code          stack          ctx
  before: 37796           2608          620
  after:  38428 (+1.7%)   2640 (+1.2%)  644 (+3.9%)
2025-02-08 14:53:30 -06:00
Christopher Haster
b6ab323eb1 Dropped the q-bit (previous-perturb) from cksum tags
Now that we perturb commit cksums with the odd-parity zero, the q-bit no
longer serves a purpose other than extra debug info. But this is a
double-edged sword, because redundant info just means another thing that
can go wrong.

For example, should we assert? If the q-bit doesn't reflect the
previous-perturb state it's a bug, but the only thing that would break
would be the q-bit itself. And if we don't assert what's the point of
keeping the q-bit around?

Dropping the q-bit avoids answering this question and saves a bit of
code:

           code          stack          ctx
  before: 37772           2608          620
  after:  37768 (-0.0%)   2608 (+0.0%)  620 (+0.0%)
2025-01-28 14:41:45 -06:00
Christopher Haster
66bf005bb8 Renamed LFSR_TAG_ORPHAN -> LFSR_TAG_STICKYNOTE
I've been unhappy with LFSR_TAG_ORPHAN for a while now. While it's true
these represent orphaned files, they also represent zombied files. And
as long as a reference to the file exists in-RAM, I find it hard to say
these files are truely "orphaned".

We're also just using the term "orphan" for too many things.

Really this tag just represents an mid reservation. The term stickynote
works well enough for this, and fits in with the other internal tag,
LFSR_TAG_BOOKMARK.
2025-01-28 14:41:45 -06:00
Christopher Haster
62cc4dbb14 scripts: Disabled local import hack on import
Moved local import hack behind if __name__ == "__main__"

These scripts aren't really intended to be used as python libraries.
Still, it's useful to import them for debugging and to get access to
their juicy internals.
2025-01-28 14:41:30 -06:00
Christopher Haster
7cfcc1af1d scripts: Renamed summary.py -> csv.py
This seems like a more fitting name now that this script has evolved
into more of a general purpose high-level CSV tool.

Unfortunately this does conflict with the standard csv module in Python,
breaking every script that imports csv (which is most of them).
Fortunately, Python is flexible enough to let us remove the current
directory before imports with a bit of an ugly hack:

  # prevent local imports
  __import__('sys').path.pop(0)

These scripts are intended to be standalone anyways, so this is probably
a good pattern to adopt.
2024-11-09 12:31:16 -06:00
Christopher Haster
a0ab7bda26 scripts: Avoid rereading shrub blocks
This extends Rbyd.fetch to accept another rbyd, in which case we inherit
the RAM-backed block without rereading it from disk. This avoids an
issue where shrubs can become corrupted if the disk is being
simultaneously written and debugged.

Normally we can detect the checksum mismatch and toss out the rbyd
during fetch, but shrub pointers don't include a checksum since they
assume the containing rbyd has already been checksummed.

It's interesting to note this even avoids the memory copy thanks to
Python's reference counting.
2024-11-08 02:24:56 -06:00
Christopher Haster
0260f0bcee scripts: Added better branch cksum checks
If we're fetching branches anyways, we might as well check that the
checksums match. This helps protect against infinite loops in B-tree
branches.

Also fixed an issue where we weren't xoring perturb state on finding an
explicit trunk.

Note this is equivalent to LFS_M_CKFETCHES in lfs.c.

---

This doesn't mean we always need LFS_M_CKFETCHES. Our dbg scripts just
need to be a little bit tougher because 1. running tests with -j creates
wildly corrupted and entangled littlefs images, and 2. Rbyd.fetch is
almost too forgiving in choosing the nearest trunk.
2024-11-08 02:20:19 -06:00
Christopher Haster
e3fdc3dbd7 scripts: Added simple mroot cycle detectors to dbg scripts
These work by keeping a set of all seen mroots as we descend down the
mroot chain. Simple, but it works.

The downside of this approach is that the mroot set grows unbounded, but
it's unlikely we'll ever have enough mroots in a system for this to
really matter.

This fixes scripts like dbgbmap.py getting stuck on intentional mroot
cycles created for testing. It's not a problem for a foreground script
to get stuck in an infinite loop, since you can just kill it, but a
background script getting stuck at 100% CPU is a bit more annoying.
2024-11-07 11:46:39 -06:00
Christopher Haster
007ac97bec scripts: Adopted double-indent on multiline expressions
This matches the style used in C, which is good for consistency:

  a_really_long_function_name(
          double_indent_after_first_newline(
              single_indent_nested_newlines))

We were already doing this for multiline control-flow statements, simply
because I'm not sure how else you could indent this without making
things really confusing:

  if a_really_long_function_name(
          double_indent_after_first_newline(
              single_indent_nested_newlines)):
      do_the_thing()

This was the only real difference style-wise between the Python code and
C code, so now both should be following roughly the same style (80 cols,
double-indent multiline exprs, prefix multiline binary ops, etc).
2024-11-06 15:31:17 -06:00
Christopher Haster
48c2e7784b scripts: Renamed import math alias m -> mt
Mainly to avoid conflicts with match results m, this frees up the single
letter variables m for other purposes.

Choosing a two letter alias was surprisingly difficult, but mt is nice
in that it somewhat matches it (for itertools) and ft (for functools).
2024-11-05 01:58:40 -06:00
Christopher Haster
4d8bfeae71 attrs: Reduced UATTR/SATTR range down to 7-bits
It would be nice to have a full 8-bit range for both user attrs and
system attrs, for both backwards compatibility and maximizing the
available attr space, but I think it just doesn't make sense from an API
perspective.

Sure we could finagle the user/sys bit into a flags argument, or provide
separate lfsr_getuattr/getsattr functions, but asking users to use a
9-bit int for higher-level operations (dynamic attrs, iteration, etc) is
a bit much...

So this reduces the two attr ranges down to 7-bits, requiring 8-bits
total to store all possible attr types in the current system:

  TAG_ATTR      0x0400  v--- -1-a -aaa aaaa
  TAG_UATTR     0x04aa  v--- -1-- -aaa aaaa
  TAG_SATTR     0x05aa  v--- -1-1 -aaa aaaa

This really just affects scripts, since we haven't actually implemented
attributes yet.

Worst case we still have the 9-bit encoding space carved out, so we can
always add an additional set of attrs in the future if we start running
into attr pressure.

Or, you know, just turn on the subtype leb128 encoding the 8th subtype
bit is reserved for. Then you'd only be limited by internal driver
details, probably 24-bits per attr range if we make tags 32-bits
internally. Though this would probably come with quite a code cost...
2024-08-22 00:59:09 -05:00
Christopher Haster
c00e0b2af6 Fixed explicit trunks messing with canonical checksums
Updating the canonical checksum should only depend on if the tag is a
trunkish tag (not a checksum tag), and not if the tag is in the current
trunk. The trunk parameter to lfsr_rbyd_fetch should have no effect on
the canonical checksum.

Fixed in boath lfsr_rbyd_fetch and scripts.

Curiously no code changes:

           code          stack
  before: 36416           2616
  after:  36416 (+0.0%)   2616 (+0.0%
2024-08-20 12:03:48 -05:00
Christopher Haster
1044c9d2b7 Adopted odd-parity-zero rbyd perturb scheme
I've been scratching my head over our rbyd perturb scheme. It's gotten
rather clunky with needing to xor valid bits and whatnot.

But it's tricky with needing erased-state to be included in parity bits,
while at the same time excluded from our canonical checksum. If only
there was some way to flip the checksums parity without changing its
value...

Enter the crc32c odd-parity zero: 0xfca42daf!

This bends the definition of zero a bit, but it is one of two numbers in
our crc32c-ring with a very interesting property:

  crc32c(m) == crc32c(m xor 0xfca42daf) xor 0xfca42daf  // odd-p zero
  crc32c(m) == crc32c(m xor 0x00000000) xor 0x00000000  // even-p zero

Recall that crc32c's polynomial, 0x11edc6f41, is composed of two
polynomials: 0x3, the parity polynomial, and 0xf5b4253f, a maximally
sized irreducible polynomial. Because our polynomial breaks down into
two smaller polynomials, our crc32c space turns out to not be a field,
but rather a ring containing two smaller sub-fields. Because these
sub-fields are defined by their polynomials, one is the 31-bit crc
defined by the polynomial 0xf5b4253f, while the other is the current
parity.

We can move in the parity sub-field without changing our position in the
31-bit crc sub-field by xoring with a number that is one in the parity
sub-field, but zero in the 31-bit crc sub-field.

This number happens to be 0xf5b4253f (0xfca42daf bit-reversed)!

(crcs being bit-reversed will never not be annoying)

So long story short, xoring any crc32c with 0xfca42daf will change its
parity but not its value.

---

An that's basically our new perturb scheme. If we need to perturb, xor
with 0xfca42daf to change the parity, and after calculating/validating
the checksum, xor with 0xfca42daf to get our canonical checksum.

Isn't that neat!

There was one small hiccup: At first I assumed you could continue
including the valid bits in the checksum, which would have been nice for
bulk checksumming. But this doesn't work because while valid bits cancel
out so the parity doesn't change, changing valid bits _does_ change the
underlying 31-bit crc, poisoning our checksum and making everything a
mess.

So we still need to mask out valid bits, which is a bit annoying.

But then I stumbled on the funny realization that by masking our valid
bits, we accidentally end up with a fully functional parity scheme.
Because valid bits _don't_ include the previous valid bit, we can figure
out the parity for not only the entire commit, but also each individual
tag:

  80 03 00 08 6c 69 74 74 6c 65 66 73 80
  ^'----------------.---------------' ^
  |                 |                 |
  v       +       parity      =       v'

Or more simply:

  80 03 00 08 6c 69 74 74 6c 65 66 73 80
  '----------------.----------------' ^
                   |                  |
                 parity       =       v'

Double neat!

Some other notes:

- By keeping the commit checksum perturbed, but not the canonical
  checksum, the perturb state is self-validating. We no longer need to
  explicitly check the previous-perturb-bit (q) to avoid the perturb
  hole we ran into previously.

  I'm still keeping the previous-perturb-bit (q) around, since it's
  useful for debugging. We still need to know the perturb state
  internally at all times in order to xor out the canonical checksum
  correctly anyways.

- Thanks to all of our perturb iterations, we now know how to remove the
  valid bits from the checksum easily:

    cksum ^= 0x00000080 & (tag >> 8)

  This makes the whole omitting-valid-bits thing less of a pain point.

- It wasn't actually worth it to perturb the checksum when building
  commits, vs manually flipping each valid bit, as this would have made
  our internal appendattr API really weird.

  At least the perturbed checksum made fetch a bit simpler.

Not sure exactly how to draw this with our perturb scheme diagrams,
maybe something like this?

  .---+---+---+---. \   \   \   \
  |v|    tag      | |   |   |   |
  +---+---+---+---+ |   |   |   |
  |     commit    | |   |   |   |
  |               | +-. |   |   |
  +---+---+---+---+ / | |   |   |
  |v|qp-------------->p>p-->p   .
  +---+---+---+---+   | .   .   .
  |     cksum     |   | .   .   .
  +---+---+---+---+   | .   .   .
  |    padding    |   | .   .   .
  |               |   | .   .   .
  +---+---+---+---+   | |   |   |
  |v------------------' |   |   |
  +---+---+---+---+     |   |   |
  |     commit    |     +-. |   +- rbyd
  |               |     | | |   |  cksum
  +---+---+---+---+     / | +-. /
  |v----------------------' | |
  +-------+---+---+         / |
  |     cksum ----------------'
  +---+---+---+---+
  |    padding    |
  |               |
  +---+---+---+---+
  |     erased    |
  |               |
  .               .
  .               .

---

Code changes were minimal, saving a tiny bit of code:

           code          stack
  before: 36368           2664
  after:  36352 (-0.0%)   2672 (+0.3%)

There was a stack bump in lfsr_bd_readtag, but as far as I can tell it's
just compiler noise? I poked around a bit but couldn't figure out why it
changed...
2024-08-16 01:03:43 -05:00
Christopher Haster
fb73f78c91 Updated comments to prefer "canonical checksum" for rbyd checksums
I think this describes the goal of the non-perturbed rbyd checksums
decently. At the very least it's less wrong that "data checksum", and
calling it the "metadata checksum" would just be confusing. (Would our
commit checksum be the "metametadata checksum" then?)
2024-07-31 12:29:13 -05:00
Christopher Haster
c739e18f6f Renamed LFSR_TAG_NOISE -> LFSR_TAG_NOTE
Sort of like SHT_NOTE in elf files, but with no defined format. Using
LFSR_TAG_NOTE for additional noise/nonces is still encouraged, but it
can also be used to add debug info.
2024-06-20 13:04:20 -05:00
Christopher Haster
ae0e3348fe Added -l/--list to dbgtag.py
Inspired by errno's/dbgerr.py's -l/--list, this gives a quick and easy
list of the current tag encodings, which can be very useful:

  $ ./scripts/dbgtag.py -l
  LFSR_TAG_NULL       0x0000  v--- ---- ---- ----
  LFSR_TAG_CONFIG     0x00tt  v--- ---- -ttt tttt
  LFSR_TAG_MAGIC      0x0003  v--- ---- ---- --11
  LFSR_TAG_VERSION    0x0004  v--- ---- ---- -1--
  ... snip ...

We already need to keep dbgtag.py in-sync or risk a bad debugging
experience, so we might as well let it tell us all the information it
currently knows.

Also yay for self-inspecting code, I don't know if it's bad that I'm
becoming a fan of parsing information out of comments...
2024-06-20 13:02:08 -05:00
Christopher Haster
898f916778 Fixed pl hole in perturb logic
Turns out there's very _very_ small powerloss hole in our current
perturb logic.

We rely on tag valid bits to validate perturb bits, but these
intentionally don't end up in the commit checksum. This means there will
always be a powerloss hole when we write the last valid bit. If we lose
power after writing that bit, suddenly the remaining commit and any
following commits may appear as valid.

Now, this is really unlikely considering we need to lose power exactly
when we write the cksum tag's valid bit, and our nonce helps protect
against this. But a hole is a hole.

The solution here is to include the _current_ perturb bit (q) in the
commit's cksum tag, alongside the _next_ perturb bit (p). This will be
included in the commit's checksum, but _not_ in the canonical checksum,
allowing the commit's checksum validate the current perturb state
without ruining our erased-state agnostic checksums:

  .---+---+---+---. . . .---+---+---+---. \   \   \   \
  |v|    tag      |     |v|    tag      | |   |   |   |
  +---+---+---+---+     +---+---+---+---+ |   |   |   |
  |     commit    |     |     commit    | |   |   |   |
  |               |     |               | +-. |   |   |
  +---+---+---+---+     +---+---+---+---+ / | |   |   |
  |v|qp-------------.   |v|qp| tag      |   | .   .   .
  +---+---+---+---+ |   +---+---+---+---+   | .   .   .
  |     cksum     | |   |     cksum     |   | .   .   .
  +---+---+---+---+ |   +---+---+---+---+   | .   .   .
  |    padding    | |   |    padding    |   | .   .   .
  |               | |   |               |   | .   .   .
  +---+---+---+---+ | . +---+---+---+---+   | |   |   |
  |     erased    | +-> |v------------------' |   |   |
  |               | |   +---+---+---+---+     |   |   |
  .               . |   |     commit    |     +-. |   +- rbyd
  .               . |   |.----------------.   | | |   |  cksum
                    |   +| -+---+---+---+ |   / | +-. /
                    +-> |v|qp| tag      | '-----' | |
                    |   +- ^ ---+---+---+         / |
                    '------'  cksum ----------------'
                        +---+---+---+---+
                        |    padding    |
                        |               |
                        +---+---+---+---+
                        |     erased    |
                        |               |
                        .               .
                        .               .

(Ok maybe this diagram needs work...)

This adds another thing that needs to be checked during rbyd fetch, and
note, we _do_ need to explicitly check this, but it solves the problem.
If power is loss after v, q would be invalid, and if power is lost after
q, our cksum would be invalid.

Note this would have also been an issue for the previous cksum + parity
perturb scheme.

Code changes:

           code          stack
  before: 33570           2592
  after:  33598 (+0.1%)   2592 (+0.0%)
2024-06-07 19:41:47 -05:00
Christopher Haster
8a4f6fcf68 Adopted a simpler rbyd perturb scheme
The previous cksum + parity scheme worked, but needing to calculate both
cksum + parity on slightly different sets of metadata felt overly
complicated. After taking a step back, I've realized the problem is that
we're trying to force perturb effects to be implicit via the parity. If we
instead actually implement perturb effects explicitly, things get quite
a bit simpler...

This does add a bit more logic to the read path, but I don't think it's
worse than the mess we needed to parse separate cksum + parity.

Now, the perturb bit has the explicit behavior of inverting all tag
valid bits in the following commit. Which is conveniently the same as
xoring the crc32c with 00000080 before parsing each tag:

  .---+---+---+---. . . .---+---+---+---. \   \   \   \
  |v|    tag      |     |v|    tag      | |   |   |   |
  +---+---+---+---+     +---+---+---+---+ |   |   |   |
  |     commit    |     |     commit    | |   |   |   |
  |               |     |               | +-. |   |   |
  +---+---+---+---+     +---+---+---+---+ / | |   |   |
  |v|p--------------.   |v|p|  tag      |   | .   .   .
  +---+---+---+---+ |   +---+---+---+---+   | .   .   .
  |     cksum     | |   |     cksum     |   | .   .   .
  +---+---+---+---+ |   +---+---+---+---+   | .   .   .
  |    padding    | |   |    padding    |   | .   .   .
  |               | |   |               |   | .   .   .
  +---+---+---+---+ | . +---+---+---+---+   | |   |   |
  |     erased    | +-> |v------------------' |   |   |
  |               | |   +---+---+---+---+     |   |   |
  .               . |   |     commit    |     +-. |   +- rbyd
  .               . |   |               |     | | |   |  cksum
                    |   +---+---+---+---+     / | +-. /
                    '-> |v----------------------' | |
                        +---+---+---+---+         / |
                        |     cksum ----------------'
                        +---+---+---+---+
                        |    padding    |
                        |               |
                        +---+---+---+---+
                        |     erased    |
                        |               |
                        .               .
                        .               .

With this scheme, we don't need to calculate a separate parity, because
each valid bit effectively validates the current state of the perturb
bit.

We also don't need extra logic to omit valid bits from the cksum,
because flipping all valid bits effectively makes perturb=0 the
canonical metadata encoding and cksum.

---

I also considered only inverting the first valid bit, which would have
the additional benefit of allowing entire commits to be crc32ced at
once, but since we don't actually track when we've started a commit
this turned out to be quite a bit more complicated than I thought.

We need someway to validate the first valid bit, otherwise it could be
flipped by a failed prog and we'd never notice. This is fine, we can
store a copy of the previous perturb bit in the next cksum tag, but it
does mean we need to track the perturb bit for the duration of the
commit. So we'd end up needing to track both start-of-commit and the
perturb bit state, which starts getting difficult to fit into our rbyd
struct...

It's easier and simpler to just flip every valid bit. As a plus this
means every valid bit contributes to validating the perturb bit.

---

Also renamed LFSR_TAG_PERTURB -> LFSR_TAG_NOISE just to avoid confusion.
Though not sure if this tag should stick around...

The end result is a nice bit of code/stack savings, which is what we'd
expect with a simpler scheme:

           code          stack
  before: 33746           2600
  after:  33570 (-0.5%)   2592 (-0.3%)
2024-06-07 18:24:13 -05:00
Christopher Haster
bb3ef46cdf Fixed B-tree rendering in dbgmtree.py, unexpected arg
Rbyd.btree_btree renders B-trees, not rbyds, the rbyd arg doesn't make
sense here. This was just an accidental refactor mistake.
2024-05-31 16:36:33 -05:00
Christopher Haster
2e8012681b Tweaked dbg script headers to match the mount info log
The main difference being rendering the weight with a single letter "w"
prefix:

  $ ./scripts/dbglfs.py disk -b4096
  littlefs v0.0 4096x256 0x{1,0}.8b w2.512, rev eb7f2a0d
  ...

This lets us add valuable weight info without too much noise.

Adopting this in the dbg scripts is nice for consistency.
2024-05-24 14:56:11 -05:00
Christopher Haster
56b18dfd9a Reworked revision count logic a bit, block_cycles -> block_recycles
The original goal here was to restore all of the revision count/
wear-leveling features that were intentionally ignored during
refactoring, but over time a few other ideas to better leverage our
revision count bits crept in, so this is sort of the amalgamation of
that...

Note! None of these changes affect reading. mdir fetch strictly needs
only to look at the revision count as a big 32-bit counter to determine
which block is the most recent.

The interesting thing about the original definition of the revision
count, a simple 32-bit counter, is that it actually only needs 2-bits to
work. Well, three states really: 1. most recent, 2. less recent, 3.
future most recent. This means the remaining bits are sort of up for
grabs to other things.

Previously, we've used the extra revision count bits as a heuristic for
wear-leveling. Here we reintroduce that, a bit more rigorously, while
also carving out space for a nonce to help with commit collisions.

Here's the new revision count breakdown:

  vvvvrrrr rrrrrrnn nnnnnnnn nnnnnnnn
  '-.''----.----''---------.--------'
    '------|---------------|---------- 4-bit relocation revision
           '---------------|---------- recycle-bits recycle counter
                           '---------- pseudorandom nonce

- 4-bit relocation revision

  We technically only need 2-bits to tell which block is the most
  recent, but I've bumped it up to 4-bits just to be safe and to make
  it a bit more readable in hex form.

- recycle-bits recycle counter

  A user configurable counter, this counter tracks how many times a
  metadata block has been erased. When it overflows we return the block
  to the allocator to participate in block-level wear-leveling again.
  This implements our copy-on-bounded-write strategy.

- pseudorandom nonce

  The remaining bits we fill with a pseudorandom nonce derived from the
  filesystem's prng. Note this prng isn't the greatest (it's just the
  xor of all mdir cksums), but it gets the job done. It should also be
  reproducible, which can be a good thing.

  Suggested by ithinuel, the addition of a nonce should help with the
  commit collision issue caused by noop erases. It doesn't completely
  solve things, since we're only using crc32c cksums not collision
  resistant cryptographic hashes, but we still have the existing
  valid/perturb bit system to fall back on.

When we allocate a new mdir, we want to zero the recycle counter. This
is where our relocation revision is useful for indicating which block is
the most recent:

  initial state: 10101010 10101010 10101010 10101010
                 '-.'
                  +1     zero           random
                   v .----'----..---------'--------.
  lfsr_rev_init: 10110000 00000011 01110010 11101111

When we increment, we increment recycle counter and xor in a new nonce:

  initial state: 10110000 00000011 01110010 11101111
                 '--------.----''---------.--------'
                         +1              xor <-- random
                          v               v
  lfsr_rev_init: 10110000 00000111 01010100 01000000

And when the recycle counter overflows, we relocate the mdir.

If we aren't wear-leveling, we just increment the relocation revision to
maximize the nonce.

---

Some other notes:

- Renamed block_cycles -> block_recycles.

  This is intended to help avoid confusing block_cycles with the actual
  physical number of erase cycles supported by the device.

  I've noticed this happening a few times, and it's unfortunately
  equivalent to disabling wear-leveling completely. This can be improved
  with better documentation, but also changing the name doesn't hurt.

- We now relocate both blocks in the mdir at the same time.

  Previously we only relocated one block in the mdir per recycle. This
  was necessary to keep our threaded linked-list in sync, but the
  threaded linked-list is now no more!

  Relocating both blocks is simpler, updates the mtree less often,
  compatible with metadata redundancy, and avoids aliasing issues that
  were a problem when relocating one block.

  Note that block_recycles is internally multiplied by 2 so each block
  sees the correct number of erase cycles.

- block_recycles is now rounded down to a power-of-2.

  This makes the counter logic easier to work with and takes up less RAM
  in lfs_t. This is a rough heuristic anyways.

- Moved the lfs->seed updates into lfsr_mountinited + lfsr_mdir_commit.

  This avoids readonly operations affecting the seed and should help
  reproducibility.

- Changed rev count in dbg scripts to render as hex, similar to cksums.

  Now that we using most of the bits in the revision count, the decimal
  version is, uh, not helpful...

Code changes:

           code          stack
  before: 33342           2640
  after:  33434 (+0.3%)   2640 (+0.0%)
2024-05-22 18:49:05 -05:00
Christopher Haster
11c948678f Renamed size_limit -> file_limit
This limits the maximum size of a file, which is also implies the
maximum integer size required to mount.

The exact name is a bit of a toss-up. I originally went with size_limit
to avoid confusion around if file_limit reflected the file size or the
number of files, but since this ends up mapping to lfs_off_t and _not_
lfs_size_t, I think size_limit may be a bit of a bad choice.
2024-05-18 13:00:15 -05:00
Christopher Haster
8a75a68d8b Made rbyd cksums erased-state agnostic
Long story short, rbyd checksums are now fully reproducible. If you
write the same set of tags to any block, you will end up with the same
checksum.

This is actually a bit tricky with littlefs's constraints.

---

The main problem boils down to erased-state. littlefs has a fairly
flexible model for erased-state, and this brings some challenges. In
littlefs, storage goes through 2 states:

1. Erase - Prepare storage for progging. Reads after an erase may return
   arbitrary, but consistent, values.

2. Prog - Program storage with data. Storage must be erased and no progs
   attempted. Reads after a prog must return the new data.

Note in this model erased-state may not be all 0xffs, though it likely
will be for flash. This allows littlefs to support a wide range of
other storage devices: SD, RAM, NVRAM, encryption, ECC, etc.

But this model also means erased-state may be different from block to
block, and even different on later erases of the same block.

And if that wasn't enough of a challenge, _erased-state can contain
perfectly valid commits_. Usually you can expect arbitrary valid cksums
to be rare, but thanks to SD, RAM, etc, modeling erase as a noop, valid
cksums in erased-state is actually very common.

So how do we manage erased-state in our rbyds?

First we need some way to detect it, since we can't prog if we're not
erased. This is accomplished by the forward-looking erased-state cksum
(ecksum):

  .---+---+---+---.     \
  |     commit    |     |
  |               |     |
  |               |     |
  +---+---+---+---+     +-.
  |     ecksum -------. | | <-- ecksum - cksum of erased state
  +---+---+---+---+   | / |
  |     cksum --------|---' <-- cksum - cksum of commit,
  +---+---+---+---+   |                 including ecksum
  |    padding    |   |
  |               |   |
  +---+---+---+---+ \ |
  |     erased    | +-'
  |               | /
  .               .
  .               .

You may have already noticed the start of our problems. The ecksum
contains the erased-state, which is different per-block, and our rbyd
cksum contains the ecksum. We need to include the ecksum so we know if
it's valid, but this means our rbyd cksum changes block to block.

Solving this is simple enough: Stop the rbyd's canonical cksum before
the ecksum, but include the ecksum in the actual cksum we write to disk.

Future commits will need to start from the canonical cksum, so the old
ecksum won't be included in new commits, but this shouldn't be a
problem:

  .---+---+---+---. . . \ . \ . . . . .---+---+---+---.     \   \
  |     commit    |     |   |         |     commit    |     |   |
  |               |     |   +- rbyd   |               |     |   |
  |               |     |   |  cksum  |               |     |   |
  +---+---+---+---+     +-. /         +---+---+---+---+     |   |
  |     ecksum -------. | |           |     ecksum    |     .   .
  +---+---+---+---+   | / |           +---+---+---+---+     .   .
  |     cksum --------|---'           |     cksum     |     .   .
  +---+---+---+---+   |               +---+---+---+---+     .   .
  |    padding    |   |               |    padding    |     .   .
  |               |   |               |               |     .   .
  +---+---+---+---+ \ | . . . . . . . +---+---+---+---+     |   |
  |     erased    | +-'               |     commit    |     |   |
  |               | /                 |               |     |   +- rbyd
  .               .                   |               |     |   |  cksum
  .               .                   +---+---+---+---+     +-. /
                                      |     ecksum -------. | |
                                      +---+---+---+---+   | / |
                                      |     cksum ------------'
                                      +---+---+---+---+   |
                                      |    padding    |   |
                                      |               |   |
                                      +---+---+---+---+ \ |
                                      |     erased    | +-'
                                      |               | /
                                      .               .
                                      .               .

The second challenge is the pesky possibility of existing valid commits.
We need some way to ensure that erased-state following a commit does not
accidentally contain a valid old commit.

This is where are tag's valid bits come into play: The valid bit of each
tag must match the parity of all preceding tags (equivalent to the
parity of the crc32c), and we can use some perturb bits in the cksum tag
to make sure any tags in our erased-state do _not_ match:

  .---+---+---+---. \ . . . . . .---+---+---+---. \   \   \
  |v|    tag      | |           |v|    tag      | |   |   |
  +---+---+---+---+ |           +---+---+---+---+ |   |   |
  |     commit    | |           |     commit    | |   |   |
  |               | |           |               | |   |   |
  +---+---+---+---+ +-----.     +---+---+---+---+ +-. |   |
  |v|p|  tag      | |     |     |v|p|  tag      | | | |   |
  +---+---+---+---+ /     |     +---+---+---+---+ / | |   |
  |     cksum     |       |     |     cksum     |   | .   .
  +---+---+---+---+       |     +---+---+---+---+   | .   .
  |    padding    |       |     |    padding    |   | .   .
  |               |       |     |               |   | .   .
  +---+---+---+---+ . . . | . . +---+---+---+---+   | |   |
  |v---------------- != --'     |v------------------' |   |
  |     erased    |             +---+---+---+---+     |   |
  .               .             |     commit    |     |   |
  .               .             |               |     |   |
                                +---+---+---+---+     +-. +-.
                                |v|p|  tag      |     | | | |
                                +---+---+---+---+     / | / |
                                |     cksum ----------------'
                                +---+---+---+---+       |
                                |    padding    |       |
                                |               |       |
                                +---+---+---+---+       |
                                |v---------------- != --'
                                |     erased    |
                                .               .
                                .               .

New problem! The rbyd cksum contains the valid bits, which contain the
perturb bits, which depends on the erased-state!

And you can't just derive the valid bits from the rbyd's canonical
cksum. This avoids erased-state poisoning, sure, but then nothing in the
new commit depends on the perturb bits! The catch-22 here is that we
need the valid bits to both depend on, and ignore, the erased-state
poisoned perturb bits.

As far as I can tell, the only way around this is to make the rybd's
canonical cksum not include the parity bits. Which is annoying, masking
out bits is not great for bulk cksum calculation...

But this does solve our problem:

  .---+---+---+---. \ . . . . . .---+---+---+---. \   \   \   \
  |v|    tag      | |           |v|    tag      | |   |   o   o
  +---+---+---+---+ |           +---+---+---+---+ |   |   |   |
  |     commit    | |           |     commit    | |   |   |   |
  |               | |           |               | |   |   |   |
  +---+---+---+---+ +-----.     +---+---+---+---+ +-. |   |   |
  |v|p|  tag      | |     |     |v|p|  tag      | | | |   .   .
  +---+---+---+---+ /     |     +---+---+---+---+ / | |   .   .
  |     cksum     |       |     |     cksum     |   | .   .   .
  +---+---+---+---+       |     +---+---+---+---+   | .   .   .
  |    padding    |       |     |    padding    |   | .   .   .
  |               |       |     |               |   | .   .   .
  +---+---+---+---+ . . . | . . +---+---+---+---+   | |   |   |
  |v---------------- != --'     |v------------------' |   o   o
  |     erased    |             +---+---+---+---+     |   |   |
  .               .             |     commit    |     |   |   +- rbyd
  .               .             |               |     |   |   |  cksum
                                +---+---+---+---+     +-. +-. /
                                |v|p|  tag      |     | | o |
                                +---+---+---+---+     / | / |
                                |     cksum ----------------'
                                +---+---+---+---+       |
                                |    padding    |       |
                                |               |       |
                                +---+---+---+---+       |
                                |v---------------- != --'
                                |     erased    |
                                .               .
                                .               .

Note that because each commit's cksum derives from the canonical cksum,
the valid bits and commit cksums no longer contain the same data, so our
parity(m) = parity(crc32c(m)) trick no longer works.

However our crc32c still does tell us a bit about each tag's parity, so
with a couple well-placed xors we can at least avoid needing two
parallel calculations:

  cksum' = crc32c(cksum, m)
  valid' = parity(cksum' xor cksum) xor valid

This also means our commit cksums don't include any information about
the valid bits, since we mask these out before cksum calculation. Which
is a bit concerning, but as far as I can tell not a real problem.

---

An alternative design would be to just keep track of two cksums: A
commit cksum and a canonical cksum.

This would be much simpler, but would also require storing two cksums in
RAM in our lfsr_rbyd_t struct. A bit annoying for our 4-byte crc32cs,
and a bit more than a bit annoying for hypothetical 32-byte sha256s.

It's also not entirely clear how you would update both crc32cs
efficiently. There is a way to xor out the initial state before each
tag, but I think it would still require O(n) cycles of crc32c
calculation...

As it is, the extra bit needed to keep track of commit parity is easy
enough to sneak into some unused sign bits in our lfsr_rbyd_t struct.

---

I've also gone ahead and mixed in the current commit parity into our
cksum's perturb bits, so the commit cksum at least contains _some_
information about the previous parity.

But it's not entirely clear this actually adds anything. Our perturb
bits aren't _required_ to reflect the commit parity, so a very unlucky
power-loss could in theory still make a cksum valid for the wrong
parity.

At least this situation will be caught by later valid bits...

I've also carved out a tag encoding, LFSR_TAG_PERTURB, solely for adding
more perturb bits to commit cksums:

  LFSR_TAG_CKSUM          0x3cpp  v-11 cccc -ppp pppp

  LFSR_TAG_CKSUM          0x30pp  v-11 ---- -ppp pppp
  LFSR_TAG_PERTURB        0x3100  v-11 ---1 ---- ----
  LFSR_TAG_ECKSUM         0x3200  v-11 --1- ---- ----
  LFSR_TAG_GCKSUMDELTA+   0x3300  v-11 --11 ---- ----

  + Planned

This allows for more than 7 perturb bits, and could even mix in the
entire previous commit cksum, if we ever think that is worth the RAM
tradeoff.

LFSR_TAG_PERTURB also has the advantage that it is validated by the
cksum tag's valid bit before being included in the commit cksum, which
indirectly includes the current commit parity. We may eventually want to
use this instead of the cksum tag's perturb bits for this reason, but
right now I'm not sure this tiny bit of extra safety is worth the
minimum 5-byte per commit overhead...

Note if you want perturb bits that are also included in the rbyd's
canonical cksum, you can just use an LFSR_TAG_SHRUBDATA tag. Or any
unreferenced shrub tag really.

---

All of these changes required a decent amount of code, I think mostly
just to keep track of the parity bit. But the isolation of rbyd cksums
from erased-state is necessary for several future-planned features:

           code          stack
  before: 33564           2816
  after:  33916 (+1.0%)   2824 (+0.3%)
2024-05-04 17:25:01 -05:00
Christopher Haster
c4fcc78814 Tweaked file types/name tag encoding to be a bit less quirky
The intention behind the quirky encoding was to leverage bit 1 to
indicate if the underlying file type would be backed by the common file
B-tree data structure. Looking forward, there may be several of these
types, compressed files, contiguous files, etc, that for all intents and
purposes are just normal files interpreted differently.

But trying to leverage too many bits like this is probably going to give
us a sparse, awkward, and confusing tag encoding, so I've reverted to a
hopefully more normal encoding:

  LFSR_TAG_NAME           0x02tt  v--- --1- -ttt tttt

  LFSR_TAG_NAME           0x0200  v--- --1- ---- ----
  LFSR_TAG_REG            0x0201  v--- --1- ---- ---1
  LFSR_TAG_DIR            0x0202  v--- --1- ---- --1-
  LFSR_TAG_SYMLINK*       0x0203  v--- --1- ---- --11
  LFSR_TAG_BOOKMARK       0x0204  v--- --1- ---- -1--
  LFSR_TAG_ORPHAN         0x0205  v--- --1- ---- -1-1
  LFSR_TAG_COMPR*         0x0206  v--- --1- ---- -11-
  LFSR_TAG_CONTIG*        0x0207  v--- --1- ---- -111

  * Hypothetical

Note the carve-out for the hypothetical symlink tag. Symlinks are
actually incredibly low in the priority list, but they are also
the only current hypothetical file type that would need to be exposed to
users. Grouping these up makes sense.

This will get a bit messy if we ever end up with a 4th user-facing type,
but there isn't any in POSIX at least (ignoring non-fs types, socket,
fifo, character, block, etc).

The gap also helps line things up so reg/orphan are a single bit flip,
and the non-user facing types all share a bit.

This had no impact on code size:

           code          stack
  before: 33564           2816
  after:  33564 (+0.0%)   2816 (+0.0%)
2024-05-04 17:24:48 -05:00
Christopher Haster
6e5d314c20 Tweaked struct tag encoding so b*/m* tags are earlier
These b*/m* struct tags have a common pattern that would be good to
emphasize in the encoding. The later struct tags get a bit more messy as
they leave space for future possible extensions.

New encoding:

  LFSR_TAG_STRUCT         0x03tt  v--- --11 -ttt ttrr

  LFSR_TAG_DATA           0x0300  v--- --11 ---- ----
  LFSR_TAG_BLOCK          0x0304  v--- --11 ---- -1rr
  LFSR_TAG_BSHRUB         0x0308  v--- --11 ---- 1---
  LFSR_TAG_BTREE          0x030c  v--- --11 ---- 11rr
  LFSR_TAG_MROOT          0x0310  v--- --11 ---1 --rr
  LFSR_TAG_MDIR           0x0314  v--- --11 ---1 -1rr
  LFSR_TAG_MSHRUB*        0x0318  v--- --11 ---1 1---
  LFSR_TAG_MTREE          0x031c  v--- --11 ---1 11rr
  LFSR_TAG_DID            0x0320  v--- --11 --1- ----
  LFSR_TAG_BRANCH         0x032c  v--- --11 --1- 11rr

  * Hypothetical

Note that all shrubs currently end with 1---, and all btrees, including
the awkward branch tag, end with 11rr.

This had no impact on code size:

           code          stack
  before: 33564           2816
  after:  33564 (+0.0%)   2816 (+0.0%)
2024-05-04 17:24:33 -05:00
Christopher Haster
5fa85583cd Dropped block-level erased-state checksums for RAM-tracked erased-state
Unfortunately block-level erased-state checksums (becksums) don't really
work as intended.

An invalid becksum _does_ signal that a prog has been attempted, but a
valid becksum does _not_ prove that a prog has _not_ been attempted.

Rbyd ecksums work, but only thanks to a combination of prioritizing
valid commits and the use of perturb bits to force erased-state changes.
It _is_ possible to end up with an ecksum collision, but only if you
1. lose power before completing a commit, and 2. end up with a
non-trivial crc32c collision. If this does happen, at the very least the
resulting commit will likely end up corrupted and thrown away later.

Block-level becksums, at least as originally designed, don't have either
of these protections. To make matters worse, the blocks these becksums
reference contain only raw user data. Write 0xffs into a file and you
will likely end up with a becksum collision!

This is a problem for a couple of reasons:

1. Progging multiple times to erased-state is likely to result in
   corrupted data, though this is also likely to get caught with
   validating writes.

   Worst case, the resulting data looks valid, but with weakened data
   retention.

2. Because becksums are stored in the copy-on-write metadata of the
   file, attempting to open a file twice for writing (or more advanced
   copy-on-write operations in the future) can lead to a situation where
   a prog is attempted on _already committed_ data.

   This is very bad and breaks copy-on-write guarantees.

---

So clearly becksums are not fit for purpose and should be dropped. What
can we replace them with?

The first option, implemented here, is RAM-tracked erased state. Give
each lfsr_file_t its own eblock/eoff fields to track the last known good
erased-state. And before each prog, clear eblock/eoff so we never
accidentally prog to the same erased-state twice.

It's interesting to note we don't currently clear eblock/eoff in all
file handles, this is ok only because we don't currently share
eblock/eoff across file handles. Each eblock/eoff is exclusive to the
lfsr_file_t and does not appear anywhere else in the system.

The main downside of this approach is that, well, the RAM-tracked
erase-state is only tracked in RAM. Block-level erased-state effectively
does not persist across reboots. I've considered adding some sort of
per-file erased-state tracking to the mdir that would need to be cleared
before use, but such a mechanism ends up quite complicated.

At the moment, I think the best second option is to put erased-state
tracking in the future-planned bmap. This would let you opt-in to
on-disk tracking of all erased-state in the system.

One nice thing about RAM-tracked erased-state is that it's not on disk,
so it's not really a compatibility concern and won't get in the way of
additional future erased-state tracking.

---

Benchmarking becksums vs RAM-tracking has been quite interesting. While
in theory becksums can track much more erased-state, it's quite unlikely
anything but the most recent erased-state actually ends up used. The end
result is no real measurable performance loss, and actually a minor
speedup because we don't need to calculate becksums on every block
write.

There are some pathological cases, such as multiple write heads, but
these are out-of-scope right now (note! multiple explicit file handles
currently handle this case beautifully because we don't share
eblock/eoff!)

Becksums were also relatively complicated, and needed extra scaffolding
to pass around/propagate as secondary tags alongside the primary bptr.
So trading these for RAM-tracking also gives us a nice bit of code/stack
savings, albeit at a 2-word RAM cost in lfsr_file_t:

           code          stack          structs
  before: 33888           2864             1096
  after:  33564 (-1.0%)   2816 (-1.7%)     1104 (+0.7%)

  lfsr_file_t before: 104
  lfsr_file_t after:  112 (+7.7%)
2024-05-04 17:22:56 -05:00
Christopher Haster
81ccfbccd0 Dropped -x/--device from dbg*.py scripts
This hasn't really proven useful.

At one point showing the cksums in dbgrbyd.py was useful, but this is
now possible and easier with dbgblock.py -x/--cksum.
2024-04-28 13:21:46 -05:00