Files
littlefs/scripts/dbgtag.py
Christopher Haster 1c5adf71b3 Implemented self-validating global-checksums (gcksums)
This was quite a puzzle.

The problem: How do we detect corrupt mdirs?

Seems like a simple question, but we can't just rely on mdir cksums. Our
mdirs are independently updateable logs, and logs have this annoying
tendency to "rollback" to previously valid states when corrupted.

Rollback issues aren't littlefs-specific, but what _is_ littlefs-
specific is that when one mdir rolls back, it can disagree with other
mdirs, resulting in wildly incorrect filesystem state.

To solve this, or at least protect against disagreeable mdirs, we need
to somehow include the state of all other mdirs in each mdir commit.

---

The first thought: Why not use gstate?

We already have a system for storing distributed state. If we add the
xor of all of our mdir cksums, we can rebuild it during mount and verify
that nothing changed:

   .--------.   .--------.   .--------.   .--------.
  .| mdir 0 |  .| mdir 1 |  .| mdir 2 |  .| mdir 3 |
  ||        |  ||        |  ||        |  ||        |
  || gdelta |  || gdelta |  || gdelta |  || gdelta |
  |'-----|--'  |'-----|--'  |'-----|--'  |'-----|--'
  '------|-'   '------|-'   '------|-'   '------|-'
  '--.------'  '--.------'  '--.------'  '--.------'
   cksum |      cksum |      cksum |      cksum |
     |   |        v   |        v   |        v   |
     '---------> xor -------> xor -------> xor -------> gcksum
         |            v            v            v         =?
         '---------> xor -------> xor -------> xor ---> gcksum

Unfortunately it's not that easy. Consider what this looks like
mathematically (g is our gcksum, c_i is an mdir cksum, d_i is a
gcksumdelta, and +/-/sum is xor):

  g = sum(c_i) = sum(d_i)

If we solve for a new gcksumdelta, d_i:

  d_i = g' - g
  d_i = g + c_i - g
  d_i = c_i

The gcksum cancels itself out! We're left with an equation that depends
only on the current mdir, which doesn't help us at all.

Next thought: What if we permute the gcksum with a function t before
distributing it over our gcksumdeltas?

   .--------.   .--------.   .--------.   .--------.
  .| mdir 0 |  .| mdir 1 |  .| mdir 2 |  .| mdir 3 |
  ||        |  ||        |  ||        |  ||        |
  || gdelta |  || gdelta |  || gdelta |  || gdelta |
  |'-----|--'  |'-----|--'  |'-----|--'  |'-----|--'
  '------|-'   '------|-'   '------|-'   '------|-'
  '--.------'  '--.------'  '--.------'  '--.------'
   cksum |      cksum |      cksum |      cksum |
     |   |        v   |        v   |        v   |
     '---------> xor -------> xor -------> xor -------> gcksum
         |            |            |            |   .--t--'
         |            |            |            |   '-> t(gcksum)
         |            v            v            v          =?
         '---------> xor -------> xor -------> xor ---> t(gcksum)

In math terms:

  t(g) = t(sum(c_i)) = sum(d_i)

In order for this to work, t needs to be non-linear. If t is linear, the
same thing happens:

  d_i = t(g') - t(g)
  d_i = t(g + c_i) - t(g)
  d_i = t(g) + t(c_i) - t(g)
  d_i = t(c_i)

This was quite funny/frustrating (funnistrating?) during development,
because it means a lot of seemingly obvious functions don't work!

- t(g) = g              - Doesn't work
- t(g) = crc32c(g)      - Doesn't work because crc32cs are linear
- t(g) = g^2 in GF(2^n) - g^2 is linear in GF(2^n)!?

Fortunately, powers coprime with 2 finally give us a non-linear function
in GF(2^n), so t(g) = g^3 works:

  d_i = g'^3 - g^3
  d_i = (g + c_i)^3 - g^3
  d_i = (g^2 + gc_i + gc_i + c_i^2)(g + c_i) - g^3
  d_i = (g^2 + c_i^2)(g + c_i) - g^3
  d_i = g^3 + gc_i^2 + g^2c_i + c_i^3 - g^3
  d_i = gc_i^2 + g^2c_i + c_i^3

---

Bleh, now we need to implement finite-field operations? Well, not
entirely!

Note that our algorithm never uses division. This means we don't need a
full finite-field (+, -, *, /), but can get away with a finite-ring (+,
-, *). And conveniently for us, our crc32c polynomial defines a ring
epimorphic to a 31-bit finite-field.

All we need to do is define crc32c multiplication as polynomial
multiplication mod our crc32c polynomial:

  crc32cmul(a, b) = pmod(pmul(a, b), P)

And since crc32c is more-or-less just pmod(x, P), this lets us take
advantage of any crc32c hardware/tables that may be available.

---

Bunch of notes:

- Our 2^n-bit crc-ring maps to a 2^n-1-bit finite-field because our crc
  polynomial is defined as P(x) = Q(x)(x + 1), where Q(x) is a 2^n-1-bit
  irreducible polynomial.

  This is a common crc construction as it provides optimal odd-bit/2-bit
  error detection, so it shouldn't be too difficult to adapt to other
  crc sizes.

- t(g) = g^3 is not the only function that works, but it turns out to be
  a pretty good one:

  - 3 and 2^(2^n-1)-1 are coprime, which means our function t(g) = g^3
    provides a one-to-one mapping in the underlying fields of all crc
    rings of size 2^(2^n).

    We know 3 and 2^(2^n-1)-1 are coprime because 2^(2^n-1)-1 =
    2^(2^n)-1 (a Fermat number) - 2^(2^n-1) (a power-of-2), and 3
    divides Fermat numbers >=3 (A023394) and is not 2.

  - Our delta, when viewed as a polynomial in g: d(g) = gc^2 + g^2c +
    c^3, has degree 2, which implies there are at most 2 solutions or
    1-bit of information loss in the underlying field.

    This is optimal since the original definition already had 2
    solutions before we even chose a function:

      d(g) = t(g + c) - t(g)
      d(g) = t(g + c) - t((g + c) - c)
      d(g) = t((g + c) + c) - t(g + c)
      d(g) = d(g + c)

  Though note the mapping of our crc-ring to the underlying field
  already represents 1-bit of information loss.

- If you're using a cryptographic hash or other non-crc, you should
  probably just use an equal sized finite-field.

  Though note changing from a 2^n-1-bit field to a 2^n-bit field does
  change the math a bit, with t(g) = g^7 being a better non-linear
  function:

  - 7 is the smallest odd-number coprime with 2^n-1, a Fermat number,
    which makes t(g) = g^7 a one-to-one mapping.

    3 humorously divides all 2^n-1 Fermat numbers.

  - Expanding delta with t(g) = g^7 gives us a 6 degree polynomial,
    which implies at most 6 solutions or ~3-bits of information loss.

    This isn't actually the best you can do, some exhaustive searching
    over small fields (<=2^16) suggests t(g) = g^(2^(n-1)-1) _might_ be
    optimal, but that's a heck of a lot more multiplications.

- Because our crc32cs preserve parity/are epimorphic to parity bits,
  addition (xor) and multiplication (crc32cmul) also preserve parity,
  which can be used to show our entire gcksum system preserves parity.

  This is quite neat, and means we are guaranteed to detect any odd
  number of bit-errors across the entire filesystem.

- Another idea was to use two different addition operations: xor and
  overflowing addition (or mod a prime).

  This probably would have worked, but lacks the rigor of the above
  solution.

- You might think an RS-like construction would help here, where g =
  sum(c_ia^i), but this suffers from the same problem:

    d_i = g' - g
    d_i = g + c_ia^i - g
    d_i = c_ia^i

  Nothing here depends on anything outside of the current mdir.

- Another question is should we be using an RS-like construction anyways
  to include location information in our gcksum?

  Maybe in another system, but I don't think it's necessary in littlefs.

  While our mdir are independently updateable, they aren't _entirely_
  independent. The location of each mdir is stored in either the mtree
  or a parent mdir, so it always gets mixed into the gcksum somewhere.

  The only exception being the mrootanchor which is always at the fixed
  blocks 0x{0,1}.

- This does _not_ catch "global-rollback" issues, where the most recent
  commit in the entire filesystem is corrupted, revealing an older, but
  still valid, filesystem state.

  But as far as I am aware this is just a fundamental limitation of
  powerloss-resilient filesystems, short of doing destructive
  operations.

  At the very least, exposing the gcksum would allow the user to store
  it externally and prevent this issue.

---

Implementation details:

- Our gcksumdelta depends on the rbyd's cksum, so there's a catch-22 if
  we include it in the rbyd itself.

  We can avoid this by including it in the commit tags (actually the
  separate canonical cksum makes this easier than it would have been
  earlier), but this does mean LFSR_TAG_GCKSUMDELTA is not an
  LFSR_TAG_GDELTA subtype. Unfortunate but not a dealbreaker.

- Reading/writing the gcksumdelta gets a bit annoying with it not being
  in the rbyd. For now I've extended the low-level lfsr_rbyd_fetch_/
  lfsr_rbyd_appendcksum_ to accept an optional gcksumdelta pointer,
  which is a bit awkward, but I don't know of a better solution.

- Unlike the grm, _every_ mdir commit involves the gcksum, which means
  we either need to propagate the gcksumdelta up the mroot chain
  correctly, or somehow keep track of partially flushed gcksumdeltas.

  To make this work I modified the low-level lfsr_mdir_commit__
  functions to accept start_rid=-2 to indicate when gcksumdeltas should
  be flushed.

  It's a bit of a hack, but I think it might make sense to extend this
  to all gdeltas eventually.

The gcksum cost both code and RAM, but I think it's well worth it for
removing an entire category of filesystem corruption:

           code          stack          ctx
  before: 37796           2608          620
  after:  38428 (+1.7%)   2640 (+1.2%)  644 (+3.9%)
2025-02-08 14:53:30 -06:00

369 lines
13 KiB
Python
Executable File

#!/usr/bin/env python3
# prevent local imports
if __name__ == "__main__":
__import__('sys').path.pop(0)
import io
import os
import struct
import sys
TAG_NULL = 0x0000 ## 0x0000 v--- ---- ---- ----
TAG_CONFIG = 0x0000 ## 0x00tt v--- ---- -ttt tttt
TAG_MAGIC = 0x0003 # 0x0003 v--- ---- ---- --11
TAG_VERSION = 0x0004 # 0x0004 v--- ---- ---- -1--
TAG_RCOMPAT = 0x0005 # 0x0005 v--- ---- ---- -1-1
TAG_WCOMPAT = 0x0006 # 0x0006 v--- ---- ---- -11-
TAG_OCOMPAT = 0x0007 # 0x0007 v--- ---- ---- -111
TAG_GEOMETRY = 0x0009 # 0x0008 v--- ---- ---- 1-rr
TAG_NAMELIMIT = 0x000c # 0x000c v--- ---- ---- 11--
TAG_FILELIMIT = 0x000d # 0x000d v--- ---- ---- 11-1
TAG_GDELTA = 0x0100 ## 0x01tt v--- ---1 -ttt tttt
TAG_GRMDELTA = 0x0100 # 0x0100 v--- ---1 ---- ----
TAG_NAME = 0x0200 ## 0x02tt v--- --1- -ttt tttt
TAG_REG = 0x0201 # 0x0201 v--- --1- ---- ---1
TAG_DIR = 0x0202 # 0x0202 v--- --1- ---- --1-
TAG_BOOKMARK = 0x0204 # 0x0204 v--- --1- ---- -1--
TAG_STICKYNOTE = 0x0205 # 0x0205 v--- --1- ---- -1-1
TAG_STRUCT = 0x0300 ## 0x03tt v--- --11 -ttt tttt
TAG_DATA = 0x0300 # 0x0300 v--- --11 ---- ----
TAG_BLOCK = 0x0304 # 0x0304 v--- --11 ---- -1rr
TAG_BSHRUB = 0x0308 # 0x0308 v--- --11 ---- 1---
TAG_BTREE = 0x030c # 0x030c v--- --11 ---- 11rr
TAG_MROOT = 0x0311 # 0x0310 v--- --11 ---1 --rr
TAG_MDIR = 0x0315 # 0x0314 v--- --11 ---1 -1rr
TAG_MTREE = 0x031c # 0x031c v--- --11 ---1 11rr
TAG_DID = 0x0320 # 0x0320 v--- --11 --1- ----
TAG_BRANCH = 0x032c # 0x032c v--- --11 --1- 11rr
TAG_ATTR = 0x0400 ## 0x04aa v--- -1-a -aaa aaaa
TAG_UATTR = 0x0400 # 0x04aa v--- -1-- -aaa aaaa
TAG_SATTR = 0x0500 # 0x05aa v--- -1-1 -aaa aaaa
TAG_SHRUB = 0x1000 ## 0x1kkk v--1 kkkk -kkk kkkk
TAG_ALT = 0x4000 ## 0x4kkk v1cd kkkk -kkk kkkk
TAG_B = 0x0000
TAG_R = 0x2000
TAG_LE = 0x0000
TAG_GT = 0x1000
TAG_CKSUM = 0x3000 ## 0x300p v-11 ---- ---- ---p
TAG_P = 0x0001
TAG_NOTE = 0x3100 ## 0x3100 v-11 ---1 ---- ----
TAG_ECKSUM = 0x3200 ## 0x3200 v-11 --1- ---- ----
TAG_GCKSUMDELTA = 0x3300 ## 0x3300 v-11 --11 ---- ----
# some ways of block geometry representations
# 512 -> 512
# 512x16 -> (512, 16)
# 0x200x10 -> (512, 16)
def bdgeom(s):
s = s.strip()
b = 10
if s.startswith('0x') or s.startswith('0X'):
s = s[2:]
b = 16
elif s.startswith('0o') or s.startswith('0O'):
s = s[2:]
b = 8
elif s.startswith('0b') or s.startswith('0B'):
s = s[2:]
b = 2
if 'x' in s:
s, s_ = s.split('x', 1)
return (int(s, b), int(s_, b))
else:
return int(s, b)
# parse some rbyd addr encodings
# 0xa -> (0xa,)
# 0xa.c -> ((0xa, 0xc),)
# 0x{a,b} -> (0xa, 0xb)
# 0x{a,b}.c -> ((0xa, 0xc), (0xb, 0xc))
def rbydaddr(s):
s = s.strip()
b = 10
if s.startswith('0x') or s.startswith('0X'):
s = s[2:]
b = 16
elif s.startswith('0o') or s.startswith('0O'):
s = s[2:]
b = 8
elif s.startswith('0b') or s.startswith('0B'):
s = s[2:]
b = 2
trunk = None
if '.' in s:
s, s_ = s.split('.', 1)
trunk = int(s_, b)
if s.startswith('{') and '}' in s:
ss = s[1:s.find('}')].split(',')
else:
ss = [s]
addr = []
for s in ss:
if trunk is not None:
addr.append((int(s, b), trunk))
else:
addr.append(int(s, b))
return tuple(addr)
def fromleb128(data):
word = 0
for i, b in enumerate(data):
word |= ((b & 0x7f) << 7*i)
word &= 0xffffffff
if not b & 0x80:
return word, i+1
return word, len(data)
def tagrepr(tag, w=None, size=None, off=None):
if (tag & 0x6fff) == TAG_NULL:
return '%snull%s%s' % (
'shrub' if tag & TAG_SHRUB else '',
' w%d' % w if w else '',
' %d' % size if size else '')
elif (tag & 0x6f00) == TAG_CONFIG:
return '%s%s%s%s' % (
'shrub' if tag & TAG_SHRUB else '',
'magic' if (tag & 0xfff) == TAG_MAGIC
else 'version' if (tag & 0xfff) == TAG_VERSION
else 'rcompat' if (tag & 0xfff) == TAG_RCOMPAT
else 'wcompat' if (tag & 0xfff) == TAG_WCOMPAT
else 'ocompat' if (tag & 0xfff) == TAG_OCOMPAT
else 'geometry' if (tag & 0xfff) == TAG_GEOMETRY
else 'namelimit' if (tag & 0xfff) == TAG_NAMELIMIT
else 'filelimit' if (tag & 0xfff) == TAG_FILELIMIT
else 'config 0x%02x' % (tag & 0xff),
' w%d' % w if w else '',
' %s' % size if size is not None else '')
elif (tag & 0x6f00) == TAG_GDELTA:
return '%s%s%s%s' % (
'shrub' if tag & TAG_SHRUB else '',
'grmdelta' if (tag & 0xfff) == TAG_GRMDELTA
else 'gdelta 0x%02x' % (tag & 0xff),
' w%d' % w if w else '',
' %s' % size if size is not None else '')
elif (tag & 0x6f00) == TAG_NAME:
return '%s%s%s%s' % (
'shrub' if tag & TAG_SHRUB else '',
'name' if (tag & 0xfff) == TAG_NAME
else 'reg' if (tag & 0xfff) == TAG_REG
else 'dir' if (tag & 0xfff) == TAG_DIR
else 'bookmark' if (tag & 0xfff) == TAG_BOOKMARK
else 'stickynote' if (tag & 0xfff) == TAG_STICKYNOTE
else 'name 0x%02x' % (tag & 0xff),
' w%d' % w if w else '',
' %s' % size if size is not None else '')
elif (tag & 0x6f00) == TAG_STRUCT:
return '%s%s%s%s' % (
'shrub' if tag & TAG_SHRUB else '',
'data' if (tag & 0xfff) == TAG_DATA
else 'block' if (tag & 0xfff) == TAG_BLOCK
else 'bshrub' if (tag & 0xfff) == TAG_BSHRUB
else 'btree' if (tag & 0xfff) == TAG_BTREE
else 'mroot' if (tag & 0xfff) == TAG_MROOT
else 'mdir' if (tag & 0xfff) == TAG_MDIR
else 'mtree' if (tag & 0xfff) == TAG_MTREE
else 'did' if (tag & 0xfff) == TAG_DID
else 'branch' if (tag & 0xfff) == TAG_BRANCH
else 'struct 0x%02x' % (tag & 0xff),
' w%d' % w if w else '',
' %s' % size if size is not None else '')
elif (tag & 0x6e00) == TAG_ATTR:
return '%s%sattr 0x%02x%s%s' % (
'shrub' if tag & TAG_SHRUB else '',
's' if tag & 0x100 else 'u',
((tag & 0x100) >> 1) ^ (tag & 0xff),
' w%d' % w if w else '',
' %s' % size if size is not None else '')
elif tag & TAG_ALT:
return 'alt%s%s%s%s%s' % (
'r' if tag & TAG_R else 'b',
'a' if tag & 0x0fff == 0 and tag & TAG_GT
else 'n' if tag & 0x0fff == 0
else 'gt' if tag & TAG_GT
else 'le',
' 0x%x' % (tag & 0x0fff) if tag & 0x0fff != 0 else '',
' w%d' % w if w is not None else '',
' 0x%x' % (0xffffffff & (off-size))
if size and off is not None
else ' -%d' % size if size
else '')
elif (tag & 0x7f00) == TAG_CKSUM:
return 'cksum%s%s%s%s' % (
'p' if not tag & 0xfe and tag & TAG_P else '',
' 0x%02x' % (tag & 0xff) if tag & 0xfe else '',
' w%d' % w if w else '',
' %s' % size if size is not None else '')
elif (tag & 0x7f00) == TAG_NOTE:
return 'note%s%s%s' % (
' 0x%02x' % (tag & 0xff) if tag & 0xff else '',
' w%d' % w if w else '',
' %s' % size if size is not None else '')
elif (tag & 0x7f00) == TAG_ECKSUM:
return 'ecksum%s%s%s' % (
' 0x%02x' % (tag & 0xff) if tag & 0xff else '',
' w%d' % w if w else '',
' %s' % size if size is not None else '')
elif (tag & 0x7f00) == TAG_GCKSUMDELTA:
return 'gcksumdelta%s%s%s' % (
' 0x%02x' % (tag & 0xff) if tag & 0xff else '',
' w%d' % w if w else '',
' %s' % size if size is not None else '')
else:
return '0x%04x%s%s' % (
tag,
' w%d' % w if w is not None else '',
' %d' % size if size is not None else '')
def list_tags():
# here be magic, parse our script's source to get to the tag comments
import inspect
import re
tags = []
tag_pattern = re.compile(
'^(?P<name>TAG_[^ ]*) *= *(?P<tag>[^ #]+) *#+ *(?P<comment>.*)$')
for line in inspect.getsourcelines(
inspect.getmodule(inspect.currentframe()))[0]:
m = tag_pattern.match(line)
if m:
tags.append(m.groups())
# find widths for alignment
w = [0]
for n, t, c in tags:
w[0] = max(w[0], len('LFSR_'+n))
# print
for n, t, c in tags:
print('%-*s %s' % (
w[0], 'LFSR_'+n,
c))
def dbg_tag(data):
if isinstance(data, int):
tag = data
weight = None
size = None
else:
data = data.ljust(2, b'\0')
tag = (data[0] << 8) | data[1]
weight, d = fromleb128(data[2:]) if 2 < len(data) else (None, 2)
size, d_ = fromleb128(data[2+d:]) if d < len(data) else (None, d)
print(tagrepr(tag, weight, size))
def main(tags, *,
list=False,
block_size=None,
block_count=None,
off=None,
**args):
import builtins
list_, list = list, builtins.list
# list all known tags
if list_:
list_tags()
# try to decode these tags
else:
# interpret as a sequence of hex bytes
if args.get('hex'):
dbg_tag(bytes(int(tag, 16) for tag in tags))
# interpret as strings
elif args.get('string'):
for tag in tags:
dbg_tag(tag.encode('utf8'))
# default to interpreting as ints
elif block_size is None and off is None:
for tag in tags:
dbg_tag(int(tag, 0))
# if either block_size or off provided interpret as a block device
else:
disk, *blocks = tags
blocks = [rbydaddr(block) for block in blocks]
# is bd geometry specified?
if isinstance(block_size, tuple):
block_size, block_count_ = block_size
if block_count is None:
block_count = block_count_
# flatten block, default to block 0
if not blocks:
blocks = [(0,)]
blocks = [block for blocks_ in blocks for block in blocks_]
with open(disk, 'rb') as f:
# if block_size is omitted, assume the block device is one
# big block
if block_size is None:
f.seek(0, os.SEEK_END)
block_size = f.tell()
# blocks may also encode offsets
blocks, offs = (
[block[0] if isinstance(block, tuple) else block
for block in blocks],
[off if off is not None
else block[1] if isinstance(block, tuple)
else None
for block in blocks])
# read each tag
for block, off in zip(blocks, offs):
f.seek((block * block_size) + (off or 0))
# read maximum tag size
data = f.read(2+5+5)
dbg_tag(data)
if __name__ == "__main__":
import argparse
import sys
parser = argparse.ArgumentParser(
description="Decode littlefs tags.",
allow_abbrev=False)
parser.add_argument(
'tags',
nargs='*',
help="Tags to decode.")
parser.add_argument(
'-l', '--list',
action='store_true',
help="List all known tags.")
parser.add_argument(
'-x', '--hex',
action='store_true',
help="Interpret as a sequence of hex bytes.")
parser.add_argument(
'-s', '--string',
action='store_true',
help="Interpret as strings.")
parser.add_argument(
'-b', '--block-size',
type=bdgeom,
help="Block size/geometry in bytes.")
parser.add_argument(
'--block-count',
type=lambda x: int(x, 0),
help="Block count in blocks.")
parser.add_argument(
'--off',
type=lambda x: int(x, 0),
help="Use this offset.")
sys.exit(main(**{k: v
for k, v in vars(parser.parse_intermixed_args()).items()
if v is not None}))