Implemented self-validating global-checksums (gcksums)

This was quite a puzzle.

The problem: How do we detect corrupt mdirs?

Seems like a simple question, but we can't just rely on mdir cksums. Our
mdirs are independently updateable logs, and logs have this annoying
tendency to "rollback" to previously valid states when corrupted.

Rollback issues aren't littlefs-specific, but what _is_ littlefs-
specific is that when one mdir rolls back, it can disagree with other
mdirs, resulting in wildly incorrect filesystem state.

To solve this, or at least protect against disagreeable mdirs, we need
to somehow include the state of all other mdirs in each mdir commit.

---

The first thought: Why not use gstate?

We already have a system for storing distributed state. If we add the
xor of all of our mdir cksums, we can rebuild it during mount and verify
that nothing changed:

   .--------.   .--------.   .--------.   .--------.
  .| mdir 0 |  .| mdir 1 |  .| mdir 2 |  .| mdir 3 |
  ||        |  ||        |  ||        |  ||        |
  || gdelta |  || gdelta |  || gdelta |  || gdelta |
  |'-----|--'  |'-----|--'  |'-----|--'  |'-----|--'
  '------|-'   '------|-'   '------|-'   '------|-'
  '--.------'  '--.------'  '--.------'  '--.------'
   cksum |      cksum |      cksum |      cksum |
     |   |        v   |        v   |        v   |
     '---------> xor -------> xor -------> xor -------> gcksum
         |            v            v            v         =?
         '---------> xor -------> xor -------> xor ---> gcksum

Unfortunately it's not that easy. Consider what this looks like
mathematically (g is our gcksum, c_i is an mdir cksum, d_i is a
gcksumdelta, and +/-/sum is xor):

  g = sum(c_i) = sum(d_i)

If we solve for a new gcksumdelta, d_i:

  d_i = g' - g
  d_i = g + c_i - g
  d_i = c_i

The gcksum cancels itself out! We're left with an equation that depends
only on the current mdir, which doesn't help us at all.

Next thought: What if we permute the gcksum with a function t before
distributing it over our gcksumdeltas?

   .--------.   .--------.   .--------.   .--------.
  .| mdir 0 |  .| mdir 1 |  .| mdir 2 |  .| mdir 3 |
  ||        |  ||        |  ||        |  ||        |
  || gdelta |  || gdelta |  || gdelta |  || gdelta |
  |'-----|--'  |'-----|--'  |'-----|--'  |'-----|--'
  '------|-'   '------|-'   '------|-'   '------|-'
  '--.------'  '--.------'  '--.------'  '--.------'
   cksum |      cksum |      cksum |      cksum |
     |   |        v   |        v   |        v   |
     '---------> xor -------> xor -------> xor -------> gcksum
         |            |            |            |   .--t--'
         |            |            |            |   '-> t(gcksum)
         |            v            v            v          =?
         '---------> xor -------> xor -------> xor ---> t(gcksum)

In math terms:

  t(g) = t(sum(c_i)) = sum(d_i)

In order for this to work, t needs to be non-linear. If t is linear, the
same thing happens:

  d_i = t(g') - t(g)
  d_i = t(g + c_i) - t(g)
  d_i = t(g) + t(c_i) - t(g)
  d_i = t(c_i)

This was quite funny/frustrating (funnistrating?) during development,
because it means a lot of seemingly obvious functions don't work!

- t(g) = g              - Doesn't work
- t(g) = crc32c(g)      - Doesn't work because crc32cs are linear
- t(g) = g^2 in GF(2^n) - g^2 is linear in GF(2^n)!?

Fortunately, powers coprime with 2 finally give us a non-linear function
in GF(2^n), so t(g) = g^3 works:

  d_i = g'^3 - g^3
  d_i = (g + c_i)^3 - g^3
  d_i = (g^2 + gc_i + gc_i + c_i^2)(g + c_i) - g^3
  d_i = (g^2 + c_i^2)(g + c_i) - g^3
  d_i = g^3 + gc_i^2 + g^2c_i + c_i^3 - g^3
  d_i = gc_i^2 + g^2c_i + c_i^3

---

Bleh, now we need to implement finite-field operations? Well, not
entirely!

Note that our algorithm never uses division. This means we don't need a
full finite-field (+, -, *, /), but can get away with a finite-ring (+,
-, *). And conveniently for us, our crc32c polynomial defines a ring
epimorphic to a 31-bit finite-field.

All we need to do is define crc32c multiplication as polynomial
multiplication mod our crc32c polynomial:

  crc32cmul(a, b) = pmod(pmul(a, b), P)

And since crc32c is more-or-less just pmod(x, P), this lets us take
advantage of any crc32c hardware/tables that may be available.

---

Bunch of notes:

- Our 2^n-bit crc-ring maps to a 2^n-1-bit finite-field because our crc
  polynomial is defined as P(x) = Q(x)(x + 1), where Q(x) is a 2^n-1-bit
  irreducible polynomial.

  This is a common crc construction as it provides optimal odd-bit/2-bit
  error detection, so it shouldn't be too difficult to adapt to other
  crc sizes.

- t(g) = g^3 is not the only function that works, but it turns out to be
  a pretty good one:

  - 3 and 2^(2^n-1)-1 are coprime, which means our function t(g) = g^3
    provides a one-to-one mapping in the underlying fields of all crc
    rings of size 2^(2^n).

    We know 3 and 2^(2^n-1)-1 are coprime because 2^(2^n-1)-1 =
    2^(2^n)-1 (a Fermat number) - 2^(2^n-1) (a power-of-2), and 3
    divides Fermat numbers >=3 (A023394) and is not 2.

  - Our delta, when viewed as a polynomial in g: d(g) = gc^2 + g^2c +
    c^3, has degree 2, which implies there are at most 2 solutions or
    1-bit of information loss in the underlying field.

    This is optimal since the original definition already had 2
    solutions before we even chose a function:

      d(g) = t(g + c) - t(g)
      d(g) = t(g + c) - t((g + c) - c)
      d(g) = t((g + c) + c) - t(g + c)
      d(g) = d(g + c)

  Though note the mapping of our crc-ring to the underlying field
  already represents 1-bit of information loss.

- If you're using a cryptographic hash or other non-crc, you should
  probably just use an equal sized finite-field.

  Though note changing from a 2^n-1-bit field to a 2^n-bit field does
  change the math a bit, with t(g) = g^7 being a better non-linear
  function:

  - 7 is the smallest odd-number coprime with 2^n-1, a Fermat number,
    which makes t(g) = g^7 a one-to-one mapping.

    3 humorously divides all 2^n-1 Fermat numbers.

  - Expanding delta with t(g) = g^7 gives us a 6 degree polynomial,
    which implies at most 6 solutions or ~3-bits of information loss.

    This isn't actually the best you can do, some exhaustive searching
    over small fields (<=2^16) suggests t(g) = g^(2^(n-1)-1) _might_ be
    optimal, but that's a heck of a lot more multiplications.

- Because our crc32cs preserve parity/are epimorphic to parity bits,
  addition (xor) and multiplication (crc32cmul) also preserve parity,
  which can be used to show our entire gcksum system preserves parity.

  This is quite neat, and means we are guaranteed to detect any odd
  number of bit-errors across the entire filesystem.

- Another idea was to use two different addition operations: xor and
  overflowing addition (or mod a prime).

  This probably would have worked, but lacks the rigor of the above
  solution.

- You might think an RS-like construction would help here, where g =
  sum(c_ia^i), but this suffers from the same problem:

    d_i = g' - g
    d_i = g + c_ia^i - g
    d_i = c_ia^i

  Nothing here depends on anything outside of the current mdir.

- Another question is should we be using an RS-like construction anyways
  to include location information in our gcksum?

  Maybe in another system, but I don't think it's necessary in littlefs.

  While our mdir are independently updateable, they aren't _entirely_
  independent. The location of each mdir is stored in either the mtree
  or a parent mdir, so it always gets mixed into the gcksum somewhere.

  The only exception being the mrootanchor which is always at the fixed
  blocks 0x{0,1}.

- This does _not_ catch "global-rollback" issues, where the most recent
  commit in the entire filesystem is corrupted, revealing an older, but
  still valid, filesystem state.

  But as far as I am aware this is just a fundamental limitation of
  powerloss-resilient filesystems, short of doing destructive
  operations.

  At the very least, exposing the gcksum would allow the user to store
  it externally and prevent this issue.

---

Implementation details:

- Our gcksumdelta depends on the rbyd's cksum, so there's a catch-22 if
  we include it in the rbyd itself.

  We can avoid this by including it in the commit tags (actually the
  separate canonical cksum makes this easier than it would have been
  earlier), but this does mean LFSR_TAG_GCKSUMDELTA is not an
  LFSR_TAG_GDELTA subtype. Unfortunate but not a dealbreaker.

- Reading/writing the gcksumdelta gets a bit annoying with it not being
  in the rbyd. For now I've extended the low-level lfsr_rbyd_fetch_/
  lfsr_rbyd_appendcksum_ to accept an optional gcksumdelta pointer,
  which is a bit awkward, but I don't know of a better solution.

- Unlike the grm, _every_ mdir commit involves the gcksum, which means
  we either need to propagate the gcksumdelta up the mroot chain
  correctly, or somehow keep track of partially flushed gcksumdeltas.

  To make this work I modified the low-level lfsr_mdir_commit__
  functions to accept start_rid=-2 to indicate when gcksumdeltas should
  be flushed.

  It's a bit of a hack, but I think it might make sense to extend this
  to all gdeltas eventually.

The gcksum cost both code and RAM, but I think it's well worth it for
removing an entire category of filesystem corruption:

           code          stack          ctx
  before: 37796           2608          620
  after:  38428 (+1.7%)   2640 (+1.2%)  644 (+3.9%)
This commit is contained in:
Christopher Haster
2025-01-12 16:01:39 -06:00
parent 0eee57017d
commit 1c5adf71b3
11 changed files with 684 additions and 168 deletions

273
lfs.c
View File

@@ -1158,6 +1158,7 @@ enum lfsr_tag {
LFSR_TAG_P = 0x0001,
LFSR_TAG_NOTE = 0x3100,
LFSR_TAG_ECKSUM = 0x3200,
LFSR_TAG_GCKSUMDELTA = 0x3300,
// in-device only tags, these should never get written to disk
LFSR_TAG_INTERNAL = 0x0800,
@@ -2725,7 +2726,8 @@ static int lfsr_rbyd_ckecksum(lfs_t *lfs, const lfsr_rbyd_t *rbyd,
}
// fetch an rbyd
static int lfsr_rbyd_fetch(lfs_t *lfs, lfsr_rbyd_t *rbyd,
static int lfsr_rbyd_fetch_(lfs_t *lfs,
lfsr_rbyd_t *rbyd, uint32_t *gcksumdelta,
lfs_block_t block, lfs_size_t trunk) {
// set up some initial state
rbyd->blocks[0] = block;
@@ -2752,8 +2754,11 @@ static int lfsr_rbyd_fetch(lfs_t *lfs, lfsr_rbyd_t *rbyd,
lfsr_rid_t weight_ = 0;
// assume unerased until proven otherwise
lfsr_data_t ecksum = LFSR_DATA_NULL();
lfsr_data_t ecksum_ = LFSR_DATA_NULL();
lfsr_ecksum_t ecksum = {.cksize=-1};
lfsr_ecksum_t ecksum_ = {.cksize=-1};
// also find gcksumdelta, though this is only used by mdirs
uint32_t gcksumdelta_ = 0;
// scan tags, checking valid bits, cksums, etc
while (off < lfs->cfg->block_size
@@ -2793,7 +2798,33 @@ static int lfsr_rbyd_fetch(lfs_t *lfs, lfsr_rbyd_t *rbyd,
// found an ecksum? save for later
if (tag == LFSR_TAG_ECKSUM) {
ecksum_ = LFSR_DATA_DISK(block, off_, size);
err = lfsr_data_readecksum(lfs,
&LFSR_DATA_DISK(block, off_,
// note this size is to make the hint do
// what we want
lfs->cfg->block_size - off_),
&ecksum_);
if (err) {
if (err == LFS_ERR_CORRUPT) {
break;
}
return err;
}
// found gcksumdelta? save for later
} else if (tag == LFSR_TAG_GCKSUMDELTA) {
err = lfsr_data_readle32(lfs,
&LFSR_DATA_DISK(block, off_,
// note this size is to make the hint do
// what we want
lfs->cfg->block_size - off_),
&gcksumdelta_);
if (err) {
if (err == LFS_ERR_CORRUPT) {
break;
}
return err;
}
}
// is an end-of-commit cksum
@@ -2824,13 +2855,17 @@ static int lfsr_rbyd_fetch(lfs_t *lfs, lfsr_rbyd_t *rbyd,
rbyd->trunk = (LFSR_RBYD_ISSHRUB & rbyd->trunk) | trunk_;
rbyd->weight = weight;
ecksum = ecksum_;
ecksum_.cksize = -1;
if (gcksumdelta) {
*gcksumdelta = gcksumdelta_;
}
gcksumdelta_ = 0;
// revert to canonical checksum and perturb if necessary
cksum_ = cksum
^ ((lfsr_rbyd_isperturb(rbyd))
? LFS_CRC32C_ODDZERO
: 0);
ecksum_ = LFSR_DATA_NULL();
}
}
@@ -2888,18 +2923,9 @@ static int lfsr_rbyd_fetch(lfs_t *lfs, lfsr_rbyd_t *rbyd,
// did we end on a valid commit? we may have erased-state
bool erased = false;
if (lfsr_data_size(ecksum) != 0) {
// read the erased-state checksum
lfsr_ecksum_t ecksum__;
err = lfsr_data_readecksum(lfs, &ecksum,
&ecksum__);
if (err && err != LFS_ERR_CORRUPT) {
return err;
}
if (err != LFS_ERR_CORRUPT) {
if (ecksum.cksize != -1) {
// check the erased-state checksum
err = lfsr_rbyd_ckecksum(lfs, rbyd, &ecksum__);
err = lfsr_rbyd_ckecksum(lfs, rbyd, &ecksum);
if (err && err != LFS_ERR_CORRUPT) {
return err;
}
@@ -2907,7 +2933,6 @@ static int lfsr_rbyd_fetch(lfs_t *lfs, lfsr_rbyd_t *rbyd,
// found valid erased-state?
erased = (err != LFS_ERR_CORRUPT);
}
}
// used eoff=-1 to indicate when there is no erased-state
if (!erased) {
@@ -2917,6 +2942,11 @@ static int lfsr_rbyd_fetch(lfs_t *lfs, lfsr_rbyd_t *rbyd,
return 0;
}
static int lfsr_rbyd_fetch(lfs_t *lfs, lfsr_rbyd_t *rbyd,
lfs_block_t block, lfs_size_t trunk) {
return lfsr_rbyd_fetch_(lfs, rbyd, NULL, block, trunk);
}
// a more aggressive fetch when checksum is known
static int lfsr_rbyd_fetchck(lfs_t *lfs, lfsr_rbyd_t *rbyd,
lfs_block_t block, lfs_size_t trunk,
@@ -3937,7 +3967,11 @@ leaf:;
return 0;
}
static int lfsr_rbyd_appendcksum(lfs_t *lfs, lfsr_rbyd_t *rbyd) {
// needed in lfsr_rbyd_appendcksum
static uint32_t lfsr_gcksum_cube(uint32_t gcksum);
static int lfsr_rbyd_appendcksum_(lfs_t *lfs,
lfsr_rbyd_t *rbyd, uint32_t *gcksumdelta) {
// begin appending
int err = lfsr_rbyd_appendinit(lfs, rbyd);
if (err) {
@@ -3947,6 +3981,28 @@ static int lfsr_rbyd_appendcksum(lfs_t *lfs, lfsr_rbyd_t *rbyd) {
// save the canonical checksum
uint32_t cksum = rbyd->cksum;
// append gcksumdelta?
//
// the only requirement for gcksumdelta is we append after
// calculating the canonical checksum, it's a bit more convenient to
// append before the ecksum because of end-of-commit calculations
if (gcksumdelta) {
// figure out changes to our gcksumdelta
uint32_t gcksumdelta_ = *gcksumdelta
^ lfsr_gcksum_cube(lfs->gcksum_p)
^ lfsr_gcksum_cube(lfs->gcksum)
^ lfs->gcksum_d;
*gcksumdelta = gcksumdelta_;
uint8_t gcksumdelta_buf[LFSR_LE32_DSIZE];
err = lfsr_rbyd_appendrat_(lfs, rbyd, LFSR_RAT(
LFSR_TAG_GCKSUMDELTA, 0, LFSR_DATA_LE32(
gcksumdelta_, gcksumdelta_buf)));
if (err) {
return err;
}
}
// align to the next prog unit
//
// this gets a bit complicated as we have two types of cksums:
@@ -4081,6 +4137,10 @@ static int lfsr_rbyd_appendcksum(lfs_t *lfs, lfsr_rbyd_t *rbyd) {
return 0;
}
static int lfsr_rbyd_appendcksum(lfs_t *lfs, lfsr_rbyd_t *rbyd) {
return lfsr_rbyd_appendcksum_(lfs, rbyd, NULL);
}
static int lfsr_rbyd_appendrats(lfs_t *lfs, lfsr_rbyd_t *rbyd,
lfsr_srid_t rid, lfsr_srid_t start_rid, lfsr_srid_t end_rid,
const lfsr_rat_t *rats, lfs_size_t rat_count) {
@@ -6808,6 +6868,14 @@ static inline void lfsr_gdelta_xor(
}
// gcksum (global checksum) things
// cubing the gcksum prevents trivial gcksumdeltas
static uint32_t lfsr_gcksum_cube(uint32_t gcksum) {
return lfs_crc32c_mul(lfs_crc32c_mul(gcksum, gcksum), gcksum);
}
// grm (global remove) things
static inline uint8_t lfsr_grm_count_(const lfsr_grm_t *grm) {
return (grm->mids[0] >= 0) + (grm->mids[1] >= 0);
@@ -6895,6 +6963,8 @@ static int lfsr_data_readgrm(lfs_t *lfs, lfsr_data_t *data,
// some mdir-related gstate things we need
static void lfsr_fs_flushgdelta(lfs_t *lfs) {
// zero any pending gdeltas
lfs->gcksum_d = 0;
lfs_memset(lfs->grm_d, 0, LFSR_GRM_DSIZE);
}
@@ -6911,6 +6981,8 @@ static void lfsr_fs_preparegdelta(lfs_t *lfs) {
static void lfsr_fs_revertgdelta(lfs_t *lfs) {
// revert gstate to on-disk state
lfs->gcksum = lfs->gcksum_p;
int err = lfsr_data_readgrm(lfs,
&LFSR_DATA_BUF(lfs->grm_p, LFSR_GRM_DSIZE),
&lfs->grm);
@@ -6921,11 +6993,15 @@ static void lfsr_fs_revertgdelta(lfs_t *lfs) {
static void lfsr_fs_commitgdelta(lfs_t *lfs) {
// commit any pending gdeltas
lfs->gcksum_p = lfs->gcksum;
lfsr_data_fromgrm(&lfs->grm, lfs->grm_p);
}
// append and consume any pending gstate
static int lfsr_rbyd_appendgdelta(lfs_t *lfs, lfsr_rbyd_t *rbyd) {
// gcksums are a special case and handled directly in
// lfsr_mdir_commit__/lfsr_rbyd_appendcksum_
// need grm delta?
if (!lfsr_gdelta_iszero(lfs->grm_d, LFSR_GRM_DSIZE)) {
// make sure to xor any existing delta
@@ -6964,6 +7040,9 @@ static int lfsr_rbyd_appendgdelta(lfs_t *lfs, lfsr_rbyd_t *rbyd) {
}
static int lfsr_fs_consumegdelta(lfs_t *lfs, const lfsr_mdir_t *mdir) {
// consume any gcksum deltas
lfs->gcksum_d ^= mdir->gcksumdelta;
// consume any grm deltas
lfsr_data_t data;
int err = lfsr_rbyd_lookup(lfs, &mdir->rbyd, -1, LFSR_TAG_GRMDELTA,
@@ -7065,7 +7144,9 @@ static int lfsr_mdir_fetch(lfs_t *lfs, lfsr_mdir_t *mdir,
// try to fetch rbyds in the order of most recent to least recent
for (int i = 0; i < 2; i++) {
int err = lfsr_rbyd_fetch(lfs, &mdir->rbyd, blocks[0], 0);
int err = lfsr_rbyd_fetch_(lfs,
&mdir->rbyd, &mdir->gcksumdelta,
blocks[0], 0);
if (err && err != LFS_ERR_CORRUPT) {
return err;
}
@@ -7265,6 +7346,7 @@ static int lfsr_mtree_lookup(lfs_t *lfs, lfsr_smid_t mid,
if (lfsr_mtree_isnull(&lfs->mtree)) {
mdir_->mid = mid;
mdir_->rbyd = lfs->mroot.rbyd;
mdir_->gcksumdelta = lfs->mroot.gcksumdelta;
return 0;
// looking up direct mdir?
@@ -7308,6 +7390,8 @@ static int lfsr_mdir_alloc__(lfs_t *lfs, lfsr_mdir_t *mdir,
lfsr_smid_t mid, bool partial) {
// assign the mid
mdir->mid = mid;
// default to zero gcksumdelta
mdir->gcksumdelta = 0;
if (!partial) {
// allocate one block without an erase
@@ -7362,6 +7446,8 @@ static int lfsr_mdir_swap__(lfs_t *lfs, lfsr_mdir_t *mdir_,
const lfsr_mdir_t *mdir, bool force) {
// assign the mid
mdir_->mid = mdir->mid;
// reset to zero gcksumdelta, upper layers should handle this
mdir_->gcksumdelta = 0;
// first thing we need to do is read our current revision count
uint32_t rev;
@@ -7686,22 +7772,38 @@ static int lfsr_mdir_commit__(lfs_t *lfs, lfsr_mdir_t *mdir,
}
// append any gstate?
if (start_rid == -1) {
if (start_rid <= -1) {
int err = lfsr_rbyd_appendgdelta(lfs, &mdir->rbyd);
if (err) {
return err;
}
}
// TODO should lfsr_rbyd_appendcksum_ revert cksum on failure?
// save cksum in case we fail
uint32_t cksum = mdir->rbyd.cksum;
// xor our new cksum
lfs->gcksum ^= mdir->rbyd.cksum;
// finalize commit
int err = lfsr_rbyd_appendcksum(lfs, &mdir->rbyd);
int err = lfsr_rbyd_appendcksum_(lfs, &mdir->rbyd,
// include gcksumdelta if we're not relocating
(start_rid <= -2) ? &mdir->gcksumdelta : NULL);
if (err) {
// undo cksum xor on failure
lfs->gcksum ^= cksum;
return err;
}
// success? flush gstate?
if (start_rid == -1) {
if (start_rid <= -1) {
// TODO this is a hack
// we only flush gcksumdelta if rid == -2
uint32_t gcksum_d = lfs->gcksum_d;
lfsr_fs_flushgdelta(lfs);
if (start_rid > -2) {
lfs->gcksum_d = gcksum_d;
}
}
return 0;
@@ -7719,7 +7821,7 @@ static lfs_ssize_t lfsr_mdir_estimate__(lfs_t *lfs, const lfsr_mdir_t *mdir,
// calculate dsize by starting from the outside ids and working inwards,
// this naturally gives us a split rid
lfsr_srid_t a_rid = start_rid;
lfsr_srid_t a_rid = lfs_smax(start_rid, -1);
lfsr_srid_t b_rid = lfs_min(mdir->rbyd.weight, end_rid);
lfs_size_t a_dsize = 0;
lfs_size_t b_dsize = 0;
@@ -7827,7 +7929,7 @@ static lfs_ssize_t lfsr_mdir_estimate__(lfs_t *lfs, const lfsr_mdir_t *mdir,
}
}
if (a_rid == -1) {
if (a_rid <= -1) {
mdir_dsize += dsize_;
} else {
a_dsize += dsize_;
@@ -7858,8 +7960,14 @@ static int lfsr_mdir_compact__(lfs_t *lfs, lfsr_mdir_t *mdir_,
// (btree), not the staged state (btree_), this is important,
// we can't trust btree_ after a failed commit
// assume we keep any gcksumdelta, this will get fixed the first time
// we commit anything
if (start_rid == -2) {
mdir_->gcksumdelta = mdir->gcksumdelta;
}
// copy over tags in the rbyd in order
lfsr_srid_t rid = start_rid;
lfsr_srid_t rid = lfs_smax(start_rid, -1);
lfsr_tag_t tag = 0;
while (true) {
lfsr_rid_t weight;
@@ -8075,8 +8183,14 @@ relocate:;
}
compact:;
// don't copy over gcksum if relocating
lfsr_srid_t start_rid_ = start_rid;
if (relocated && !overcompacted) {
start_rid_ = lfs_smax(start_rid_, -1);
}
// compact our mdir
err = lfsr_mdir_compact__(lfs, &mdir_, mdir, start_rid, end_rid);
err = lfsr_mdir_compact__(lfs, &mdir_, mdir, start_rid_, end_rid);
if (err) {
LFS_ASSERT(err != LFS_ERR_RANGE);
// bad prog? try another block
@@ -8090,7 +8204,7 @@ compact:;
//
// upper layers should make sure this can't fail by limiting the
// maximum commit size
err = lfsr_mdir_commit__(lfs, &mdir_, start_rid, end_rid,
err = lfsr_mdir_commit__(lfs, &mdir_, start_rid_, end_rid,
mid, rats, rat_count);
if (err) {
LFS_ASSERT(err != LFS_ERR_RANGE);
@@ -8101,6 +8215,10 @@ compact:;
return err;
}
// consume gcksumdelta if relocated
if (relocated && !overcompacted) {
lfs->gcksum_d ^= mdir->gcksumdelta;
}
// update mdir
*mdir = mdir_;
return 0;
@@ -8196,6 +8314,9 @@ static int lfsr_mdir_commit(lfs_t *lfs, lfsr_mdir_t *mdir,
// setup any pending gdeltas
lfsr_fs_preparegdelta(lfs);
// xor our old cksum
lfs->gcksum ^= mdir->rbyd.cksum;
// create a copy
lfsr_mdir_t mdir_[2];
mdir_[0] = *mdir;
@@ -8218,7 +8339,7 @@ static int lfsr_mdir_commit(lfs_t *lfs, lfsr_mdir_t *mdir,
// attempt to commit/compact the mdir normally
lfsr_srid_t split_rid;
int err = lfsr_mdir_commit_(lfs, &mdir_[0], -1, -1, &split_rid,
int err = lfsr_mdir_commit_(lfs, &mdir_[0], -2, -1, &split_rid,
mdir->mid, rats, rat_count);
if (err && err != LFS_ERR_RANGE
&& err != LFS_ERR_NOENT) {
@@ -8229,6 +8350,7 @@ static int lfsr_mdir_commit(lfs_t *lfs, lfsr_mdir_t *mdir,
lfsr_mdir_t mroot_ = lfs->mroot;
if (!err && lfsr_mdir_cmp(mdir, &lfs->mroot) == 0) {
mroot_.rbyd = mdir_[0].rbyd;
mroot_.gcksumdelta = mdir_[0].gcksumdelta;
}
// handle possible mtree updates, this gets a bit messy
@@ -8328,6 +8450,7 @@ static int lfsr_mdir_commit(lfs_t *lfs, lfsr_mdir_t *mdir,
mdir_[0].mid >> lfs->mdir_bits,
mdir_[0].rbyd.blocks[0], mdir_[0].rbyd.blocks[1]);
mdir_[0].rbyd = mdir_[1].rbyd;
mdir_[0].gcksumdelta = mdir_[1].gcksumdelta;
goto relocated;
// other sibling reduced to zero
@@ -8509,6 +8632,18 @@ static int lfsr_mdir_commit(lfs_t *lfs, lfsr_mdir_t *mdir,
// mtree should never go to zero since we always have a root bookmark
LFS_ASSERT(lfsr_mtree_weight_(&mtree_) > 0);
// make sure mtree/mroot changes are on-disk before committing
// metadata
err = lfsr_bd_sync(lfs);
if (err) {
goto failed;
}
// xor mroot's cksum if we haven't already
if (lfsr_mdir_cmp(mdir, &lfs->mroot) != 0) {
lfs->gcksum ^= lfs->mroot.rbyd.cksum;
}
// mark any copies of our mroot as unerased
lfs->mroot.rbyd.eoff = -1;
for (lfsr_omdir_t *o = lfs->omdirs; o; o = o->next) {
@@ -8517,19 +8652,12 @@ static int lfsr_mdir_commit(lfs_t *lfs, lfsr_mdir_t *mdir,
}
}
// make sure mtree/mroot changes are on-disk before committing
// metadata
err = lfsr_bd_sync(lfs);
if (err) {
goto failed;
}
// commit new mtree into our mroot
//
// note end_rid=0 here will delete any files leftover from a split
// in our mroot
uint8_t mtree_buf[LFS_MAX(LFSR_MPTR_DSIZE, LFSR_BTREE_DSIZE)];
err = lfsr_mdir_commit_(lfs, &mroot_, -1, 0, NULL,
err = lfsr_mdir_commit_(lfs, &mroot_, -2, 0, NULL,
-1, LFSR_RATS(
(lfsr_mtree_ismptr(&mtree_))
? LFSR_RAT(
@@ -8580,9 +8708,12 @@ static int lfsr_mdir_commit(lfs_t *lfs, lfsr_mdir_t *mdir,
goto failed;
}
// xor mrootchild's cksum
lfs->gcksum ^= mrootparent_.rbyd.cksum;
// commit mrootchild
uint8_t mrootchild_buf[LFSR_MPTR_DSIZE];
err = lfsr_mdir_commit_(lfs, &mrootparent_, -1, -1, NULL,
err = lfsr_mdir_commit_(lfs, &mrootparent_, -2, -1, NULL,
-1, LFSR_RATS(
LFSR_RAT(
LFSR_TAG_MROOT, 0,
@@ -8630,7 +8761,7 @@ static int lfsr_mdir_commit(lfs_t *lfs, lfsr_mdir_t *mdir,
}
uint8_t mrootchild_buf[LFSR_MPTR_DSIZE];
err = lfsr_mdir_commit__(lfs, &mrootanchor_, -1, -1,
err = lfsr_mdir_commit__(lfs, &mrootanchor_, -2, -1,
-1, LFSR_RATS(
LFSR_RAT(
LFSR_TAG_MAGIC, 0,
@@ -8656,6 +8787,7 @@ static int lfsr_mdir_commit(lfs_t *lfs, lfsr_mdir_t *mdir,
}
// gstate must have been committed by a lower-level function at this point
LFS_ASSERT(lfs->gcksum_d == 0);
LFS_ASSERT(lfsr_gdelta_iszero(lfs->grm_d, LFSR_GRM_DSIZE));
// sync on-disk state
@@ -8745,8 +8877,10 @@ static int lfsr_mdir_commit(lfs_t *lfs, lfsr_mdir_t *mdir,
>= (lfsr_srid_t)mdir_[0].rbyd.weight) {
o->mdir.mid += (1 << lfs->mdir_bits) - mdir_[0].rbyd.weight;
o->mdir.rbyd = mdir_[1].rbyd;
o->mdir.gcksumdelta = mdir_[1].gcksumdelta;
} else {
o->mdir.rbyd = mdir_[0].rbyd;
o->mdir.gcksumdelta = mdir_[0].gcksumdelta;
}
} else if (o->mdir.mid > mdir->mid) {
o->mdir.mid += mdelta;
@@ -8757,13 +8891,16 @@ static int lfsr_mdir_commit(lfs_t *lfs, lfsr_mdir_t *mdir,
if (mdelta > 0
&& mdir->mid == -1) {
mdir->rbyd = mroot_.rbyd;
mdir->gcksumdelta = mroot_.gcksumdelta;
} else if (mdelta > 0
&& lfsr_mid_rid(lfs, mdir->mid)
>= (lfsr_srid_t)mdir_[0].rbyd.weight) {
mdir->mid += (1 << lfs->mdir_bits) - mdir_[0].rbyd.weight;
mdir->rbyd = mdir_[1].rbyd;
mdir->gcksumdelta = mdir_[1].gcksumdelta;
} else {
mdir->rbyd = mdir_[0].rbyd;
mdir->gcksumdelta = mdir_[0].gcksumdelta;
}
// update mroot and mtree
@@ -13331,6 +13468,12 @@ static int lfs_init(lfs_t *lfs, uint32_t flags,
lfs->omdirs = NULL;
// zero gstate
lfs->gcksum = 0;
lfs->gcksum_p = 0;
lfs->gcksum_d = 0;
lfs->grm.mids[0] = -1;
lfs->grm.mids[1] = -1;
lfs_memset(lfs->grm_p, 0, LFSR_GRM_DSIZE);
lfs_memset(lfs->grm_d, 0, LFSR_GRM_DSIZE);
@@ -13796,6 +13939,9 @@ static int lfsr_mountinited(lfs_t *lfs) {
// numbers
lfs->seed ^= mdir->rbyd.cksum;
// build gcksum out of mdir cksums
lfs->gcksum_p ^= mdir->rbyd.cksum;
// collect any gdeltas from this mdir
err = lfsr_fs_consumegdelta(lfs, mdir);
if (err) {
@@ -13815,6 +13961,42 @@ static int lfsr_mountinited(lfs_t *lfs) {
}
}
// keep track of the current gcksum
lfs->gcksum = lfs->gcksum_p;
// validate gcksum by comparing its cube against the gcksumdeltas
//
// The use of cksum^3 here is important to avoid trivial
// gcksumdeltas. If we use a linear function (cksum, crc32c(cksum),
// cksum^2, etc), the state of the filesystem cancels out when
// calculating a new gcksumdelta:
//
// d_i = t(g') - t(g)
// d_i = t(g + c_i) - t(g)
// d_i = t(g) + t(c_i) - t(g)
// d_i = t(c_i)
//
// Using cksum^3 prevents this from happening:
//
// d_i = (g + c_i)^3 - g^3
// d_i = (g + c_i)(g + c_i)(g + c_i) - g^3
// d_i = (g^2 + gc_i + gc_i + c_i^2)(g + c_i) - g^3
// d_i = (g^2 + c_i^2)(g + c_i) - g^3
// d_i = g^3 + gc_i^2 + g^2c_i + c_i^3 - g^3
// d_i = gc_i^2 + g^2c_i + c_i^3
//
// cksum^3 also has some other nice properties, providing a perfect
// 1->1 mapping of t(g) in 2^31 fields, and losing at most 3-bits of
// info when calculating d_i.
//
if (lfsr_gcksum_cube(lfs->gcksum) != lfs->gcksum_d) {
LFS_ERROR("Found gcksum mismatch, cksum^3 %08"PRIx32" "
"(!= %08"PRIx32")",
lfsr_gcksum_cube(lfs->gcksum),
lfs->gcksum_d);
return LFS_ERR_CORRUPT;
}
// once we've mounted and derived a pseudo-random seed, initialize our
// block allocator
//
@@ -13924,7 +14106,8 @@ int lfsr_mount(lfs_t *lfs, uint32_t flags,
// TODO this should use any configured values
LFS_DEBUG("Mounted littlefs v%"PRId32".%"PRId32" %"PRId32"x%"PRId32" "
"0x{%"PRIx32",%"PRIx32"}.%"PRIx32" w%"PRId32".%"PRId32,
"0x{%"PRIx32",%"PRIx32"}.%"PRIx32" w%"PRId32".%"PRId32", "
"cksum %08"PRIx32,
LFS_DISK_VERSION_MAJOR,
LFS_DISK_VERSION_MINOR,
lfs->cfg->block_size,
@@ -13933,7 +14116,8 @@ int lfsr_mount(lfs_t *lfs, uint32_t flags,
lfs->mroot.rbyd.blocks[1],
lfsr_rbyd_trunk(&lfs->mroot.rbyd),
lfsr_mtree_weight_(&lfs->mtree) >> lfs->mdir_bits,
1 << lfs->mdir_bits);
1 << lfs->mdir_bits,
lfs->gcksum);
return 0;
@@ -13991,7 +14175,7 @@ static int lfsr_formatinited(lfs_t *lfs) {
uint8_t name_limit_buf[LFSR_LLEB128_DSIZE];
uint8_t file_limit_buf[LFSR_LEB128_DSIZE];
uint8_t bookmark_buf[LFSR_LEB128_DSIZE];
err = lfsr_rbyd_commit(lfs, &rbyd, -1, LFSR_RATS(
err = lfsr_rbyd_appendrats(lfs, &rbyd, -1, -1, -1, LFSR_RATS(
LFSR_RAT(
LFSR_TAG_MAGIC, 0,
LFSR_DATA_BUF("littlefs", 8)),
@@ -14025,6 +14209,13 @@ static int lfsr_formatinited(lfs_t *lfs) {
if (err) {
return err;
}
// prepare initial gcksum and commit
lfs->gcksum = rbyd.cksum;
err = lfsr_rbyd_appendcksum_(lfs, &rbyd, &(uint32_t){0});
if (err) {
return err;
}
}
// sync on-disk state

5
lfs.h
View File

@@ -611,6 +611,7 @@ typedef struct {
typedef struct lfsr_mdir {
lfsr_smid_t mid;
lfsr_rbyd_t rbyd;
uint32_t gcksumdelta;
} lfsr_mdir_t;
typedef struct lfsr_omdir {
@@ -874,6 +875,10 @@ typedef struct lfs {
uint8_t *buffer;
} lookahead;
uint32_t gcksum;
uint32_t gcksum_p;
uint32_t gcksum_d;
lfsr_grm_t grm;
uint8_t grm_p[LFSR_GRM_DSIZE];
uint8_t grm_d[LFSR_GRM_DSIZE];

View File

@@ -76,37 +76,9 @@ ssize_t lfs_fromleb128(uint32_t *word, const void *buffer, size_t size) {
// return crc;
//}
// Calculate crc32c incrementally
uint32_t lfs_crc32c(uint32_t crc, const void *buffer, size_t size) {
// init with 0xffffffff so prefixed zeros affect the crc
const uint8_t *data = buffer;
crc ^= 0xffffffff;
// A couple crc32c implementations to choose from.
//
// The default, "small-table" implementation offers a decent performance
// without much additional code-size, reasonable for microcontrollers. For
// anything larger where you really don't care about an extra 1KiB of code
// the "big-table" implementation is probably better.
//
// Some quick measurements with GCC 11 using -Os -mcpu=cortex-m55, with
// instruction counts from QEMU and an input size of 4KiB. Note these are
// not cycle-accurate:
//
// code stack ins ld/st branch
// naive 48 12 221192 4099 36865
// small-table 124 12 49160 12291 4097
// big-table 1064 8 32776 8195 4097
//
#if defined(LFS_SMALLER_CRC32C)
for (size_t i = 0; i < size; i++) {
crc = crc ^ data[i];
for (size_t j = 0; j < 8; j++) {
crc = (crc >> 1) ^ ((crc & 1) ? 0x82f63b78 : 0);
}
}
#elif !defined(LFS_FASTER_CRC32C)
// crc32c tables (see lfs_crc32c for more info)
#if !defined(LFS_FASTER_CRC32C)
static const uint32_t lfs_crc32c_table[16] = {
0x00000000, 0x105ec76f, 0x20bd8ede, 0x30e349b1,
0x417b1dbc, 0x5125dad3, 0x61c69362, 0x7198540d,
@@ -114,11 +86,6 @@ uint32_t lfs_crc32c(uint32_t crc, const void *buffer, size_t size) {
0xc38d26c4, 0xd3d3e1ab, 0xe330a81a, 0xf36e6f75,
};
for (size_t i = 0; i < size; i++) {
crc = (crc >> 4) ^ lfs_crc32c_table[0xf & (crc ^ (data[i] >> 0))];
crc = (crc >> 4) ^ lfs_crc32c_table[0xf & (crc ^ (data[i] >> 4))];
}
#else
static const uint32_t lfs_crc32c_table[256] = {
0x00000000, 0xf26b8303, 0xe13b70f7, 0x1350f3f4,
@@ -186,7 +153,46 @@ uint32_t lfs_crc32c(uint32_t crc, const void *buffer, size_t size) {
0x79b737ba, 0x8bdcb4b9, 0x988c474d, 0x6ae7c44e,
0xbe2da0a5, 0x4c4623a6, 0x5f16d052, 0xad7d5351,
};
#endif
// Calculate crc32c incrementally
uint32_t lfs_crc32c(uint32_t crc, const void *buffer, size_t size) {
// init with 0xffffffff so prefixed zeros affect the crc
const uint8_t *data = buffer;
crc ^= 0xffffffff;
// A couple crc32c implementations to choose from.
//
// The default, "small-table" implementation offers a decent performance
// without much additional code-size, reasonable for microcontrollers. For
// anything larger where you really don't care about an extra 1KiB of code
// the "big-table" implementation is probably better.
//
// Some quick measurements with GCC 11 using -Os -mcpu=cortex-m55, with
// instruction counts from QEMU and an input size of 4KiB. Note these are
// not cycle-accurate:
//
// code stack ins ld/st branch
// naive 48 12 221192 4099 36865
// small-table 124 12 49160 12291 4097
// big-table 1064 8 32776 8195 4097
//
#if defined(LFS_SMALLER_CRC32C)
for (size_t i = 0; i < size; i++) {
crc = crc ^ data[i];
for (size_t j = 0; j < 8; j++) {
crc = (crc >> 1) ^ ((crc & 1) ? 0x82f63b78 : 0);
}
}
#elif !defined(LFS_FASTER_CRC32C)
for (size_t i = 0; i < size; i++) {
crc = (crc >> 4) ^ lfs_crc32c_table[0xf & (crc ^ (data[i] >> 0))];
crc = (crc >> 4) ^ lfs_crc32c_table[0xf & (crc ^ (data[i] >> 4))];
}
#else
for (size_t i = 0; i < size; i++) {
crc = (crc >> 8) ^ lfs_crc32c_table[0xff & (crc ^ data[i])];
}
@@ -197,4 +203,42 @@ uint32_t lfs_crc32c(uint32_t crc, const void *buffer, size_t size) {
return crc;
}
// Multiply two crc32cs in the crc32c ring
uint32_t lfs_crc32c_mul(uint32_t a, uint32_t b) {
// Multiplication in a crc32c ring involves polynomial
// multiplication modulo the crc32c polynomial to keep things
// finite:
//
// r = a * b mod P
//
// Note because our crc32c is not irreducible, this does not give
// us a finite-field, i.e. division is undefined. Still,
// multiplication has useful properties.
// This gets a bit funky because crc32cs are little-endian, but
// fortunately pmul is symmetric. Unfortunately the result is
// 31-bits large, so we need to shift by 1.
uint64_t r = lfs_pmul(a, b) << 1;
// We can accelerate our module with crc32c tables if present, these
// loops may look familiar.
#if defined(LFS_SMALLER_CRC32C)
for (int i = 0; i < 32; i++) {
r = (r >> 1) ^ ((r & 1) ? 0x82f63b78 : 0);
}
#elif !defined(LFS_FASTER_CRC32C)
for (int i = 0; i < 8; i++) {
r = (r >> 4) ^ lfs_crc32c_table[0xf & r];
}
#else
for (int i = 0; i < 4; i++) {
r = (r >> 8) ^ lfs_crc32c_table[0xff & r];
}
#endif
return (uint32_t)r;
}
#endif

View File

@@ -281,6 +281,25 @@ static inline int lfs_scmp(uint32_t a, uint32_t b) {
return (int)(unsigned)(a - b);
}
// Perform polynomial/carry-less multiplication
//
// This is a multiply where all adds are replaced with xors. If we view
// a and b as binary polynomials, xor is polynomial addition and pmul is
// polynomial multiplication.
static inline uint64_t lfs_pmul(uint32_t a, uint32_t b) {
uint64_t r = 0;
uint64_t a_ = a;
while (b) {
if (b & 1) {
r ^= a_;
}
a_ <<= 1;
b >>= 1;
}
return r;
}
// Convert between 32-bit little-endian and native order
static inline uint32_t lfs_fromle32(uint32_t a) {
#if !defined(LFS_NO_BUILTINS) && defined(LFS_LITTLE_ENDIAN)
@@ -603,6 +622,9 @@ static inline size_t lfs_strcspn(const char *a, const char *cs) {
//
uint32_t lfs_crc32c(uint32_t crc, const void *buffer, size_t size);
// Multiply two crc32cs in the crc32c ring
uint32_t lfs_crc32c_mul(uint32_t a, uint32_t b);
// Allocate memory, only used if buffers are not provided to littlefs
// Note, memory must be 64-bit aligned

View File

@@ -50,10 +50,11 @@ TAG_B = 0x0000
TAG_R = 0x2000
TAG_LE = 0x0000
TAG_GT = 0x1000
TAG_CKSUM = 0x3000 ## 0x3c0p v-11 cccc ---- ---p
TAG_CKSUM = 0x3000 ## 0x300p v-11 ---- ---- ---p
TAG_P = 0x0001
TAG_NOTE = 0x3100 # 0x3100 v-11 ---1 ---- ----
TAG_ECKSUM = 0x3200 # 0x3200 v-11 --1- ---- ----
TAG_NOTE = 0x3100 ## 0x3100 v-11 ---1 ---- ----
TAG_ECKSUM = 0x3200 ## 0x3200 v-11 --1- ---- ----
TAG_GCKSUMDELTA = 0x3300 ## 0x3300 v-11 --11 ---- ----
CHARS = 'mbd-'
@@ -594,7 +595,8 @@ class Bmap:
# our core rbyd type
class Rbyd:
def __init__(self, blocks, data, rev, eoff, trunk, weight, cksum):
def __init__(self, blocks, data, rev, eoff, trunk, weight, cksum,
gcksumdelta):
if isinstance(blocks, int):
blocks = (blocks,)
@@ -605,6 +607,7 @@ class Rbyd:
self.trunk = trunk
self.weight = weight
self.cksum = cksum
self.gcksumdelta = gcksumdelta
@property
def block(self):
@@ -680,6 +683,8 @@ class Rbyd:
weight = 0
weight_ = 0
weight__ = 0
gcksumdelta = None
gcksumdelta_ = None
while j_ < len(data) and (not trunk or eoff <= trunk):
# read next tag
v, tag, w, size, d = fromtag(data[j_:])
@@ -695,6 +700,11 @@ class Rbyd:
if not tag & TAG_ALT:
if (tag & 0xff00) != TAG_CKSUM:
cksum___ = crc32c(data[j_:j_+size], cksum___)
# found a gcksumdelta?
if (tag & 0xff00) == TAG_GCKSUMDELTA:
gcksumdelta_ = (tag, w, j_-d, d, data[j_:j_+size])
# found a cksum?
else:
# check cksum
@@ -706,6 +716,8 @@ class Rbyd:
cksum_ = cksum__
trunk_ = trunk__
weight = weight_
gcksumdelta = gcksumdelta_
gcksumdelta_ = None
# update perturb bit
perturb = tag & TAG_P
# revert to data cksum and perturb
@@ -737,6 +749,7 @@ class Rbyd:
0xfca42daf if perturb else 0)
trunk_ = trunk__
weight = weight_
gcksumdelta = gcksumdelta_
trunk___ = 0
# update canonical checksum, xoring out any perturb state
@@ -747,9 +760,9 @@ class Rbyd:
# cksum mismatch?
if cksum is not None and cksum_ != cksum:
return cls(block, data, rev, 0, 0, 0, cksum_)
return cls(block, data, rev, 0, 0, 0, cksum_, gcksumdelta)
return cls(block, data, rev, eoff, trunk_, weight, cksum_)
return cls(block, data, rev, eoff, trunk_, weight, cksum_, gcksumdelta)
def lookup(self, rid, tag):
if not self:

View File

@@ -48,10 +48,11 @@ TAG_B = 0x0000
TAG_R = 0x2000
TAG_LE = 0x0000
TAG_GT = 0x1000
TAG_CKSUM = 0x3000 ## 0x3c0p v-11 cccc ---- ---p
TAG_CKSUM = 0x3000 ## 0x300p v-11 cccc ---- ---p
TAG_P = 0x0001
TAG_NOTE = 0x3100 # 0x3100 v-11 ---1 ---- ----
TAG_ECKSUM = 0x3200 # 0x3200 v-11 --1- ---- ----
TAG_NOTE = 0x3100 ## 0x3100 v-11 ---1 ---- ----
TAG_ECKSUM = 0x3200 ## 0x3200 v-11 --1- ---- ----
TAG_GCKSUMDELTA = 0x3300 ## 0x3300 v-11 --11 ---- ----
# some ways of block geometry representations
@@ -253,6 +254,11 @@ def tagrepr(tag, w=None, size=None, off=None):
' 0x%02x' % (tag & 0xff) if tag & 0xff else '',
' w%d' % w if w else '',
' %s' % size if size is not None else '')
elif (tag & 0x7f00) == TAG_GCKSUMDELTA:
return 'gcksumdelta%s%s%s' % (
' 0x%02x' % (tag & 0xff) if tag & 0xff else '',
' w%d' % w if w else '',
' %s' % size if size is not None else '')
else:
return '0x%04x%s%s' % (
tag,
@@ -265,7 +271,8 @@ TBranch = co.namedtuple('TBranch', 'a, b, d, c')
# our core rbyd type
class Rbyd:
def __init__(self, blocks, data, rev, eoff, trunk, weight, cksum):
def __init__(self, blocks, data, rev, eoff, trunk, weight, cksum,
gcksumdelta):
if isinstance(blocks, int):
blocks = (blocks,)
@@ -276,6 +283,7 @@ class Rbyd:
self.trunk = trunk
self.weight = weight
self.cksum = cksum
self.gcksumdelta = gcksumdelta
@property
def block(self):
@@ -351,6 +359,8 @@ class Rbyd:
weight = 0
weight_ = 0
weight__ = 0
gcksumdelta = None
gcksumdelta_ = None
while j_ < len(data) and (not trunk or eoff <= trunk):
# read next tag
v, tag, w, size, d = fromtag(data[j_:])
@@ -366,6 +376,11 @@ class Rbyd:
if not tag & TAG_ALT:
if (tag & 0xff00) != TAG_CKSUM:
cksum___ = crc32c(data[j_:j_+size], cksum___)
# found a gcksumdelta?
if (tag & 0xff00) == TAG_GCKSUMDELTA:
gcksumdelta_ = (tag, w, j_-d, d, data[j_:j_+size])
# found a cksum?
else:
# check cksum
@@ -377,6 +392,8 @@ class Rbyd:
cksum_ = cksum__
trunk_ = trunk__
weight = weight_
gcksumdelta = gcksumdelta_
gcksumdelta_ = None
# update perturb bit
perturb = tag & TAG_P
# revert to data cksum and perturb
@@ -408,6 +425,7 @@ class Rbyd:
0xfca42daf if perturb else 0)
trunk_ = trunk__
weight = weight_
gcksumdelta = gcksumdelta_
trunk___ = 0
# update canonical checksum, xoring out any perturb state
@@ -418,9 +436,9 @@ class Rbyd:
# cksum mismatch?
if cksum is not None and cksum_ != cksum:
return cls(block, data, rev, 0, 0, 0, cksum_)
return cls(block, data, rev, 0, 0, 0, cksum_, gcksumdelta)
return cls(block, data, rev, eoff, trunk_, weight, cksum_)
return cls(block, data, rev, eoff, trunk_, weight, cksum_, gcksumdelta)
def lookup(self, rid, tag):
if not self:

View File

@@ -49,10 +49,11 @@ TAG_B = 0x0000
TAG_R = 0x2000
TAG_LE = 0x0000
TAG_GT = 0x1000
TAG_CKSUM = 0x3000 ## 0x3c0p v-11 cccc ---- ---p
TAG_CKSUM = 0x3000 ## 0x300p v-11 ---- ---- ---p
TAG_P = 0x0001
TAG_NOTE = 0x3100 # 0x3100 v-11 ---1 ---- ----
TAG_ECKSUM = 0x3200 # 0x3200 v-11 --1- ---- ----
TAG_NOTE = 0x3100 ## 0x3100 v-11 ---1 ---- ----
TAG_ECKSUM = 0x3200 ## 0x3200 v-11 --1- ---- ----
TAG_GCKSUMDELTA = 0x3300 ## 0x3300 v-11 --11 ---- ----
# some ways of block geometry representations
@@ -123,6 +124,21 @@ def crc32c(data, crc=0):
crc = (crc >> 1) ^ ((crc & 1) * 0x82f63b78)
return 0xffffffff ^ crc
def pmul(a, b):
r = 0
while b:
if b & 1:
r ^= a
a <<= 1
b >>= 1
return r
def crc32cmul(a, b):
r = pmul(a, b)
for _ in range(31):
r = (r >> 1) ^ ((r & 1) * 0x82f63b78)
return r
def popc(x):
return bin(x).count('1')
@@ -284,6 +300,11 @@ def tagrepr(tag, w=None, size=None, off=None):
' 0x%02x' % (tag & 0xff) if tag & 0xff else '',
' w%d' % w if w else '',
' %s' % size if size is not None else '')
elif (tag & 0x7f00) == TAG_GCKSUMDELTA:
return 'gcksumdelta%s%s%s' % (
' 0x%02x' % (tag & 0xff) if tag & 0xff else '',
' w%d' % w if w else '',
' %s' % size if size is not None else '')
else:
return '0x%04x%s%s' % (
tag,
@@ -296,7 +317,8 @@ TBranch = co.namedtuple('TBranch', 'a, b, d, c')
# our core rbyd type
class Rbyd:
def __init__(self, blocks, data, rev, eoff, trunk, weight, cksum):
def __init__(self, blocks, data, rev, eoff, trunk, weight, cksum,
gcksumdelta=None):
if isinstance(blocks, int):
blocks = (blocks,)
@@ -307,6 +329,7 @@ class Rbyd:
self.trunk = trunk
self.weight = weight
self.cksum = cksum
self.gcksumdelta = gcksumdelta
@property
def block(self):
@@ -382,6 +405,8 @@ class Rbyd:
weight = 0
weight_ = 0
weight__ = 0
gcksumdelta = None
gcksumdelta_ = None
while j_ < len(data) and (not trunk or eoff <= trunk):
# read next tag
v, tag, w, size, d = fromtag(data[j_:])
@@ -397,6 +422,11 @@ class Rbyd:
if not tag & TAG_ALT:
if (tag & 0xff00) != TAG_CKSUM:
cksum___ = crc32c(data[j_:j_+size], cksum___)
# found a gcksumdelta?
if (tag & 0xff00) == TAG_GCKSUMDELTA:
gcksumdelta_ = (tag, w, j_-d, d, data[j_:j_+size])
# found a cksum?
else:
# check cksum
@@ -408,6 +438,8 @@ class Rbyd:
cksum_ = cksum__
trunk_ = trunk__
weight = weight_
gcksumdelta = gcksumdelta_
gcksumdelta_ = None
# update perturb bit
perturb = tag & TAG_P
# revert to data cksum and perturb
@@ -439,6 +471,7 @@ class Rbyd:
0xfca42daf if perturb else 0)
trunk_ = trunk__
weight = weight_
gcksumdelta = gcksumdelta_
trunk___ = 0
# update canonical checksum, xoring out any perturb state
@@ -449,9 +482,9 @@ class Rbyd:
# cksum mismatch?
if cksum is not None and cksum_ != cksum:
return cls(block, data, rev, 0, 0, 0, cksum_)
return cls(block, data, rev, 0, 0, 0, cksum_, gcksumdelta)
return cls(block, data, rev, eoff, trunk_, weight, cksum_)
return cls(block, data, rev, eoff, trunk_, weight, cksum_, gcksumdelta)
def lookup(self, rid, tag):
if not self:
@@ -922,8 +955,11 @@ class Rbyd:
# have mdir?
done, rid, tag, w, j, _, data, _ = self.lookup(-1, TAG_MDIR)
if not done and rid == -1 and tag == TAG_MDIR:
if mbid == 0:
blocks = frommdir(data)
return False, 0, 0, Rbyd.fetch(f, block_size, blocks)
else:
return True, 0, 0, None
else:
# I guess we're inlined?
@@ -1192,15 +1228,11 @@ class GState:
def __init__(self, mleaf_weight):
self.gstate = {}
self.gdelta = {}
self.gcksum = 0
self.mleaf_weight = mleaf_weight
def xor(self, mbid, mw, mdir):
tag = TAG_GDELTA-0x1
while True:
done, rid, tag, w, j, d, data, _ = mdir.lookup(-1, tag+0x1)
if done or rid != -1 or (tag & 0xff00) != TAG_GDELTA:
break
def gxor(rid, tag, w, j, d, data):
# keep track of gdeltas
if tag not in self.gdelta:
self.gdelta[tag] = []
@@ -1213,7 +1245,35 @@ class GState:
a^b for a,b in it.zip_longest(
self.gstate[tag], data, fillvalue=0))
# gcksum deltas are a bit of a special case
self.gcksum ^= mdir.cksum
if mdir.gcksumdelta is not None:
tag, w, j, d, data = mdir.gcksumdelta
gxor(-1, tag, w, j, d, data)
# other gstate deltas
tag = TAG_GDELTA-0x1
while True:
done, rid, tag, w, j, d, data, _ = mdir.lookup(-1, tag+0x1)
if done or rid != -1 or (tag & 0xff00) != TAG_GDELTA:
break
gxor(rid, tag, w, j, d, data)
# parsers for some gstate
@ft.cached_property
def gcksum_(self):
# cubed gcksum
return crc32cmul(crc32cmul(self.gcksum, self.gcksum), self.gcksum)
@ft.cached_property
def gcksum__(self):
# gcksumdelta based cubed gcksum
if TAG_GCKSUMDELTA not in self.gstate:
return 0
return fromle32(self.gstate[TAG_GCKSUMDELTA])
@ft.cached_property
def grm(self):
if TAG_GRMDELTA not in self.gstate:
@@ -1233,7 +1293,10 @@ class GState:
def repr(self):
def grepr(tag, data):
if tag == TAG_GRMDELTA:
if tag == TAG_GCKSUMDELTA:
gcksum = fromle32(data)
return 'gcksum %08x' % gcksum
elif tag == TAG_GRMDELTA:
count, _ = fromleb128(data)
return 'grm %s' % (
'none' if count == 0
@@ -1826,7 +1889,7 @@ def main(disk, mroots=None, *,
corrupted = True
else:
rweight = max(rweight, mdir.weight)
gstate.xor(0, mdir)
gstate.xor(0, 0, mdir)
# find any dids
for rid, tag, w, j, d, data in mdir:
@@ -1908,14 +1971,14 @@ def main(disk, mroots=None, *,
if grmed_dir_dids != grmed_bookmark_dids:
corrupted = True
# are we going to end up rendering the dtree?
dtree = args.get('files') or not (
# are we going to end up rendering the ftree?
ftree = args.get('files') or not (
args.get('config') or args.get('gstate'))
# do a pass to find the width that fits file names+tree, this
# may not terminate! It's up to the user to use -Z in that case
f_width = 0
if dtree:
if ftree:
def rec_f_width(did, depth):
depth_ = 0
width_ = 0
@@ -1941,13 +2004,15 @@ def main(disk, mroots=None, *,
#### actual debugging begins here
# print some information about the filesystem
print('littlefs v%s.%s %dx%d %s w%d.%d, rev %08x' % (
print('littlefs v%s.%s %dx%d %s w%d.%d, rev %08x, cksum %08x%s' % (
config.version[0] if config.version[0] is not None else '?',
config.version[1] if config.version[1] is not None else '?',
(config.geometry[0] or 0), (config.geometry[1] or 0),
mroot.addr(),
bweight//mleaf_weight, 1*mleaf_weight,
mroot.rev))
mroot.rev,
gstate.gcksum,
'' if gstate.gcksum_ == gstate.gcksum__ else '!'))
# dynamically size the id field
w_width = max(
@@ -1982,14 +2047,24 @@ def main(disk, mroots=None, *,
# print gstate?
if args.get('gstate'):
for i, (repr_, tag, data) in enumerate(gstate.repr()):
print('%12s %*s %-*s %s' % (
# some special situations worth reporting
notes = []
# gcksum mismatch?
if (tag == TAG_GCKSUMDELTA
and gstate.gcksum_ != gstate.gcksum__):
notes.append('gcksum!=%08x' % gstate.gcksum_)
print('%s%12s %*s %-*s %s%s%s' % (
'\x1b[31m' if color and notes else '',
'gstate:' if i == 0 else '',
2*w_width+1, 'g' if i == 0 else '',
21+w_width, repr_,
next(xxd(data, 8), '')
if not args.get('raw')
and not args.get('no_truncate')
else ''))
else '',
' (%s)' % ', '.join(notes) if notes else '',
'\x1b[m' if color and notes else ''))
# show on-disk encoding
if args.get('raw') or args.get('no_truncate'):
@@ -2029,8 +2104,8 @@ def main(disk, mroots=None, *,
2*w_width+1, '',
line))
# print dtree?
if dtree:
# print ftree?
if ftree:
# only show mdir on change
pmbid = None
# recursively print directories
@@ -2091,7 +2166,7 @@ def main(disk, mroots=None, *,
if did_ not in grmed_dir_dids:
notes.append('orphaned')
# print human readable dtree entry
# print human readable ftree entry
print('%s%12s %*s %-*s %s%s%s' % (
'\x1b[31m' if color and not grmed and notes
else '\x1b[90m'

View File

@@ -48,10 +48,11 @@ TAG_B = 0x0000
TAG_R = 0x2000
TAG_LE = 0x0000
TAG_GT = 0x1000
TAG_CKSUM = 0x3000 ## 0x3c0p v-11 cccc ---- ---p
TAG_CKSUM = 0x3000 ## 0x300p v-11 ---- ---- ---p
TAG_P = 0x0001
TAG_NOTE = 0x3100 # 0x3100 v-11 ---1 ---- ----
TAG_ECKSUM = 0x3200 # 0x3200 v-11 --1- ---- ----
TAG_NOTE = 0x3100 ## 0x3100 v-11 ---1 ---- ----
TAG_ECKSUM = 0x3200 ## 0x3200 v-11 --1- ---- ----
TAG_GCKSUMDELTA = 0x3300 ## 0x3300 v-11 --11 ---- ----
# some ways of block geometry representations
@@ -268,6 +269,11 @@ def tagrepr(tag, w=None, size=None, off=None):
' 0x%02x' % (tag & 0xff) if tag & 0xff else '',
' w%d' % w if w else '',
' %s' % size if size is not None else '')
elif (tag & 0x7f00) == TAG_GCKSUMDELTA:
return 'gcksumdelta%s%s%s' % (
' 0x%02x' % (tag & 0xff) if tag & 0xff else '',
' w%d' % w if w else '',
' %s' % size if size is not None else '')
else:
return '0x%04x%s%s' % (
tag,
@@ -280,7 +286,8 @@ TBranch = co.namedtuple('TBranch', 'a, b, d, c')
# our core rbyd type
class Rbyd:
def __init__(self, blocks, data, rev, eoff, trunk, weight, cksum):
def __init__(self, blocks, data, rev, eoff, trunk, weight, cksum,
gcksumdelta):
if isinstance(blocks, int):
blocks = (blocks,)
@@ -291,6 +298,7 @@ class Rbyd:
self.trunk = trunk
self.weight = weight
self.cksum = cksum
self.gcksumdelta = gcksumdelta
@property
def block(self):
@@ -366,6 +374,8 @@ class Rbyd:
weight = 0
weight_ = 0
weight__ = 0
gcksumdelta = None
gcksumdelta_ = None
while j_ < len(data) and (not trunk or eoff <= trunk):
# read next tag
v, tag, w, size, d = fromtag(data[j_:])
@@ -381,6 +391,11 @@ class Rbyd:
if not tag & TAG_ALT:
if (tag & 0xff00) != TAG_CKSUM:
cksum___ = crc32c(data[j_:j_+size], cksum___)
# found a gcksumdelta?
if (tag & 0xff00) == TAG_GCKSUMDELTA:
gcksumdelta_ = (tag, w, j_-d, d, data[j_:j_+size])
# found a cksum?
else:
# check cksum
@@ -392,6 +407,8 @@ class Rbyd:
cksum_ = cksum__
trunk_ = trunk__
weight = weight_
gcksumdelta = gcksumdelta_
gcksumdelta_ = None
# update perturb bit
perturb = tag & TAG_P
# revert to data cksum and perturb
@@ -423,6 +440,7 @@ class Rbyd:
0xfca42daf if perturb else 0)
trunk_ = trunk__
weight = weight_
gcksumdelta = gcksumdelta_
trunk___ = 0
# update canonical checksum, xoring out any perturb state
@@ -433,9 +451,9 @@ class Rbyd:
# cksum mismatch?
if cksum is not None and cksum_ != cksum:
return cls(block, data, rev, 0, 0, 0, cksum_)
return cls(block, data, rev, 0, 0, 0, cksum_, gcksumdelta)
return cls(block, data, rev, eoff, trunk_, weight, cksum_)
return cls(block, data, rev, eoff, trunk_, weight, cksum_, gcksumdelta)
def lookup(self, rid, tag):
if not self:

View File

@@ -58,10 +58,11 @@ TAG_B = 0x0000
TAG_R = 0x2000
TAG_LE = 0x0000
TAG_GT = 0x1000
TAG_CKSUM = 0x3000 ## 0x3c0p v-11 cccc ---- ---p
TAG_CKSUM = 0x3000 ## 0x300p v-11 ---- ---- ---p
TAG_P = 0x0001
TAG_NOTE = 0x3100 # 0x3100 v-11 ---1 ---- ----
TAG_ECKSUM = 0x3200 # 0x3200 v-11 --1- ---- ----
TAG_NOTE = 0x3100 ## 0x3100 v-11 ---1 ---- ----
TAG_ECKSUM = 0x3200 ## 0x3200 v-11 --1- ---- ----
TAG_GCKSUMDELTA = 0x3300 ## 0x3300 v-11 --11 ---- ----
# some ways of block geometry representations
@@ -256,6 +257,11 @@ def tagrepr(tag, w=None, size=None, off=None):
' 0x%02x' % (tag & 0xff) if tag & 0xff else '',
' w%d' % w if w else '',
' %s' % size if size is not None else '')
elif (tag & 0x7f00) == TAG_GCKSUMDELTA:
return 'gcksumdelta%s%s%s' % (
' 0x%02x' % (tag & 0xff) if tag & 0xff else '',
' w%d' % w if w else '',
' %s' % size if size is not None else '')
else:
return '0x%04x%s%s' % (
tag,

View File

@@ -46,10 +46,11 @@ TAG_B = 0x0000
TAG_R = 0x2000
TAG_LE = 0x0000
TAG_GT = 0x1000
TAG_CKSUM = 0x3000 ## 0x3c0p v-11 cccc ---- ---p
TAG_CKSUM = 0x3000 ## 0x300p v-11 ---- ---- ---p
TAG_P = 0x0001
TAG_NOTE = 0x3100 # 0x3100 v-11 ---1 ---- ----
TAG_ECKSUM = 0x3200 # 0x3200 v-11 --1- ---- ----
TAG_NOTE = 0x3100 ## 0x3100 v-11 ---1 ---- ----
TAG_ECKSUM = 0x3200 ## 0x3200 v-11 --1- ---- ----
TAG_GCKSUMDELTA = 0x3300 ## 0x3300 v-11 --11 ---- ----
# some ways of block geometry representations
@@ -210,6 +211,11 @@ def tagrepr(tag, w=None, size=None, off=None):
' 0x%02x' % (tag & 0xff) if tag & 0xff else '',
' w%d' % w if w else '',
' %s' % size if size is not None else '')
elif (tag & 0x7f00) == TAG_GCKSUMDELTA:
return 'gcksumdelta%s%s%s' % (
' 0x%02x' % (tag & 0xff) if tag & 0xff else '',
' w%d' % w if w else '',
' %s' % size if size is not None else '')
else:
return '0x%04x%s%s' % (
tag,

View File

@@ -2,6 +2,124 @@
after = ['test_traversal', 'test_gc', 'test_mount']
code = '''
// naive crc32c
static uint32_t test_ck_naive_crc32c(
uint32_t crc, const void *buffer, size_t size) {
const uint8_t *buffer_ = buffer;
crc ^= 0xffffffff;
for (size_t i = 0; i < size; i++) {
crc = crc ^ buffer_[i];
for (size_t j = 0; j < 8; j++) {
crc = (crc >> 1) ^ ((crc & 1) ? 0x82f63b78 : 0);
}
}
crc ^= 0xffffffff;
return crc;
}
// naive crc32c multiplication
static uint32_t test_ck_naive_crc32c_mul(uint32_t a, uint32_t b) {
// pmul
uint64_t r = 0;
for (int i = 0; i < 32; i++) {
if (b & (1 << i)) {
r ^= (uint64_t)a << i;
}
}
// mod crc32c
for (int i = 0; i < 31; i++) {
r = (r >> 1) ^ ((r & 1) ? 0x82f63b78 : 0);
}
return (uint32_t)r;
}
'''
# let's first check that our crc32c math probably works
# try some random inputs and compare with a naive implementation
[cases.test_ck_crc32c]
defines.SIZE = [1, 2, 4, 8, 16, 32, 64]
defines.SEED = 'range(10)'
defines.N = 1000
fuzz = 'SEED'
code = '''
uint32_t prng = SEED;
for (lfs_size_t i = 0; i < N; i++) {
uint8_t buffer[SIZE];
for (lfs_size_t j = 0; j < SIZE; j++) {
buffer[j] = TEST_PRNG(&prng);
}
uint32_t a = test_ck_naive_crc32c(0, buffer, SIZE);
uint32_t b = lfs_crc32c(0, buffer, SIZE);
assert(a == b);
}
'''
# test incremental crc32cs
[cases.test_ck_crc32c_incr]
defines.SIZE = [1, 2, 4, 8, 16, 32, 64]
defines.SEED = 'range(10)'
defines.N = 1000
fuzz = 'SEED'
code = '''
uint32_t prng = SEED;
for (lfs_size_t i = 0; i < N; i++) {
uint8_t buffer[SIZE];
for (lfs_size_t j = 0; j < SIZE; j++) {
buffer[j] = TEST_PRNG(&prng);
}
uint32_t a = lfs_crc32c(0, buffer, SIZE);
uint32_t b = 0;
for (lfs_size_t j = 0; j < SIZE; j++) {
b = lfs_crc32c(b, &buffer[j], 1);
}
assert(a == b);
}
'''
# try some random inputs and compare with a naive implementation
[cases.test_ck_crc32c_mul]
defines.SEED = 'range(10)'
defines.N = 1000
fuzz = 'SEED'
code = '''
uint32_t prng = SEED;
for (lfs_size_t i = 0; i < N; i++) {
uint32_t x = TEST_PRNG(&prng);
uint32_t y = TEST_PRNG(&prng);
uint32_t a = test_ck_naive_crc32c_mul(x, y);
uint32_t b = lfs_crc32c_mul(x, y);
assert(a == b);
}
'''
# test that multiplication is distributive
[cases.test_ck_crc32c_mul_dist]
defines.SEED = 'range(10)'
defines.N = 1000
fuzz = 'SEED'
code = '''
uint32_t prng = SEED;
for (lfs_size_t i = 0; i < N; i++) {
uint32_t x = TEST_PRNG(&prng);
uint32_t y = TEST_PRNG(&prng);
uint32_t z = TEST_PRNG(&prng);
uint32_t a = lfs_crc32c_mul(x, y ^ z);
uint32_t b = lfs_crc32c_mul(x, y) ^ lfs_crc32c_mul(x, z);
assert(a == b);
}
'''
# Test filesystem-level checksum things