Fixed pl hole in perturb logic

Turns out there's very _very_ small powerloss hole in our current
perturb logic.

We rely on tag valid bits to validate perturb bits, but these
intentionally don't end up in the commit checksum. This means there will
always be a powerloss hole when we write the last valid bit. If we lose
power after writing that bit, suddenly the remaining commit and any
following commits may appear as valid.

Now, this is really unlikely considering we need to lose power exactly
when we write the cksum tag's valid bit, and our nonce helps protect
against this. But a hole is a hole.

The solution here is to include the _current_ perturb bit (q) in the
commit's cksum tag, alongside the _next_ perturb bit (p). This will be
included in the commit's checksum, but _not_ in the canonical checksum,
allowing the commit's checksum validate the current perturb state
without ruining our erased-state agnostic checksums:

  .---+---+---+---. . . .---+---+---+---. \   \   \   \
  |v|    tag      |     |v|    tag      | |   |   |   |
  +---+---+---+---+     +---+---+---+---+ |   |   |   |
  |     commit    |     |     commit    | |   |   |   |
  |               |     |               | +-. |   |   |
  +---+---+---+---+     +---+---+---+---+ / | |   |   |
  |v|qp-------------.   |v|qp| tag      |   | .   .   .
  +---+---+---+---+ |   +---+---+---+---+   | .   .   .
  |     cksum     | |   |     cksum     |   | .   .   .
  +---+---+---+---+ |   +---+---+---+---+   | .   .   .
  |    padding    | |   |    padding    |   | .   .   .
  |               | |   |               |   | .   .   .
  +---+---+---+---+ | . +---+---+---+---+   | |   |   |
  |     erased    | +-> |v------------------' |   |   |
  |               | |   +---+---+---+---+     |   |   |
  .               . |   |     commit    |     +-. |   +- rbyd
  .               . |   |.----------------.   | | |   |  cksum
                    |   +| -+---+---+---+ |   / | +-. /
                    +-> |v|qp| tag      | '-----' | |
                    |   +- ^ ---+---+---+         / |
                    '------'  cksum ----------------'
                        +---+---+---+---+
                        |    padding    |
                        |               |
                        +---+---+---+---+
                        |     erased    |
                        |               |
                        .               .
                        .               .

(Ok maybe this diagram needs work...)

This adds another thing that needs to be checked during rbyd fetch, and
note, we _do_ need to explicitly check this, but it solves the problem.
If power is loss after v, q would be invalid, and if power is lost after
q, our cksum would be invalid.

Note this would have also been an issue for the previous cksum + parity
perturb scheme.

Code changes:

           code          stack
  before: 33570           2592
  after:  33598 (+0.1%)   2592 (+0.0%)
This commit is contained in:
Christopher Haster
2024-06-07 19:00:35 -05:00
parent 8a4f6fcf68
commit 898f916778
7 changed files with 66 additions and 30 deletions

View File

@@ -53,8 +53,8 @@ TAG_R = 0x2000
TAG_LE = 0x0000
TAG_GT = 0x1000
TAG_CKSUM = 0x3000
TAG_Q = 0x0000
TAG_P = 0x0001
TAG_Q = 0x0002
TAG_NOISE = 0x3100
TAG_ECKSUM = 0x3200
@@ -241,10 +241,10 @@ def tagrepr(tag, w=None, size=None, off=None):
else ' -%d' % size if size
else '')
elif (tag & 0x7f00) == TAG_CKSUM:
return 'cksum%s%s%s' % (
'p' if tag & 0xff == TAG_P
else 'q' if tag & 0xff == TAG_Q
else ' 0x%02x' % (tag & 0xff),
return 'cksum%s%s%s%s%s' % (
'q' if not tag & 0xfc and tag & TAG_Q else '',
'p' if not tag & 0xfc and tag & TAG_P else '',
' 0x%02x' % (tag & 0xff) if tag & 0xfc else '',
' w%d' % w if w else '',
' %s' % size if size is not None else '')
elif (tag & 0x7f00) == TAG_NOISE:
@@ -555,6 +555,10 @@ def dbg_log(data, block_size, rev, eoff, weight, *,
cksum_ = crc32c(data[j_:j_+size], cksum_)
# found a cksum?
else:
# check perturb bit
if perturb != bool(tag & TAG_Q):
notes.append('q!=%x' % perturb)
# check cksum
cksum__ = fromle32(data[j_:j_+4])
if cksum_ != cksum__:
notes.append('cksum!=%08x' % cksum__)
@@ -963,6 +967,10 @@ def main(disk, blocks=None, *,
cksum__ = crc32c(data[j_:j_+size], cksum__)
# found a cksum?
else:
# check perturb bit
if perturb != bool(tag & TAG_Q):
break
# check cksum
cksum___ = fromle32(data[j_:j_+4])
if cksum__ != cksum___:
break