Rearranged type encoding for crcs so they mostly fit in a single byte

I'm still not sure this is the best decision, since it may add some
complexity to tag parsing, but making most crcs one byte may be valuable
since these exist in every single commit.

This gives tags three high-level encodings:

  in-tree tags:
  iiiiiii iiiiitt ttTTTTT TTT00rv
              ^----^--------^--^^- 16-bit id
                   '--------|--||- 4-bit suptype
                            '--||- 8-bit subtype
                               '|- removed bit
                                '- valid bit
  lllllll lllllll lllllll lllllll
                                ^- n-bit length

  out-of-tree tags:
  ------- -----TT TTTTTTt ttt01pv
                       ^----^--^^- 8-bit subtype
                            '--||- 4-bit suptype
                               '|- perturb bit
                                '- valid bit
  lllllll lllllll lllllll lllllll
                                ^- n-bit length

  alt tags:
  wwwwwww wwwwwww wwwwwww www1dcv
                            ^-^^^- 28-bit weight
                              '||- direction bit
                               '|- color bit
                                '- valid bit
  jjjjjjj jjjjjjj jjjjjjj jjjjjjj
                                ^- n-bit jump

Having the location of the subtype flipped for crc tags vs tree tags is
unintuitive, but it makes more crc tags fit in a single byte, while
preserving expected tag ordering for tree tags.

The only case where crc tags don't fit in a single byte if is non-crc
checksums (sha256?) are added, at which point I expect the subtype to
indicate which checksum algorithm is in use.
This commit is contained in:
Christopher Haster
2023-01-18 15:17:01 -06:00
parent 55b072e761
commit d08497c299
2 changed files with 24 additions and 14 deletions

View File

@@ -54,6 +54,8 @@ def tagrepr(tag, size, off=None):
type = tag & 0x7fff
suptype = tag & 0x7807
subtype = (tag >> 3) & 0xff
xsuptype = tag & 0x7e
xsubtype = (tag >> 7) & 0xff
id = ((tag >> 15) & 0xffff) - 1
if suptype == 0x0800:
@@ -73,14 +75,14 @@ def tagrepr(tag, size, off=None):
subtype,
' id%d' % id if id != -1 else '',
' %d' % size if not tag & 0x1 else '')
elif (suptype & ~0x1) == 0x0002:
elif xsuptype == 0x0002:
return 'crc%x%s %d' % (
suptype & 0x1,
' 0x%02x' % subtype if subtype else '',
tag & 0x1,
' 0x%02x' % xsubtype if xsubtype else '',
size)
elif suptype == 0x0802:
elif xsuptype == 0x000a:
return 'fcrc%s %d' % (
' 0x%02x' % subtype if subtype else '',
' 0x%02x' % xsubtype if xsubtype else '',
size)
elif suptype & 0x4:
return 'alt%s%s 0x%x %s' % (
@@ -250,7 +252,7 @@ def show_log(block_size, data, rev, off, *,
j_ += delta
if not tag & 0x4:
if (tag & 0x7806) != 0x0002:
if (tag & 0x007e) != 0x0002:
crc = crc32c(data[j_:j_+size], crc)
# found a crc?
else:
@@ -599,7 +601,7 @@ def main(disk, block_size, block1, block2=None, *,
wastrunk = True
if not tag & 0x4:
if (tag & 0x7806) != 0x0002:
if (tag & 0x007e) != 0x0002:
crc = crc32c(data[j_:j_+size], crc)
# keep track of id count
if (tag & 0x7807) == 0x0800: