forked from Imagelibrary/littlefs
The main motivation for this was issues fitting a good tag encoding into
14-bits. The extra 2-bits (though really only 1 bit was needed) from
making this not a leb encoding opens up the space from 3 suptypes to
15 suptypes, which is nothing to shake a stick at.
The main downsides:
1. We can't rely on leb encoding for effectively-infinite extensions.
2. We can't shorten small tags (crcs, grows, shrinks) to one byte.
For 1., extending the leb encoding beyond 14-bits is already
unpalatable, because it would increase RAM costs in the tag
encoder/decoder,` which must assume a worst-case tag size, and would likely
add storage cost to every alt pointer, more on this in the next section.
The current encoding is quite generous, so I think it is unlikely we
will exceed the 16-bit encoding space. But even if we do, it's possible
to use a spare bit for an "extended" set of tags in the future.
As for 2., the lack of compression is a downside, but I've realized the
only tags that really matter storage-wise are the alt pointers. In any
rbyds there will be roughly O(m log m) alt pointers, but at most O(m) of
any other tags. What this means is that the encoding of any other tag is
in the noise of the encoding of our alt pointers.
Our alt pointers are already pretty densely packed. But because the
sparse key part of alt-pointers are stored as-is, the worst-case
encoding of in-tree tags likely ends up as the encoding of our
alt-pointers. So going up to 3-byte tags adds a surprisingly large
storage cost.
As a minor plus, le16s should be slightly cheaper to encode/decode. It
should also be slightly easier to debug tags on-disk.
tag encoding:
TTTTtttt ttttTTTv
^--------^--^^- 4+3-bit suptype
'---|- 8-bit subtype
'- valid bit
iiii iiiiiii iiiiiii iiiiiii iiiiiii
^- m-bit id/weight
llll lllllll lllllll lllllll lllllll
^- m-bit length/jump
Also renamed the "mk" tags, since they no longer have special behavior
outside of providing names for entries:
- LFSR_TAG_MK => LFSR_TAG_NAME
- LFSR_TAG_MKBRANCH => LFSR_TAG_BNAME
- LFSR_TAG_MKREG => LFSR_TAG_REG
- LFSR_TAG_MKDIR => LFSR_TAG_DIR