B-trees with names are now working, though this required a number of
changes to the B-tree layout:
1. B-tree no-longer require name entries (LFSR_TAG_MK) on each branch.
This is a nice optimization to the design, since these name entries
just waste space in purely weight-based B-trees, which are probably
going to be most B-trees in the filesystem.
If a name entry is missing, the struct entry, which is required,
should have the effective weight of the entry.
The first entry in every rbyd block is expected to be have no name
entry, since this is the default path for B-tree lookups.
2. The first entry in every rbyd block _may_ have a name entry, which
is ignored. I'm calling these "vestigial names" to make them sound
cooler than they actually are.
These vestigial names show up in a couple complicated B-tree
operations:
- During B-tree split, since pending attributes are calculated before
the split, we need to play out pending attributes into the rbyd
before deciding what name becomes the name of entry in the parent.
This creates a vestigial name which we _could_ immediately remove,
but the remove adds additional size to the must-fit split operation
- During B-tree pop/merge, if we remove the leading no-name entry,
the second, named entry becomes the leading entry. This creates a
vestigial name that _looks_ easy enough to remove when making the
pending attributes for pop/merge, but turns out the be surprisingly
tricky if the parent undergoes a split/merge at the same time.
It may be possible to remove all these vestigial names proactively,
but this adds additional rbyd lookups to figure out the exact tag to
remove, complicates things in a fragile way, and doesn't actually
reduce storage costs until the rbyd is compacted.
The main downside is that these B-trees may be a bit more confusing
to debug.
This was a rather simple exercise. lfsr_btree_commit does most of the
work already, so all this needed was setting up the pending attributes
correctly.
Also:
- Tweaked dbgrbyd.py's tree rendering to match dbgbtree.py's.
- Added a print to each B-tree test to help find the resulting B-tree
when debugging.
An example:
$ ./scripts/dbgbtree.py -B4096 disk 0xaa -t -i
btree 0xaa.1000, rev 35, weight 278
block ids name tag data
(truncated)
00aa.1000: +-+ 0-16 branch id16 3 7e d4 10 ~..
007e.0854: | |-> 0 inlined id0 1 73 s
| |-> 1 inlined id1 1 74 t
| |-> 2 inlined id2 1 75 u
| |-> 3 inlined id3 1 76 v
| |-> 4 inlined id4 1 77 w
| |-> 5 inlined id5 1 78 x
| |-> 6 inlined id6 1 79 y
| |-> 7 inlined id7 1 7a z
| |-> 8 inlined id8 1 61 a
| |-> 9 inlined id9 1 62 b
...
This added the idea of block+limit addresses such as 0xaa.1000. Added
this as an option to dbgrbyd.py along with a couple other tweaks:
- Added block+limit support (0x<block>.<limit>).
- Fixed in-device representation indentation when trees are present.
- Changed fromtag to implicitly fixup ids/weights off-by-one-ness, this
is consistent with lfs.c.