Commit Graph

103 Commits

Author SHA1 Message Date
Christopher Haster
89d5a5ef80 Working implementation of B-tree name split/lookup with vestigial names
B-trees with names are now working, though this required a number of
changes to the B-tree layout:

1. B-tree no-longer require name entries (LFSR_TAG_MK) on each branch.
   This is a nice optimization to the design, since these name entries
   just waste space in purely weight-based B-trees, which are probably
   going to be most B-trees in the filesystem.

   If a name entry is missing, the struct entry, which is required,
   should have the effective weight of the entry.

   The first entry in every rbyd block is expected to be have no name
   entry, since this is the default path for B-tree lookups.

2. The first entry in every rbyd block _may_ have a name entry, which
   is ignored. I'm calling these "vestigial names" to make them sound
   cooler than they actually are.

   These vestigial names show up in a couple complicated B-tree
   operations:

   - During B-tree split, since pending attributes are calculated before
     the split, we need to play out pending attributes into the rbyd
     before deciding what name becomes the name of entry in the parent.
     This creates a vestigial name which we _could_ immediately remove,
     but the remove adds additional size to the must-fit split operation

   - During B-tree pop/merge, if we remove the leading no-name entry,
     the second, named entry becomes the leading entry. This creates a
     vestigial name that _looks_ easy enough to remove when making the
     pending attributes for pop/merge, but turns out the be surprisingly
     tricky if the parent undergoes a split/merge at the same time.

   It may be possible to remove all these vestigial names proactively,
   but this adds additional rbyd lookups to figure out the exact tag to
   remove, complicates things in a fragile way, and doesn't actually
   reduce storage costs until the rbyd is compacted.

   The main downside is that these B-trees may be a bit more confusing
   to debug.
2023-03-21 12:59:46 -05:00
Christopher Haster
a897b875d3 Implemented lfsr_btree_update and added more tests
This was a rather simple exercise. lfsr_btree_commit does most of the
work already, so all this needed was setting up the pending attributes
correctly.

Also:
- Tweaked dbgrbyd.py's tree rendering to match dbgbtree.py's.
- Added a print to each B-tree test to help find the resulting B-tree
  when debugging.
2023-03-17 14:20:40 -05:00
Christopher Haster
ce599be70d Added scripts/dbgbtree.py for debugging B-trees, tweaked dbgrbyd.py
An example:

  $ ./scripts/dbgbtree.py -B4096 disk 0xaa -t -i
  btree 0xaa.1000, rev 35, weight 278
  block            ids     name     tag                     data
  (truncated)
  00aa.1000: +-+      0-16          branch id16 3           7e d4 10                 ~..
  007e.0854: | |->       0          inlined id0 1           73                       s
             | |->       1          inlined id1 1           74                       t
             | |->       2          inlined id2 1           75                       u
             | |->       3          inlined id3 1           76                       v
             | |->       4          inlined id4 1           77                       w
             | |->       5          inlined id5 1           78                       x
             | |->       6          inlined id6 1           79                       y
             | |->       7          inlined id7 1           7a                       z
             | |->       8          inlined id8 1           61                       a
             | |->       9          inlined id9 1           62                       b
  ...

This added the idea of block+limit addresses such as 0xaa.1000. Added
this as an option to dbgrbyd.py along with a couple other tweaks:

- Added block+limit support (0x<block>.<limit>).
- Fixed in-device representation indentation when trees are present.
- Changed fromtag to implicitly fixup ids/weights off-by-one-ness, this
  is consistent with lfs.c.
2023-03-17 14:20:10 -05:00