Implemented lfsr_btree_pop and btree merges

B-tree remove/merge is the most annoying part of B-trees.

The implementation here follows the same ideas implemented in push/split:
1. Defer splits/merges until compaction.
2. Assume our split/merge will succeed and play it out into the rbyd.
3. On the first sign of failure, revert any unnecessary changes by
   appending deletes.
4. Do all of this in a single commit to avoid issues with single-prog
   blocks.

Mapping this onto B-tree merge, the condition that triggers merge is
when our rbyd is <1/4 the block_size after compaction, and the condition
that aborts a merge is when our rbyd is >1/2 the block_size, since that
would trigger a split on a later compact.

Weaving this into lfsr_btree_commit is a bit subtle, but relatively
straightforward all things considered.

One downside is it's not physically possible to try merging with both
siblings, so we have to choose just one to attempt a merge. We handle
the corner case of merging the last sibling in a block explicitly, and
in theory the other sibling will eventually trigger a merge during its
own compaction.

Extra annoying are the corner cases with merges in the root rbyd that
make the root rbyd degenerate. We really should avoid a compaction in
this case, as otherwise we would erase a block that we immediately
inline at a significant cost. However determining if our root rbyd is
degenerate is tricky. We can determine a degenerate root with children
by checking if our rbyd's weight matches the B-tree's weight when we
merge. But determining a degenerate root that is a leaf requires
manually looking up both children in lfsr_btree_pop to see if they will
result in a degenerate root. Ugh.

On the bright side, this does all seem to be working now. Which
completes the last of the core B-tree algorithms.
This commit is contained in:
Christopher Haster
2023-03-13 03:35:50 -05:00
parent a897b875d3
commit 8732904ef6
3 changed files with 1304 additions and 88 deletions

View File

@@ -269,7 +269,7 @@ def show_log(block_size, data, rev, off, *,
lifetimes_[i:i+1] = []
shrinks.add(i + len(shrinks))
checkpoint(j, weights, lifetimes, set(), shrinks, shrinks)
checkpoint(j, weights, lifetimes, set(), shrinks, {i})
weights = weights_
lifetimes = lifetimes_