Commit Graph

13 Commits

Author SHA1 Message Date
Christopher Haster
5d9e7c8e86 Moved lifetimes in dbgrbyd.py so lifetimes and jumps can both be rendered
$ ./scripts/dbgrbyd.py disk 4096 0 -g -j
  mdir 0x0, rev 1, size 59
  off             tag                     data (truncated)
  00000004: .     createreg id1 4         aa aa aa aa              ....  <--.
  0000000c: |     altblt x80d0 x4                                        -' |
  00000010: | .   createreg id2 4         cc cc cc cc              ....  <. |
  00000018: | |   altrlt x80d0 x4                                        -|-'
  0000001c: | |   altbgt x8000 x10                                       -'
  00000020: | .\  createreg id2 4         bb bb bb bb              ....
  00000028: | | | fcrc 5                  51 53 7d 52 01           QS}R.
  0000002f: | | | crc0 7                  5f db 22 8a 1b 1b 1b     _."....
2023-02-12 14:44:28 -06:00
Christopher Haster
ca710b5a29 Initial, very, very rough implementation of rbyd range deletion
Tree deletion is such a pain. It always seems like an easy addition to
the core algorithm but always comes with problems.

The initial plan for deletes was to iterate through all tags, tombstone,
and then adjust weights as needed. This accomplishes deletes with little
change to the rbyd algorithm, but adds a complex traversal inside the
commit logic. Doable in one commit, but complex. It also risks weird
unintuitive corner cases since the cost of deletion grows with the number
of tags being deleted (O(m log n)).

But this rbyd data structure is a tree, so in theory it's possible to
delete a whole range of tags in a single O(log n) operation.

---

This is a proof-of-concept range deletion algorithm for rbyd trees.

Note, this does not preserve rbyd's balancing properties! But it is no
worse than tombstoning. This is acceptable for littlefs as any
unbalanced trees will be rebalanced during compaction.

The idea is to follow the same underlying dhara algorithm, where we
follow a search path and save any alt pointers not taken, but we follow
both search paths that form the outside of the range, and only keep
outside edges.

For example, a tree:

        .-------o-------.
        |               |
    .---o---.       .---o---.
    |       |       |       |
  .-o-.   .-o-.   .-o-.   .-o-.
  |   |   |   |   |   |   |   |
  a   b   c   d   e   f   g   h

To delete the range d-e, we would search for d, and search for e:

        ********o********
        *               *
    .---*****       *****---.
    |       *       *       |
  .-o-.   .-***   ***-.   .-o-.
  |   |   |   *   *   |   |   |
  a   b   c   d   e   f   g   h

And keep the outside edges:

    .---                 ---.
    |                       |
  .-o-.   .-         -.   .-o-.
  |   |   |           |   |   |
  a   b   c           f   g   h

But how do we combine the outside edges? The simpler option is to do
both searches seperately, one after the other. This would end up with a
tree like this:

    .---------o
    |         |
  .-o-.   .---o
  |   |   |   |
  a   b   c   o---------.
              |         |
              o---.   .-o-.
              |   |   |   |
              _   f   g   h

But this horribly throws off the balance of our tree! It's worse than
tombstoning, and gets worse with more tags.

An alternative strategy, which is used here, is to alternate edges as we
descend down the tree. This unfortunately is more complex, and requires
~2x the RAM, but better preserves the balance of our tree. It isn't
perfect, because we lose color information, but we can leave that up to
compaction:

  .---------o
  |         |
.-o-.       o---------.
|   |       |         |
a   b   .---o       .-o-.
        |   |       |   |
        c   o---.   g   h
            |   |
            _   f

I also hope this can be merged into lfs_rbyd_append, deduplicating the
entire core rbyd append algorithm.
2023-02-12 13:29:06 -06:00
Christopher Haster
12edc5aee3 Added some ascii art to dbgrbyd.py to help debug how ids change over time
An example:

  $ ./scripts/dbgrbyd.py disk 4096 0 -i
  mdir 0x0, rev 1, size 59
  off       tag                     data (truncated)
  00000004: create x01 id1 4        aa aa aa aa              ....      .
  0000000c: altblt x80d0 x4                                            |
  00000010: create x01 id2 4        cc cc cc cc              ....      | .
  00000018: altrlt x80d0 x4                                            | |
  0000001c: altbgt x8000 x10                                           | |
  00000020: create x01 id2 4        bb bb bb bb              ....      | .\
  00000028: fcrc 5                  51 53 7d 52 01           QS}R.     | | |
  0000002f: crc0 7                  5f db 22 8a 1b 1b 1b     _."....   | | |
2023-02-12 13:23:56 -06:00
Christopher Haster
8d4991df6a Added the option to error on no valid commit to dbgrbyd.py
Considered adding --ignore-errors to watch.py, but it doesn't really
make sense with watch.py's implementation. watch.py would need to not update
in realtime, which conflicts with other use cases.
2023-02-12 13:19:46 -06:00
Christopher Haster
5cdda57373 Added the ability to remove rbyd tags via tombstoning
It's quite lucky a spare bit is free in the tag encoding, this means we
don't need a reserved length value as originally planned. We end up using
all of the bits that overlap the alt pointer encoding, which is nice and
unexpected.
2023-02-12 13:16:55 -06:00
Christopher Haster
d6ad74555b Made dbgrbyd.py a bit more resilient to truncated data 2023-02-12 13:14:54 -06:00
Christopher Haster
b48f7fcfb0 Added some more debug utilities to dbgrbyd.py, mainly --rbyd and --jumps
Not only is this a genuinely useful debugging tool, it looks very cool:

  $ ./scripts/dbgrbyd.py disk 4096 0 -j
  mdir 0x0, rev 1, size 73
  off       tag                     data (truncated)
  00000004: gstate x01 4            aa aa aa aa              ....      <----.
  0000000b: altblt x98 x4                                              -' | |
  0000000e: gstate x04 4            bb bb bb bb              ....      <------.
  00000015: altrlt x98 x4                                              -|-' | |
  00000018: altbgt x7ee8 xe                                            -'   | |
  0000001c: gstate x02 4            cc cc cc cc              ....      <.   | |
  00000023: altrlt x98 x4                                              -|---' |
  00000026: altrlt x80 x1c                                             -'     |
  00000029: altbgt x7e68 xe                                            -------'
  0000002d: gstate x03 4            dd dd dd dd              ....
  00000034: fcrc 5                  05 3f aa db 01           .?...
  0000003b: crc0 8                  5c 12 29 d9 1b 1b 1b 1b  \.).....
2023-02-12 13:02:59 -06:00
Christopher Haster
fe28837861 Rbyd trees with 4-leaves now working, fixed lfs_rtag_flip bug
- This is when flips starts happening during lfs_rbyd_append
- lfs_rtag_flip had an off-by-one math mistake
2023-02-12 12:58:55 -06:00
Christopher Haster
c5fec90465 Rbyd rflips are now working, quite nicely actually
It turns out statefulness works quite well with this algorithm (The
prototype was in Haskell, which created some artificial problems. I
think it may have just been too high-level a language for this
near-instruction-level algorithm).
2023-02-12 12:58:29 -06:00
Christopher Haster
05276cef9a Added a bias to alt weights so in-between tags prefer larger tags
This bias makes it so that tag lookups always find a tag strictly >= the
requested tag, unless we are at the end of the tree.

This makes tree traversal trivial, which is quite nice.

Need to remove ntag now, it's no longer needed.
2023-02-12 12:49:19 -06:00
Christopher Haster
024aaeba56 Some small tweaks
- Moved alt encoding 0x1 => 0x4, which can lead to slightly better
  lookup tables, the perturb bit takes the same place as the color bit,
  which means both can be ignored in readonly operations.

- Dropped lfs_rbyd_fetchmatch, asking each lfs_rbyd_fetch to include NULL
  isn't that bad.

New encoding:

  tags:
  iiii iiiiiii iiiiiTT TTTTTTt ttt0tpv
                   ^--------^------^^^- 16-bit id
                            '------|||- 8-bit type2
                                   '||- 5-bit type1
                                    '|- perturb bit
                                     '- valid bit
  llll lllllll lllllll lllllll lllllll
                                     ^- n-bit length

  alts:
  wwww wwwwwww wwwwwww wwwwwww www1dcv
                                 ^^^-^- 28-bit weight
                                  '|-|- color bit
                                   '-|- direction bit
                                     '- valid bit
  jjjj jjjjjjj jjjjjjj jjjjjjj jjjjjjj
                                     ^- n-bit jump
2023-02-12 12:40:19 -06:00
Christopher Haster
9a0e3fc749 More rbyd tests, multi-commit now working 2023-02-12 12:39:17 -06:00
Christopher Haster
ad00ca79e2 Added dbgrbyd.py script, fixed some small things in rbyd commit
- We need to actually write the perturb bit
- It helps to encode the crc's leb128 length field correctly
2023-02-12 12:38:07 -06:00