littlefs

Author	SHA1	Message	Date
Christopher Haster	865477d7e1	Changing coalesce strategy, reimplemented shrub/btree carve Note this is already showing better code reuse, which is a good sign, though maybe that's just the benefit of reimplementing similar logic multiple times. Now both reading and carving end up in the same lfsr_btree_readnext and lfsr_btree_buildcarve functions for both btrees and shrubs. Both btrees and shrubs are fundamentally rbyds, so we can share a lot of functionality as long as we redirect to the correct commit function at the last minute. This surprising opportunity for deduplication was noticed while putting together the dbg scripts. Planned logic (not actual function names): lfsr_file_readnext -> lfsr_shrub_readnext \| \| \| v '---------> lfsr_btree_readnext lfsr_file_flushbuffer -> lfsr_shrub_carve ------------. .---------------------' \| v v lfsr_file_flushshrub -> lfsr_btree_carve -> lfsr_btree_buildcarve Though the btree part of the above statement is only a hypothetical at the moment. Not even the shrubs can survive compaction now. The reason is the new SLICE tag which needs low-level support in rbyd compact. SLICE introduces indirect refernces to data located in the same rbyd, which removes any copying cost associated with coalescing. Previously, a large coalesce_size risked O(n^2) runtime when incrementally append small amounts of data, but with SLICEs we can defer coalescing to compaction time, where the copy is effectively free. This compaction-time-coalescing is also hypothetical, which is why our tests are failing. But the theory is promising. I was originally against this idea because of how it crosses abstraction layers, requiring some very low-level code that absolutely can not be omitted in a simpler littlefs driver. But after working on the actual file writing code for a while I've become convinced the tradeoff is worth it. Note coalesce_size will likely still need to be configurable. Data in fragmenting/sparse btrees is still susceptible to coalescing, and it's not clear the impacts of internal fragmentation when data sizes approach the hard block_size/2 limit.	2023-10-17 23:21:18 -05:00
Christopher Haster	fce1612dc0	Reverted to separate BTREE/BRANCH encodings, reordered on-disk structs My current thinking is that these are conceptually different types, with BTREE tags representing the entire btree, and BRANCH tags representing only the inner btree nodes. We already have multiple btree tags anyways: btrees attached to files, the mtree, and in the future maybe a bmaptree. Having separate tags also makes it possible to store a btree in a btree, though I don't think we'll ever use this functionality. This also removes the redundant weight field from branches. The redundant weight field is only a minor cost relative to storage, but it also takes up a bit of RAM when encoding. Though measurements show this isn't really significant. New encodings: btree encoding: branch encoding: .---+- -+- -+- -+- -. .---+- -+- -+- -+- -. \| weight \| \| blocks \| +---+- -+- -+- -+- -+ ' ' \| blocks \| ' ' ' ' +---+- -+- -+- -+- -+ ' ' \| trunk \| +---+- -+- -+- -+- -+ +---+- -+- -+- -+- -' \| trunk \| \| cksum \| +---+- -+- -+- -+- -' '---+---+---+---' \| cksum \| '---+---+---+---' Code/RAM changes: code stack before: 30836 2088 after: 30944 (+0.4%) 2080 (-0.4%) Also reordered other on-disk structs with weight/size, so such structs always have weight/size as the first field. This may enable some optimizations around decoding the weight/size without needing to know the specific type in some cases. --- This change shouldn't have affected functionality, but it revealed a bug in a dtree test, where a did gets caught in an mdir split and the split name makes the did unreachable. Marking this as a TODO for now. The fix is going to be a bit involved (fundamental changes to the opened-mdir list), and similar work is already planned to make removed files work.	2023-10-15 14:53:07 -05:00
Christopher Haster	b936e33643	Tweaked dbg scripts to resize tag repr based on weight This a compromise between padding the tag repr correctly and parsing speed. If we don't have to traverse an rbyd (for, say, tree printing), we don't want to since parsing rbyds can get quite slow when things get big (remember this is a filesystem!). This makes tag padding a bit of a hard sell. Previously this was hardcoded to 22 characters, but with the new file struct printing it quickly became apparently this would be a problematic limit: 12288-15711 block w3424 0x1a.0 3424 67 64 79 70 61 69 6e 71 gdypainq It's interesting to note that this has only become an issue for large trees, where the weight/size in the tag can be arbitrarily large. Fortunately we already have the weight of the rbyd after fetch, so we can use a heuristic similar to the id padding: tag padding = 21 + nlog10(max(weight,1)+1) --- Also dropped extra information with the -x/--device flag. It hasn't really been useful and was implemented inconsistently. Maybe -x/--device should just be dropped completely...	2023-10-14 01:25:14 -05:00
Christopher Haster	a2aa25aa8e	Tweaked dbgrbyd.py to show -1 tag rids	2023-10-14 00:53:31 -05:00
Christopher Haster	ef691d4cfe	Tweaked rbyd lookup/append to use 0 lower rid bias Previously our lower/upper bounds were initialized to -1..weight. This made a lot of the math unintuitive and confusing, and it's not really necessary to support -1 rids (-1 rids arise naturally in order-statistic trees the can have weight=0). The tweak here is to use lower/upper bounds initialized to 0..weight, which makes the math behave as expected. -1 rids naturally arise from rid = upper-1.	2023-10-14 00:52:00 -05:00
Christopher Haster	3fb4350ce7	Updated dbg scripts to support shrub trees - Added shrub tags to tagrepr - Modified dbgrbyd.py to use last non-shrub trunk by default - Tweaked dbgrbyd's log mode to find maximum seen weight for id padding	2023-10-13 23:35:03 -05:00
Christopher Haster	2f38822820	Still missing quite a bit, but rudimentary inlined-trees are now working And by working, I mean you can create inlined trees, just don't compact/split/move/etc anything. But this does outline the path files take when writing buffers into inlined trees. "Inlined trees" in littlefs are entire small rbyd trees embedded as secondary trees in an mdir's main rbyd tree. When fetching, we can indicate if a given trunk belongs to the main tree or secondary tree by setting one of the unused mode bits in the trunk's tag, now called the "deferred" bit. This bit doesn't need to be included in the alt's "key" field, so there's no issue with it conflicting with the alt's mode bits. This requires a bit of tweaking lfsr_rbyd_fetch, since it needs to fall back to the previous trunk if it discovers the most recent trunk belongs to an inlined tree. But as a benefit we can leverage the full power of rbyds in inlined files, including holes, partial updates, etc. One downside is it looks like these inlined trees may involve more work in maintining their state correctly, since they need to be sort of "brought along" when mdirs are compacted, even if they don't actually have a reference in the mdir yet. But the sheer amount of flexibility this gives inlined files may make this overhead worth it.	2023-10-13 23:11:35 -05:00
Christopher Haster	dd6a4e6496	Dropped the header from dbg scripts I had never noticed xxd has no header until comparing its output against dbgblock.py. Turns out these headers aren't really all that useful, and even sometimes wrong in dbglfs.py.	2023-09-15 17:45:17 -05:00
Christopher Haster	c1fe64314c	Reworked how filesystem-level config is stored Now, instead of storing a single contiguous block of config data, config is stored as tagged metadata like any other attribute. This allows more flexibility towards adding/removing config in the future, without cluttering up the config with deprecated entries (see ATA's "IDENTIFY DEVICE" response). Most of the config entries are single leb128 limits on various integer types, with the exception of the magic string and version (major/minor pair). --- Note this also includes some semantic changes to the config: - Limits are stored as size-1. This avoid issues with integer overflow at extreme ranges. This was also adopted for block size (block limit) and block count (disk limit). This deviation between on-disk config and user-facing config risks confusion, but allows the potential for the full 2^31 range for these values. - The default cksum type, crc32c, has been changed to 0. Originally this was 2 to allow the type to map to the crc width for crc8, crc16, crc32c, crc64, etc. But dropping this idea and numbering checksums as they are implemented simplifies things. May come back to this. - Storing these configs as attributes opens up of the option of on-disk defaults when configs are missing. I'm being a bit conservative with this one, as it's not clear to me if we should prefer default configs (less code/storage, risk of untested config parsing) or prefer explicit on-disk configs. Currently the following have defaults since they seem the most obvious to me: - cksum type => defaults to crc32c - redund type => defaults to parity (TODO, should this default to no redund?) - utag_limit => defaults to 0x7f (no special tag decoding) - uattr_limit => defaults to block_limit (implicit)	2023-09-15 14:51:25 -05:00
Christopher Haster	f7900edc1c	Updated dbg scripts with changes, adopted mbid.mrid in debug prints This format for mids is a compromise in readability vs debugability. For example, if our mbid weight is 256 (4KiB blocks), the 19th entry in the second mdir would be the raw integer 275. With this mid format, we would print it as 256.19. The idea is to make it easy to see it's the 19th entry in the mdir while still making it relatively easy to see that 256.19 and 275 are equivalent when debugging. --- The scripts also took some tweaking due to the mid change. Tried to keep the names consistent, but I don't think it's worthwhile to change too much of the scripts while they are working.	2023-09-05 10:10:30 -05:00
Christopher Haster	256430213d	Dropped separate BTREE/BRANCH encodings There is a bit of redundancy here, as we already know the weights of btree's inner-branches from their parents. But in theory sharing the same encoding for both the top level btree reference and inner-branches should offer more chance for deduplication and hopefully less code. This also moves some members around in the btree encoding so that the redund blocks are at the beginning. This _might_ simplify decoding of the variable-length redund blocks at some point. Current btree encoding: .----+----+----+----. \| blocks ... redund leb128s (1-20 bytes) : : \|----+----+----+----\| \| trunk ... 1 leb128 (1-5 bytes) \|----+----+----+----\| \| weight ... 1 leb128 (1-5 bytes) \|----+----+----+----\| \| cksum \| 1 le32 (4 bytes) '----+----+----+----' This also partially reverts some tag name changes: - BNAME -> BRANCH - DMARK -> BOOKMARK	2023-08-22 13:20:37 -05:00
Christopher Haster	314c832588	Adopted new struct encoding scheme with redund tag bits Struct tags, in littlefs, generally encode pointers to different on-disk data structures. At this point, they've gotten a bit complex, with the btree struct, for example, containing 1. a block address, 2. the trunk offset, 3. the weight of the trunk, and 4. a checksum. Also some future plans: 1. Block redundancy will make it so these pointers may have a variable number of block addresses to contend with. 2. Different checksum types may make the checksum field itself variable length, at least on larger builds of littlefs. This may also happen if we support truncated checksums in littlefs for storage saving reasons. Having two variable sized fields becomes a bit of a pain. We can use the encoded tag size to figure out the size of one of these fields, but not both. The change here makes it so the tag size now determines the checksum size, requiring the redundancy amount to go somewhere else. This makes it so checksums can be variably sized, and the explicit redundancy amount avoids the need to parse the leb128s fully to know how many blocks we're expecting. But where to put the redundancy amount? This commit carves out 2-bits from the struct tag to store the amount of redundancy to allow up to 3 blocks of redundancy: v0000011 0TTTTTrr ^--^---^-^----^-^- valid bit '---\|-\|----\|-\|- 3-bit mode (0x0 for structs) '-\|----\|-\|- 4-bit suptype (0x3 for structs) '----\|-\|- 0 bit (reserved for leb128) '-\|- 5-bit subtype '- 2-bit redund 3 blocks may sound extremely limiting, but it's a common limit for filesystems, 1. because you have to keep in mind each redundant block adds that much more writing/reading overhead and 2. the fact that 2^(2^n)-1 is always divisible by 3 makes >3 parity blocks much more complicated mathematically. Worst case, if we ever have >3 redundant blocks, we can create new struct subtypes. Maybe adding extended struct types that prefix the block addresses with a leb128 encoding the redundancy amount. --- As a part of this, reorganized the on-disk btree and ecksum encodings to put the checksum last. Also split out the btree and inner btree branches as separate struct types. The btree includes the weight, whereas the weight is implicit in inner btree branches. This came about after realizing context-specific prefixes are relatively easy to add thanks to the composability of our parsers. This led to some name collisions though: - BRANCH -> BNAME - BOOKMARK -> DMARK	2023-08-11 12:55:48 -05:00
Christopher Haster	d2f2b53262	Renamed fcksum -> ecksum This checksum is used to keep track of if we have erased, and not yet touched, the unused bytes trailing our current commit in the rbyd. The working theory is that if any prog attempt is made, it will, most likely, change the checksum of the contents, allowing littlefs to determine if trailing erased-state is safe to use, even under powerloss. littlefs can also perturb future data by a single bit, to force this checksum to always be invalidated during normal operation. The original name, "forward erased-state checksums (fcksum)", came from the idea that the checksum "looks forward" into the next commit. But after using them for a bit, I think the name is unnecessarily confusing. It, uh, also looks a lot like a swear word. I think shortening the name to just "erased-state checksums (ecksum)", even though the previous name is already in use in a release, is reasonable. --- It's probably hard to believe but the name change from fcrc -> ecrc really was unrelated to the crc -> cksum change. But boy is it convenient for avoiding an awkward name. A lot of these name changes involved sed scripts, so I didn't notice how awkward fcksum would be to use until writing this commit message.	2023-08-07 14:34:47 -05:00
Christopher Haster	7031d6e1b3	Changed most references to crc/csum -> cksum The reason for this is to move away from the idea that littlefs is strictly bound to CRCs and make the code more welcoming to other checksum types, such as SHA256, etc. Of course, changing the name doesn't really do anything. littlefs actually _is_ strictly bound to CRCs in a couple ways that other filesystems aren't. These would need to have workarounds for other checksum types: - We leverage the parity-preserving nature of (some) CRCs to not have to also calculate the parity of metadata in rbyd commits. - We leverage the linearity of CRCs to retroactively flip the perturb bit in the cksum tag without needing to recalculate the checksum. Though the fact we need to do this is because of how we use parity above, so this may just not be needed for non-CRC checksums. - The plans for global-CRCs (not yet implemented) rely heavily on the mathematical properties of CRC polynomials. This doesn't mean global-CRCs can't work with other checksums, you would just need to find a different type of polynomial.	2023-08-07 14:18:37 -05:00
Christopher Haster	d77a173d5c	Changed source to consistently use rid for rbyd ids Originally it made sense to name the rbyd ids, well, ids, at least in the internals of the rbyd functions. But this doesn't work well outside of the rbyd code, where littlefs has to juggle several different id types with different purposes: - rid => rbyd-id, 31-bit index into an rbyd - bid => btree-id, 31-bit index into a btree - mid => mdir-id, 15-bit+15-bit index into the mtree - did => directory-id, 31-bit unique identifier for directories Even though context makes it clear which id the id refers to in the rbyd internals, updating the name to rid makes it clearer that these are the same type of id when looking at code both inside and outside the rbyd functions.	2023-08-07 14:10:09 -05:00
Christopher Haster	64a1b46ea2	Renamed a couple directory related things - dstart -> bookmark - dnamelookup -> namelookup	2023-08-07 14:00:44 -05:00
Christopher Haster	c2d9f1b047	Implemented, but untested, global-removes This implementation is in theory correct, but of course, being untested, who knows? Though this does come with remounting added to all of the directory tests. This effectively tests that all of the directory creation tests we have so far maintain grm=0 after each unmount-mount cycle. Which is valuable.	2023-07-18 21:40:36 -05:00
Christopher Haster	cc0ac25b5e	Implemented infrastructure necessary for global-removes This has, in theory, global-removes (grm) being written out as a part of of directory creation, but they aren't used in any form and so may not be being written correctly. But it did require quite a bit of problem solving to get to this point (the interactions between mtree splitsand grms is really annoying), so it's worth a commit.	2023-07-18 21:40:30 -05:00
Christopher Haster	da810aca26	Implemented mtree path/dname lookup, rudimentary lfsr_mkdir/lfsr_dir_read This makes it now possible to create directories in the new system. The new system now uses a single global "mtree" to store all metadata entries in the filesystem. In this system, a directory is simply a range of metadata entries. This has a number of benefits, but does come with its own problems: 1. We need to indicate which directory each file belongs to. To do this the file's name entry has been changed to a tuple of leb128-encoded directory-id + actual file name: 01 66 69 6c 65 2e 74 78 74 .file.txt ^ '----------+----------' '------------\|------------ leb128 directory-id '------------ ascii/utf8 name If we include the directory-id as part of filename comparison, files should naturally be next to other files in the same directory. 2. We need a way allocate directory-ids for new directories. This turns out to be a bit more tricky than I expected. We can't use any mid/bid/rid inherent to the mtree, because these change on any file creation/deletion. And since we commit the did into the tree, that's not acceptable. Initially I though you could just find the largest did and increment, but this gives you no way to reclaim deleted dids. And sure, deleted dids have no storage consumption, but eventually you will overflow the did integer. Since this can suddenly happen in a filesystem that's been in a steady-state for years, that's pretty unnacceptable. One solution is to do a simple linear search over the mtree for an unused did. But with a runtime of O(n^2 log(n)), this raises performance concerns. Sidenote: It's interesting to note that the Linux kernel's allocation of process-ids, a very similar problem, is surprisingly complex and relies on a radix-tree of bitmaps (struct idr). This suggests I'm not missing an obvious solution somewhere. The solution I settled on here is to instead treat the set of dids as a sort of hash table: 1. Hash the full directory path into a did. 2. Perform a linear search until we have no collision. leb128(truncate28(crc32c("dir"))) .--------' v 9e cd c8 30 66 69 6c 65 2e 74 78 74 ...0file.txt '----+----' '----------+----------' '-----------------\|------------ leb128 directory-id '------------ ascii/utf8 name Worst case, this can still exhibit the worst case O(n^2 log(n)) performance when we are close to full dids. However that seems unlikely to happen in practice, since we don't truncate our hashes, unlike normal hash tables. An additional 32-bit word for each file is a small price to pay for a low-chance of collisions. In the current implementation, I do truncate the hash to 28-bits. Since we encode the hash with leb128, and hashes are statistically random, this gives us better usage of the leb128 encoding. However it does limit a 32-bit littlefs to 256 Mi directories. Maybe this should be a configurable limit in the future. But that highlights another benefit of this scheme. It's easy to change in the future without disk changes. 3. We need a way to know if a directory-id is allocated, even if the directory is empty. For this we just introduce a new tag: LFSR_TAG_DSTART, which is an empty file entry that indicates the directory at the given did in the mtree is allocated. To create/delete these atomically with the reference in our parent directory, we can use the GRM system for atomic renames. Note this isn't implemented yet. This is also the first time we finally get around to testing all of the dname lookup functions, so this did find a few bugs, mostly around reporting the root correctly.	2023-07-05 13:41:21 -05:00
Christopher Haster	cf588ac3fa	Dropped alt-always as an rbyd trunk terminator Now that tree rebalancing is implemented and needed a null terminator anyways, I think it's clear that the benefit of the alt-always pointers as trunk terminator has pretty limited value. Now a null or other tag is needed for every trunk, which simplifies checks for end-of-trunk. Alt-always tags are still emitted for deletes, etc, but there their behavior is implicit, so no special checks are needed. Alt-always tags are naturally cleaned up as a part of rbyd pruning.	2023-06-27 00:49:31 -05:00
Christopher Haster	43dc3a5c8d	Implemented tree rebalancing during rbyd compaction This isn't actually for performance reasons, but to reduce storage overhead of the rbyd metadata tree, which was showing signs of being problematic for small block sizes. Originally, the plan for compaction was to rely on the self-balancing rbyd append algorithm and simply append each tag to a new tree. Unfortunately, since each append requires a rewrite of the trunk (current search path), this introduces ~nlog(n) alts but only uses ~n alts for the final tree. This really starts to put pressure on small blocks, where the exponential-ness of the log doesn't kick in and overhead limits are already tight. Measuring lfsr_mdir_commit code size, this shows a ~556 byte cost on thumb: 16416 -> 16972 (+3.4%). Though there are still some optimizations on the table, this implementation needs a cleanup pass. alt overhead code cost rebalance: <= 28n 16972 append: <= 24nlog(n) 16416 Note these all assume worst case alt overhead, but we _need_ to assume worst case for our rbyd estimations, or else the filesystem can get stuck in unrecoverable compaction states. Because of the code cost I'm not sure if rebalancing will stay, be optional, or replace append-compaction completely yet. Some implementation notes: - Most tree balancing algorithms rely on true recursion, I suspect recursion may be a hard requirement in general, but it's hard to find bounded-ram algorithms. This solution gets around the ram requirement by leveraging the fact that our tags exist in a log to build up each layer in the tree tail-recursively. It's interesting to note that this is a special case of having little ram but lots of storage. - Humorously this shouldn't result in a performance improvement. Rbyd trees result in a worst case 2log(n) height, and rebalancing gives us a perfect worst case log(n) height, but, since we need an additional alt pointer for each node in our tree, things bump back up to 2log(n). - Originally the plan was to terminate each node with an alt-always tag, but during implementation I realized there was no easy way to get the key that splits the children with awkward tree lookups. As a workaround each node is terminated with an altle tag that contains the key followed by an unreachable null tag. This is redundant information, but makes the algorithm easier to implement. Fortunately null tags use the smallest tag encoding, which isn't that small, but that means this wastes at most 4*n bytes. - Note this preserves the first-tag-always-ends-up-at-off=0x4 rule, which is necessary for the littlefs magic to end up in a consistent place. - I've dropped dropping vestigial names for now, which means vestigial names can remain in btrees indefinitely. Need to revisit this.	2023-06-25 15:23:46 -05:00
Christopher Haster	e79c15b026	Implemented wide tags for both rbyd commit and lookup Wide tags are a happy accident that fell out of the realization that we can view all subtypes of a given tag suptype as a range in our rbyd. Combining this with how natural it is to operate on ranges in an rbyd allows us to perform operations on an entire range of subtypes as though it were a single tag. - lookup wide tag => find the smallest tag with this tag's suptype, O(log(n)) - remove wide tag => remove all tags with this tag's suptype, O(log(n)) - append wide tag => remove all tags with this tag's suptype, and then append our tag, O(log(n)) This is very useful for littlefs, where we've already been using tag's subtypes to hold extra type info, and have had to rely on awkward alternatives such as deleting existing subtypes before writing our new subtype. For example, when committing file metadata (not yet implemented), we can append a wide struct tag to update the metadata while also clearing out any lingering struct tags from previous commits, all in one rbyd append operation. This uses another mode bit in-device to change the behavior of lfsr_rbyd_commit, of which we have a couple: vwgrtttt 0TTTTTTT ^^^^---^--------^- valid bit (currently unused, maybe errors?) '\|\|---\|--------\|- wide bit, ignores subtype (in-device) '\|---\|--------\|- grow bit, don't create new id (in-device) '---\|--------\|- rm bit, remove this tag (in-device) '--------\|- 4-bit suptype '- leb128 subtype	2023-06-19 16:08:43 -05:00
Christopher Haster	2467d2e486	Added a separate tag encoding for the mtree This helps with debugging and can avoid weird issues if a file btree ever accidentally ends up attached to id -1 (due to fs bug). Though a separate encoding isn't strictly necessary, maybe this should be reverted at some point.	2023-06-18 15:12:43 -05:00
Christopher Haster	7180b70c9c	Allowed "alta" (altbgt 0) to terminate rbyd trunks, dropped rm bit This replaces unr with null on disk, though note both the rm bit and unr are used in-device still, they just don't get written to disk. This removes the need for the rm bit on disk. Since we no longer need to figure out what's been removed during fetch, we can save this bit for both internal and future on-disk use. Special handling of alta allows us to avoid emitting an unr tag (now null) if the current trunk is truly unreachable. This is minor now, but important for a theoretical rbyd rebalance operation (planned), which brings the rbyd overhead down from ~3x to ~2x. These changes give us two ways to terminate trunks without a tag: 1. With an alta, if the current trunk is unreachable: altbgt 0x403 w0 0x7b altbgt 0x402 w0 0x29 alta w0 0x4 2. With a null, if the current trunk is reachable, either for code convenience or because emitting an alta is impossible (an empty rbyd for example): altbgt 0x403 w0 0x7b altbgt 0x402 w0 0x29 altbgt 0x401 w0 0x4 null	2023-06-17 18:11:45 -05:00
Christopher Haster	2113d877d6	Moved bits around in tag encoding to allow leb128 custom attributes Yet another tag encoding, but hopefully narrowing in on a good long term design. This change trades a subtype bit for the ability to extend subtypes indefinitely via leb128 in the future. The immediate benefit is ~unlimited custom attributes, though I'm not sure how to make this configurable yet. Extended custom attributes may have a significant impact on alt tag sizes, so it may be worth defaulting to only 8-bit custom attributes still. Tag encoding: vmmmtttt 0TTTTTTT 0wwwwwww 0sssssss ^--^---^--------^--------^--------^- valid bit '---\|--------\|--------\|--------\|- 3-bit mode '--------\|--------\|--------\|- 4-bit suptype '--------\|--------\|- leb128 subtype '--------\|- leb128 weight '- leb128 size/jump This limits subtypes to 7-bits, but this seems very reasonable at the moment. This also seems to limit custom attributes to 7-bits, but we can use two separate suptypes to bring this back up to 8-bits. I was planning to do this anyways to have separate "user-attributes" and "system-attributes", so this actually fits in really well.	2023-06-16 01:51:33 -05:00
Christopher Haster	c60fa69ce1	Optimized dbg*.py tree generation/rendering by deduplicating edges Optimizing a script? This might sound premature, but the tree rendering was, uh, quite slow for any decently sized (>1024) btree. The main reason is that tree generation is quite hacky in places, repeatedly spitting out multiple copies of the inner node's rbyd trees for example. Rather than rewrite the tree generation implementation to be smarter, this just changes all edge representations to namedtuples (which may reduce memory pressure a bit), and collects them into a Python set. This has the effect of deduplicating generated edges efficiently, and improved the rendering performance significantly. --- I also considered memoizing rbyd tree, but dropped the idea since the current renderer performs well enough.	2023-05-30 18:17:51 -05:00
Christopher Haster	9b803f9625	Reimplemented tree rendering in dbg*.py scripts The goal here was to add the option to show the combined rbyd trees in dbgbtree.py/dbgmtree.py. This was quite tricky, (and not really helped by the hackiness of these scripts), but was made a bit easier by adding a general purpose tree renderer that can render a precomputed set of branches into the tag output. For example, a 2-deep rendering of a simple btree with a block size of 1KiB, where you can see a bit of the emergent data-structure: $ ./scripts/dbgbtree.py disk -B1024 0x223 -t -Z2 -i btree 0x223.90, rev 46, weight 1024 rbyd ids tag ... 0223.0090: .-+ 0-199 btree w200 9 ... 00cb.0048: \| \| .-> 0-39 btree w40 7 ... \| \| .---+-> 40-79 btree w40 7 ... \| \| \| .---> 80-119 btree w40 7 ... \| \| \| \| .-> 120-159 btree w40 7 ... \| '-+-+-+-> 160-199 btree w40 7 ... 0223.0090: .---+-+ 200-399 btree w200 9 ... 013e.004b: \| \| .-> 200-239 btree w40 7 ... \| \| .---+-> 240-279 btree w40 8 ... \| \| \| .---> 280-319 btree w40 8 ... \| \| \| \| .-> 320-359 btree w40 8 ... \| '-+-+-+-> 360-399 btree w40 8 ... 0223.0090: \| .---+ 400-599 btree w200 9 ... 01a7.004c: \| \| \| .-> 400-439 btree w40 8 ... \| \| \| .---+-> 440-479 btree w40 8 ... \| \| \| \| .---> 480-519 btree w40 8 ... \| \| \| \| \| .-> 520-559 btree w40 8 ... \| \| '-+-+-+-> 560-599 btree w40 8 ... 0223.0090: \| \| .-+ 600-799 btree w200 9 ... 021e.004c: \| \| \| \| .-> 600-639 btree w40 8 ... \| \| \| \| .---+-> 640-679 btree w40 8 ... \| \| \| \| \| .---> 680-719 btree w40 8 ... \| \| \| \| \| \| .-> 720-759 btree w40 8 ... \| \| \| '-+-+-+-> 760-799 btree w40 8 ... 0223.0090: +-+-+-+ 800-1023 btree w224 10 ... 021f.0298: \| .-> 800-839 btree w40 8 ... \| .-+-> 840-879 btree w40 8 ... \| \| .-> 880-919 btree w40 8 ... '---+-+-> 920-1023 btree w104 9 ... This tree renderer also replaces the adhoc tree rendere in dbgrbyd.py for consistency.	2023-05-30 18:04:54 -05:00
Christopher Haster	b67fcb0ee5	Added dbgmtree.py for debugging the littlefs metadata-tree This builds on dbgrbyd.py and dbgbtree.py by allowing for quick debugging of the littlefs mtree, which is a btree of rbyd pairs with a few bells and whistles. This also comes with a number of tweaks to dbgrbyd.py and dbgbtree.py, mostly changing rbyd addresses to support some more mdir friendly formats. The syntax for rbyd addresses is starting to converge into a couple common patterns, which is nice for quickly determining what type of address you are looking at at a glance: - 0x12 => An rbyd at block 0x12 - 0x12.34 => An rbyd at block 0x12 with trunk 0x34 - 0x{12,34} => An rbyd at either block 0x12 or block 0x34 (an mdir) - 0x{12,34}.56 => An rbyd at either block 0x12 or block 0x34 with trunk 0x56 These scripts have also been updated to support any number of blocks in an rbyd address, for example 0x{12,34,56,78}. This is a bit of future proofing. >2 blocks in mdirs may be explored in the future for the increased redundancy.	2023-05-30 18:04:54 -05:00
Christopher Haster	975a98b099	Renamed a few superblock-related things - supermdir -> mroot - supermagic -> magic - superconfig -> config	2023-05-30 14:46:56 -05:00
Christopher Haster	738eb52159	Tweaked tag encoding/naming for btrees/branches LFSR_TAG_BNAME => LFSR_TAG_BRANCH LFSR_TAG_BRANCH => LFSR_TAG_BTREE Maybe this will be a problem in the future if our branch structure is not the same as a standalone btree, but I don't really see that happening.	2023-05-30 13:41:28 -05:00
Christopher Haster	85ebdd0881	Reintroduced Brent's algorithm for cycle detection in lfsr_mount	2023-05-30 13:28:07 -05:00
Christopher Haster	70a3a2b16e	Rough implementation of lfsr_format/mount/unmount This work already indicates we need more data-related helper functions. We shouldn't need this many function calls to do "simple" operations such as fetch the superconfig if it exists.	2023-05-30 13:16:03 -05:00
Christopher Haster	a511696bad	Added ability to bypass rbyd fetch during B-tree lookups This is an absurd optimization that stems from the observation that the branch encoding for the inner-rbyds in a B-tree is enough information to jump directly to the trunk of the rbyd without needing an lfsr_rbyd_fetch. This results in a pretty ridiculous performance jump from O(m log_m(n/m)) to O(log(m) log_m(n/m)). If the complexity analysis isn't impressive enough, look at some rough benchmarking of read operations for 4KiB-block, 1K-entry B-trees: 12KiB ^ :: :. :: .: .: :. : .: :. : : .. : : . : .: : : : \| .:: .::.::.:: ::.::::::::::::.::::::::.::::::::::::. \| : :::':: ::'::'::':: :' :':: :'::::::::': ::::::': : before \| ::: ::' :' :' :: :' '' ' ' '' : : : '' ' ' ' \| ::: '' \|: 0B :'------------------------------------------------------> .17KiB ^ ............::::::::::::::::::::::::::::: \| . .....:::::''''''''' ' ' ' \| .:::::::::::: after \| :':'' \|.:: .:' 0B :-------------------------------------------------------> 0 1K In order for this to work, the branch encoding did need to be tweaked slightly. Before it stored block+off, now it stores block+trunk where "trunk" is the offset of the entry point into the rbyd tree. Both off and trunk are enough info to know when to stop fetching, if necessary, but trunk allows lookups to jump directly into the branches rbyd tree without a fetch. With the change to trunk, lfsr_rbyd_fetch has also be extended to allow fetching of any internal trunks, not just the last trunk in the commit. This is very useful for dbgrbyd.py, but doesn't currently have a use in littlefs itself. But it's at least valuable to have the feature available in case it does become useful. Note that two cases still requires the slower O(m log_m(n/m)) lookup with lfsr_rbyd_fetch: 1. Name lookups, since we currently use a linear-search O(m) to find names. 2. Validating B-tree rbyd's, which requires a linear fetch O(m) to validate the checksums. We will need to do this at least once after mount. It's also worth mentioning this will likely have a large impact on B-tree traversal speed. Which is huge as I am expecting B-tree traversal to be the main bottleneck once garbage-collection (or its replacement) is involved.	2023-04-14 00:51:34 -05:00
Christopher Haster	7eb0c4763a	Reversed LFSR_ATTR id/tag argument order I've been wanting to make this change for a while now (tag,id => id,tag). The id,tag order matches the common lexicographic order used for sorting tuples. Sorting tag,id tuples by their id first is less common. The reason for this order in the codebase is because all attrs on disk start with their tag first, since its decoding determines the purpose of the id field (keep in mind this includes other non-tree tags such as crcs, alts, etc). But with the move to storing weights instead of tags on disk, this gives us a clear point to switch from tag,w to id,tag ordering. I may be thinking to much about this, but it does affect a significant amount of the codebase.	2023-04-14 00:43:33 -05:00
Christopher Haster	2142b4a09d	Reworked dbgrbyd.py's tree renderer to make more sense While the previous renderer was "technically correct", the attempt to map rotated alts to their nearest neighbor just made the resulting tree an unreadable mess. Now the renderer prunes alts with unreachable edges (like they would be during lfsr_rbyd_append). And aligns all alts with their destination trunk. This results in a much more readable, if slightly less accurate, rendering of the tree. Example: $ ./scripts/dbgrbyd.py -B4096 disk 0 -t rbyd 0x0, rev 1, size 1508, weight 40 off ids tag data (truncated) 0000032a: .-+-> 0 reg w1 1 73 s 00000026: \| '-> 1-5 reg w5 1 62 b 00000259: .-------+---> 6-11 reg w6 1 6f o 00000224: \| .-+-+-> 12-17 reg w6 1 6e n 0000028e: \| \| \| '-> 18 reg w1 1 70 p 00000076: \| \| '---> 19-20 reg w2 1 64 d 0000038f: \| \| .-> 21-22 reg w2 1 75 u 0000041d: \| .---+---+-> 23 reg w1 1 78 x 000001f3: \| \| .-> 24-27 reg w4 1 6d m 00000486: \| \| .-----+-> 28-29 reg w2 1 7a z 000004f3: \| \| \| .-----> 30-31 reg w2 1 62 b 000004ba: \| \| \| \| .---> 32-35 reg w4 1 61 a 0000058d: \| \| \| \| \| .-> 36-37 reg w2 1 65 e 000005c6: +-+-+-+-+-+-> 38-39 reg w2 1 66 f	2023-04-14 00:41:55 -05:00
Christopher Haster	0ccf283321	Changed in-tree tags to store their weights Sorting weights instead of ids just had a number of benefits, suggesting this is a better design: - Calculating the id and delta of each rbyd trunk is surprisingly easier - id is now just lower+w-1, and no extra conditions are needed for unr tags, which just have a weight of zero. - Removes ambiguity around which id unr tags should be assigned to, especially unrs that delete ids. - No more +-1 weirdness when encoding/decoding tag ids - the weight can be written as-is and -1 ids are infered from their weight and position in the tree (lower+w-1 = 0+0-1 = -1). - Weights compress better under leb128 encoding, since they are usually quite small.	2023-04-14 00:32:05 -05:00
Christopher Haster	13852df071	Switched back to altgt 0 for unreachable tags, made btree tests pass again This fixed two notable bugs: 1. Using "altle 0xfff0" to terminate unreachable rbyd trunks threw off id calculations in lfsr_rbyd_fetch searches. We derive the tag's id+weight from the lower bound calculated as the sum of all "altle"s and an always-followed "altle 0xfff0" throws this off. We _could_ derive the tag's id+weight from the upper bound, inverting this relationship, but decided to revert back to using "altgt 0" to terminate unreachable rbyd trunks. Using the lower bound is more intuitive, and "altgt 0" has the benifit of supporting variable-length tags if we ever need to adopt those. To avoid the previous issues around 0-tag holes (which was the original motivation for altle 0xfff0), 0-tags are now automatically adjusted in lfsr_rbyd_lookup, and avoided in lfsr_rbyd_append. But note! if any implemention tries to look up 0-tags, this will eventually break! See previous commits for more info. 2. Unfortunately, we can't combine branch updates and weight updates in lfsr_btree_commit in the general case. If our btree contains bname tags, the weight is attached to the bname tag, separately from the branch tag. Branch updates in lfsr_btree_commit need two separate attrs for the weight and branch struct for this reason, which is unfortunate. The amount of extra conditions to make bname+branch pairs work makes me want to redesign the inner-nodes of the btrees, but I can't think of a better way to approach the problem.	2023-04-14 00:04:58 -05:00
Christopher Haster	e5ad09b380	Some btree progress, implementing rbyd-tag-weight changes	2023-04-14 00:02:51 -05:00
Christopher Haster	85bd28951c	Solved rbyd grow/insert ambiguity by adding a device-only "mk" bit This "mk" bit must not be written to disk, it would conflict with the other non-tree tag encodings. But we can use this bit in the context of lfsr_tag_append to disambiguate tags changing weight from inserting new tags. Note that in the context of rbyd compactions, this will make things a bit weird, since it's no longer just a direct one-to-one copy of each tag. To make compactions a bit easier, this implementation allows the "mk" bit to be set on any tag and ignores it when the weight delta is zero. It turns out that this scheme greatly simplifies the awkward leaf-split-alt calculation that previously had several if statements to handle different corner cases, with the caveat that "mk" tags need their ids adjusted by +1. Added this adjustment directly into lfsr_rbyd_append for now, so the upper-level interface can be a bit more intuitive. Though this may need to change later if it is more confusing than helpful.	2023-04-13 19:00:39 -05:00
Christopher Haster	5a1c36f210	Attempting to add weight changes to every rbyd append This does not work as is due to ambiguity with grows and insertions. Before, these were disambiguated by seperate grow and attr tags. You effectively grew the neighboring id before claiming its weight as yours. But now that the attr itself creates the grow/insertion, it's ambiguous which one is intended.	2023-04-13 18:58:56 -05:00
Christopher Haster	8f26b68af2	Derived grows/shrinks from rbyd trunk, no longer needing explicit tags I only recently noticed there is enough information in each rbyd trunk to infer the effective grow/shrinks. This has a number of benefits: - Cleans up the tag encoding a bit, no longer expecting tag size to sometimes contain a weight (though this could've been fixed other ways). 0x6 in the lower nibble now reserved exclusively for in-device tags. - grow/shrinks can be implicit to any tag. Will attempt to leverage this in the future. - The weight of an rbyd can no longer go out-of-sync with itself. While this _shouldn't_ happen normally, if it does I imagine it'd be very hard to debug. Now, there is only one source of knowledge about the weight of the rbyd: The most recent set of alt-pointers. Note that remove/unreachable tags now behave _very_ differently when it comes to weight calculation, remove tags require the tree to make the tag unreachable. This is a tradeoff for the above.	2023-03-27 01:45:34 -05:00
Christopher Haster	546fff77fb	Adopted full le16 tags instead of 14-bit leb128 tags The main motivation for this was issues fitting a good tag encoding into 14-bits. The extra 2-bits (though really only 1 bit was needed) from making this not a leb encoding opens up the space from 3 suptypes to 15 suptypes, which is nothing to shake a stick at. The main downsides: 1. We can't rely on leb encoding for effectively-infinite extensions. 2. We can't shorten small tags (crcs, grows, shrinks) to one byte. For 1., extending the leb encoding beyond 14-bits is already unpalatable, because it would increase RAM costs in the tag encoder/decoder,` which must assume a worst-case tag size, and would likely add storage cost to every alt pointer, more on this in the next section. The current encoding is quite generous, so I think it is unlikely we will exceed the 16-bit encoding space. But even if we do, it's possible to use a spare bit for an "extended" set of tags in the future. As for 2., the lack of compression is a downside, but I've realized the only tags that really matter storage-wise are the alt pointers. In any rbyds there will be roughly O(m log m) alt pointers, but at most O(m) of any other tags. What this means is that the encoding of any other tag is in the noise of the encoding of our alt pointers. Our alt pointers are already pretty densely packed. But because the sparse key part of alt-pointers are stored as-is, the worst-case encoding of in-tree tags likely ends up as the encoding of our alt-pointers. So going up to 3-byte tags adds a surprisingly large storage cost. As a minor plus, le16s should be slightly cheaper to encode/decode. It should also be slightly easier to debug tags on-disk. tag encoding: TTTTtttt ttttTTTv ^--------^--^^- 4+3-bit suptype '---\|- 8-bit subtype '- valid bit iiii iiiiiii iiiiiii iiiiiii iiiiiii ^- m-bit id/weight llll lllllll lllllll lllllll lllllll ^- m-bit length/jump Also renamed the "mk" tags, since they no longer have special behavior outside of providing names for entries: - LFSR_TAG_MK => LFSR_TAG_NAME - LFSR_TAG_MKBRANCH => LFSR_TAG_BNAME - LFSR_TAG_MKREG => LFSR_TAG_REG - LFSR_TAG_MKDIR => LFSR_TAG_DIR	2023-03-25 14:36:29 -05:00
Christopher Haster	89d5a5ef80	Working implementation of B-tree name split/lookup with vestigial names B-trees with names are now working, though this required a number of changes to the B-tree layout: 1. B-tree no-longer require name entries (LFSR_TAG_MK) on each branch. This is a nice optimization to the design, since these name entries just waste space in purely weight-based B-trees, which are probably going to be most B-trees in the filesystem. If a name entry is missing, the struct entry, which is required, should have the effective weight of the entry. The first entry in every rbyd block is expected to be have no name entry, since this is the default path for B-tree lookups. 2. The first entry in every rbyd block _may_ have a name entry, which is ignored. I'm calling these "vestigial names" to make them sound cooler than they actually are. These vestigial names show up in a couple complicated B-tree operations: - During B-tree split, since pending attributes are calculated before the split, we need to play out pending attributes into the rbyd before deciding what name becomes the name of entry in the parent. This creates a vestigial name which we _could_ immediately remove, but the remove adds additional size to the must-fit split operation - During B-tree pop/merge, if we remove the leading no-name entry, the second, named entry becomes the leading entry. This creates a vestigial name that _looks_ easy enough to remove when making the pending attributes for pop/merge, but turns out the be surprisingly tricky if the parent undergoes a split/merge at the same time. It may be possible to remove all these vestigial names proactively, but this adds additional rbyd lookups to figure out the exact tag to remove, complicates things in a fragile way, and doesn't actually reduce storage costs until the rbyd is compacted. The main downside is that these B-trees may be a bit more confusing to debug.	2023-03-21 12:59:46 -05:00
Christopher Haster	8732904ef6	Implemented lfsr_btree_pop and btree merges B-tree remove/merge is the most annoying part of B-trees. The implementation here follows the same ideas implemented in push/split: 1. Defer splits/merges until compaction. 2. Assume our split/merge will succeed and play it out into the rbyd. 3. On the first sign of failure, revert any unnecessary changes by appending deletes. 4. Do all of this in a single commit to avoid issues with single-prog blocks. Mapping this onto B-tree merge, the condition that triggers merge is when our rbyd is <1/4 the block_size after compaction, and the condition that aborts a merge is when our rbyd is >1/2 the block_size, since that would trigger a split on a later compact. Weaving this into lfsr_btree_commit is a bit subtle, but relatively straightforward all things considered. One downside is it's not physically possible to try merging with both siblings, so we have to choose just one to attempt a merge. We handle the corner case of merging the last sibling in a block explicitly, and in theory the other sibling will eventually trigger a merge during its own compaction. Extra annoying are the corner cases with merges in the root rbyd that make the root rbyd degenerate. We really should avoid a compaction in this case, as otherwise we would erase a block that we immediately inline at a significant cost. However determining if our root rbyd is degenerate is tricky. We can determine a degenerate root with children by checking if our rbyd's weight matches the B-tree's weight when we merge. But determining a degenerate root that is a leaf requires manually looking up both children in lfsr_btree_pop to see if they will result in a degenerate root. Ugh. On the bright side, this does all seem to be working now. Which completes the last of the core B-tree algorithms.	2023-03-17 14:29:02 -05:00
Christopher Haster	a897b875d3	Implemented lfsr_btree_update and added more tests This was a rather simple exercise. lfsr_btree_commit does most of the work already, so all this needed was setting up the pending attributes correctly. Also: - Tweaked dbgrbyd.py's tree rendering to match dbgbtree.py's. - Added a print to each B-tree test to help find the resulting B-tree when debugging.	2023-03-17 14:20:40 -05:00
Christopher Haster	89ab174f33	Reworked dbgrbyd.py's --lifetimes so it actually works Changed so there is no 1-to-1 mk-tag/id assumption, any unique ids create a simulated lifetime to render. This fixes the issue where grows/shrinks left-aligned ids confused dbgrbyd.py. As a plus, now dbgrbyd.py can actually handle multi-id grow/shrinks, and is more robust against out-of-sync grow/shrinks. This sort of lifetime issues are when you'd want to run dgbrbyd.py, so it's a bit important this is handled gracefully.	2023-03-17 14:20:40 -05:00
Christopher Haster	ce599be70d	Added scripts/dbgbtree.py for debugging B-trees, tweaked dbgrbyd.py An example: $ ./scripts/dbgbtree.py -B4096 disk 0xaa -t -i btree 0xaa.1000, rev 35, weight 278 block ids name tag data (truncated) 00aa.1000: +-+ 0-16 branch id16 3 7e d4 10 ~.. 007e.0854: \| \|-> 0 inlined id0 1 73 s \| \|-> 1 inlined id1 1 74 t \| \|-> 2 inlined id2 1 75 u \| \|-> 3 inlined id3 1 76 v \| \|-> 4 inlined id4 1 77 w \| \|-> 5 inlined id5 1 78 x \| \|-> 6 inlined id6 1 79 y \| \|-> 7 inlined id7 1 7a z \| \|-> 8 inlined id8 1 61 a \| \|-> 9 inlined id9 1 62 b ... This added the idea of block+limit addresses such as 0xaa.1000. Added this as an option to dbgrbyd.py along with a couple other tweaks: - Added block+limit support (0x<block>.<limit>). - Fixed in-device representation indentation when trees are present. - Changed fromtag to implicitly fixup ids/weights off-by-one-ness, this is consistent with lfs.c.	2023-03-17 14:20:10 -05:00
Christopher Haster	88e3db98a9	Rough implementation of btree append This involves many, many hacks, but is enough to test the concept and start looking at how it interacts with different block sizes. Note only append (lfsr_btree_push on the end) is implemented, and it makes some assumption about how the ids can interact when splitting rbyds.	2023-03-17 14:20:09 -05:00
Christopher Haster	6f4704474b	Changed GROW/SHRINK to always be explicit, dropped LFSR_TAG_RM Generally, less implicit behavior => simpler systems, which is the goal here.	2023-03-17 14:20:09 -05:00
Christopher Haster	1709aec95b	Rough draft of general btree implementation, needs work This implements a common B-tree using rbyd's as inner nodes. Since our rbyds actually map to sorted arrays, this fits together quite well. The main caveat/concern is that we can't rely on strict knowledge on the on-disk size of these things. This first shows up with B-tree insertion, we can't split in preparation to insert as we descend down the tree. Normally, this means our B-tree would require recursion in order to keep track of each parent as we descend down our tree. However, we can avoid this by not storing our parent, but by looking it up again on each step of the splitting operation. This brute-force-ish approach makes our algorithm tail-recursive, so bounded RAM, but raises our runtime from O(logB(n)) to O(logB(n)^2) That being said, O(logB(n)^2) is still sublinear, and, thanks to B-tree's extremely high branching factor, may be insignificant.	2023-03-17 14:20:09 -05:00

1 2 3

125 Commits