littlefs

Author	SHA1	Message	Date
Christopher Haster	b05db8e3d3	Added support for lists of conditional ifs in test/bench.py Any conditions in both the suites and cases are anded together to determine when the test/bench should run. Accepting a list here makes it easier to compose multiple conditions, since toml-level elements are a bit easier to modify than strings of C expressions.	2023-06-01 17:40:51 -05:00
Christopher Haster	07244fb2d4	In test/bench.py, added "internal" flag This marks internal tests/benches (case.in="lfs.c") with an otherwise-unused flag that is printed during --summary/--list-*. This just helps identify which tests/benches are internal.	2023-06-01 17:40:48 -05:00
Christopher Haster	82027f3d90	Changed bench/test.py to error if explicit suite/case can't be found Previously no matches would noop, which, while consistent with an empty test suite that contains no tests but shouldn't really error, this made it easy to miss when a typo would cause tests to be missed. Also added a bit of color to script-level errors in test/bench.py	2023-06-01 17:16:21 -05:00
Christopher Haster	2339e9865f	Tweaked dbgmtree.py -Z flag to include mroots as depth This helps debug a corrupted mtree with cycles, which has been a problem in the past. Also fixed a small rendering issue with dbgmtree.py not connecting inner tree edges to mdir roots correctly during rendering.	2023-06-01 13:52:14 -05:00
Christopher Haster	49ec7a12b9	Tweaked btree traversal to visit each leaf at most once We are already paying the memory cost of a fetched lfsr_rbyd_t during btree so we can traverse inner btree nodes. But we are currently just wasting this memory when we traverse leaf entries. Instead, we can use this memory to cache the btree's current leaf node, avoiding a btree walk until we've iterated over all rids in the current leaf. Think about this for a second: 1. We cache the root rbyd in lfsr_btree_t because we share the memory with a union and we always need to read the root during lookups. 2. We cache the leaf rbyds in lfsr_btree_traversal_t because we need the memory for traversing inner btree nodes. The only btree nodes we don't cache during traversal are inner nodes when the height of the btree >= 3. If you're familiar with how btrees behave on storage, you know the height because _exponentially_ less likely to grow as the tree gets larger. It's entirely possible for btree traversal to simply never traverse a non-cached btree node once your block size gets large enough. --- There may be a way to instead reclaim this memory, such as sharing this memory with mtree traversal and higher layers, but the current API design and hierarchy of C structs make this difficult. Maybe this is worth looking into in the future, but at most it would save one lfsr_rbyd_t. This also reverts some rid changes in btree lookups to share less code but be a bit easier to reason about.	2023-05-30 19:49:15 -05:00
Christopher Haster	30bcb6947b	Added a test for mtree cycle detection, limited cycle detection to mdirs I intended to also add a test for cycles in the btree that backs the mtree (and eventually other btrees), but something really curious happened. It turns out it's actually really hard to create a btree cycle, even intentionally. This is because each CoW btree pointer includes the expected CRC of the branch's rbyd. To succesfully create a cycle that isn't trivially detected in a validating mtree traversal, you would somehow need to solve for a cyclic set of dependent CRCs that are still valid. I suspect this is slightly easier than a hash-based construction, due to the linear nature of CRCs, but still I think it's unreasonable to expect these sort of cycles to occur in the wild. Even with filesystem bugs. --- Note this isn't true for the mdirs, which are mutable so storing a checksum in the pointer isn't possible. For this reason, cycle detection is kept for mdirs during mtree traversal. This may not be strictly necessary for the mtree, but it needed for the mroot chain. Nonetheless, this does simplify things. Specifically it reduces the cycle detection's tortoise state to only mdir pairs.	2023-05-30 19:41:39 -05:00
Christopher Haster	773278eb26	Adopted mtree traversal in lfsr_mountinited This is a nice bit of deduplication as long as the mtree traversal can handle both: 1. Cycle detection 2. Btree node validation Eventually we'll also collect gstate here, which mtree traversal should make quite easy. The only catch is if we eventually need a non-fetching way to read the mroot config, such as if we need to infer the csum type or block-size, but that's a future problem.	2023-05-30 19:38:04 -05:00
Christopher Haster	1631ca8d78	Reimplemented Brent's cycle detection on top of mtree traversal This is a bit tricky because our tortoise state is now quite large thanks to how we are nesting traversals: - Current mdir pair - Current mtree block+trunk - Current btree block+trunk? (TODO) - Others? (TODO) This also raises some questions about what constitutes a cycle in our btrees. Since they are strictly CoW, they should be strictly DAGs worst case. But is that still true when considering that btree nodes can contain multiple trunk versions? To be safe, I'm currently including the trunks in our tortoise state, but it may be possible to relax this in the future.	2023-05-30 19:37:36 -05:00
Christopher Haster	6a96866737	Added an mtree traversal benchmark Note that because we amortize the traversal cost over the number of entries, mtree traversal may have some strange looking results when compared to mtree lookup. Though it's interesting to note this is a valid result. In mtree lookups we need to fetch the mdir for each entry, which is expensive. However mtree traversal can strictly avoid fetching each mdir more than once. This does make mdir traversal faster when iterating over all mdirs in order. This can be represented in big O notation if we treat the number of entries (n) and block size (b) as variables: - mtree traversal via lookup = O(nb+nlog(b)logb(n)) - mtree traversal via traversal = O(nlog(b)logb(n))	2023-05-30 19:21:34 -05:00
Christopher Haster	09b3d24036	Moved btree rbyd validation into mtree traversal Validating btree nodes during lfsr_btree_lookup was useful as a proof-of-concept, but it's not really needed if we validate btree nodes during mtree traversal. mtree traversal provides the first reads into the filesystem. It's how we find the real mroot, and (in theory at the moment) it provides the core operation for error detection in correction. With this in mind, implementing btree node validation in mtree traversal makes a lot of sense, with lfsr_btree_lookup leveraging an assumed successful validation for faster/smaller btree walks. Note that btree node validation during traversal is still optional. We really don't want to pay this cost during block allocation for example. --- It may look concerning that there's no related validation in btree traversal layer itself. It turns out that a quirk of btree traversal returning inner btree nodes on first visit, before actually traversing the btree node, is that it's safe for us to validte the btree node in only the mtree traversal layer. As long as we don't continue traversing on finding a corrupted btree, the btree traversal layer will never traverse an unvalidated btree node. This keeps all the validation logic in the same place, mtree traversal. I don't know if this will stay this way if/when more error correction features are added, but it's convenient in the meantime.	2023-05-30 18:52:02 -05:00
Christopher Haster	34bcb62a9e	Implemented incremental mtree traversal Just like lfsr_btree_traversal_t, lfsr_mtree_traversal_t provides a mechanism for traversing the mtree incrementally, including any inner btree nodes. This is one level more complex than btree traversal because we also need to handle the mroot chain and traversal of rids in each mdir. Again, mtree traversal returns temporary decoded rbyd structs for inner nodes. Actually, mtree traversal only returns inner nodes... so maybe using lfsr_data_t here is the wrong choice: - tag=LFSR_TAG_BTREE => lfsr_rbyd_t - tag=LFSR_TAG_MDIR => lfsr_mdir_t	2023-05-30 18:47:30 -05:00
Christopher Haster	93bf68c84b	Added lfsr_btree_traversal_t, incremental traversal of btree nodes The main thing to note is that traversal here != iteration. Thanks to the right-leaning nature of our btrees, iteration is already provided by lfsr_btree_lookupnext, using the bid as the current iteration state. What btree traversal provides is traversal over every rbyd + entries used in the btree, include the inner btree nodes. This is useful for things like garbage collection and error detection that need to operate on the raw rbyds. Note that both btree traversal and iteration are still O(n log_b(n)). We can't do any better than that without recursion. One non-intuitive implementation detail, we return a tag describing each entry, but instead of returning an on-disk data reference for inner btree nodes, we return a pointer to a temporarily decoded rbyd struct. This simplifies root handling, and we probably want the decoded version anyways: - tag=LFSR_TAG_BTREE => lfsr_rbyd_t - tag=anything else => lfsr_data_t The reason for making btree traversal incremental, and not just use a callback like we've done previously, is to eventually use this as a part of high-level incremental garbage-collection/error-correction. For this to work, all of the lower-levels also need to be incremental.	2023-05-30 18:32:51 -05:00
Christopher Haster	565c8cb9c7	Reimplemented the internal opened-mdir linked-list littlefs uses an invasive linked-list in open mdirs to keep any open files/dirs (and some special mdirs) in sync during filesystem operations. The main benefit of this is that the filesystem doesn't need to know the number of open files at compile time. The implementation here introduces a new type, lfsr_openedmdir_t, for mdirs that want to participate in the opened-mdir linked-list. This saves a couple words of memory in the cases where the mdir does not need to participate in the opend-mdir linked-list. Since we are creating quite a few more mdir structs in lfsr_mdir_commit now, the size of this struct is valuable. The implementation of lfsr_mdir_commit knew this was coming, so aside from the new type, adding this feature was straightforward: 1. Update opened-mdirs based on in-flight attrs. 2. Update opened-mdirs rbyd state. 3. Mark any deleted opened-mdirs with the reserved mid -2. 4. Test.	2023-05-30 18:24:36 -05:00
Christopher Haster	c60fa69ce1	Optimized dbg*.py tree generation/rendering by deduplicating edges Optimizing a script? This might sound premature, but the tree rendering was, uh, quite slow for any decently sized (>1024) btree. The main reason is that tree generation is quite hacky in places, repeatedly spitting out multiple copies of the inner node's rbyd trees for example. Rather than rewrite the tree generation implementation to be smarter, this just changes all edge representations to namedtuples (which may reduce memory pressure a bit), and collects them into a Python set. This has the effect of deduplicating generated edges efficiently, and improved the rendering performance significantly. --- I also considered memoizing rbyd tree, but dropped the idea since the current renderer performs well enough.	2023-05-30 18:17:51 -05:00
Christopher Haster	af0c3967b4	Adopted new tree renderer in dbgmtree, implemented mtree rendering In addition to plugging in the rbyd and btree renderers in dbgbtree.py, this required wiring in rbyd trees in the mdirs and mroots. A bit tricky, but with a more-or-less straightforward implementation thanks to the common edge description used for the tree renderer. For example, a relatively small mtree: $ ./scripts/dbgmtree.py disk -B4096 -t -i mroot 0x{0,1}.45, rev 1, weight 0 mdir ids tag ... {0000,0001}: .---------> -1 magic 8 ... \| .-------> config 21 ... +-+-+ btree 7 ... 0006.000a: \| .-+ 0 mdir w1 2 ... {0002,0003}: \| \| '-> 0.0 inlined w1 1024 ... 0006.000a: '-+-+ 1 mdir w1 2 ... {0004,0005}: '-> 1.0 inlined w1 1024 ...	2023-05-30 18:10:32 -05:00
Christopher Haster	9b803f9625	Reimplemented tree rendering in dbg*.py scripts The goal here was to add the option to show the combined rbyd trees in dbgbtree.py/dbgmtree.py. This was quite tricky, (and not really helped by the hackiness of these scripts), but was made a bit easier by adding a general purpose tree renderer that can render a precomputed set of branches into the tag output. For example, a 2-deep rendering of a simple btree with a block size of 1KiB, where you can see a bit of the emergent data-structure: $ ./scripts/dbgbtree.py disk -B1024 0x223 -t -Z2 -i btree 0x223.90, rev 46, weight 1024 rbyd ids tag ... 0223.0090: .-+ 0-199 btree w200 9 ... 00cb.0048: \| \| .-> 0-39 btree w40 7 ... \| \| .---+-> 40-79 btree w40 7 ... \| \| \| .---> 80-119 btree w40 7 ... \| \| \| \| .-> 120-159 btree w40 7 ... \| '-+-+-+-> 160-199 btree w40 7 ... 0223.0090: .---+-+ 200-399 btree w200 9 ... 013e.004b: \| \| .-> 200-239 btree w40 7 ... \| \| .---+-> 240-279 btree w40 8 ... \| \| \| .---> 280-319 btree w40 8 ... \| \| \| \| .-> 320-359 btree w40 8 ... \| '-+-+-+-> 360-399 btree w40 8 ... 0223.0090: \| .---+ 400-599 btree w200 9 ... 01a7.004c: \| \| \| .-> 400-439 btree w40 8 ... \| \| \| .---+-> 440-479 btree w40 8 ... \| \| \| \| .---> 480-519 btree w40 8 ... \| \| \| \| \| .-> 520-559 btree w40 8 ... \| \| '-+-+-+-> 560-599 btree w40 8 ... 0223.0090: \| \| .-+ 600-799 btree w200 9 ... 021e.004c: \| \| \| \| .-> 600-639 btree w40 8 ... \| \| \| \| .---+-> 640-679 btree w40 8 ... \| \| \| \| \| .---> 680-719 btree w40 8 ... \| \| \| \| \| \| .-> 720-759 btree w40 8 ... \| \| \| '-+-+-+-> 760-799 btree w40 8 ... 0223.0090: +-+-+-+ 800-1023 btree w224 10 ... 021f.0298: \| .-> 800-839 btree w40 8 ... \| .-+-> 840-879 btree w40 8 ... \| \| .-> 880-919 btree w40 8 ... '---+-+-> 920-1023 btree w104 9 ... This tree renderer also replaces the adhoc tree rendere in dbgrbyd.py for consistency.	2023-05-30 18:04:54 -05:00
Christopher Haster	b67fcb0ee5	Added dbgmtree.py for debugging the littlefs metadata-tree This builds on dbgrbyd.py and dbgbtree.py by allowing for quick debugging of the littlefs mtree, which is a btree of rbyd pairs with a few bells and whistles. This also comes with a number of tweaks to dbgrbyd.py and dbgbtree.py, mostly changing rbyd addresses to support some more mdir friendly formats. The syntax for rbyd addresses is starting to converge into a couple common patterns, which is nice for quickly determining what type of address you are looking at at a glance: - 0x12 => An rbyd at block 0x12 - 0x12.34 => An rbyd at block 0x12 with trunk 0x34 - 0x{12,34} => An rbyd at either block 0x12 or block 0x34 (an mdir) - 0x{12,34}.56 => An rbyd at either block 0x12 or block 0x34 with trunk 0x56 These scripts have also been updated to support any number of blocks in an rbyd address, for example 0x{12,34,56,78}. This is a bit of future proofing. >2 blocks in mdirs may be explored in the future for the increased redundancy.	2023-05-30 18:04:54 -05:00
Christopher Haster	f7d4497b80	Added some simple mtree benchmarks It's interesting to note the different performance characteristics of purely CoW btrees vs our mutable mtree. The main downside of our mtree is the need to fetch leaf mdirs. This fetch is expensive, and can be avoided in CoW btrees by storing the trunk in each branch's parent. On the other hand, btrees need to propagate all changes upwards to the root. An interesting takeaway is that a sort of mdir-trunk cache may be a very interesting optimization for relatively little RAM cost. This may be something to explore in the future.	2023-05-30 18:04:48 -05:00
Christopher Haster	41c28952c5	Updated benches/bench_btree.toml with btree changes	2023-05-30 16:36:50 -05:00
Christopher Haster	bac740a90e	Reverted "source rbyd optional" and moved reloc/alloc into lfsr_mdir_compact_ This consolidates all mdir relocation and allocation logic into lfsr_mdir_compact by using an extra mid hint to indicate if the mdir is new and unallocated. When mid == -4, the mdir is NOT new, and the mid should be taken from the mdir. This deduplicates quite a bit of logic between mdir compaction and unlining, though it is incompatible with the optional rbyd scheme previously used to compact mroot 0x{0,1} during mroot extensions. Though I'm not sure this is the best approach. We should probably look at this again after things have more of a shape.	2023-05-30 16:36:37 -05:00
Christopher Haster	cd2d54855e	Added a number of tests over mdir relocations, fixed minor bugs - lfsr_btree_isnull still used tag and not only weight for null trees - relocation forgot the mid - missed relocation when uninlining, though this fix should be cleaned up - made revision count behavior a bit more consistent Note that the new tests may be -Gnor exclusive, they rely quite a bit on exactly when compaction happens...	2023-05-30 16:36:37 -05:00
Christopher Haster	7b1c35a99b	Tweaked lfsr_mdir_compact_ to make source rbyd optional This allows lfsr_mdir_compact_ to also cover the mroot extension commit. The mroot located at 0x{0,1} is a bit unique in that it can never relocated. Instead we "extend" the mroot chain by an additional mroot that can relocate. One nice thing is we can implement this by letting lfsr_mdir_commit_ perform a normal relocation, and then rewrite the 0x{0,1} mroot with a pointer to the "relocated" mroot. Though we have to take extra care to make sure this write doesn't recursively trigger an additional relocate, which would never terminate. Fortunately, this situation only happens when we are compacting the 0x{0,1} mroot. Which simplifies things a bit.	2023-05-30 16:36:37 -05:00
Christopher Haster	395eff49ad	Changed mdir weight->0 to drop caches instead of introducing a temporary commit This was actually a bug, and would have eventually been caught when we resume power-loss testing. In our current implementation of the mtree, we immediately drop mdirs when their weight goes to zero, since at this point there's no route to write new commits to the mdir. This was implemented by 1. writing out the commit, and then 2. removing the mdir from the mtree if its weight is zero. But this has a problem analogous to why we can't salvage failed compacts during mdir split: If we allow a valid commit to be written to an mdir before we update its position in the mtree, that commit becomes immediately visible in the case of a power-loss. This is a bit tricky to fix since we rely entirely on appending tags to the on-disk rbyd to determine weight changes. We tried simulating weight changes previously, but that was a mistake that created complexity. The solution here is separate the appending of tags from the commit finalization: 1. Append any pending tags. 2. If weight->0, drop caches, abort the commit. 3. Otherwise, write the checksum, finalizing the commit. This has the new side-effect of intentionally leaving unfinalized commits on-disk, but since we have no way to reclaim the erased bytes in these mdirs, that is probably ok.	2023-05-30 16:36:37 -05:00
Christopher Haster	7877eeaa9d	Restructured lfsr_mdir_commit into separate high/low-level implementations lfsr_mdir_commit => lfsr_mdir_commit \|-> lfsr_mdir_commit_ '-> lfsr_mdir_compact_ The mess that was lfsr_mdir_commit was a growing problem. Flattening all possible mdir operations into a single loop may have resulted in a smaller code size, but at a significant cost to implementation difficult, readability, bugs, etc. This restructure splits the mdir commit logic into three components: 1. lfsr_mdir_compact_ This handles the swapping of mdir blocks, revision counts, erasing, etc. lfsr_mdir_compact_ also accepts a range of ids, allowing it to be called directly for mdir splitting/uninlining. Actually, the biggest feature in lfsr_mdir_compact_, which is easy to overlook, is that is accepts two attr lists. This seems like a weird feature for an API, but keep in mind we have strict RAM limitations, so we can't really concatenate attr lists easily. There is only a single case we need two attr lists: When uninlining an mroot we need to include 1. any pending mroot attrs, and 2. the new mtree. But one case is enough to make attempted workarounds excessively complicated. Simply accepting two attr lists here resolves this. 2. lfsr_mdir_commit_ This handles the low-level mdir commit logic: It tries to do a simple rbyd commit, and if that fails falls back to a compact/relocate loop. Perhaps surprisingly, lfsr_mdir_commit_ does not handle mdir splits. The exact behavior of mdir splits is context specific, so lfsr_mdir_commit_ simple errors if lfsr_rbyd_estimate indicates compaction will be unsuccessful. Less surprisingly, lfsr_mdir_commit_ does not handle any mtree/internal state updates. lfsr_mdir_commit_ is only concerned with the specific mdir struct provided. 3. lfsr_mdir_commit This ties together all of the mdir commit logic and provides the main mechanism by which the rest of the filesystem interacts with mdirs. lfsr_mdir_commit is mainly responsible for handling the side-effects of the low-level operations: - Propagating mtree/mroot updates caused by relocations/splits/drops - Updating the provided mdir struct correctly if it splits/relocates based on a rid hint - Updating the internally tracked mroot/mtree state on success - Updating any open mdirs on success (TODO) This is a complicated function, but most of that complexity can be captured in a large, but relatively simple, tree of if statements. Not great for code cost, but this may just be a necessity of the new mtree data-structure. This also includes the tail-recursive mroot propagation loop, which is an excellent example of how splitting the high/low-level logic helps separate context-specific logic. This still needs work, but the significantly improved readability of lfsr_mdir_commit provides much more confidence in this design. This already has the strong advantage that the extra mdir copies make it clear when exactly the higher-level mdir copies are updated. This gives us much better confidence that errors will not render the mdir state unusable, though may be coming with a RAM cost.	2023-05-30 16:33:20 -05:00
Christopher Haster	ef4fb9d3d3	Added specific tests to cover complex mdir split/drop corner cases Dropped the high-level "large entry" tests in exchange for these low-level tests. The high-level tests accomplished the same thing, but worse and less reliably. Added some rough fixes (this whole code path needs to be rewritten). Also made lfsr_rbyd_bisect a bit better behaved when dealing with a small number of large entries. This was necessary for the split/drop corner case tests since these rely on precise control of when mdirs split.	2023-05-30 14:57:45 -05:00
Christopher Haster	6bc85375ea	Added a very rough implementation of mdir drops mdirs behave a bit differently than btree nodes here. When an mdir's weight drops to zero, we eagerly drop the mdir. Unfortunately this introduce a large number of conditions into lfsr_mdir_commit. Maybe there's some different way to structure to code to avoid this... Also expanded mtree tests to cover more corner cases, these are desperately for any confidence that mdir drops work.	2023-05-30 14:57:19 -05:00
Christopher Haster	ea28413eb2	Added a bit of fuzz testing over mtree splits This isn't the greatest coverage as we don't have a verifiable simulation. Simulating the splitting-bucket-tree that is the mtree is tricky. So right now this mostly just checks there's no internal assert failures and if we have the expected number of entries afterwards.	2023-05-30 14:55:56 -05:00
Christopher Haster	3625882343	Added separate lfsr_btree_lookupnext_/lfsr_btree_lookupnext - lfsr_btree_lookupnext_ => gives you the underlying rbyd/rid, intended for btree-internal use. - lfsr_btree_lookupnext => does not give you underlying rbyd/rid, used for general purpose lookups/iteration. This is for consistency with other *_lookupnext functions, and discourages use of the leaf rbyd/rid. These are sensitive to internal btree state.	2023-05-30 14:54:33 -05:00
Christopher Haster	755789701f	Adopted lfsr_data_t in btree push/update/split functions This is mostly for consistency. It's unclear if we'll ever actually use the on-disk lfsr_data_t representation here, since current thoughts expect most btrees to only store pointers with in-device representations. It may be worth reverting this in the future.	2023-05-30 14:52:56 -05:00
Christopher Haster	99e9e18baa	Moved heavy work of copying/filtering tags into compact/appendall After implementing lfsr_btree_commit and lfsr_mdir_commit, a common pattern emerged for all compact/split operations: 1. Copy over subrange of tags. 2. Apply pending tags in that subrange. lfsr_rbyd_appendall and lfsr_rbyd_compact now provide these operations, allowing for better code sharing across these two algorithms. The only hiccup is vestigial names in btree commit, which require a flag and some special handling. Note that btree merge is a bit of a special case for now.	2023-05-30 14:52:05 -05:00
Christopher Haster	abbcf58d07	Simplified btree merge estimate at the risk of failed merges This trades off simpler estimation (and more flexibility for experimenting with better compaction strategies) for the possibility of failed merges that require cleanup. Note, we still estimate if our merge fits post-compaction, this just isn't reliable for knowing if the merge fits appended to the current compaction, with in-flight attrs for both the in-flight commit and merge name. Fortunately, since our estimate is conservative, we shouldn't see any split<->merge oscillation. --- It's also worth noting since the merge name isn't accounted for in our sibling, it wasn't clear if our estimation was correct in the first place. This change avoids any issues that may cause. One downside of using our compaction estimate as a heuristic to avoid failed merges: The merge abort code path may be difficult to cover in tests. We should make sure merge aborts don't go untested.	2023-05-30 14:47:29 -05:00
Christopher Haster	a57e79bc68	Some cleanup, reverted merge of rbyd_estimate/isdegenerate Unfortunately due to different early-exit conditions, estimate/isdegenerate isn't trivially compatible. The previous, merged implementation missed the opportunity to inline btrees with two large entries undergoing compaction. It's unlikely to hit this, but splitting these back into two separate passes simplifies the code and avoids the potential for other bugs from this combination of unrelated pieces of logic. Keep in mind lfsr_rbyd_isdegenerate is cheap: 1. Only ran when compacting the root of a btree. 2. Cutoff is usually small, at the moment requires at most 2 ids, or 2*2 rbyd lookups with the current btree implementation.	2023-05-30 14:47:01 -05:00
Christopher Haster	975a98b099	Renamed a few superblock-related things - supermdir -> mroot - supermagic -> magic - superconfig -> config	2023-05-30 14:46:56 -05:00
Christopher Haster	7925f9f019	Some more mtree split/uninlining tests and fixes Currently relying on lfsr_rbyd_append/appendattrs to inject extra attributes during lfsr_mdir_commit, need to consider if this is really the best solution. This probably results in more function calls than we really need.	2023-05-30 14:44:18 -05:00
Christopher Haster	9b72406632	Implemented mtree uninlining and splitting This is the first step towards a working mtree, though raises more questions than it resolves.	2023-05-30 13:55:21 -05:00
Christopher Haster	a3bfa3488f	Adopted a different strategy for mdir split threshold estimation This approach is simpler: fall back to using two passes if we split a supermdir. This trades off code complexity with runtime, but I think we really don't care about the runtime here, since this operation should really only happen once in a filesystem's entire lifetime.	2023-05-30 13:55:16 -05:00
Christopher Haster	06b04bda6b	Working toward supermdir split, consolidated more logic into lfsr_rbyd_inthresh This is tricky because of the number of corner-cases that can occur: 1. Our supermdir fits as is => compact normally. 2. Our supermdir does not fit, but it does if we separate the superattrs from file attrs => uninline, but don't split. 3. Our supermdir does not fit, and does not fit after separating the superattrs => uninline and split.	2023-05-30 13:53:59 -05:00
Christopher Haster	f15add4374	Simplified rbyd compaction estimate This trades of a simpler compaction estimate for a looser upper bound. We now only lower the bound for: - The number of alts per tag. - The worst-case leb128 encoding assuming current block_size. Since this worst case encoding only depends on the block_size, it can also be precalculated and stored somewhere, though we're currently not doing that. On the plus side, this no longer varies depending on the rbyd's weight, which could cause hard-to-detect issues for very large B-trees.	2023-05-30 13:44:08 -05:00
Christopher Haster	b97192886c	Updated benches to match internal API changes	2023-05-30 13:43:40 -05:00
Christopher Haster	beba584501	Implemented and adopted rbyd compaction estimates Still needs work, but at least adopted optionally in the btree. Ignoring the mdirs for now, which is a bit ironic, because the mdir compaction is really what this feature is for. But this at least proves the concept. --- Unlike btrees, mdirs simply cannot perform the attempt-then-delete-half strategy current performed by the btrees during compaction with a single pcache. This is because the moment we finish the commit with the delete, it becomes visible to the filesystem. We can't abort the commit temporarily to deal with the other half of the split, because our pcache is in use. So, instead, the idea is to estimate the compacted rbyd size before compacting, using conservative (but tight!) estimates for various leb128 encoded parts of the metadata. And if we adopt this strategy for mdirs, we should probably adopt it in the btrees for better code sharing. A couple benefits: - Major reduction in progs during split, since we don't write out tags just to delete them. - btree merge can actually consider both siblings now. - Not needing to weave the split/merge logic around compact offers a better route for code deduplication. - mdir compact will actually work, that's generally a good thing. And a couple downsides: - This estimate is complex, meaning more code-cost and a bigger surface area for bugs. - This results in a minor performance hit for the common compact case, since we need to read the rbyd being compacted twice instead of once.	2023-05-30 13:41:41 -05:00
Christopher Haster	738eb52159	Tweaked tag encoding/naming for btrees/branches LFSR_TAG_BNAME => LFSR_TAG_BRANCH LFSR_TAG_BRANCH => LFSR_TAG_BTREE Maybe this will be a problem in the future if our branch structure is not the same as a standalone btree, but I don't really see that happening.	2023-05-30 13:41:28 -05:00
Christopher Haster	4e3dca0b81	Partial implementation of a rudimentary mtree This became surprisingly tricky. The main issue is knowing when to split mdirs, and how to determine this without wasting erase cycles. Unlike splitting btree nodes, we can't salvage failed compacts here. As soon as the salvage commit is written to disk, the commit becomes immediately visibile to the filesystem because it still exists in the mtree. This is a problem if we lose power. We're likely going to need to implement rbyd estimates. This is something I hoped to avoid because it brings in quite a bit of complexity and might lead to an annoying amount of storage waste since our estimates will need to be conservative to avoid unrecoverable situations. --- Also changed the on-disk btree/branch struct to store a copy of the weight. This was already required for the root of the btree, requiring the weight to be stored in every btree pointer allows better code deduplication at the cost of some redundancy on btree branches, where the weight is already implied by the rbyd structure. This weight is usually a single byte for most branches anyways. This may be worth revisiting at some point to see if there's any other unexpected tradeoffs.	2023-05-30 13:28:35 -05:00
Christopher Haster	85ebdd0881	Reintroduced Brent's algorithm for cycle detection in lfsr_mount	2023-05-30 13:28:07 -05:00
Christopher Haster	6236f460a4	Added rough draft of the rest of superblock parsing	2023-05-30 13:27:05 -05:00
Christopher Haster	eacf5895c6	Reorganized lfsr_mount/lfsr_format to better reuse code Added lfsr_mountinited/lfsr_formatinited, mostly so lfsr_formatinited can just call lfsr_mountinited for its mount-check. This also leads to a nice consolidation of the cleanup-on-error part of lfsr_mount/lfsr_format.	2023-05-30 13:26:14 -05:00
Christopher Haster	038f6b4c4b	Adopted more pedantic names for lookupnext/lookup The exact behavior of lfsr_rbyd_lookup is a bit unusual, and has already resulted in a few mistakes. To make this more clear at a glance, names have been changed and a few more helper functions added. The new names and expected behavior: - _lookupnext - lookup the smallest id/tag less than or equal to the requested id/tag, returns LFS_ERR_NOENT if id/tag is greater than all ids/tags in the data structure. - _lookup - lookup the exact id/tag, returns LFS_ERR_NOENT if id/tag is not in the data structure. These have been adopted in all current data structures: rbyd/btree/mdir - lfsr_rbyd_lookup => lfsr_rbyd_lookupnext - lfsr_btree_lookup => lfsr_btree_lookupnext - lfsr_btree_namelookup => lfsr_btree_namelookupnext - lfsr_mdir_lookup => lfsr_mdir_lookupnext Note no lfsr_btree_namelookup is added, this is a more complicated than lfsr_btree_lookup (we need to cmp the name on-disk for equality) and also probably not needed.	2023-05-30 13:25:40 -05:00
Christopher Haster	c83d8b7abc	Added lfsr_data_add for more lfsr_data_t manipulation Also fixed an internal (currently unreachable) bug in lfs_bd_cmp where the hint could underflow if zero.	2023-05-30 13:25:29 -05:00
Christopher Haster	59552e8f1e	Merged lfsr_data_progcsum into lfsr_data_prog Considering that _most_ progs in littlefs needs to be checksummed for future consistency checks, it's not really worth it to have a separate non-checksumming function. Especially when you consider the fanout of the prog extensions for different types. A similar problem, sort of unresolved, is what to do with all the validating functions. I guess we'll cross that bridge when we need to.	2023-05-30 13:24:40 -05:00
Christopher Haster	21bd43fa0c	Implemented a set of convenience lfsr_data_read* functions This makes lfsr_data_t a more powerful primitive in littlefs. - Implemented a set of convenience lfsr_data_read* functions for easier reading from muxed on-disk/in-ram data. - Adoped these functions and lfsr_data_t in *_fromdisk functions as well as low-level rbyd operations. - Changed leb128 parsing to rely on returned limits instead of pre-initialized 0xffs to detect truncated leb128s. - Prefer incremental reading+parsing in more places as a side-effect, though we don't really have a measurement if this is a net benefit or cost code/ram wise.	2023-05-30 13:23:48 -05:00
Christopher Haster	283b8e84c4	Some more experimental lfsr_bd_ functions These functions offer more than previous internal bd functions, the idea being that the more functionality we can move into this layer, the less functionality gets duplicated across dependent functions. - lfsr_bd_read - caching read with hint - lfsr_bd_readcsum - read with checksum - lfsr_bd_csum - calculate checksum, don't read data - lfsr_bd_cmp - compare data against a buffer - lfsr_bd_prog - caching prog - lfsr_bd_progcsum - prog with checksum - lfsr_bd_sync - complete an in-flight prog - lfsr_bd_progvalidate - prog with read-back validation - lfsr_bd_progcsumvalidate - prog with checksum and read-back validation - lfsr_bd_syncvalidate - complete an in-flight prog with read-back validation - lfsr_bd_erase - erase a block - lfsr_bd_readtag - read a tag with optional checksum - lfsr_bd_progtag - prog a tag with checksum Of course these are all susceptible to change.	2023-05-30 13:17:01 -05:00

1 2 3 4 5 ...

922 Commits