littlefs

Author	SHA1	Message	Date
Christopher Haster	f29a4982c4	Added block-level erased-state checksums Much like the erased-state checksums in our rbyds (ecksums), these block-level erased-state checksums (becksums) allow us to detect failed progs to erased parts of a block and are key to achieving efficient incremental write performance with large blocks and frequent power cycles/open-close cycles. These are also key to achieving _reasonable_ write performance for simple writes (linear, non-overwriting), since littlefs now relies solely on becksums to efficiently append to blocks. Though I suppose the previous block staging logic used with the CTZ skip-list could be brought back to make becksums optional and avoid btree lookups during simple writes (we do a _lot_ of btree lookups)... I'll leave this open as a future optimization... Unlike in-rbyd ecksums, becksums need to be stored out-of-band so our data blocks only contain raw data. Since they are optional, an additional tag in the file's btree makes sense. Becksums are relatively simple, but they bring some challenges: 1. Adding becksums to file btrees is the first case we have for multiple struct tags per btree id. This isn't too complicated a problem, but requires some new internal btree APIs. Looking forward, which I probably shouldn't be doing this often, multiple struct tags will also be useful for parity and content ids as a part of data redundancy and data deduplication, though I think it's uncontroversial to consider this both heavier-weight features... 2. Becksums only work if unfilled blocks are aligned to the prog_size. This is the whole point of crystal_size -- to provide temporary storage for unaligned writes -- but actually aligning the block during writes turns out to be a bit tricky without a bunch of unecesssary btree lookups (we already do too many btree lookups!). The current implementation here discards the pcache to force alignment, taking advantage of the requirement that cache_size >= prog_size, but this is corrupting our block checksums. Code cost: code stack before: 31248 2792 after: 32060 (+2.5%) 2864 (+2.5%) Also lfsr_ftree_flush needs work. I'm usually open to gotos in C when they improve internal logic, but even for me, the multiple goto jumps from every left-neighbor lookup into the block writing loop is a bit much...	2023-12-14 01:05:34 -06:00
Christopher Haster	26afd8b118	Reworked lfsr_ftree_flush to try to minimize btree lookups Mainly by not looking up left neighbors after the first of many block/fragment/crystal writes. The right neighbors should already be avoiding redundant lookups since redundant lookups imply a full fragment/block on the not-last write. Additionally, once we fail our crystallization check, we can assume all future fragment writes will fail, so we only need to do that lookup once. There's probably still more to optimize, but the way these heuristics interact are tricky...	2023-12-12 12:10:06 -06:00
Christopher Haster	fcddef6f1a	Dropped lfsr_data_t hole representation We don't need this, it's not easily gc-able, and encourages redundant lookups. It's better to just handle holes explicitly where needed.	2023-12-12 12:10:02 -06:00
Christopher Haster	4534d095e9	Reframed data slice operations in terms of lfsr_data_slice This combines the previous lfsr_data_truncate/lfsr_data_fruncate behavior into a single flexible function, and makes truncate/fruncate small aliases (drop in the future?). The combined behavior lets us adopt lfsr_data_slice in more places.	2023-12-12 12:07:58 -06:00
Christopher Haster	c4d75efa40	Added bptr checksums Looking forward, bptr checksums provide an easy mechanism to validate data residing in blocks. This extends the merkle-tree-like nature of the filesystem all the way down to the data level, and is common in other COW filesystems. Two interesting things to note: 1. We don't actually check data-level checksums yet, but we do calculate data-level checksums unconditionally. Writing checksums is easy, but validating checksums is a bit more tricky. This is made a bit harder for littlefs, since we can't hold an entire block of data in RAM, so we have to choose between separate bus transactions for checksum + data reads, or extremely expensive overreads every read. Note this already exists at the metadata-level, the separate bus transactions for rbyd fetch + rbyd lookup means we _are_ susceptible to a very small window where bit errors can get through. But anyways, writing checksums is easy. And has basically no cost since we are already processing the data for our write. So we might as well write the data-level checksums at all times, even if we aren't validating at the data-level. 2. To make bptr checksums work cheaply we need an additional cksize field to indicate how much data is checksummed. This field seems redundant when we already have the bptr's data size, but if we didn't have this field, we would be forced to recalculate the checksum every time a block is sliced. This would be unreasonable. The immutable cksize field does mean we may be checksumming more data than we need to when validating, but we should be avoiding small block slices anyways for storage cost reasons. This does add some stack cost because our bptr struct is larger now: code stack before: 31200 2768 after: 31272 (+0.2%) 2800 (+1.1%)	2023-12-12 12:07:55 -06:00
Christopher Haster	16fa88aac3	Rearranged lfsr_ftree_flush a bit and dropped lfsr_ftree_readnext This avoids redundant lookups when holes are involved. And we don't really leverage data holes as an abstraction well. We use data holes in two places, but they do two different things, so they may as well be specialized operations. code stack before: 31092 2752 after: 31200 (+0.3%) 2768 (+0.6%)	2023-12-10 14:10:52 -06:00
Christopher Haster	9f02cbb26b	Tweaked mount/format/dbg littlefs info print The info should now be ordered more-or-less by decreasing importance: littlefs v2.0 4096x256 0x{0,1}.36d w12.256 ^ ^ ^ ^ ^ '-.-' ^ ^ ^ '--\|-\|----\|---\|-----\|---\|-----\|---\|-- littlefs '-\|----\|---\|-----\|---\|-----\|---\|-- on-disk major version '----\|---\|-----\|---\|-----\|---\|-- on-disk minor version '---\|-----\|---\|-----\|---\|-- block size '-----\|---\|-----\|---\|-- block count '---\|-----\|---\|-- mroot blocks '-----\|---\|-- mroot trunk '---\|-- mtree weight '-- mweight	2023-12-08 14:23:53 -06:00
Christopher Haster	6ccd9eb598	Adopted different strategy for hypothetical future configs Instead of writing every possible config that has the potential to be useful in the future, stick to just writing the configs that we know are useful, and error if we see any configs we don't understand. This prevents unnecessary config bloat, while still allowing configs to be introduced in a backwards compatible way in the future. Currently unknown configs are treated as a mount error, but in theory you could still try to read the filesystem, just with potentially corrupted data. Maybe this could be behind some sort of "FORCE" mount flag. littlefs must never write to the filesystem if it finds unknown configs. --- This also creates a curious case for the hole in our tag encoding previously taken up by the OCOMPATFLAGS config. We can query for any config > SIZELIMIT with lookupnext, but the OCOMPATFLAGS flag would need an extra lookup which just isn't worth it. Instead I'm just adding OCOMPATFLAGS back in. To support OCOMPATFLAGS littlefs has to do literally nothing, so this is really more of a documentation change. And who know, maybe OCOMPATFLAGS will have some weird use case in the future...	2023-12-08 14:03:56 -06:00
Christopher Haster	337bdf61ae	Rearranged tag encodings to make space for BECKSUM, ORPHAN, etc Also: - Renamed GSTATE -> GDELTA for gdelta tags. GSTATE tags added as separate in-device flags. The GSTATE tags were already serving this dual purpose. - Renamed BSHRUB* -> SHRUB when the tag is not necessarily operating on a file bshrub. - Renamed TRUNK -> BSHRUB The tag encoding space now has a couple funky holes: - 0x0005 - Hole for aligning config tags. I guess this could be used for OCOMPATFLAGS in the future? - 0x0203 - Hole so that ORPHAN can be a 1-bit difference from REG. This could be after BOOKMARK, but having a bit to differentiate littlefs specific file types (BOOKMARK, ORPHAN) from normal file types (REG, DIR) is nice. I guess this could be used for SYMLINK if we ever want symlinks in the future? - 0x0314-0x0318 - Hole so that the mdir related tags (MROOT, MDIR, MTREE) are nicely aligned. This is probably a good place for file-related tags to go in the future (BECKSUM, CID, COMPR), but we only have two slots, so will probably run out pretty quickly. - 0x3028 - Hole so that all btree related tags (BTREE, BRANCH, MTREE) share a common lower bit-pattern. I guess this could be used for MSHRUB if we ever want mshrubs in the future?	2023-12-08 13:28:47 -06:00
Christopher Haster	04c6b5a067	Added grm rcompat flag, dropped ocompat, tweaked compat flags a bit I'm just not seeing a use case for optional compat flags (ocompat), so dropping for now. It seems their *nix equivalent, feature_compat, is used to inform fsck of things, but this doesn't really make since in littlefs since there is no fsck. Or from a different perspective, littlefs is always running fsck. Ocompat flags can always be added later (since they do nothing). Unfortunately this really ruins the alignment of the tag encoding. For whatever reason config limits tend to come in pairs. For now the best solution is just leave tag 0x0006 unused. I guess you can consider it reserved for hypothetical ocompat flags in the future. --- This adds an rcompat flag for the grm, since in theory a filesystem doesn't need to support grms if it never renames files (or creates directories?). But if a filesystem doesn't support grms and a grms gets written into the filesystem, this can lead to corruption. I think every piece of gstate will end up with its own compat flag for this reason. --- Also renamed r/w/oflags -> r/w/ocompatflags to make their purpose clearer. --- The code impact of adding the grm rcompat flag is minimal, and will probably be less for additional rcompat flags: code stack before: 31528 2752 after: 31584 (+0.2%) 2752 (+0.0%)	2023-12-07 15:05:51 -06:00
Christopher Haster	c76ff08f67	Added lfsr_rbyd_compact, lfsr_rbyd_appendshrub Also renamed lfsr_rbyd_compact -> lfsr_rbyd_appendcompaction to make room for a more high-level lfsr_rbyd_compact and emphasize that this is an append operation. lfsr_rbyd_appendshrub was a common pattern emerging in mdir commit related functions for moving shrubs around. It's just a useful function to have. lfsr_rbyd_compact, on the other hand, is only useful for the evicting bshrubs. Most other rbyd compaction code involves weird corner cases and doesn't seem to generalize well (or at least I can't see it). But adding lfsr_rbyd_compact is nice for consistency with the mdir commit/compact functions. And it can be used by lfsr_bshrub_commit at least... Code changes minimal: before: 31596 2752 after: 31528 (+0.2%) 2752 (+0.0%)	2023-12-06 23:58:46 -06:00
Christopher Haster	7e9c0fbd88	Changed lfsr_rbyd/btree/bshrub_commit to _not_ be atomic, adopted more Now that error recovery is well defined (at least in theory), and high-level mdir/file functions create on-stack copies, the on-stack copies for the low-level rbyd/btree/bshrub commit functions are redundant and not useful. Dropping the redundant on-stack copies in low-level functions saves a bit for stack usage. Additionally, we can adopt lfsr_rbyd_commit in more places where it was avoided to avoid even more redundant on-stack copies. code stack before: 31676 2776 after: 31596 (-0.3%) 2752 (-0.9%)	2023-12-06 22:49:06 -06:00
Christopher Haster	3a6afaf1c5	Renamed lfs_alloc_ack -> lfs_alloc_ckpoint This name describes this operation ever so slightly better, I've already been refering to this as "checkpointing the allocator" places.	2023-12-06 22:24:18 -06:00
Christopher Haster	43270ed50f	Moved post-compaction mdir commits into lfsr_mdir_compact__ Since we need access to the pending attr-list in lfsr_mdir_compact__ now, we might as well just do the pending commit. Worst case, a call to lfsr_mdir_compact__ can provide a NULL attr-list for the previous behavior (though we always follow up compaction with a commit). The only downside is that this behavior is now a bit different from lfsr_rbyd_compact. Need to revisit lfsr_rbyd_compact and see if it should do the same. This had a tiny improvement to code size, which I think is just the cost of two function calls: code stack before: 31716 2776 after: 31676 (-0.1%) 2776 (+0.0%)	2023-12-06 22:24:12 -06:00
Christopher Haster	0026121bc3	Merged lfsr_ftree_bufferedreadnext/readnext Turns out we don't need lfsr_ftree_(unbuffered)readnext. Even if we did, the unbuffered behavior can be obtained by passing a NULL buffer to lfsr_ftree_readnext. Code changes minimal: code stack before: 31760 2776 after: 31716 (-0.1%) 2776 (+0.0%)	2023-12-06 22:24:10 -06:00
Christopher Haster	2793ae2e03	Renamed lfsr_flags_is* -> lfsr_o_is, dropped lfsr_file_is There is a point where all of these small function redeclarations become more noise than helpful, though I'm not sure where that line is.	2023-12-06 22:24:08 -06:00
Christopher Haster	166845f43f	Adopt lfsr_ftree_t to help with staging files lfsr_ftree_t acts as a sort of proto-file type, holding enough information for file reads/writes if the relevant mdir is known. This lets low-level file write operations operate on a copy of the proto-file without needing to copy the relevant mdir, file stuff, etc. To make this work, lfsr_mdir_commit also needs to stage any bshrubs in the attr-list, since these may not be in our opened file list, but this is a good thing to handle implicitly anyways. We should only ever have one untracked bshrub being operated on (multithreaded support would be a whole other can of worms). Unfortunately the extra machinery in lfsr_mdir_commit, and the fact that passing two pointers around instead of one adds quite a bit of code, means this comes with a code cost. But the tradeoff for stack cost and no risk of stack pointers in our opened file list makes this probably worth it: code stack before: 31584 2824 after: 31760 (+0.6%) 2776 (-1.7%)	2023-12-06 22:24:05 -06:00
Christopher Haster	6bfbbae341	Implemented file-level error recover in theory Not yet tested, thus the "in theory". Testing this is going to be a bit tricky. Fortunately on-stack copies are a pretty resilient way to recover from errors. This comes with an unfortunate, but necessary, code/stack increase: code stack before: 31280 2736 after: 31584 (+1.0%) 2824 (+3.2%)	2023-12-06 22:24:01 -06:00
Christopher Haster	5636895eee	Significantly improved lfsr_btree_carve The main optimization here is to try the minimize the number of individual btree/bshrub commits. We can't always perform lfsr_btree_carve in a single commit unfortunately, due to unbounded crystal fragments and needing to commit to different leaf rbyds, but we can combine attrs into single commits more than we were previously. This is especially important for bshrubs, where commits can trigger full mdir compactions. Additionally, the method we use for breaking up small blocks into fragments has been changed to never "lose" the underlying block pointer until the fragmentation is complete. This prevents the block pointer misallocation bug found earlier. In theory this won't be necessary once file writes create on-stack bshrub copies for error recovery, but it's at least nice to prove that this is possible in case we ever want to not maintain on-stack bshrub copies (code savings?). code stack before: 31512 2648 after: 31280 (-0.7%) 2736 (+3.3%)	2023-12-06 22:23:59 -06:00
Christopher Haster	effdb1e8c6	Enabled bypassing the file's write buffer during file writes This is a nuanced optimization that is relatively unique to littlefs's use case. Because we're in a RAM constrained environment, it's not unreasonable for whatever temporary buffer is used to write to a file to exceed the write buffer allocated for the file. Since the temporary buffer is temporary, it may be orders of magnitude larger than the file's write buffer. If this happens, breaking up the write into write-buffer-sized chunks so we can copy the data through the file's write buffer just wastes IO. In theory, littlefs should work just fine with a zero-sized file write buffer, though this is not yet tested. One interesting subtlety with the current implementation, we still flush the file's write buffer when we bypass it. In theory you can avoid flushing, but this risks strange write orders that could make low-level write heuristics (such as the crystallization threshold) behave really poorly. --- Unfortunately the tests are now failing due to an interesting but unrelated bug. It turns out bypassing our file buffer allows in-btree block pointers to interact with heavily fragmented btrees for the first time in our tests. This leads to an incorrect allocation of a block that is in the previous copy of a file's btree during lfsr_btree_carve. This isn't an issue for btree inner nodes. Btree commits happen atomically, with the new btree being allocated on the stack and protected by allocator checkpoints until completion. But this is an issue for the block pointers, because lfsr_btree_carve is not atomic and results in intermediary states where the block pointers are lost. What's a bit funny, is this commit is actually a part of some preparation to introduce temporary file copies during lfsr_file_write for error recovery. This would mean the previous state of our file would remain viewable by the block allocator, fixing this bug. I need to think a bit more on if this is the correct solution (debugging these multi-layer bugs turns my brain into mush), but I think it is, in which case this bug was fixed before it was even discovered.	2023-12-06 22:23:56 -06:00
Christopher Haster	6261bafed2	Added more file tests with multiple files, fixed bugs Fortunately these operations are heavily tested in test_dirs. The only difference with files is the possibility for shrubs to need to be copied. Bugs fixed: - It's counterintuitive, but lfsr_rbyd_appendcompactattr _can_ error with LFS_ERR_RANGE when we are copying a shrub. This can happen if the underlying mdir needs compaction itself. - It's possible to null-trunk bshrubs to appear in our filesystem traversal. Null-trunk bshrubs don't usually appear in any stable state, but they are created by lfsr_bshrub_alloc and lfsr_btree_commit to represent new, yet-uncommitted shrubs. This gets a bit tricky because we also use null-trunks to indicate if lfsr_btree_traversal has traversed the root. We can't rely on bid >= weight for this because zero-weight btrees are allowed. The solution here, though maybe temporary (famous last words), is to treat null-trunk btrees as not having a root. Which isn't really true, but null-trunk btree roots only exist between allocator checkpoints, so they are allowed to be unreachable. We really need more asserts that this is the case though... At least added an assert that we never commit/read null trunks on disk.	2023-12-06 22:23:53 -06:00
Christopher Haster	939dd2145a	Added some corner-case tests, fixed related bugs/POSIX nuances POSIX is notoriously full of subtle and confusing nuances. Not through any fault of POSIX, but as a result of trying to describe a complex system with simple and easy to use operations. Corner cases fixed here: - rename("dir", "file") => ENOTDIR This is the main surprise to me, and a mistake on my part. I thought EISDIR would be appropriate for any renames with mismatched types, since both involve a directory. It would be simpler code-wise, and avoid ambiguity around if "file" is not a dir, or some other file exists in the file's path. But I guess ENOTDIR makes more sense if you think of the destination as the target being operated on. - remove("/") => EINVAL - rename("/", "x") => EINVAL - rename("x", "/") => ENOTEMPTY - open("/") => EISDIR It's a bit difficult to lookup what error codes around root operations should be, since they mostly end up as EPERM on modern systems, but this doesn't really make sense for littlefs. The solution chosen here is to prefer directory-related errors (EISDIR, ENOTEMPTY) when possible, and fall back to EINVAL when the only issue is that the target is the root directory. Also I tweaked lfsr_mtree_pathlookup a bit so mid=0 indicates the target is the root and mid=-1 indicates the target can't be created (because of a missing directory). I think using mid=0 for the latter is a leftover from when mid=-1 was a bit of a mess...	2023-12-06 22:23:51 -06:00
Christopher Haster	abbd2d6c3f	Made it possible to actually rename shrubbed files This needed a bit of extra handling to copy the shrub, since it exists outside of the mdir's main tree. Also added relevant tests.	2023-12-06 22:23:47 -06:00
Christopher Haster	b1ce27f733	Reorganized test suites a bit - Renamed test_dtree -> test_dirs - Renamed test_dseek -> test_dread - Split test_files -> test_files, test_fwrite	2023-12-06 22:23:45 -06:00
Christopher Haster	4da7c88eb0	Added shortcut encoding for linear/log powerlosses in test-runner The previous encoding was a bit problematic with our linear and log heuristics, which can grow thousands of powerlosses deep. You know you have a problem when you're copying a test id that spans a dozen lines. It also meant we were spending O(n^2) time just encoding powerloss ids: before: 942.68s after: 921.94s (-2.2%) This new encoding takes advantage of the unused characters in our leb16 encoding, with an 'x' prefix indicating linear-heuristic powerlosses and a 'y' prefix indicating log-heuristic powerlosses ('w' is used for negative leb16s). Before: - explicit: 42q2q2 - linear: 123456789abcdefg1h1i1j1k1l1m1n1o1p1q1r1s1t1u1v1 - log: 1248g1g2g4g8gg1gg2gg4gg8 After: - explicit: 42q2q2 - linear: xg2 - log: yc	2023-12-06 22:23:43 -06:00
Christopher Haster	d485795336	Removed concept of geometries from test/bench runners This turned out to not be all that useful. Tests already take quite a bit to run, which is a good thing! We have a lot of tests! 942.68s or ~15 minutes of tests at the time of writing to be exact. But simply multiplying the number of tests by some number of geometries is heavy handed and not a great use of testing time. Instead, tests where different geometries are relevant can parameterize READ_SIZE/PROG_SIZE/BLOCK_SIZE at the suite level where needed. The geometry system was just another define parameterization layer anyways. Testing different geometries can still be done in CI by overriding the relevant defines anyways, and it _might_ be interesting there.	2023-12-06 22:23:41 -06:00
Christopher Haster	b8d3a5ef46	Fixed inotify race conditions and fd leak in scripts Since we were only registering our inotify reader after the previous operation completed, it was easy to miss modifications that happened faster than our scripts. Since our scripts are in Python, this happened quite often and made it hard to trust the current state of scripts with --keep-open, sort of defeating the purpose of --keep-open... I think previously this race condition wasn't avoided because of the potential to loop indefinitely if --keep-open referenced a file that the script itself modified, but it's up to the user to avoid this if it is an issue. --- Also while fixing this, I noticed our use of the inotify_simple library was leaking file descriptors everywhere! I just wasn't closing any inotify objects at all. A bit concerning since scripts with --keep-open can be quite long lived...	2023-12-06 22:23:38 -06:00
Christopher Haster	a9772d785a	Removed removal of root bookmark in test_mtree It turns out permanent root bookmark creates some rather interesting constraints on our mtree: 1. We can never delete all mids, since at least one mid needs to exist to represent the root's bookmark. 2. We can never revert to an inlined mdir after uninlining, since our root bookmark always exists to stop this. This is an unfortunate downside as it would be nice to be able to reinline mdirs, but not the end of the world. This restricts what operations are possible, and transitively, what we can test. This commit drops the removal of root bookmarks in test_mtree, which was a workaround to keep tests from early implementation running. This was preventing some minor optimizations. This required dropping some tests, but these tests tested operations that aren't really possible in practice. Dropping the removal of root bookmarks allowed for a minor optimization in lfsr_mdir_drop, and may lead to more in the future (or maybe just stricter asserts): code stack before: 31280 2648 after: 31208 (-0.2%) 2648 (+0.0%)	2023-12-06 22:23:36 -06:00
Christopher Haster	74c4bb0792	Reverted moving lfsr_mdir_drop out of lfsr_mdir_commit The logic is simpler with lfsr_mdir_drop being an implicit side-effect of lfsr_mdir_commit, and right now a simpler system is preferred over a complex one. I think it will be more useful to focus on reducing the stack overhead of lfsr_mdir_drop in general. Still, it may be worth reverting this revert in the future.	2023-12-06 22:23:34 -06:00
Christopher Haster	608d3bd5a5	Moved lfsr_mdir_drop out of lfsr_mdir_commit, made explicit This directly trades code cost for stack cost, by moving the stack overhead of lfsr_mdir_drop out of the hot-path. Currently, the stack hot-path is the mdir commit needed to flush file bshrubs, and this can never drop an mdir: before: 31280 2648 after: 31312 (+0.1%) 2520 (-4.8%)	2023-12-06 22:23:31 -06:00
Christopher Haster	eb6c361dfa	Adopted lazy orphaned mdir drops This ended up being much less of a simplification than I hoped it would. It's still easier/more efficient to revert to a relocation in most cases when dropping in an mdir split, and the small gain from simplifying how drops/commits interact is overshadowed by the code duplication necessary to separate lfsr_mdir_drop out from lfsr_mdir_commit: code stack before: 30952 2528 after: 31280 (+1.1%) 2648 (+4.7%) Still, this does at least simplify the logical corner cases (we don't need to abort commits when droppable anymore), and lfsr_mdir_drop is ultimately necessary for supporting lazy file creation. Also having a fix-orphans step during mount allows other littlefs implementations the option to create orphanned mdirs without compat issues. So this ends up the more flexible approach. It _might_ be worth having both eager mdir drops and an explicit lfsr_mdir_drop for lazy file creation in the future, but I doubt this will end up worth the code duplication... --- Oh right, I forgot to actually describe this change. This trades eager mdir drops: 1. Drop mdirs from the mtree immediately as soon as their weight goes to zero. For lazy mdir drops: 1. Drop mdirs from the mtree in a second commit. 2. Scan and drop orphaned mdirs on the first write after mount. This sounds very similar to the previous "deorphan" scan, which risked an extreme performance cost during mount, but it should be noted this orphan scan only needs to touch every mdir once. This makes it no worse than the overhead of actually mounting the filesystem. We can also keep an eye out for orphaned mdirs when we mount, so no extra scan is needed unless there was an unlucky powerloss. Eager mdir dropping sounds simpler, but thanks to deferred commits introduces some subtle complexity around aborting commits that would drop an mdir to zero. Remember commits are viewable on-disk as soon as a commit completes. In _theory_, lazy mdir drops simplify the logic around committing to mdirs. Though the real kicker is that lazy mdir drops are required for lazy file creation. The current idea for lazy file creation involves tracking mid-less opened-but-not-yet-created files. These files can have bshrubs, so they need space on an mdir somewhere. But they aren't actually created yet, so they don't have an mid. This is fine (though it's probably going to be tricky) as long as we allocate an mid on file sync, but there is always a risk of losing power with mdirs that contain only RAM-backed files. Fortunately, no-mids means no orphaned files, but it does mean orphaned mdirs with no synced contents. Long story short, lazy mdir drops are currently a necessary evil, and logical simplification, that unfortunately comes with some cost.	2023-12-06 22:23:28 -06:00
Christopher Haster	aa79c274a6	Changed lfsr_mdir_commit__ to only copy this rbyd This saves a bit of stack space in theory, but after the redund block restructure this amounts to a single word. But lfsr_mdir_commit__ isn't on the hot path anyways, so this doesn't even matter... code stack before: 30948 2528 after: 30952 (+0.0%) 2528 (+0.0%)	2023-12-06 22:23:25 -06:00
Christopher Haster	c1c51a316a	Some more cleanup around code organization I would merge this with the previous cleanup items, but there's no way I'm going to try to rebase this through the redund block restructure...	2023-12-06 22:23:18 -06:00
Christopher Haster	51e39747c0	Reverting alternate redund block layout in lfsr_mdir_t See the previous commit for the reason. The alternate redund block layout is just inferior in terms of both code and RAM.	2023-12-06 22:23:16 -06:00
Christopher Haster	9d182c2055	Attempted alternate redund block layout in lfsr_mdir_t The idea here is to revert moving redund blocks into lfsr_rbyd_t, and instead just keep a redundant copy of the rbyd blocks in the redund blocks in lfsr_mdir_t. Surprisingly, extra overhead in lfsr_mdir_t ended up with worse stack usage than extra overhead in lfsr_rbyd_t. I guess we end up allocated more mdirs than rbyds, which makes a bit of sense given how complicated lfsr_mdir_commit is: code stack structs redund union: 30976 2496 1072 redund in rbyd: 30948 (-0.1%) 2528 (+1.3%) 1100 (+2.6%) redund in mdir: 31000 (+0.1%) 2536 (+1.6%) 1092 (+1.8%) The mdir option does seem to improve struct overhead, but this hasn't been a reliable measurement since it doesn't take into account how many of each struct is allocated. Given that the mdir option is inferior in both code and stack cost, and requires more care to keep the rbyd/redund blocks in sync, I think I'm going to revert this for now but keep the commit in the commit history since it's an interesting comparison.	2023-12-06 22:23:13 -06:00
Christopher Haster	becbc0c2ad	Moved redundant blocks into the lfsr_rbyd_t struct This simplifies dependent structs with redundancy, mainly lfsr_mdir_t, at a significant RAM cost: code stack structs before: 30976 2496 1072 after: 30948 (-0.1%) 2528 (+1.3%) 1100 (+2.6%) Which, to be honest, is not as bad as I thought it would be. Though it is still pretty bad for no new features. The motivation for this change: 1. The organization of the previous lfsr_mdir_t struct was a bit hacky and relied on exact padding so the redund block array and rbyd block lined up at the right offset. 2. The previous organization prevented theoretical "read-only rbyd structs" that could omit write-related fields, e.g. eoff and cksum. This idea is currently unused. 3. The current mdir=level-1, btree/data=level-0 redund design makes this RAM tradeoff pretty bad, but in theory higher btree redund levels would need the extra redund blocks in the rbyd struct anyways. Still, the RAM impact to the current default configuration means this should probably be reverted...	2023-12-06 22:23:11 -06:00
Christopher Haster	019044e4c6	Adopted better struct field names, cast to lfsr_openedmdir_t - Renamed mdir->u.m to mdir->u.mdir. - Prefer mdir->u.rbyd.* where possible. - Changed file/dir mdirs to be stored directly, requiring a cast to lfsr_openedmdir_t to enroll in the opened mdir list.	2023-12-06 22:23:08 -06:00
Christopher Haster	a89b3e42ba	Some cleanup items - Adopted _IS naming convention for sign-bit macros. - Made all struct initializing macros function-like, including the _NULL() macros. - Renamed ggrm/dgrm -> grm_g/grm_d. - Renamed lfsr_mroot_commit_ -> lfsr_mroot_commit. - Renamed LFSR_FILE_BSPROUT -> LFSR_FILE_ISDIRECT. - Renamed LFSR_BSPROUT_NULL -> LFSR_FILE_BNULL(). - Dropped _unerase functions for explicitly setting eoff=-1.	2023-12-06 22:23:06 -06:00
Christopher Haster	f4af2b407e	More mid-related function cleanup Reverted to one set of signed lfsr_mid_rid/bid functions, and tried to make their usage more consistent. We have two ways to compare mdirs now, lfsr_mdir_cmp (compares block addresses) and lfsr_mdir_bid (compares mids), and it's not very clear when to use which one. lfsr_mdir_cmp is a bit more robust in weird mid cases (mainly inlined mdirs when mroot mid=-1), so currently preferring that. Also did some bit twiddling to preserve mid=-1 => bid=-1 and rid=-1, this save a bit of code: code stack before: 31056 2488 after: 30972 (-0.3%) 2496 (+0.3%)	2023-12-06 22:23:02 -06:00
Christopher Haster	41b9caf25d	Renamed mid related functions and tried to make them less cumbersome - lfs->mleaf_bits -> lfs->mbits - lfsr_mleafweight -> lfsr_mweight - lfsr_midbmask -> lfsr_mid_bid - lfsr_midrmask -> lfsr_mid_rid - added lfsr_mid_cbid - added lfsr_mid_crid - added lfsr_mdir_* variants	2023-12-06 22:23:00 -06:00
Christopher Haster	928108da0a	Removed the mtree param from lfsr_mtree_* functions There's only one mtree in a given filesystem. With the recent lfsr_mdir_commit restructure, it makes more sense for the mtree to be implicit. code stack before: 31096 2480 after: 31016 (-0.3%) 2480 (+0.0%)	2023-12-06 22:22:57 -06:00
Christopher Haster	30a9a62620	Heavily reworked lfsr_mdir_commit, split into more mid-level functions Originally, the intention of this rework was to make it possible to shrub the mtree, i.e. allow an mshrub, i.e. inline the root rbyd of the mtree to be inlined in the mroot. This would allow small mtrees, 2, 3, etc mdirs, to save a block that would be needed for the mtree's root. But as the mshrub was progressing, minor problems kept unfolding, and ultimately I've decided to shelve the idea of mshrubs for now. They add quite a bit of complexity for relatively little gain: - bshrubs are just complicated to update. They require a call to lfsr_mdir_commit to update the inlined-root, which is a bit of a problem when your mshrub needs to be updated inside lfsr_mdir_commit, and your system disallows recursion... Recursion _can_ be avoided by separate bshrub commit variants that go through either lfsr_mdir_commit or lfsr_mdir_commit_, but this complicates things and requires some code duplication, weakening the value of reusing the bshrub data-structure. - It's not always possible to compact the mshrub's backing mroot when we need to modify the mshrub. If an mroot becomes full and needs to split, for example, we need to allocate the new mdirs, update the (new) mshrub, and then commit everything into the mroot when we compact. But the "update the (new) mshrub" step can't be done until after we compact, because the mroot is by definition full. This _can_ also be worked around, by building an attr list containing all of the mshrub changes, and committing the mshrub/mroot changes in the same transaction, but this complicates things and increases the stack cost for the current hot-path. - Every shrub needs a configurable shrub size, and the mshrub is no exception. This adds another config option and complicates shared shrub eviction code. - The value for mshrubs is not actually that great. Unlike file bshrubs, there's only one mshrub in the filesystem, and I'm not sure there's a situation where a filesystem has >1 mdirs and the exact number of allocated blocks is critical. And this complexity is reflected in code cost and robustness, not to mention developer time. I think for littlefs this is just not worth doing. At least not now. We can always introduce mshrubs in a backwards compatible manner if needed. --- But this rework did lead to better code organization around mdir commits and how they update the mtree/mroot, so I'm keeping those changes. In general lfsr_mdir_commit has been broken up into mtree/mroot specific functions that _do_ propagate in-device changes. Any commit to the mroot changes the on-disk state of the filesystem anyways, so the mroot commit _must_ be the last thing lfsr_mdir_commit does. This leads to some duplicated updates, but that's not really a problem. Here's the new call graph inside lfsr_mdir_commit: lfsr_mdir_commit .---------' \| \| \| '-----------------. v \| \| '-----------------. \| lfsr_mtree_commit \| '--------. \| \| '---------. \| \| \| \| v v \| \| \| lfsr_mroot_commit \| \| \| \| '--------. \| \| \| \| v v \| \| \| lfsr_mdir_commit_ \| \| \| .--------' '--------. \| \| \| \| .-----------------\|-' \| v v v v v lfsr_mdir_commit__ lfsr_mdir_compact__ This rework didn't really impact code/stack that much. It added a bit of code, but saved a bit of RAM. The real value is that the narrower-scoped functions contain more focused logic: code stack before: 30780 2504 after: 31096 (+1.0%) 2480 (-1.0%)	2023-12-06 22:22:52 -06:00
Christopher Haster	a8f54fb1e0	Brought back the lfsr_mptr_t This is just a useful type to have to make the code a bit more readable. This doesn't affect the code that much, except we are making more on-stack copies of mptrs since the mdir doesn't technically contain a mutable mptr. Maybe this should change? code stack before: 30768 2496 after: 30776 (+0.0%) 2504 (+0.3%)	2023-11-21 14:16:09 -06:00
Christopher Haster	bc8d54f9e0	Cleaned up bshrub code a bit Mostly moved things around, removed a vestigial but harmless eviction check in lfsr_mdir_compact__, add lfsr_bshrub_alloc/fetch, etc.	2023-11-21 14:10:33 -06:00
Christopher Haster	2a4aadca0e	Fixed a number of bshrub-related alloc/clobber test failures - There was a lingering strict pcache assert in lfs_bd_erase. Very unlikely to hit, but it is possible and shouldn't be an assert now that pcache can be left in an arbitrary state. That being said, it was asserting on an actual bug in this case. - Our btree traversal was not traversing the roots of zero-weight btrees. Zero-weight btrees can happen as an intermediary step during btree/bshrub carving. If the stars align with the block allocator and intermediary carving states this can cause incorrect block allocations. - Staged updates to bsprouts/bshrubs need to be played out before updates to opened mdirs lfsr_mdir_commit, this is just because lfsr_file_isbsprout/isbshrub depend on mdir.block and updating the mdirs first corrupts this. Maybe a different organization to this code would be useful, it is already full of TODOs.	2023-11-21 02:37:23 -06:00
Christopher Haster	4793d2f144	Fixed new bshrub roots and related bug fixing It turned out by implicitly handling root allocation in lfsr_btree_commit_, we were never allowing lfsr_bshrub_commit to intercept new roots as new bshrubs. Fixing this required moving the root allocation logic up into lfsr_btree_commit. This resulted in quite a bit of small bug fixing because it turns out if you can never create non-inlined bshrubs you never test non-inlined bshrubs: - Our previous rbyd.weight == btree.weight check for if we've reached the root no longer works, changed to an explicit check that the blocks match. Fortunately, now that new roots set trunk=0 new roots are no longer a problematic case. - We need to only evict when we calculate an accurate estimate, the previous code had a bug where eviction occurred early based only on the progged-since-last-estimate. - We need to manually set bshrub.block=mdir.block on new bshrubs, otherwise the lfsr_bshrub_isbshrub check fails in mdir commit staging. Also updated btree/bshrub following code in the dbg scripts, which mostly meant making them accept both BRANCH and SHRUBBRANCH tags as btree/bshrub branches. Conveniently very little code needs to change to extend btree read operations to support bshrubs.	2023-11-21 00:06:08 -06:00
Christopher Haster	6bd00caf93	Reimplemented eager shrub eviction, now with a more reliable heuristic Unfortunately, waiting to evict shrubs until mdir compaction does not work because we only have a single pcache. When we evict a bshrub we need a pcache for writing the new btree root, but if we do this during mdir compaction, our pcache is already busy handling the mdir compaction. We can't do a separate pass for bshrub eviction, since this would require tracking an unbounded number of new btree roots. In the previous shrub design, we meticulously tracked the compacted shrub estimate in RAM, determining exactly how the estimate would change as a part of shrub carve operations. This worked, but was fragile. It was easy for the shrub estimate to diverge from the actual value, and required quite a bit of extra code to maintain. Since the use cases for bshrubs is growing a bit, I didn't want to return to this design. So here's a new approach based on emulating btree compacts/splits inside the shrubs: 1. When a bshrub is fetched, scan the bshrub and calculate a compaction estimate. Store this. 2. On every commit, find the upper bound of new data being progged, and keep track of estimate + progged. We can at least get this relatively easily from commit attr lists. We can't get the amount deleted, which is the problem. 3. When estimate + progged exceeds shrub_size, scan the bshrub again and recalculate the estimate. 4. If estimate exceeds the shrub_size/2, evict the bshrub, converting it into a btree. As you may note, this is very close to how our btree compacts/splits work, but emulated. In particular, evictions/splits occur at (shrub_size/block_size)/2 in order to avoid runaway costs when the bshrub/btree gets close to full. Benefits: - This eviction heuristic is very robust. Calculating the amount progged from the attr list is relatively cheap and easy, and any divergence should be fixed when we recalculate the estimate. - The runtime cost is relatively small, amortized O(log n) which is the existing runtime to commit to rbyds. Downsides: - Just like btree splits, evictions force our bshrub to be ~1/2 full on average. This combined with the 2x cost for mdir pairs, the 2x cost for mdirs being ~1/2 full on average, and the need for both a synced and unsynced copy of file bshrubs brings our file bshrub's overhead up to ~16x, which is getting quite high... Anyways, bshrubs now work, and the new file topology is passing testing. An unfortunate surprise is the jump in stack cost. This seems to come from moving the lfsr_btree_flush logic into the hot-path that includes bshrub commit + mdir commit + all the mtree logic. Previously the separate of btree/shrub commits meant that the more complex block/btree/crystal logic was on a separate path from the mdir commit logic: code stack lfsr_file_t before bshrubs: 31840 2072 120 after bshrubs: 30756 (-3.5%) 2448 (+15.4%) 104 (-15.4%) I _think_ the reality is not actually as bad as measured, most of these flush/carve/commit functions calculate some work and then commit it in seperate steps. In theory GCC's shrinkwrapping optimizations should limit the stack to only what we need as we finish different calculations, but our current stack measurement scripts just add together the whole frames, so any per-call stack optimizations get missed...	2023-11-21 00:04:30 -06:00
Christopher Haster	6b82e9fb25	Fixed dbg scripts to allow explicit trunks without checksums Note this is intentionally different from how lfsr_rbyd_fetch behaves in lfs.c. We only call lfsr_rbyd_fetch when we need validated checksums, otherwise we just don't fetch. The dbg scripts, on the other hand, always go through fetch, but it is useful to be able to inspect the state of incomplete trunks when debugging. This use to be how the dbg scripts behaved, but they broke because of some recent script work.	2023-11-20 23:28:27 -06:00
Christopher Haster	c94b5f4767	Redesigned the inlined topology of files, now using geoxylic btrees As a part of the general redesign of files, all files, not just small files, can inline some data directly in the metadata log. Originally, this was a single piece of inlined data or an inlined tree (shrub) that effectively acted as an overlay over the block/btree data. This is now changed so that when we have a block/btree, the root of the btree is inlined. In effect making a full btree a sort of extended shrub. I'm currently calling this a "geoxylic btree", since that seems to be a somewhat related botanical term. Geoxylic btrees have, at least on paper, a number of benefits: - There is a single lookup path instead of two, this simplifies code a bit and decreases lookup costs. - One data structure instead of two also means lfsr_file_t requires less RAM, since all of the on-disk variants can go into one big union. Though I'm not sure this is very significant vs stack/buffer costs. - The write path is much simpler and has less duplication (it was difficult to deduplicate the shrub/btree code because of how the shrub goes through the mdir). In this redesign, lfsr_btree_commit_ leaves root attrs uncommitted, allowing lfsr_bshrub_commit to finish the job via lfsr_mdir_commit. - We don't need to maintain a shrub estimate, we just lazily evict trees during mdir compaction. This has a side-effect of allowing shrubs to temporarily grow larger than shrub_size before eviction. NOTE THIS (fundamentally?) DOESN'T WORK - There is no awkwardly high overhead for small btrees. The btree root for two-block files should be able to comfortably fit in the shrub portion of the btree, for example. - It may be possible to also make the mtree geoxylic, which should reduce storage overhead of small mtrees and make better use of the mroot. All of this being said, things aren't working yet. Shrub eviction during compaction runs into a problem with a single pcache -- how do we write the new btrees without dropping the compaction pcache? We can't evict btrees in a separate pass becauce their number is unbounded...	2023-11-20 23:23:58 -06:00
Christopher Haster	7243c0f371	Fixed some confusion in tracebd.py around buffered lines with headers Also limited block_size/block_count updates to only happen when the configured value is None. This matches dbgbmap.py. Basically just a cleanup of some bugs after the rework related to matching dbgbmap.py. Unfortunately these scripts have too much surface area and no tests...	2023-11-13 13:42:11 -06:00

1 2 3 4 5 ...

1232 Commits