littlefs

Author	SHA1	Message	Date
Christopher Haster	e8b1be17fe	Added bshrub support to dbgbmap.py Forgot about this script.	2024-02-03 18:16:49 -06:00
Christopher Haster	bea13dcf8e	Use sign bit of rbyd.trunk to indicate shrubness of rbyds Shrubness should have always been a property of lfsr_rbyd_t. You know you've made a good design decision when things just sort of fall into place and the code somehow becomes cleaner. The downside of this change is accessing rbyd trunks requires a mask, which is annoying, but the upside is we don't need to signal shrubness via extra booleans in internal functions anymore. The funny thing is, the actual motivation for this change is was just to free up a bit in our tag encoding. Simplifying some of the internal functions was just a nice side effect. code stack before: 33940 2928 after: 33928 (-0.0%) 2912 (-0.5%)	2024-02-03 18:16:45 -06:00
Christopher Haster	15593ccc49	Renamed scratch files -> orphan files I was originally avoiding naming these orphans, as they're _technically_ not orphans. They do exist in the mtree. But the name orphan just describes this types purpose too well. This does lead to some confusing terms, such as the fact that orphan files can be non-orphaned if there are any in-device references. But I think this makes sense? - LFSR_TAG_SCRATCH -> LFSR_TAG_ORPHAN - LFSR_F_UNCREAT -> LFSR_F_ORPHAN - test_fscratch.toml -> test_forphan.toml	2024-02-03 18:15:38 -06:00
Christopher Haster	ba505c2a37	Implemented scratch file basics "Scratch files" are a new file type added to solve the zero-sized file problem. Though they have a few other uses that may be quite valuable. The "zero-sized file problem" is a common surprise for users, where what seems like a simple file create+write operation: lfs_file_open(&lfs, &file, "hi", LFS_O_WRONLY \| LFS_O_CREAT \| LFS_O_EXCL); lfs_file_write(&lfs, &file, "hello!", strlen("hello!")); lfs_file_close(&lfs, &file); Can end up create a zero-sized file under powerloss, breaking user assumptions and their code. The tricky thing is that this is actually correct behavior as defined by POSIX. `open` with O_CREAT creats a file entry immediately, which is initially zero-sized. And the fact that power can be lost between `open` and `close` isn't really avoidable. But this is a common enough footgun that it's probably worth deviating from POSIX here. But how to avoid zero-sized files exactly? First thought: Delay the file creation until sync/close, tracking uncreated files in-device until then. This solves the problem and avoids any intermediary state if we lose power, but came with a number of headaches: 1. Since we delay file creation, we don't immediately write the filename to disk on open. This implies we need to keep the filename allocated in RAM until the first sync/close call. The requirement to keep the filename allocated for new files until first sync/close could be added to open, and with the option to call sync immediately to save the filename (and accept the risk of zero-sized files), I don't think it would be _that_ bad of an API. But it would still be pretty bad. Extra bad because 1. there's no way to warn on misuse at compile-time, 2. use-after-free bugs have a tendency to go unnoticed annoyingly often, 3. it's a regression from the previous API, and 4. who the heck reads the more-or-less same `open` documentation for every filesystem they adopt. 2. Without an allocated mid, tracking files internally gets a lot harder. The best option I could think of was to keep the opened-file linked-list sorted by mid + (in-device) file name. This did not feel like a great solutiona and was going to add more code cost. 3. Handling mdir splits containing uncreated files adds another headache. Complicated lfsr_mdir_estimate further as it needs to decide in which mdir the uncreated files will end up, and potentially split on a filename that isn't even created yet. 4. Since the number of uncreated files can be potentially unbounded, you can't prevent an mdir from filling up with only uncreated files. On disk this ends up looking like an "empty" mdir, which need specially handling in littlefs to reclaim after powerloss. Support for empty mdirs -- the orphaned mdir scan -- was already added earlier. We already scan each mdir to build gstate, so it doesn't really add much cost. Notice that last bullet point? We already scan each mdir during mount. Why not, instead of scanning for orphaned mdirs, scan for orphaned files? So this leads to the idea of "scratch files". Instead of actually delaying file creation, fake it. Create a scratch file during open, and on the first sync/close, convert it to a regular file. If we lose power, scan for scratch files during mount, and remove them on first write. Some tradeoffs: 1. The orphan scan for scratch files is a bit more expensive than for mdirs on storage with large block sizes. We need to look at each file entry vs just each mdir, which pushed the runtime up to O(BlogB) vs O(B). Though if you also consider large mtrees, the worst case is still O(nlogn). 2. Creating intermediate scratch files adds another commit to file creation. This is probably not a big issue for flash, but may be more of a concern on devices with large prog sizes. 3. Scratch files complicate unrelated mkdir/rename/etc code a bit, since we need to consider what happens when the dest is a scratch file. But the end result is simple. And simple is good. Both for implementation headaches, and code size. Even if the on-disk state is conceptually more complicated. You may have noticed these scratch files are basically isomorphic to just setting an "uncreated" flag on the file, and that's true. There may have been a simpler route to end up with the design, but hey, as long as it works. As a plus, scratch files present a solution for a couple other things: 1. Removing an open file can become a scratch file until closed. 2. Scratch files can be used as temporary files. Open a file with O_DESYNC and never call sync and you have yourself a temporary file. Maybe in the future we should add O_TMPFILE to avoid the need for unique filenames, but that is low priority.	2024-02-03 18:15:29 -06:00
Christopher Haster	f29a4982c4	Added block-level erased-state checksums Much like the erased-state checksums in our rbyds (ecksums), these block-level erased-state checksums (becksums) allow us to detect failed progs to erased parts of a block and are key to achieving efficient incremental write performance with large blocks and frequent power cycles/open-close cycles. These are also key to achieving _reasonable_ write performance for simple writes (linear, non-overwriting), since littlefs now relies solely on becksums to efficiently append to blocks. Though I suppose the previous block staging logic used with the CTZ skip-list could be brought back to make becksums optional and avoid btree lookups during simple writes (we do a _lot_ of btree lookups)... I'll leave this open as a future optimization... Unlike in-rbyd ecksums, becksums need to be stored out-of-band so our data blocks only contain raw data. Since they are optional, an additional tag in the file's btree makes sense. Becksums are relatively simple, but they bring some challenges: 1. Adding becksums to file btrees is the first case we have for multiple struct tags per btree id. This isn't too complicated a problem, but requires some new internal btree APIs. Looking forward, which I probably shouldn't be doing this often, multiple struct tags will also be useful for parity and content ids as a part of data redundancy and data deduplication, though I think it's uncontroversial to consider this both heavier-weight features... 2. Becksums only work if unfilled blocks are aligned to the prog_size. This is the whole point of crystal_size -- to provide temporary storage for unaligned writes -- but actually aligning the block during writes turns out to be a bit tricky without a bunch of unecesssary btree lookups (we already do too many btree lookups!). The current implementation here discards the pcache to force alignment, taking advantage of the requirement that cache_size >= prog_size, but this is corrupting our block checksums. Code cost: code stack before: 31248 2792 after: 32060 (+2.5%) 2864 (+2.5%) Also lfsr_ftree_flush needs work. I'm usually open to gotos in C when they improve internal logic, but even for me, the multiple goto jumps from every left-neighbor lookup into the block writing loop is a bit much...	2023-12-14 01:05:34 -06:00
Christopher Haster	6ccd9eb598	Adopted different strategy for hypothetical future configs Instead of writing every possible config that has the potential to be useful in the future, stick to just writing the configs that we know are useful, and error if we see any configs we don't understand. This prevents unnecessary config bloat, while still allowing configs to be introduced in a backwards compatible way in the future. Currently unknown configs are treated as a mount error, but in theory you could still try to read the filesystem, just with potentially corrupted data. Maybe this could be behind some sort of "FORCE" mount flag. littlefs must never write to the filesystem if it finds unknown configs. --- This also creates a curious case for the hole in our tag encoding previously taken up by the OCOMPATFLAGS config. We can query for any config > SIZELIMIT with lookupnext, but the OCOMPATFLAGS flag would need an extra lookup which just isn't worth it. Instead I'm just adding OCOMPATFLAGS back in. To support OCOMPATFLAGS littlefs has to do literally nothing, so this is really more of a documentation change. And who know, maybe OCOMPATFLAGS will have some weird use case in the future...	2023-12-08 14:03:56 -06:00
Christopher Haster	337bdf61ae	Rearranged tag encodings to make space for BECKSUM, ORPHAN, etc Also: - Renamed GSTATE -> GDELTA for gdelta tags. GSTATE tags added as separate in-device flags. The GSTATE tags were already serving this dual purpose. - Renamed BSHRUB* -> SHRUB when the tag is not necessarily operating on a file bshrub. - Renamed TRUNK -> BSHRUB The tag encoding space now has a couple funky holes: - 0x0005 - Hole for aligning config tags. I guess this could be used for OCOMPATFLAGS in the future? - 0x0203 - Hole so that ORPHAN can be a 1-bit difference from REG. This could be after BOOKMARK, but having a bit to differentiate littlefs specific file types (BOOKMARK, ORPHAN) from normal file types (REG, DIR) is nice. I guess this could be used for SYMLINK if we ever want symlinks in the future? - 0x0314-0x0318 - Hole so that the mdir related tags (MROOT, MDIR, MTREE) are nicely aligned. This is probably a good place for file-related tags to go in the future (BECKSUM, CID, COMPR), but we only have two slots, so will probably run out pretty quickly. - 0x3028 - Hole so that all btree related tags (BTREE, BRANCH, MTREE) share a common lower bit-pattern. I guess this could be used for MSHRUB if we ever want mshrubs in the future?	2023-12-08 13:28:47 -06:00
Christopher Haster	04c6b5a067	Added grm rcompat flag, dropped ocompat, tweaked compat flags a bit I'm just not seeing a use case for optional compat flags (ocompat), so dropping for now. It seems their *nix equivalent, feature_compat, is used to inform fsck of things, but this doesn't really make since in littlefs since there is no fsck. Or from a different perspective, littlefs is always running fsck. Ocompat flags can always be added later (since they do nothing). Unfortunately this really ruins the alignment of the tag encoding. For whatever reason config limits tend to come in pairs. For now the best solution is just leave tag 0x0006 unused. I guess you can consider it reserved for hypothetical ocompat flags in the future. --- This adds an rcompat flag for the grm, since in theory a filesystem doesn't need to support grms if it never renames files (or creates directories?). But if a filesystem doesn't support grms and a grms gets written into the filesystem, this can lead to corruption. I think every piece of gstate will end up with its own compat flag for this reason. --- Also renamed r/w/oflags -> r/w/ocompatflags to make their purpose clearer. --- The code impact of adding the grm rcompat flag is minimal, and will probably be less for additional rcompat flags: code stack before: 31528 2752 after: 31584 (+0.2%) 2752 (+0.0%)	2023-12-07 15:05:51 -06:00
Christopher Haster	4793d2f144	Fixed new bshrub roots and related bug fixing It turned out by implicitly handling root allocation in lfsr_btree_commit_, we were never allowing lfsr_bshrub_commit to intercept new roots as new bshrubs. Fixing this required moving the root allocation logic up into lfsr_btree_commit. This resulted in quite a bit of small bug fixing because it turns out if you can never create non-inlined bshrubs you never test non-inlined bshrubs: - Our previous rbyd.weight == btree.weight check for if we've reached the root no longer works, changed to an explicit check that the blocks match. Fortunately, now that new roots set trunk=0 new roots are no longer a problematic case. - We need to only evict when we calculate an accurate estimate, the previous code had a bug where eviction occurred early based only on the progged-since-last-estimate. - We need to manually set bshrub.block=mdir.block on new bshrubs, otherwise the lfsr_bshrub_isbshrub check fails in mdir commit staging. Also updated btree/bshrub following code in the dbg scripts, which mostly meant making them accept both BRANCH and SHRUBBRANCH tags as btree/bshrub branches. Conveniently very little code needs to change to extend btree read operations to support bshrubs.	2023-11-21 00:06:08 -06:00
Christopher Haster	6b82e9fb25	Fixed dbg scripts to allow explicit trunks without checksums Note this is intentionally different from how lfsr_rbyd_fetch behaves in lfs.c. We only call lfsr_rbyd_fetch when we need validated checksums, otherwise we just don't fetch. The dbg scripts, on the other hand, always go through fetch, but it is useful to be able to inspect the state of incomplete trunks when debugging. This use to be how the dbg scripts behaved, but they broke because of some recent script work.	2023-11-20 23:28:27 -06:00
Christopher Haster	1e4d4cfdcf	Tried to write errors to stderr consistently in scripts	2023-11-05 15:55:07 -06:00
Christopher Haster	4ecf4cc654	Added dbgbmap.py, tweaked tracebd.py to match dbgbmap.py parses littlefs's mtree/btrees and displays that status of every block in use: $ ./scripts/dbgbmap.py disk -B4096x256 -Z -H8 -W64 bd 4096x256, 7.8% mdir, 10.2% btree, 78.1% data mmddbbddddddmmddddmmdd--bbbbddddddddddddddbbdddd--ddddddmmdddddd mmddddbbddbbddddddddddddddddbbddddbbddddddmmddbbdddddddddddddddd bbdddddddddddd--ddddddddddddddddbbddddmmmmddddddddddddmmmmdddddd ddddddddddbbdddddddddd--ddddddddddddddmmddddddddddddddddddddmmdd ddddddbbddddddddbb--ddddddddddddddddddddbb--mmmmddbbdddddddddddd ddddddddddddddddddddbbddbbdddddddddddddddddddddddddddddddddddddd dddddddddd--ddddbbddddddddmmbbdd--ddddddddddddddbbmmddddbbdddddd ddmmddddddddddmmddddddddmmddddbbbbdddddddd--ddbbddddddmmdd--ddbb (ok, it looks a bit better with colors) dbgbmap.py matches the layout and has the same options as tracebd.py, allowing the combination of both to provide valuable insight into what exactly littlefs is doing. This required a bit of tweaking of tracebd.py to get right, mostly around conflicting order-based arguments. This also reworks the internal Bmap class to be more resilient to out-of-window ops, and adds an optional informative header.	2023-10-30 15:52:33 -05:00

1 2

62 Commits