Prevented unnecessary superblock rewrites if old version in superblock chain

Because multiple, out-of-date superblocks can exist in our superblock chain, we need to be careful to make sure newer superblock entries override older superblock entries. If we see an older on-disk minor version in the superblock chain, we were correctly overriding the on-disk minor version, but we were also leaving the "needs superblock" bit set in our consistency state. This isn't a hard-error, but would lead to a superblock rewrite every mount. The rewrite would make no progress, as the out-of-date version is effectively immutable at this point, and just waste prog cycles. This should fix that by clearing the "needs superblock" bit if we see a newer on-disk minor version.
Clarified what is accessible at specific superblock offsets in SPEC.md
2024-03-19 00:49:28 -05:00 · 2024-03-19 00:49:28 -05:00 · 2024-03-19 00:48:56 -05:00 · 2024-03-08 16:49:59 -06:00 · 2024-03-08 16:49:11 -06:00 · 2024-03-08 16:48:50 -06:00
4 changed files with 77 additions and 13 deletions
--- a/README.md
+++ b/README.md
@@ -258,6 +258,9 @@ License Identifiers that are here available: http://spdx.org/licenses/
  use with the MirageOS library operating system project. It is interoperable
  with the reference implementation, with some caveats.

+- [nim-littlefs] - A Nim wrapper and API for littlefs. Includes a fuse 
+  implementation based on [littlefs-fuse]
+
 [BSD-3-Clause]: https://spdx.org/licenses/BSD-3-Clause.html
 [littlefs-disk-img-viewer]: https://github.com/tniessen/littlefs-disk-img-viewer
 [littlefs-fuse]: https://github.com/geky/littlefs-fuse
@@ -274,3 +277,4 @@ License Identifiers that are here available: http://spdx.org/licenses/
 [littlefs-python]: https://pypi.org/project/littlefs-python/
 [littlefs2-rust]: https://crates.io/crates/littlefs2
 [chamelon]: https://github.com/yomimono/chamelon
+[nim-littlefs]: https://github.com/Graveflo/nim-littlefs
--- a/SPEC.md
+++ b/SPEC.md
@@ -441,9 +441,10 @@ Superblock fields:

 7. **Attr max (32-bits)** - Maximum size of file attributes in bytes.

-The superblock must always be the first entry (id 0) in a metadata pair as well
-as be the first entry written to the block. This means that the superblock
-entry can be read from a device using offsets alone.
+The superblock must always be the first entry (id 0) in the metadata pair, and
+the name tag must always be the first tag in the metadata pair. This makes it
+so that the magic string "littlefs" will always reside at offset=8 in a valid
+littlefs superblock.

 ---
 #### `0x2xx` LFS_TYPE_STRUCT
--- a/lfs.c
+++ b/lfs.c
@@ -710,11 +710,14 @@ static lfs_stag_t lfs_dir_getslice(lfs_t *lfs, const lfs_mdir_t *dir,
    lfs_tag_t ntag = dir->etag;
    lfs_stag_t gdiff = 0;

+    // synthetic moves
    if (lfs_gstate_hasmovehere(&lfs->gdisk, dir->pair) &&
-            lfs_tag_id(gmask) != 0 &&
-            lfs_tag_id(lfs->gdisk.tag) <= lfs_tag_id(gtag)) {
-        // synthetic moves
-        gdiff -= LFS_MKTAG(0, 1, 0);
+            lfs_tag_id(gmask) != 0) {
+        if (lfs_tag_id(lfs->gdisk.tag) == lfs_tag_id(gtag)) {
+            return LFS_ERR_NOENT;
+        } else if (lfs_tag_id(lfs->gdisk.tag) < lfs_tag_id(gtag)) {
+            gdiff -= LFS_MKTAG(0, 1, 0);
+        }
    }

    // iterate over dir block backwards (for faster lookups)
@@ -2188,7 +2191,8 @@ static int lfs_dir_splittingcompact(lfs_t *lfs, lfs_mdir_t *dir,
                // we can do, we'll error later if we've become frozen
                LFS_WARN("Unable to expand superblock");
            } else {
-                end = begin;
+                // duplicate the superblock entry into the new superblock
+                end = 1;
            }
        }
    }
@@ -2355,7 +2359,9 @@ fixmlist:;

            while (d->id >= d->m.count && d->m.split) {
                // we split and id is on tail now
-                d->id -= d->m.count;
+                if (lfs_pair_cmp(d->m.tail, lfs->root) != 0) {
+                    d->id -= d->m.count;
+                }
                int err = lfs_dir_fetch(lfs, &d->m, d->m.tail);
                if (err) {
                    return err;
@@ -4463,6 +4469,7 @@ static int lfs_mount_(lfs_t *lfs, const struct lfs_config *cfg) {
            // found older minor version? set an in-device only bit in the
            // gstate so we know we need to rewrite the superblock before
            // the first write
+            bool needssuperblock = false;
            if (minor_version < lfs_fs_disk_version_minor(lfs)) {
                LFS_DEBUG("Found older minor version "
                        "v%"PRIu16".%"PRIu16" < v%"PRIu16".%"PRIu16,
@@ -4470,10 +4477,11 @@ static int lfs_mount_(lfs_t *lfs, const struct lfs_config *cfg) {
                        minor_version,
                        lfs_fs_disk_version_major(lfs),
                        lfs_fs_disk_version_minor(lfs));
-                // note this bit is reserved on disk, so fetching more gstate
-                // will not interfere here
-                lfs_fs_prepsuperblock(lfs, true);
+                needssuperblock = true;
            }
+            // note this bit is reserved on disk, so fetching more gstate
+            // will not interfere here
+            lfs_fs_prepsuperblock(lfs, needssuperblock);

            // check superblock configuration
            if (superblock.name_max) {
--- a/tests/test_superblocks.toml
+++ b/tests/test_superblocks.toml
@@ -14,6 +14,24 @@ code = '''
    lfs_unmount(&lfs) => 0;
 '''

+# make sure the magic string "littlefs" is always at offset=8
+[cases.test_superblocks_magic]
+code = '''
+    lfs_t lfs;
+    lfs_format(&lfs, cfg) => 0;
+
+    // check our magic string
+    //
+    // note if we lose power we may not have the magic string in both blocks!
+    // but we don't lose power in this test so we can assert the magic string
+    // is present in both
+    uint8_t magic[lfs_max(16, READ_SIZE)];
+    cfg->read(cfg, 0, 0, magic, lfs_max(16, READ_SIZE)) => 0;
+    assert(memcmp(&magic[8], "littlefs", 8) == 0);
+    cfg->read(cfg, 1, 0, magic, lfs_max(16, READ_SIZE)) => 0;
+    assert(memcmp(&magic[8], "littlefs", 8) == 0);
+'''
+
 # mount/unmount from interpretting a previous superblock block_count
 [cases.test_superblocks_mount_unknown_block_count]
 code = '''
@@ -28,7 +46,6 @@ code = '''
    lfs_unmount(&lfs) => 0;
 '''

-
 # reentrant format
 [cases.test_superblocks_reentrant_format]
 reentrant = true
@@ -135,6 +152,39 @@ code = '''
    lfs_unmount(&lfs) => 0;
 '''

+# make sure the magic string "littlefs" is always at offset=8
+[cases.test_superblocks_magic_expand]
+defines.BLOCK_CYCLES = [32, 33, 1]
+defines.N = [10, 100, 1000]
+code = '''
+    lfs_t lfs;
+    lfs_format(&lfs, cfg) => 0;
+    lfs_mount(&lfs, cfg) => 0;
+    for (int i = 0; i < N; i++) {
+        lfs_file_t file;
+        lfs_file_open(&lfs, &file, "dummy",
+                LFS_O_WRONLY | LFS_O_CREAT | LFS_O_EXCL) => 0;
+        lfs_file_close(&lfs, &file) => 0;
+        struct lfs_info info;
+        lfs_stat(&lfs, "dummy", &info) => 0;
+        assert(strcmp(info.name, "dummy") == 0);
+        assert(info.type == LFS_TYPE_REG);
+        lfs_remove(&lfs, "dummy") => 0;
+    }
+    lfs_unmount(&lfs) => 0;
+
+    // check our magic string
+    //
+    // note if we lose power we may not have the magic string in both blocks!
+    // but we don't lose power in this test so we can assert the magic string
+    // is present in both
+    uint8_t magic[lfs_max(16, READ_SIZE)];
+    cfg->read(cfg, 0, 0, magic, lfs_max(16, READ_SIZE)) => 0;
+    assert(memcmp(&magic[8], "littlefs", 8) == 0);
+    cfg->read(cfg, 1, 0, magic, lfs_max(16, READ_SIZE)) => 0;
+    assert(memcmp(&magic[8], "littlefs", 8) == 0);
+'''
+
 # expanding superblock with power cycle
 [cases.test_superblocks_expand_power_cycle]
 defines.BLOCK_CYCLES = [32, 33, 1]
@@ -221,6 +271,7 @@ code = '''
    lfs_unmount(&lfs) => 0;
 '''

+
 # mount with unknown block_count
 [cases.test_superblocks_unknown_blocks]
 code = '''
Author	SHA1	Message	Date
Christopher Haster	11b036cc6c	Prevented unnecessary superblock rewrites if old version in superblock chain Because multiple, out-of-date superblocks can exist in our superblock chain, we need to be careful to make sure newer superblock entries override older superblock entries. If we see an older on-disk minor version in the superblock chain, we were correctly overriding the on-disk minor version, but we were also leaving the "needs superblock" bit set in our consistency state. This isn't a hard-error, but would lead to a superblock rewrite every mount. The rewrite would make no progress, as the out-of-date version is effectively immutable at this point, and just waste prog cycles. This should fix that by clearing the "needs superblock" bit if we see a newer on-disk minor version.	2024-03-19 00:49:28 -05:00
Christopher Haster	25ee90fdf1	Clarified what is accessible at specific superblock offsets in SPEC.md It used to be the case that the entire superblock entry could be found at specific offsets, but this was only possible while the superblock entry was immutable. Now that the superblock entry is very mutable (block-count changes, lfs2.0 -> lfs2.1 version bumps, etc), the correct superblock entry may end up later in the metadata log. At the very least, the "littlefs" magic string is still immutable and at the specific offset offset=8. This is arguably the most useful fixed-offset item.	2024-03-19 00:49:28 -05:00
Christopher Haster	a60a986c9c	Duplicate the superblock entry during superblock expansion The documentation does not match the implementation here. The intended behavior of superblock expansion was to duplicate the current superblock entry into the new superblock: .--------. .--------. .\|littlefs\|->\|littlefs\| \|\|bs=4096 \| \|\|bs=4096 \| \|\|bc=256 \| \|\|bc=256 \| \|\|crc32 \| \|\|root dir\| \|\| \| \|\|crc32 \| \|'--------' \|'--------' '--------' '--------' The main benefit is that we can rely on the magic string "littlefs" always residing in blocks 0x{0,1}, even if the superblock chain has multiple superblocks. The downside is that earlier superblocks in the superblock chain may contain out-of-date configuration. This is a bit annoying, and risks hard-to-reach bugs, but in theory shouldn't break anything as long as the filesystem is aware of this. Unfortunately this was lost at some point during refactoring in the early v2-alpha work. A lot of code was moving around in this stage, so it's a bit hard to track down the change and if it was intentional. The result is superblock expansion creates a valid linked-list of superblocks, but only the last superblock contains a valid superblock entry: .--------. .--------. .\|crc32 \|->\|littlefs\| \|\| \| \|\|bs=4096 \| \|\| \| \|\|bc=256 \| \|\| \| \|\|root dir\| \|\| \| \|\|crc32 \| \|'--------' \|'--------' '--------' '--------' What's interesting is this isn't invalid as far as lfs_mount is concerned. lfs_mount is happy as long as a superblock entry exists anywhere in the superblock chain. This is good for compat flexibility, but is the main reason this has gone unnoticed for so long. --- With the benefit of more time to think about the problem, it may have been more preferable to copy only the "littlefs" magic string and NOT the superblock entry: .--------. .--------. .\|littlefs\|->\|littlefs\| \|\|crc32c \| \|\|bs=4096 \| \|\| \| \|\|bc=256 \| \|\| \| \|\|root dir\| \|\| \| \|\|crc32 \| \|'--------' \|'--------' '--------' '--------' This would allow for simple "littlefs" magic string checks without the risks associated with out-of-date superblock entries. Unfortunately the current implementation errors if it finds a "littlefs" magic string without an associated superblock entry, so such a change would not be compatible with old drivers. --- This commit tweaks superblock expansion to duplicate the superblock entry instead of simply moving it to the new superblock. And adds tests over the magic string "littlefs" both before and after superblock expansion. Found by rojer and Nikola Kosturski	2024-03-19 00:48:56 -05:00
Christopher Haster	4dd30c1b8f	Merge pull request #948 from littlefs-project/fix-sync-ordering Fix sync issue where data writes could appear before metadata writes	2024-03-08 16:49:59 -06:00
Christopher Haster	5c0d332ecd	Merge pull request #939 from Graveflo/master Add nim-littlefs to readme	2024-03-08 16:49:11 -06:00
Christopher Haster	cf68333a55	Merge pull request #937 from littlefs-project/fix-pending-rm-get-underflow Fix synthetic move underflows in lfs_dir_get	2024-03-08 16:48:50 -06:00
Ryan McConnell	2752d8c486	add nim-littlefs to readme	2024-02-07 02:53:16 -05:00
Christopher Haster	ddbfcaa722	Fixed synthetic move underflows in lfs_dir_get By "luck" the previous code somehow managed to not be broken, though it was possible to traverse the same file twice in lfs_fs_traverse/size (which is not an error). The problem was an underlying assumption in lfs_dir_get that it would never be called when the requested id is pending removal because of a powerloss. The assumption was either: 1. lfs_dir_find would need to be called first to find the id, and it would correctly toss out pending-rms with LFS_ERR_NOENT. 2. lfs_fs_mkconsistent would be implicitly called before any filesystem traversals, cleaning up any pending-rms. This is at least true for allocator scans. But, as noted by andriyndev, both lfs_fs_traverse and lfs_fs_size can call lfs_fs_get with a pending-rm id if called in a readonly context. --- By "luck" this somehow manages to not break anything: 1. If the pending-rm id is >0, the id is decremented by 1 in lfs_fs_get, returning the previous file entry during traversal. Worst case, this reports any blocks owned by the previous file entry twice. Note this is not an error, lfs_fs_traverse/size may return the same block multiple times due to underlying copy-on-write structures. 2. More concerning, if the pending-rm id is 0, the id is decremented by 1 in lfs_fs_get and underflows. This underflow propagates into the type field of the tag we are searching for, decrementing it from 0x200 (LFS_TYPE_STRUCT) to 0x1ff (LFS_TYPE_INTERNAL(UNUSED)). Fortunately, since this happens to underflow to the INTERNAL tag type, the type intended to never exist on disk, we should never find a matching tag during our lfs_fs_get search. The result? lfs_dir_get returns LFS_ERR_NOENT, which is actually what we want. Also note that LFS_ERR_NOENT does not terminate the mdir traversal early. If it did we would have missed files instead of duplicating files, which is a slightly worse situation. --- The fix is to add an explicit check for pending-rms in lfs_dir_get, just like in lfs_dir_find. This avoids relying on unintended underflow propagation, and should make the internal API behavior more consistent. This is especially important for potential future gc extensions. Found by andriyndev	2024-02-04 15:12:31 -06:00