This just means a rewrite of the superblock entry with the new minor
version.
Though it's interesting to note, we don't need to rewrite the superblock
entry until the first write operation in the filesystem, an optimization
that is already in use for the fixing of orphans and in-flight moves.
To keep track of any outdated minor version found during lfs_mount, we
can carve out a bit from the reserved bits in our gstate. These are
currently used for a counter tracking the number of orphans in the
filesystem, but this is usually a very small number so this hopefully
won't be an issue.
In-device gstate tag:
[-- 32 --]
[1|- 11 -| 10 |1| 9 ]
^----^-----^--^--^-- 1-bit has orphans
'-----|--|--|-- 11-bit move type
'--|--|-- 10-bit move id
'--|-- 1-bit needs superblock
'-- 9-bit orphan count
This is a bit tricky since we need two different version of littlefs in
order to test for most compatibility concerns.
Fortunately we already have scripts/changeprefix.py for version-specific
symbols, so it's not that hard to link in the previous version of
littlefs in CI as a separate set of symbols, "lfsp_" in this case.
So that we can at least test the compatibility tests locally, I've added
an ifdef against the expected define "LFSP" to define a set of aliases
mapping "lfsp_" symbols to "lfs_" symbols. This is manual at the moment,
and a bit hacky, but gets the job done.
---
Also changed BUILDDIR creation to derive subdirectories from a few
Makefile variables. This makes the subdirectories less manual and more
flexible for things like LFSP. Note this wasn't possible until BUILDDIR
was changed to default to "." when omitted.
Removed the weird alignment requirement from the general truncate tests.
This explicitly hid off-by-one truncation errors.
These tests now reveal the same issue as the block-sized truncation test
while also testing for other potential off-by-one errors.
When truncation is done on a file to the block size, there seems to be
an error where it points to an incorrect block. Perform a write /
truncate / readback operation to verify this issue.
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
- General cleanup from integration, including cleaning up some older
commit code
- Partial-prog tests do not make sense when prog_size == block_size
(there can't be partial-progs!)
- Fixed signed-comparison issue in modified filebd
This change is necessary to handle out-of-order writes found by pjsg's
fuzzing work.
The problem is that it is possible for (non-NOR) block devices to write
pages in any order, or to even write random data in the case of a
power-loss. This breaks littlefs's use of the first bit in a page to
indicate the erase-state.
pjsg notes this behavior is documented in the W25Q here:
https://community.cypress.com/docs/DOC-10507
---
The basic idea here is to CRC the next page, and use this "erase-state CRC" to
check if the next page is erased and ready to accept programs.
.------------------. \ commit
| metadata | |
| | +---.
| | | |
|------------------| | |
| erase-state CRC -----. |
|------------------| | | |
| commit CRC ---|-|-'
|------------------| / |
| padding | | padding (doesn't need CRC)
| | |
|------------------| \ | next prog
| erased? | +-'
| | | |
| v | /
| |
| |
'------------------'
This is made a bit annoying since littlefs doesn't actually store the
page (prog_size) in the superblock, since it doesn't need to know the
size for any other operation. We can work around this by storing both
the CRC and size of the next page when necessary.
Another interesting note is that we don't need to any bit tweaking
information, since we read the next page every time we would need to
know how to clobber the erase-state CRC. And since we only read
prog_size, this works really well with our caching, since the caches
must be a multiple of prog_size.
This also brings back the internal lfs_bd_crc function, in which we can
use some optimizations added to lfs_bd_cmp.
Needs some cleanup but the idea is passing most relevant tests.
When you add a function to every benchmark suite, you know if should
probably be provided by the benchmark runner itself. That being said,
randomness in tests/benchmarks is a bit tricky because it needs to be
strictly controlled and reproducible.
No global state is used, allowing tests/benches to maintain multiple
randomness stream which can be useful for checking results during a run.
There's an argument for having global prng state in that the prng could
be preserved across power-loss, but I have yet to see a use for this,
and it would add a significant requirement to any future test/bench runner.
These are just incorrect limits in the tests that can be triggered by
powerloss testing, which can end up with more metadata-pairs than
without powerloss testing due to orphans.
- Fixed prettyasserts.py parsing when '->' is in expr
- Made prettyasserts.py failures not crash (yay dynamic typing)
- Fixed the initial state of the emubd disk file to match the internal
state in RAM
- Fixed true/false getting changed to True/False in test.py/bench.py
defines
- Fixed accidental substring matching in plot.py's --by comparison
- Fixed a missed LFS_BLOCk_CYCLES in test_superblocks.toml that was
missed
- Changed test.py/bench.py -v to only show commands being run
Including the test output is still possible with test.py -v -O-, making
the implicit inclusion redundant and noisy.
- Added license comments to bench_runner/test_runner
These are really just different flavors of test.py and test_runner.c
without support for power-loss testing, but with support for measuring
the cumulative number of bytes read, programmed, and erased.
Note that the existing define parameterization should work perfectly
fine for running benchmarks across various dimensions:
./scripts/bench.py \
runners/bench_runner \
bench_file_read \
-gnor \
-DSIZE='range(0,131072,1024)'
Also added a couple basic benchmarks as a starting point.
The main benefit is small test ids everywhere, though this is with the
downside of needing longer names to properly prefix and avoid
collisions. But this fits into the rest of the scripts with globally
unique names a bit better. This is a C project after all.
The other small benefit is test generators may have an easier time since
per-case symbols can expect to be unique.
This is really more work for the bench runner. With this change defines
can be manipulated at a rather high level at runtime. Which should be
useful for generating benchmarks across various dimensions.
The define grammar in the test_runner is now a bit more powerful,
accepting:
1. A single value: -DN=42
2. A list of values, which get permuted: -DN=1,2,3
3. A range: -DN=range(10)
4. Some combo: -DN=1,2,range(3,0,-1)
This is more complex in the test .toml defines, which can also be C
expressions:
1. A single value: define=42
2. A single expression: define='42*42'
3. A list: define=[1,2,3]
4. A comma separated string: define='1,2,3'
5. A range: define='42*range(10)'
6. This mess: define=[1,2,'3,4,range(2)*range(2)+3']
On one hand this seems like the wrong place for these tests, on the
other hand, it's good to know that the block device is behaving as
expected when debugging the filesystem.
Maybe this should be moved to an external program for users to test
their block devices in the future?
This mostly required names for each test case, declarations of
previously-implicit variables since the new test framework is more
conservative with what it declares (the small extra effort to add
declarations is well worth the simplicity and improved readability),
and tweaks to work with not-really-constant defines.
Also renamed test_ -> test, replacing the old ./scripts/test.py,
unfortunately git seems to have had a hard time with this.
This was caused by the new lfs_file_rawseek optimization that can skip
flushing when calculated file->pos is unchanged combined with an
implicit expectation in lfs_file_truncate that lfs_file_rawseek
unconditionally sets file->pos.
Because of this assumption, lfs_file_truncate could leave file->pos in
an outdated state while changing the internal file metadata. Humorously,
this was always gauranteed to trigger the skip in lfs_file_rawseek when
we try to restore the file->pos, leaving the file->cache used to do the
CTZ skip-list lookup in a potentially bad state.
The easiest fix is to just update file->pos correctly. Note we don't
want to explicitly flush since we can leverage the same noop
optimization if we truncate to the file position. Which I've added a
test for.
These two features have been much requested by users, and have even had
several PRs proposed to fix these in several cases. Before this, these
error conditions usually were caught by internal asserts, however
asserts prevented users from implementing their own workarounds.
It's taken me a while to provide/accept a useful recovery mechanism
(returning LFS_ERR_CORRUPT instead of asserting) because my original thinking
was that these error conditions only occur due to bugs in the filesystem, and
these bugs should be fixed properly.
While I still think this is mostly true, the point has been made clear
that being able to recover from these conditions is definitely worth the
code cost. Hopefully this new behaviour helps the longevity of devices
even if the storage code fails.
Another, less important, reason I didn't want to accept fixes for these
situations was the lack of tests that prove the code's value. This has
been fixed with the new testing framework thanks to the additional of
"internal tests" which can call C static functions and really take
advantage of the internal information of the filesystem.
With the superblock expansion stuff, the test_format tests have grown
to test more advanced superblock-related features. This is fine but
deserves a rename so it's more clear.
Also fixed a typo that meant tests never ran with block cycles.
Byte-level writes are expensive and not suggested (caches >= 4 bytes
make much more sense), however there are many corner cases with
byte-level writes that can be easy to miss (power-loss leaving single
bytes written to disk).
Unfortunately, byte-level writes mixed with power-loss testing, the
Travis infrastructure, and Arm Thumb instruction set simulation
exceeds the 50-minute budget Travis allocates for jobs.
For now I'm disabling the byte-level tests under Qemu, with the hope that
performance improvements in littlefs will let us turn these tests back
on in the future.
- Added caching to Travis install dirs, because otherwise
pip3 install fails randomly
- Increased size of littlefs-fuse disk because test script has
a larger footprint now
- Skip a couple of reentrant tests under byte-level writes because
the tests just take too long and cause Travis to bail due to no
output for 10m
- Fixed various Valgrind errors
- Suppressed uninit checks for tests where LFS_BLOCK_ERASE_VALUE == -1.
In this case rambd goes uninitialized, which is fine for rambd's
purposes. Note I couldn't figure out how to limit this suppression
to only the malloc in rambd, this doesn't seem possible with Valgrind.
- Fixed memory leaks in exhaustion tests
- Fixed off-by-1 string null-terminator issue in paths tests
- Fixed lfs_file_sync issue caused by revealed by fixing memory leaks
in exhaustion tests. Getting ENOSPC during a file write puts the file
in a bad state where littlefs doesn't know how to write it out safely.
In this case, lfs_file_sync and lfs_file_close return 0 without
writing out state so that device-side resources can still be cleaned
up. To recover from ENOSPC, the file needs to be reopened and the
writes recreated. Not sure if there is a better way to handle this.
- Added some quality-of-life improvements to Valgrind testing
- Fit Valgrind messages into truncated output when not in verbose mode
- Turned on origin tracking
Moved .travis.yml over to use the new test framework. A part of this
involved testing all of the configurations ran on the old framework
and deciding which to carry over. The new framework duplicates some of
the cases tested by the configurations so some configurations could be
dropped.
The .travis.yml includes some extreme ones, such as no inline files,
relocations every cycle, no intrinsics, power-loss every byte, unaligned
block_count and lookahead, and odd read_sizes.
There were several configurations were some tests failed because of
limitations in the tests themselves, so many conditions were added
to make sure the configurations can run on as many tests as possible.
These should probably have been cleaned up in each commit to allow
cherry-picking, but due to time I haven't been able to.
- Went with creating an mdir copy in lfs_dir_commit. This handles a
number of related cleanup issues in lfs_dir_compact and it does so
more robustly. As a plus we can use the copy to update dependencies
in the mlist.
- Eliminated code left by the ENOSPC file outlining
- Cleaned up TODOs and lingering comments
- Changed the reentrant many directory create/rename/remove test to use
a smaller set of directories because of space issues when
READ/PROG_SIZE=512
This was caused by the previous fix for allocations during
lfs_fs_deorphan in this branch. To catch half-orphans during block
allocations we needed to duplicate all metadata-pairs reported to
lfs_fs_traverse. Unfortunately this causes lfs_fs_size to report 2x the
number of metadata-pairs, which would undoubtably confuse users.
The fix here is inelegantly simple, just do a different traversale for
allocations and size measurements. It reuses the same code but touches
slightly different sets of blocks.
Unfortunately, this causes the public lfs_fs_traverse and lfs_fs_size
functions to split in how they report blocks. This is technically
allowed, since lfs_fs_traverse may report blocks multiple times due to
CoW behavior, however it's undesirable and I'm sure there will be some
confusion.
But I don't have a better solution, so from this point lfs_fs_traverse
will be reporting 2x metadata-blocks and shouldn't be used for finding
the number of available blocks on the filesystem.
This was an interesting issue found during a GitHub discussion with
rmollway and thrasher8390.
Blocks in the metadata-pair are relocated every "block_cycles", or, more
mathy, when rev % block_cycles == 0 as long as rev += 1 every block write.
But there's a problem, rev isn't += 1 every block write. There are two
blocks in a metadata-pair, so looking at it from each blocks
perspective, rev += 2 every block write.
This leads to a sort of aliasing issue, where, if block_cycles is
divisible by 2, one block in the metadata-pair is always relocated, and
the other block is _never_ relocated. Causing a complete failure of
block-level wear-leveling.
Fortunately, because of a previous workaround to avoid block_cycles = 1
(since this will cause the relocation algorithm to never terminate), the
actual math is rev % (block_cycles+1) == 0. This means the bug only
shows its head in the much less likely case where block_cycles is a
multiple of 2 plus 1, or, in more mathy terms, block_cycles = 2n+1 for
some n.
To workaround this we can bitwise or our block_cycles with 1 to force it
to never be a multiple of 2n.
(Maybe we should do this during initialization? But then block_cycles
would need to be mutable.)
---
There's a few unrelated changes mixed into this commit that shouldn't be
there since I added this as part of a branch of bug fixes I'm putting
together rather hastily, so unfortunately this is not easily cherry-pickable.
It's interesting how many ways block devices can show failed writes:
1. prog can error
2. erase can error
3. read can error after writing (ECC failure)
4. prog doesn't error but doesn't write the data correctly
5. erase doesn't error but doesn't erase correctly
Can read fail without an error? Yes, though this appears the same as
prog and erase failing.
These weren't all simulated by testbd since I unintentionally assumed
the block device could always error. Fixed by added additional bad-black
behaviors to testbd.
Note: This also includes a small fix where we can miss bad writes if the
underlying block device contains a valid commit with the exact same
size in the exact same offset.
Fixes:
- Fixed reproducability issue when we can't read a directory revision
- Fixed incorrect erase assumption if lfs_dir_fetch exceeds block size
- Fixed cleanup issue caused by lfs_fs_relocate failing when trying to
outline a file in lfs_file_sync
- Fixed cleanup issue if we run out of space while extending a CTZ skip-list
- Fixed missing half-orphans when allocating blocks during lfs_fs_deorphan
Also:
- Added cycle-detection to readtree.py
- Allowed pseudo-C expressions in test conditions (and it's
beautifully hacky, see line 187 of test.py)
- Better handling of ctrl-C during test runs
- Added build-only mode to test.py
- Limited stdout of test failures to 5 lines unless in verbose mode
Explanation of fixes below
1. Fixed reproducability issue when we can't read a directory revision
An interesting subtlety of the block-device layer is that the
block-device is allowed to return LFS_ERR_CORRUPT on reads to
untouched blocks. This can easily happen if a user is using ECC or
some sort of CMAC on their blocks. Normally we never run into this,
except for the optimization around directory revisions where we use
uninitialized data to start our revision count.
We correctly handle this case by ignoring whats on disk if the read
fails, but end up using unitialized RAM instead. This is not an issue
for normal use, though it can lead to a small information leak.
However it creates a big problem for reproducability, which is very
helpful for debugging.
I ended up running into a case where the RAM values for the revision
count was different, causing two identical runs to wear-level at
different times, leading to one version running out of space before a
bug occured because it expanded the superblock early.
2. Fixed incorrect erase assumption if lfs_dir_fetch exceeds block size
This could be caused if the previous tag was a valid commit and we
lost power causing a partially written tag as the start of a new
commit.
Fortunately we already have a separate condition for exceeding the
block size, so we can force that case to always treat the mdir as
unerased.
3. Fixed cleanup issue caused by lfs_fs_relocate failing when trying to
outline a file in lfs_file_sync
Most operations involving metadata-pairs treat the mdir struct as
entirely temporary and throw it out if any error occurs. Except for
lfs_file_sync since the mdir is also a part of the file struct.
This is relevant because of a cleanup issue in lfs_dir_compact that
usually doesn't have side-effects. The issue is that lfs_fs_relocate
can fail. It needs to allocate new blocks to relocate to, and as the
disk reaches its end of life, it can fail with ENOSPC quite often.
If lfs_fs_relocate fails, the containing lfs_dir_compact would return
immediately without restoring the previous state of the mdir. If a new
commit comes in on the same mdir, the old state left there could
corrupt the filesystem.
It's interesting to note this is forced to happen in lfs_file_sync,
since it always tries to outline the file if it gets ENOSPC (ENOSPC
can mean both no blocks to allocate and that the mdir is full). I'm
not actually sure this bit of code is necessary anymore, we may be
able to remove it.
4. Fixed cleanup issue if we run out of space while extending a CTZ
skip-list
The actually CTZ skip-list logic itself hasn't been touched in more
than a year at this point, so I was surprised to find a bug here. But
it turns out the CTZ skip-list could be put in an invalid state if we
run out of space while trying to extend the skip-list.
This only becomes a problem if we keep the file open, clean up some
space elsewhere, and then continue to write to the open file without
modifying it. Fortunately an easy fix.
5. Fixed missing half-orphans when allocating blocks during
lfs_fs_deorphan
This was a really interesting bug. Normally, we don't have to worry
about allocations, since we force consistency before we are allowed
to allocate blocks. But what about the deorphan operation itself?
Don't we need to allocate blocks if we relocate while deorphaning?
It turns out the deorphan operation can lead to allocating blocks
while there's still orphans and half-orphans on the threaded
linked-list. Orphans aren't an issue, but half-orphans may contain
references to blocks in the outdated half, which doesn't get scanned
during the normal allocation pass.
Fortunately we already fetch directory entries to check CTZ lists, so
we can also check half-orphans here. However this causes
lfs_fs_traverse to duplicate all metadata-pairs, not sure what to do
about this yet.
- Removed old tests and test scripts
- Reorganize the block devices to live under one directory
- Plugged new test framework into Makefile
renamed:
- scripts/test_.py -> scripts/test.py
- tests_ -> tests
- {file,ram,test}bd/* -> bd/*
It took a surprising amount of effort to make the Makefile behave since
it turns out the "test_%" rule could override "tests/test_%.toml.test"
which is generated as part of test.py.
Sometimes small, single line code change hides behind it a complicated
story. This is one of those times.
If you look at this diff, you may note that this is a case of
lfs_dir_fetchmatch not correctly handling a tag that invalidates a
callback used to search for some condition, in this case a search for a
parent, which is invalidated by a later dir tag overwritting the
previous dir pair.
But how can this happen? Dir-pair-tags are only overwritten during
relocations (when a block goes bad or exceeds the block_cycles config
option for dynamic wear-leveling). Other dir operations create new
directory entries. And the only lfs_dir_fetchmatch condition that relies
on overwrites (as opposed to proper deletes) is when we need to find a
directory's parent, an operation that only occurs during a _different_
relocation. And a false _positive_, can only happen if we don't have a
parent. Which is really unlikely when we search for directory parents!
This bug and minimal test case was found by Matthew Renzelmann. In a
unfortunate series of events, first a file creation causes a directory
split to occur. This creates a new, orphaned metadata-pair containing
our new file. However, the revision count on this metadata-pair
indicates the pair is due for relocation as a part of wear-leveling.
Normally, this is fine, even though this metadata-pair has no parent,
the lfs_dir_find should return ENOENT and continue without error.
However, here we get hit by our fetchmatch bug. A previous, unrelated
relocation overwrites a pair which just happens to contain the block
allocated for a new metadata-pair. When we search for a parent,
lfs_dir_fetchmatch incorrectly finds this old, outdated metadata pair
and incorrectly tells our orphan it's found its parent.
As you can imagine the orphan's dissapointment must be immense.
So an unfortunately timed dir split triggers a relocation which
incorrectly finds a previously written parent that has been outdated
by another relocation.
As a solution we can outdate our found tag if it is overwritten by
an exact match during lfs_dir_fetchmatch.
As a part of this I started adding a new set of tests: tests/test_relocations,
for aggressive relocations tests. This is already by appended to by
another PR. I suspect relocations is relatively under-tested and is
becoming more important due to recent improvements in wear-leveling.
The superblock entry takes up id 0 in the root directory (not all
entries are files, though currently the superblock is the only
exception). Normally, reading a directory correctly skips the
superblock and only reports non-superblock files.
However, this doesn't work perfectly for lfs_dir_seek, which tries
to be clever to not touch the disk.
Fortunately, we can fix this by adding an offset for the superblock.
This will only work while the superblock is the only non-file entry,
otherwise we would need to touch the disk to properly seek in a
directory (though we already touch the disk a bit to get dir-tails
during seeks).
Found by jhartika
This is caused by dir->head not being updated when dir->m.pair may be.
This causes the two to fall out of sync and later dir rewinds to fail.
This bug stems all the way back from the first commits of littlefs, so
it's surprising it has avoided detection for this long. Perhaps because
lfs_dir_rewind is not used often.
When using lfs_file_truncate() to make a file shorter the file block and
off were incorrectly positioned at the new end, resulting in invalid
data accessed when reading. Lift the seek pointer restoration to apply
to both increasing and reducing truncates.
Signed-off-by: Peter A. Bigot <pab@pabigot.com>
This is primarily to get better test coverage over devices with very
large erase/prog/read sizes. The unfortunate state of the tests is
that most of them rely on a specific block device size, so that
ENOSPC and ECORRUPT errors occur in specific situations.
This should be improved in the future, but at least for now we can
open up some of the simpler tests to run on these different
configurations.
Also added testing over both 0x00 and 0xff erase values in emubd.
Also added a number of small file tests that expose issues prevalent
on NAND devices.
- Now test errors have correct line reporting! #line directives
are passed to the compiler that reference the relevant line in
the test case shell script.
--- Multi-block directory ---
./tests/test_dirs.sh:109: assert failed with 0, expected 1
lfs_unmount(&lfs) => 1
- Cleaned up the number of implicit global variables provided to
tests. A lot of these were infrequently used and made it difficult
to remember what was provided. This isn't an MCU, so there's very
little cost to stack allocations when needed.
- Minimized the results.py script (previously stats.py) output to
match minimization of test output.
Kind of a two-fold issue. One, the programming to the middle of inline
files was causing the cache to get updated to a half programmed state.
While fine, as all programs do occur in order in a block, this is less
efficient when writing to inline files as it would cause the inline file
to need to be reread even if it fits in the cache.
Two, the rereading of the inline file was broken and passed the file's
tag all the way to where a user would expect an error. This was easy to
fix but adds to the reasons we should have test coverage information.
Found by ebinans
The cause was mistakenly setting file->ctz.size directly instead of
file->pos, which file->ctz.size gets overwritten with later in
lfs_file_flush.
Also added better seek test cases specifically for inline files. This
should also catch most of the inline corner cases related to
lfs_file_size/lfs_file_tell.
Found by ebinans
The problem was not setting the file state correctly after the truncate.
To truncate < size, we end up using the cache to traverse the ctz
skip-list far away from where our file->pos is.
We can leave the last block in the cache in case we're going to append
to the file, but if we do this we need to set up file->block+file->off
to tell use where we are in the file, and set the LFS_F_READING flag to
indicate that our cache contains read data.
Note this is different than the LFS_F_DIRTY, which we need also. The
purpose of the flags are as follows:
- LFS_F_DIRTY - file ctz skip-list branch is out of sync with
filesystem, need to update metadata
- LFS_F_READING - file cache is in use for reading, need to drop cache
- LFS_F_WRITING - file cache is in use for writing, need to write out
cache to disk
The difference between flags is subtle but important because read/prog
caches are handled differently. Prog caches have asserts in place to
catch programs without erases (the infamous pcache->block == 0xffffffff
assert).
Though maybe the names deserve an update...
Found by ebinans
- Fixed uninitialized values found by valgrind.
- Fixed uninitialized value in lfs_dir_fetchmatch when handling revision
counts.
- Fixed mess left by lfs_dir_find when attempting to find the root
directory in lfs_rename and lfs_remove.
- Fixed corner case with definitions of lfs->cfg->block_cycles.
- Added test cases around different forms of the root directory.
I think all of these were found by TheLoneWolfling, so props!
This was caused by any commit containing entries large enough to
_always_ force a compaction. This would cause littlefs to think that it
would need to split infinitely because there was no base case.
The fix here is pretty simple: treat any commit with only a single entry
as unsplittable. This forces littlefs to first try overcompacting
(fitting more in a block than what has optimal runtime), and then
failing that return LFS_ERR_NOSPC for higher layers to handle.
found by TheLoneWolfling
While linked-lists do have some minor benefits, arrays are more
idiomatic in C and may provide a more intuitive API.
Initially the linked-list approach was more beneficial than it is now,
since it allowed custom attributes to be chained to internal linked
lists of attributes. However, this was dropped because exposing the
internal attribute list in this way created a rather messy user
interface that required strictly encoding the attributes with the
on-disk tag format.
Minor downside, users can no longer introduce custom attributes in
different layers (think OS vs app). Minor upside, the code size and
stack usage was reduced a bit.
Fortunately, this API can always be changed in the future without
breaking anything (except maybe API compatibility).
Before, the tag format's type field was limited to 9-bits. This sounds
like a lot, but this field needed to encode up to 256 user-specified
types. This limited the flexibility of the encoded types. As time went
on, more bits in the type field were repurposed for various things,
leaving a rather fragile type field.
Here we make the jump to full 11-bit type fields. This comes at the cost
of a smaller length field, however the use of the length field was
always going to come with a RAM limitation. Rather than putting pressure
on RAM for inline files, the new type field lets us encode a chunk
number, splitting up inline files into multiple updatable units. This
actually pushes the theoretical inline max from 8KiB to 256KiB! (Note
that we only allow a single 1KiB chunk for now, chunky inline files
is just a theoretical future improvement).
Here is the new 32-bit tag format, note that there are multiple levels
of types which break down into more info:
[---- 32 ----]
[1|-- 11 --|-- 10 --|-- 10 --]
^. ^ . ^ ^- entry length
|. | . \------------ file id chunk info
|. \-----.------------------ type info (type3)
\.-----------.------------------ valid bit
[-3-|-- 8 --]
^ ^- chunk info
\------- type info (type1)
Additionally, I've split the CREATE tag into separate SPLICE and NAME
tags. This simplified the new compact logic a bit. For now, littlefs
still follows the rule that a NAME tag precedes any other tags related
to a file, but this can change in the future.
There was an interesting subtlety with the existing layout of tags that
could become a problem in the future. Basically, littlefs avoids writing to
any region of storage it is not absolutely sure has been erased
beforehand. This is a part of limiting the number of assumptions about
storage. It's possible a storage technology can't support writes without
erases in a way that is undetectable at write time (Maybe changing a bit
without an erase decreases the longevity of the information stored on
the bit).
But the existing layout had a very tiny corner case where this wasn't
true. Consider the location of the valid bit in the tag struct:
[1|--- 31 ---]
^--- valid bit
The responsibility of this bit is to indicate if an attempt has been
made to write the following commit. If it is not set (the specific value
is dependent on a previous read and identified by the preceeding commit),
the assumption is that it is safe to write to the next region because it
has been erased previously. If it is set, we check if the next commit is
valid, if it isn't (because of CRC failure, likely due to power-loss), we
discard the commit. But because an attempt has been made to write to
that storage, we must then do a compaction to move to the other block in
the metadata-pair.
This plan looks good on paper, but what does it look like on storage?
The problem is that words in littlefs are in little-endian. So on
storage the tag actually looks like this:
[- 8 -|- 8 -|- 8 -|1|- 7 -]
^-- valid bit
This means that we don't actually set the valid bit before writing the
tag! We write the lower bytes first. If we lose power, we may have
written 3 bytes without this fact being detectable.
We could restructure the tag structure to store the valid bit lower,
however because none of the fields are 7 bits, this would make the
extraction more costly, and we then lose the ability to check this
valid bit with a sign comparison.
The simple solution is to just store the tag in big-endian. A small
benefit is that this will actually have a negative code cost on
big-endian machines.
This mixture of endiannesses is frustrating, however it is a pragmatic
solution with only a 20-byte code size cost.