Long story short: There is a specific case where removing a directory
can trigger a deorphan pass, but lfs_remove did not check for this,
would try to clean up the (already cleaned) directory orphan, and
trigger an assert:
lfs.c:4890:assert: assert failed with false, expected eq true
LFS_ASSERT(lfs_tag_size(lfs->gstate.tag) > 0x000 || orphans >= 0);
The specific case being a remove commit that triggers a relocation that
creates an orphan.
This is also possible in lfs_rename, but only if you're renaming a
directory that implies a remove, which is a pretty rare operation.
---
This was probably an oversight introduced in the non-recursive commit
logic rework.
Fortunately the fix is to just check if we even have an orphan before
trying to remove it. We can rely on this instead of the file type, so
this fix shouldn't even increase the code size.
Found and root-caused by Hugh-Baoa
These are the same as the related reentrant variants, but by opting out
of powerloss testing, we can test a much larger number of states without
having to worry about the impact on powerloss testing runtime.
Bumped CYCLES from 20 -> 2000.
This reveals an orphan remove bug found by Hugh-Baoa.
lfs_gstate_t was assumed to be a packed array of uint32_t,
but this is not always guaranteed. Access the fields directly
instead of attempting to loop over an array of uint32_t
Fixes clang tidy warnings about use of uninitialized memory
accessed.
When removing a file, we mark all open handles as "removed" (
pair={-1,-1}) to avoid trying to later read metadata that no longer
exists. Unfortunately, this also includes open dir handles that happen
to be pointing at the removed file, causing them to return
LFS_ERR_CORRUPT on the next read.
The good news is this is _not_ actual filesystem corruption, only a
logic error in lfs_dir_read.
We actually already have logic in place to nudge the dir to the next id,
but it was unreachable with the existing logic. I suspect this worked at
one point but was broken during a refactor due to lack of testing.
---
Fortunately, all we need to do is _not_ clobber the handle if the
internal type is a dir. Then the dir-nudging logic can correctly take
over.
I've also added test_dirs_remove_read to test this and prevent another
regression, adapted from tests provided by tpwrules that identified the
original bug.
Found by tpwrules
In v2.5, we introduced an optimization to avoid rereading data when
seeking inside the file cache. Unfortunately this used a slightly
wrong condition to check if the cache was "live", which meant seeks from
end-of-blocks could end up with invalid caches and wrong data. Not
great.
The problem is the nuance of when a file's cache is "live":
1. The file is marked as LFS_F_READING or LFS_F_WRITING.
But we can't reuse the cache when writing, so we only care about
LFS_F_READING.
2. file->off != lfs->cfg->block_size (end-of-block).
This is an optimization to avoid eagerly reading blocks we may not
actually care about.
We weren't checking for the end-of-block case, which meant if you seeked
_from_ the end of a block to a seemingly valid location in the file
cache, you could end up with an invalid cache.
Note that end-of-block may not be powers-of-two due to CTZ skip-list
pointers.
---
The fix is to check for the end-of-block case in lfs_file_seek. Note
this now matches the need-new-block logic in lfs_file_flushedread.
This logic change may also make lfs_file_seek call lfs_file_flush more
often, but only in cases where lfs_file_flush is a noop.
I've also extended the test_seek tests to cover a few more boundary-read
cases and prevent a regression in the future.
Found by wjl and lrodorigo
Looks like I missed a line during refactoring, resulted in only x86
sizes being reported in GitHub statuses.
If we wanted to limited these to one architecture, thumb would have
probably been a better pick.
- before: lfs_file_open("missing/") => LFS_ERR_ISDIR
- after: lfs_file_open("missing/") => LFS_ERR_NOTDIR
As noted by bmcdonnell-fb, returning LFS_ERR_ISDIR here was inconsistent
with the case where the file exists:
case before after
lfs_file_open("dir_a") => LFS_ERR_ISDIR LFS_ERR_ISDIR
lfs_file_open("dir_a/") => LFS_ERR_ISDIR LFS_ERR_ISDIR
lfs_file_open("reg_a/") => LFS_ERR_NOTDIR LFS_ERR_NOTDIR
lfs_file_open("missing_a/") => LFS_ERR_ISDIR LFS_ERR_NOTDIR
Note this is consistent with the behavior of lfs_stat:
lfs_file_open("reg_a/") => LFS_ERR_NOTDIR
lfs_stat("reg_a/") => LFS_ERR_NOTDIR
And the only other function that can "create" files, lfs_rename:
lfs_file_open("missing_a/") => LFS_ERR_NOTDIR
lfs_rename("reg_a", "missing_a/") => LFS_ERR_NOTDIR
There is some ongoing discussion about if these should return NOTDIR,
ISDIR, or INVAL, but this is at least an improvement over the
rename/open mismatch.
- test_paths_noent_trailing_slashes
- test_paths_noent_trailing_dots
- test_paths_noent_trailing_dotdots
These managed to slip through our path testing but should be tested, if
anything just to know exactly what errors these return.
Before this, the empty path ("") was treated as an alias for the root.
This was unintentional and just a side-effect of how the path parser
worked.
Now, the empty path should always result in LFS_ERR_INVAL:
- before: lfs_stat("") => 0
- after: lfs_stat("") => LFS_ERR_INVAL
Unlike normal files, dots (".") should not change the depth when
attempting to skip dotdot ("..") entries.
A weird nuance in the path parser, but at least it had a relatively easy
fix.
Added test_paths_dot_dotdots to prevent a regression.
This changes the behavior of paths that attempt to navigate above root
to now return LFS_ERR_INVAL:
- before: lfs_stat("/../a") => 0
- after: lfs_stat("/../a") => LFS_ERR_INVAL
This is a bit of an opinionated change while making other path
resolution tweaks.
In terms of POSIX-compatibility, it's a bit unclear exactly what dotdots
above the root should do.
POSIX notes:
> As a special case, in the root directory, dot-dot may refer to the
> root directory itself.
But the word choice of "may" implies it is up to the implementation.
I originally implement this as a root-loop simply because that is what
my Linux machine does, but I now think that's not the best option. Since
we're making other path-related tweaks, we might as well try to adopt
behavior that is, in my opinion, safer and less... weird...
This should also help make paths more consistent with future theoretical
openat-list APIs, where saturating at the current directory is sort of
the least expected behavior.
- lfs_mkdir now accepts trailing slashes:
- before: lfs_mkdir("a/") => LFS_ERR_NOENT
- after: lfs_mkdir("a/") => 0
- lfs_stat, lfs_getattr, etc, now reject trailing slashes if the file is
not a directory:
- before: lfs_stat("reg_a/") => 0
- after: lfs_stat("reg_a/") => LFS_ERR_NOTDIR
Note trailing slashes are accepted if the file is a directory:
- before: lfs_stat("dir_a/") => 0
- after: lfs_stat("dir_a/") => 0
- lfs_file_open now returns LFS_ERR_NOTDIR if the file exists but the
path contains trailing slashes:
- before: lfs_file_open("reg_a/") => LFS_ERR_NOENT
- after: lfs_file_open("reg_a/") => LFS_ERR_NOTDIR
To make these work, the internal lfs_dir_find API required some
interesting changes:
- lfs_dir_find no longer sets id=0x3ff on not finding a parent entry in
the path. Instead, lfs_path_islast can be used to determine if the
modified path references a parent entry or child entry based on the
remainder of the path string.
Note this is only necessary for functions that create new entries
(lfs_mkdir, lfs_rename, lfs_file_open).
- Trailing slashes mean we can no longer rely on the modified path being
NULL-terminated. lfs_path_namelen provides an alternative to strlen
that stops at slash or NULL.
- lfs_path_isdir also tells you if the modified path must reference a
dir (contains trailing slashes). I considered handling this entirely
in lfs_dir_find, but the behavior of entry-creating functions is too
nuanced.
At least lfs_dir_find returns LFS_ERR_NOTDIR if the file exists on
disk.
Like strlen, lfs_path_namelen/islast/isdir are all O(n) where n is the
name length. This isn't great, but if you're using filenames large
enough for this to actually matter... uh... open an issue on GitHub and
we might improve this in the future.
---
There are a couple POSIX incompatibilities that I think are not
worth fixing:
- Root modifications return EINVAL instead of EBUSY:
- littlefs: remove("/") => EINVAL
- POSIX: remove("/") => EBUSY
Reason: This would be the only use of EBUSY in the system.
- We accept modifications of directories with trailing dots:
- littlefs: remove("a/.") => 0
- POSIX: remove("a/.") => EBUSY
Reason: Not worth implementing.
- We do not check for existence of directories followed by dotdots:
- littlefs: stat("a/missing/..") => 0
- POSIX: stat("a/missing/..") => ENOENT
Reason: Difficult to implement non-recursively.
- We accept modifications of directories with trailing dotdots:
- littlefs: rename("a/b/..", "c") => 0
- POSIX: rename("a/b/..", "c") => EBUSY
Reason: Not worth implementing.
These are at least now documented in tests/test_paths.toml, which isn't
the greatest location, but it's at least something until a better
document is created.
Note that these don't really belong in SPEC.md because path parsing is
a function of the driver and has no impact on disk.
As expected these are failing and will need some work to pass.
The issue with lfs_file_open allowing trailing slashes was found by
rob-zeno, and the issue with lfs_mkdir disallowing trailing slashes was
found by XinStellaris, PoppaChubby, pavel-kirienko, inf265, Xywzel,
steverpalmer, and likely others.
This should be a superset of the previous test_paths test suite, while
covering a couple more things (more APIs, more path synonyms, utf8,
non-printable ascii, non-utf8, etc).
Not yet tested are some corner cases with known bugs, mainly around
trailing slashes.
These two small libraries provide examples of error-correction
compatible with littlefs (or any filesystem really).
It would be nice to eventually provide these as drop-in solutions, but
right now it's not really possible without breaking changes to
littlefs's block device API.
In the meantime, ramcrc32bd and ramrsbd at least provide example
implementations that can be adapted to users' own block devices.