Commit Graph

174 Commits

Author SHA1 Message Date
Christopher Haster
259535ee73 Added lfs_fs_mkconsistent
lfs_fs_mkconsistent allows running the internal consistency operations
(desuperblock/deorphan/demove) on demand and without any other
filesystem changes.

This can be useful for front-loading and persisting consistency operations
when you don't want to pay for this cost on the first write to the
filesystem.

Conveniently, this also offers a way to force the on-disk minor version
to bump, if that is wanted behavior.

Idea from kasper0
2023-04-26 21:45:26 -05:00
Christopher Haster
9e28c75482 Bumped minor version to v2.6 and on-disk minor version to lfs2.1 2023-04-21 00:57:00 -05:00
Christopher Haster
d1b254da2c Reverted removal of 1-bit counter threaded through tags
Initially I thought the fcrc would be sufficient for all of the
end-of-commit context, since indicating that there is a new commit is a
simple as invalidating the fcrc. But it turns out there are cases that
make this impossible.

The surprising, and actually common, case, is that of an fcrc that
will end up containing a full commit. This is common as soon as the
prog_size is big, as small commits are padded to the prog_size at
minimum.

  .------------------. \
  |     metadata     | |
  |                  | |
  |                  | +-.
  |------------------| | |
  |   foward CRC ------------.
  |------------------| / |   |
  |   commit CRC    -----'   |
  |------------------|       |
  |     padding      |       |
  |                  |       |
  |------------------| \   \ |
  |     metadata     | |   | |
  |                  | +-. | |
  |                  | | | +-'
  |------------------| / | |
  |   commit CRC --------' |
  |------------------|     |
  |                  |     /
  '------------------'

When the commit + crc is all contained in the fcrc, something silly
happens with the math behind crcs. Everything in the commit gets
canceled out:

  crc(m) = m(x) x^|P|-1 mod P(x)

  m ++ crc(m) = m(x) x^|P|-1 + (m(x) x^|P|-1 mod P(x))

  crc(m ++ crc(m)) = (m(x) x^|P|-1 + (m(x) x^|P|-1 mod P(x))) x^|P|-1 mod P(x)

  crc(m ++ crc(m)) = (m(x) x^|P|-1 + m(x) x^|P|-1) x^|P|-1 mod P(x)

  crc(m ++ crc(m)) = 0 * x^|P|-1 mod P(x)

This is the reason the crc of a message + naive crc is zero. Even with an
initializer/bit-fiddling, the crc of the whole commit ends up as some
constant.

So no manipulation of the commit can change the fcrc...

But even if this did work, or we changed this scheme to use two
different checksums, it would still require calculating the fcrc of
the whole commit to know if we need to tweak the first bit to invalidate
the unlikely-but-problematic case where we happen to match the fcrc. This
would add a large amount of complexity to the commit code.

It's much simpler and cheaper to keep the 1-bit counter in the tag, even
if it adds another moving part to the system.
2022-12-17 12:42:05 -06:00
Christopher Haster
2f26966710 Continued implementation of forward-crcs, adopted new test runners
This fixes most of the remaining bugs (except one with multiple padding
commits + noop erases in test_badblocks), with some other code tweaks.

The biggest change was dropping reliance on end-of-block commits to know
when to stop parsing commits. We can just continue to parse tags and
rely on the crc for catch bad commits, avoiding a backwards-compatiblity
hiccup. So no new commit tag.

Also renamed nprogcrc -> fcrc and commitcrc -> ccrc and made naming in
the code a bit more consistent.
2022-12-17 12:42:05 -06:00
Christopher Haster
b4091c6871 Switched to separate-tag encoding of forward-looking CRCs
Previously forward-looking CRCs was just two new CRC types, one for
commits with forward-looking CRCs, one without. These both contained the
CRC needed to complete the current commit (note that the commit CRC
must come last!).

         [--   32   --|--   32   --|--   32   --|--   32   --]
with:    [  crc3 tag  | nprog size |  nprog crc | commit crc ]
without: [  crc2 tag  | commit crc ]

This meant there had to be several checks for the two possible structure
sizes, messying up the implementation.

         [--   32   --|--   32   --|--   32   --|--   32   --|--   32   --]
with:    [nprogcrc tag| nprog size |  nprog crc | commit tag | commit crc ]
without: [ commit tag | commit crc ]

But we already have a mechanism for storing optional metadata! The
different metadata tags! So why not use a separate tage for the
forward-looking CRC, separate from the commit CRC?

I wasn't sure this would actually help that much, there are still
necessary conditions for wether or not a forward-looking CRC is there,
but in the end it simplified the code quite nicely, and resulted in a ~200 byte
code-cost saving.
2022-12-17 12:42:05 -06:00
Christopher Haster
0c781dd822 Merge remote-tracking branch 'origin/master' into test-and-bench-runners 2022-12-06 23:08:53 -06:00
Xenoamor
a25681b2a6 Improve lfs_file_close usage description
Improve the lfs_file_close usage description to make it clearer that the configuration structure must remain valid for its lifetime

In reference to #722
2022-09-12 12:29:06 -05:00
Christopher Haster
61455b6191 Added back heuristic-based power-loss testing
The main change here from the previous test framework design is:

1. Powerloss testing remains in-process, speeding up testing.

2. The state of a test, included all powerlosses, is encoded in the
   test id + leb16 encoded powerloss string. This means exhaustive
   testing can be run in CI, but then easily reproduced locally with
   full debugger support.

   For example:

   ./scripts/test.py test_dirs#reentrant_many_dir#10#1248g1g2 --gdb

   Will run the test test_dir, case reentrant_many_dir, permutation #10,
   with powerlosses at 1, 2, 4, 8, 16, and 32 cycles. Dropping into gdb
   if an assert fails.

The changes to the block-device are a work-in-progress for a
lazily-allocated/copy-on-write block device that I'm hoping will keep
exhaustive testing relatively low-cost.
2022-08-23 19:12:22 -05:00
Christopher Haster
148e312ea3 Bumped minor version to v2.5 2022-04-13 22:47:43 -05:00
Christopher Haster
0ced3623d4 Merge pull request #657 from littlefs-project/copyright-update
Update copyright notice
2022-04-10 21:59:27 -05:00
Christopher Haster
bfb9bd2483 Merge pull request #614 from nnayo/fix_no_malloc_2
don't use lfs_file_open() when LFS_NO_MALLOC is set
2022-04-10 14:44:33 -05:00
Christopher Haster
5801169348 Merge pull request #635 from mikee47/fix/spelling-errors
Fix spelling errors
2022-03-20 23:09:23 -05:00
Christopher Haster
2db5dc80c2 Update copyright notice 2022-03-20 23:03:52 -05:00
Emilio Lopes
e29e7aeefa Specify unit of the size members of the lfs_config struct
Fixes littlefs-project/littlefs#568
2022-02-18 21:09:19 -06:00
yog
e334983767 don't use lfs_file_open() when LFS_NO_MALLOC is set 2022-02-18 20:57:20 -06:00
mikee47
4977fa0c0e Fix spelling errors 2022-01-29 09:52:00 +00:00
Christopher Haster
3d4e4f2085 Bumped minor version to v2.4 2021-01-18 20:23:54 -06:00
Will
37f4de2976 Remove inline_files_max and lfs_t entry for metadata_max 2020-12-18 13:05:20 +10:00
Will
6b16dafb4d Add metadata_max and inline_file_max to config
We have seen poor read performance on NAND flashes with 128kB blocks.
The root cause is inline files having to traverse many sets of metadata
pairs inside the current block before being fully reconstructed. Simply
disabling inline files is not enough, as the metadata will still fill up
the block and eventually need to be compacted.

By allowing configuration of how much size metadata takes up, along with
limiting (or disabling) inline file size, we achieve read performance
improvements on an order of magnitude.
2020-12-15 12:59:32 +10:00
Christopher Haster
288a5cbc8d Bumped minor version to v2.3 2020-12-04 01:31:27 -06:00
Noah Gorny
7388b2938a Deprecate LFS_F_OPENED and use lfs_mlist_isused instead
Instead of additional flag, we can just go through the mlist.
2020-12-04 00:26:19 -06:00
Christopher Haster
ce425a56c3 Merge pull request #470 from renesas/SWFLEX-1517-littlefs-thread-safe-option
Add thread safe wrappers
2020-12-03 23:47:32 -06:00
Christopher Haster
45afded784 Moved LFS_TRACE calls to API wrapper functions
This removes quite a bit of extra code needed to entertwine the
LFS_TRACE calls into the original funcions.

Also changed temporary return type to match API declaration where
necessary.
2020-12-03 23:46:59 -06:00
Christopher Haster
00a9ba7826 Tweaked thread-safe implementation
- Stayed on non-system include for lfs_util.h for now
- Named internal functions "lfs_functionraw"
- Merged lfs_fs_traverseraw
- Added LFS_LOCK/UNLOCK macros
- Changed LFS_THREADSAFE from 1/0 to defined/undefined to
  match LFS_READONLY
2020-12-03 23:46:59 -06:00
Bill Gesner
fc6988c7c3 make raw functions static. formatting tweaks 2020-12-03 23:46:54 -06:00
Bill Gesner
d0f055d321 Squash of thread-safe PR cleanup
- expand functions
- add comment
- rename functions
- fix locking issue in format and mount
- use global include
- fix ac6 linker issue
- use the global config file
- address review comments
- minor cleanup
- minor cleanup
- review comments
2020-12-03 23:41:01 -06:00
Maxime Vincent
754b4c3cda Squash of LFS_READONLY cleanup
- undef unavailable function declarations altogether
- even less code, assert on write attempts
- remove LFS_O_WRONLY and other flags when compiling with LFS_READONLY
- do not annotate #endif, as requested
- move ifdef before comments blocks, rework dangling opening bracket
- ifdef file flags that are not needed in read-only mode
- slight refactor
- ifdef LFS_F_ERRED out as well
2020-12-03 23:03:29 -06:00
Maxime Vincent
8e6826c4e2 Add LFS_READYONLY define, to allow smaller builds providing read-only mode 2020-10-28 16:09:13 +01:00
Bill Gesner
10ac6b9cf0 add thread safe wrappers 2020-09-17 23:41:20 +00:00
Christopher Haster
6622f3deee Bumped minor version to v2.2 2020-03-29 21:43:58 -05:00
Christopher Haster
a5d614fbfb Added tests for power-cycled-relocations and fixed the bugs that fell out
The power-cycled-relocation test with random renames has been the most
aggressive test applied to littlefs so far, with:
- Random nested directory creation
- Random nested directory removal
- Random nested directory renames (this could make the
  threaded linked-list very interesting)
- Relocating blocks every write (maximum wear-leveling)
- Incrementally cycling power every write

Also added a couple other tests to test_orphans and test_relocations.

The good news is the added testing worked well, it found quite a number
of complex and subtle bugs that have been difficult to find.

1. It's actually possible for our parent to be relocated and go out of
   sync in lfs_mkdir. This can happen if our predecessor's predecessor
   is our parent as we are threading ourselves into the filesystem's
   threaded list. (note this doesn't happen if our predecessor _is_ our
   parent, as we then update our parent in a single commit).

   This is annoying because it only happens if our parent is a long (>1
   pair) directory, otherwise we wouldn't need to catch relocations.
   Fortunately we can reuse the internal open file/dir linked-list to
   catch relocations easily, as long as we're careful to unhook our
   parent whenever lfs_mkdir returns.

2. Even more surprising, it's possible for the child in lfs_remove
   to be relocated while we delete the entry from our parent. This
   can happen if we are our own parent's predecessor, since we need
   to be updated then if our parent relocates.

   Fortunately we can also hook into the open linked-list here.

   Note this same issue was present in lfs_rename.

   Fortunately, this means now all fetched dirs are hooked into the
   open linked-list if they are needed across a commit. This means
   we shouldn't need assumptions about tree movement for correctness.

3. lfs_rename("deja/vu", "deja/vu") with the same source and destination
   was broken and tried to delete the entry twice.

4. Managing gstate deltas when we lose power during relocations was
   broken. And unfortunately complicated.

   The issue happens when we lose power during a relocation while
   removing a directory.

   When we remove a directory, we need to move the contents of its
   gstate delta to another directory or we'll corrupt littlefs gstate.
   (gstate is an xor of all deltas on the filesystem). We used to just
   xor the gstate into our parent's gstate, however this isn't correct.

   The gstate isn't built out of the directory tree, but rather out of
   the threaded linked-list (which exists to make collecting this
   gstate efficient).

   Because we have to remove our dir in two operations, there's a point
   were both the updated parent and child can exist in threaded
   linked-list and duplicate the child's gstate delta.

     .--------.
   ->| parent |-.
     | gstate | |
   .-|   a    |-'
   | '--------'
   |     X <- child is orphaned
   | .--------.
   '>| child  |->
     | gstate |
     |   a    |
     '--------'

   What we need to do is save our child's gstate and only give it to our
   predecessor, since this finalizes the removal of the child.

   However we still need to make valid updates to the gstate to mark
   that we've created an orphan when we start removing the child.

   This led to a small rework of how the gstate is handled. Now we have
   a separation of the gpending state that should be written out ASAP
   and the gdelta state that is collected from orphans awaiting
   deletion.

5. lfs_deorphan wasn't actually able to handle deorphaning/desyncing
   more than one orphan after a power-cycle. Having more than one orphan
   is very rare, but of course very possible. Fortunately this was just
   a mistake with using a break the in the deorphan, perhaps left from
   v1 where multiple orphans weren't possible?

   Note that we use a continue to force a refetch of the orphaned block.
   This is needed in the case of a half-orphan, since the fetched
   half-orphan may have an outdated tail pointer.
2020-01-26 23:45:54 -06:00
Christopher Haster
db054684a6 Bump version to v2.1 2019-07-29 01:42:28 -05:00
Christopher Haster
74fe46de3d Merge pull request #233 from ARMmbed/discourage-no-wear-leveling
Change block_cycles disable from 0 to -1
2019-07-28 21:35:48 -05:00
Christopher Haster
31e28fddb7 Merge pull request #237 from Ar2rL/reverse_finalize_close
Protect (LFS_ASSERT) file operations against using not opened or closed files.
2019-07-28 21:26:03 -05:00
Christopher Haster
3806d88285 Fixed seek-related typos in lfs.h
- lfs_file_rewind == lfs_file_seek(lfs, file, 0, LFS_SEEK_SET)
- lfs_file_seek returns the _new_ position of the file
2019-07-28 21:25:18 -05:00
Christopher Haster
38a2a8d2a3 Minor improvement to documentation over block_cycles
Suggested by haneefmubarak
2019-07-28 20:42:13 -05:00
Peter A. Bigot
eb013e6dd6 lfs: correct documentation on lookahead-related values
The size of the lookahead buffer is required to be a multiple of 8 bytes
in anticipation of a future improvement.  The buffer itself need only be
aligned to support access through a uint32_t pointer.

Signed-off-by: Peter A. Bigot <pab@pabigot.com>
2019-07-23 11:05:04 -05:00
Ar2rL
df2e676562 Add necessary flag to mark file as being opened. 2019-07-21 11:34:14 +02:00
Christopher Haster
53a6e04712 Changed block_cycles disable from 0 to -1
As it is now, block_cycles = 0 disables wear leveling. This was a
mistake as 0 is the "default" value for several other config options.
It's even worse when migrating from v1 as it's easy to miss the addition
of block_cycles and end up with a filesystem that is not actually
wear-leveling.

Clearly, block_cycles = 0 should do anything but disable wear-leveling.

Here, I've changed block_cycles = 0 to assert. Forcing users to set a
value for block_cycles (500 is suggested). block_cycles can be set to -1
to explicitly disable wear leveling if desired.
2019-07-17 17:05:20 -05:00
Christopher Haster
73ea008b74 Merge pull request #151 from Krakonos/master
Fixed documentation for return lfs_dir_read return value.
2019-04-12 17:07:25 -05:00
Christopher Haster
1ff6432298 Added clarification on buffer alignment.
In v2, the lookahead_buffer was changed from requiring 4-byte alignment
to requiring 8-byte alignment. This was not documented as well as it
could be, and as FabianInostroza noted, this also implies that
lfs_malloc must provide 8-byte alignment.

To protect against this, I've also added an assert on the alignment of
both the lookahead_size and lookahead_buffer.

found by FabianInostroza and amitv87
2019-04-10 11:27:48 -05:00
Ladislav Láska
26d25608b6 Fixed documentation for return lfs_dir_read return value.
lfs_dir_read breaks the convention of returning non-zero on success,
this feature should be at least documented.
2019-03-01 10:01:02 +01:00
Christopher Haster
4ad09d6c4e Added migration from littlefs v1
This is the help the introduction of littlefs v2, which is disk
incompatible with littlefs v1. While v2 can't mount v1, what we can
do is provide an optional migration, which can convert v1 into v2
partially in-place.

At worse, we only need to carry over the readonly operations on v1,
which are much less complicated than the write operations, so the extra
code cost may be as low as 25% of the v1 code size. Also, because v2
contains only metadata changes, it's possible to avoid copying file
data during the update.

Enabling the migration requires two steps
1. Defining LFS_MIGRATE
2. Call lfs_migrate (only available with the above macro)

Each macro multiplies the number of configurations needed to be tested,
so I've been avoiding macro controlled features since there's still work
to be done around testing the single configuration that's already
available. However, here the cost would be too high if we included migration
code in the standard build. We can't use the lfs_migrate function for
link time gc because of a dependency between the allocator and v1 data
structures.

So how does lfs_migrate work? It turned out to be a bit complicated, but
the answer is a multistep process that relies on mounting v1 readonly and
building the metadata skeleton needed by v2.

1. For each directory, create a v2 directory
2. Copy over v1 entries into v2 directory, including the soft-tail entry
3. Move head block of v2 directory into the unused metadata block in v1
   directory. This results in both a v1 and v2 directory sharing the
   same metadata pair.
4. Finally, create a new superblock in the unused metadata block of the
   v1 superblock.

Just like with normal metadata updates, the completion of the write to
the second metadata block marks a succesful migration that can be
mounted with littlefs v2. And all of this can occur atomically, enabling
complete fallback if power is lost of an error occurs.

Note there are several limitations with this solution.

1. While migration doesn't duplicate file data, it does temporarily
   duplicate all metadata. This can cause a device to run out of space if
   storage is tight and the filesystem as many files. If the device was
   created with >~2x the expected storage, it should be fine.

2. The current implementation is not able to recover if the metadata
   pairs develop bad blocks. It may be possilbe to workaround this, but
   it creates the problem that directories may change location during
   the migration. The other solutions I've looked at are complicated and
   require superlinear runtime. Currently I don't think it's worth
   fixing this limitation.

3. Enabling the migration requires additional code size. Currently this
   looks like it's roughly 11% at least on x86.

And, if any failure does occur, no harm is done to the original v1
filesystem on disk.
2019-02-27 19:58:07 -06:00
Christopher Haster
e1f9d2bc09 Added support for RAM-independent reading of inline files
One of the new features in LittleFS is "inline files", which is the
inlining of small files in the parent directory. Inline files have a big
limitation in that they no longer have a dedicated scratch area to write
out data before commit-time. This is fine as long as inline files are
small enough to fit in RAM.

However, this dependency on RAM creates an uncomfortable situation for
portability, with larger devices able to create larger files than
smaller devices. This problem is especially important on embedded
systems, where RAM is at a premium.

Recently, I realized this RAM requirement is necessary for _writing_
inline files, but not for _reading_ inline files. By allowing fetches of
specific slices of inline files it's possible to read inline files
without the RAM to back it.

However however, this creates a conflict with COW semantics. Normally,
when a file is open twice, it is referenced by a COW data structure that
can be updated independently. Inlines files that fit in RAM also allows
independent updates, but the moment an inline file can't fit in
RAM, any updates to that directory block could corrupt open files
referencing the inline file. The fact that this behaviour is only
inconsistent for inline files created on a different device with more
RAM creates a potential nightmare for user experience.

Fortunately, there is a workaround for this. When we are commiting to a
directory, any open files needs to live in a COW structure or in RAM.
While we could move large inline files to COW structures at open time,
this would break the separation of read/write operations and could lead
to write errors at read time (ie ENOSPC). But since this is only an
issue for commits, we can defer the move to a COW structure to any
commits to that directory. This means when committing to a directory we
need to find any _open_ large inline files and evict them from the
directory, leaving the file with a new COW structure even if it was
opened read only.

While complicated, the end result is inline files that can use the
MAX RAM that is available, but can be read with MIN RAM, even with
multiple write operations happening to the underlying directory block.
This prevents users from needing to learn the idiosyncrasies of inline
files to use the filesystem portably.
2019-01-22 20:59:59 -06:00
Christopher Haster
51b2c7e4b6 Changed custom attribute descriptors to used arrays
While linked-lists do have some minor benefits, arrays are more
idiomatic in C and may provide a more intuitive API.

Initially the linked-list approach was more beneficial than it is now,
since it allowed custom attributes to be chained to internal linked
lists of attributes. However, this was dropped because exposing the
internal attribute list in this way created a rather messy user
interface that required strictly encoding the attributes with the
on-disk tag format.

Minor downside, users can no longer introduce custom attributes in
different layers (think OS vs app). Minor upside, the code size and
stack usage was reduced a bit.

Fortunately, this API can always be changed in the future without
breaking anything (except maybe API compatibility).
2019-01-13 23:56:53 -06:00
Christopher Haster
66d751544d Modified global state format to work with new tag format
The main difference here is a change from encoding "hasorphans" and
"hasmove" bits in the tag itself. This worked with the old format, but
in the new format the space these bits take up must be consistent for
each tag type. The tradeoff is that the new tag format allows for up to
256 different global states which may be useful in the future (for
example, a global free list).

The new format encodes this info in the data blob, using an additional
word of storage. This word is actually formatted the same as though it
was a tag, which simplified internal handling and may allow other tag
types in the future.

Format for global state:
[----                          96 bits                         ----]
[1|- 11 -|- 10 -|- 10 -|---                 64                  ---]
 ^    ^      ^      ^                        ^- move dir pair
 |    |      |      \-------------------------- unused, must be 0s
 |    |      \--------------------------------- move id
 |    \---------------------------------------- type, 0xfff for move
 \--------------------------------------------- has orphans

This also included another iteration over globals (renamed to gstate)
with some simplifications to how globals are handled.
2019-01-13 23:56:50 -06:00
Christopher Haster
b989b4a89f Cleaned up tag encoding, now with clear chunk field
Before, the tag format's type field was limited to 9-bits. This sounds
like a lot, but this field needed to encode up to 256 user-specified
types. This limited the flexibility of the encoded types. As time went
on, more bits in the type field were repurposed for various things,
leaving a rather fragile type field.

Here we make the jump to full 11-bit type fields. This comes at the cost
of a smaller length field, however the use of the length field was
always going to come with a RAM limitation. Rather than putting pressure
on RAM for inline files, the new type field lets us encode a chunk
number, splitting up inline files into multiple updatable units. This
actually pushes the theoretical inline max from 8KiB to 256KiB! (Note
that we only allow a single 1KiB chunk for now, chunky inline files
is just a theoretical future improvement).

Here is the new 32-bit tag format, note that there are multiple levels
of types which break down into more info:

[----            32             ----]
[1|--  11   --|--  10  --|--  10  --]
 ^.     ^     .     ^          ^- entry length
 |.     |     .     \------------ file id chunk info
 |.     \-----.------------------ type info (type3)
 \.-----------.------------------ valid bit
  [-3-|-- 8 --]
    ^     ^- chunk info
    \------- type info (type1)

Additionally, I've split the CREATE tag into separate SPLICE and NAME
tags. This simplified the new compact logic a bit. For now, littlefs
still follows the rule that a NAME tag precedes any other tags related
to a file, but this can change in the future.
2019-01-13 23:56:01 -06:00
Christopher Haster
a548ce68c1 Switched to traversal-based compact logic
This simplifies some of the interactions between reading and writing
inside the commit logic. Unfortunately this change didn't decrease
code size as was initially hoped, but it does offer a nice runtime
improvement for the common case and should improve debugability.

Before, the compact logic required three iterations:
1. iterate through all the ids in a directory
2. scan attrs bound to each id in the directory
3. lookup attrs in the in-progress commit

The code for this, while terse and complicated, did have some nice side
effect. The directory lookup logic could be reused for looking up in the
in-progress commit, and iterating through each id allows us to know
exactly how many ids we can fit during a compact. Giving us a O(n^3)
compact and O(n^3) split.

However, this was complicated by a few things.

First, this compact logic doesn't handle deleted attrs. To work around this,
I added a marker for the last commit (or first based on your perspective)
which would indicate if a delete should be copied over. This worked but was
a bit hacky and meant deletes weren't cleaned up on the first compact.

Second, we can't actually figure out our compacted size until we
compact. This worked ok except for the fact that splits will always have a
failed compact. This means we waste an erase which could very expensive.
It is possible to work around this by keeping our work, but with only a
single prog cache this was very tricky and also somewhat hacky.

Third, the interactions between reading and writing to the same block
were tricky and error-prone. They should mostly be working now, but
seeing this requirement go away does not make me sad.

The new compact logic fixes these issues by moving the complexity into a
general-purpose lfs_dir_traverse function which has much fewer side
effects on the system. We can even use it for dry-runs to precompute our
estimated size.

How does it work?
1. iterate through all attr in the directory
2. for each attr, scan the rest of the directory to figure out the
   attr's history, this will change the attr based on dir modifications
   and may even exit early if the attr was deleted.

The end result is a traversal function that gives us the resulting state
of each attr in only O(n^2). To make this complete, we allow a bounded
recursion into mcu-side move attrs, although this ends up being O(n^3)
unlike moves in the original solution (however moves are less common.

This gives us a nice traversal function we can use for compacts and
moves, handles deletes, and is overall simpler to reason about.

Two minor hiccups:
1. We need to handle create attrs specially, since this algorithm
   doesn't care or id order, which can cause problems since attr
   insertion are order sensitive. We can fix this by simply looking up
   each create (since there is only one per file) in order at the
   beginning of our traversal. This is oddly complimentary to the move
   logic, which also handles create attrs separately.

2. We no longer know exactly how many ids we can write to a dir during
   splits. However, since we can do a dry-run traversal, we can use that
   to simply binary search for the mid-point.

This gives us a O(n^2) compact and O(n^2 log n) split, which is a nice
minor improvement (remember n is bounded by block size).
2018-12-28 11:17:51 -06:00
Christopher Haster
c8a39c4b23 Merge remote-tracking branch 'origin/master' into v2-rebase-part2 2018-10-20 21:02:25 -05:00
Christopher Haster
195075819e Added 2GiB file size limit and EFBIG reporting
On disk, littlefs uses 32-bit integers to track file size. This sets a
theoretical limit of 4GiB for files.

However, the API passes file sizes around as signed numbers, with
negative values representing error codes. This means that not all of the
APIs will work with file sizes > 2GiB.

Because of related complications over in FUSE land, I've added the LFS_FILE_MAX
constant and proper error reporting if file writes/seeks exceed the 2GiB limit.
In v2 this will join the other constants that get stored in the
superblock to help portability. Since littlefs is targeting
microcontrollers, it's likely this will be a sufficient solution.

Note that it's still possible to enable partial-support for 4GiB files
by defining LFS_FILE_MAX during compilation. This will work for most of
the APIs, except lfs_file_seek, lfs_file_tell, and lfs_file_size.

We can also consider improving support for 4GiB files, by making seek a
bit more complicated and adding a lfs_file_stat function. I'll leave
this for a future improvement if there's interest.

Found by cgrozemuller
2018-10-20 12:34:23 -05:00