forked from Imagelibrary/littlefs
Implemented mtree path/dname lookup, rudimentary lfsr_mkdir/lfsr_dir_read
This makes it now possible to create directories in the new system.
The new system now uses a single global "mtree" to store all metadata
entries in the filesystem. In this system, a directory is simply a range
of metadata entries. This has a number of benefits, but does come with
its own problems:
1. We need to indicate which directory each file belongs to. To do this
the file's name entry has been changed to a tuple of leb128-encoded
directory-id + actual file name:
01 66 69 6c 65 2e 74 78 74 .file.txt
^ '----------+----------'
'------------|------------ leb128 directory-id
'------------ ascii/utf8 name
If we include the directory-id as part of filename comparison, files
should naturally be next to other files in the same directory.
2. We need a way allocate directory-ids for new directories. This turns
out to be a bit more tricky than I expected.
We can't use any mid/bid/rid inherent to the mtree, because these
change on any file creation/deletion. And since we commit the did
into the tree, that's not acceptable.
Initially I though you could just find the largest did and increment,
but this gives you no way to reclaim deleted dids. And sure, deleted
dids have no storage consumption, but eventually you will overflow
the did integer. Since this can suddenly happen in a filesystem
that's been in a steady-state for years, that's pretty unnacceptable.
One solution is to do a simple linear search over the mtree for an
unused did. But with a runtime of O(n^2 log(n)), this raises
performance concerns.
Sidenote: It's interesting to note that the Linux kernel's allocation
of process-ids, a very similar problem, is surprisingly complex and
relies on a radix-tree of bitmaps (struct idr). This suggests I'm not
missing an obvious solution somewhere.
The solution I settled on here is to instead treat the set of dids as
a sort of hash table:
1. Hash the full directory path into a did.
2. Perform a linear search until we have no collision.
leb128(truncate28(crc32c("dir")))
.--------'
v
9e cd c8 30 66 69 6c 65 2e 74 78 74 ...0file.txt
'----+----' '----------+----------'
'-----------------|------------ leb128 directory-id
'------------ ascii/utf8 name
Worst case, this can still exhibit the worst case O(n^2 log(n))
performance when we are close to full dids. However that seems
unlikely to happen in practice, since we don't truncate our hashes,
unlike normal hash tables. An additional 32-bit word for each file
is a small price to pay for a low-chance of collisions.
In the current implementation, I do truncate the hash to 28-bits.
Since we encode the hash with leb128, and hashes are statistically
random, this gives us better usage of the leb128 encoding. However
it does limit a 32-bit littlefs to 256 Mi directories.
Maybe this should be a configurable limit in the future.
But that highlights another benefit of this scheme. It's easy to
change in the future without disk changes.
3. We need a way to know if a directory-id is allocated, even if the
directory is empty.
For this we just introduce a new tag: LFSR_TAG_DSTART, which
is an empty file entry that indicates the directory at the given did
in the mtree is allocated.
To create/delete these atomically with the reference in our parent
directory, we can use the GRM system for atomic renames.
Note this isn't implemented yet.
This is also the first time we finally get around to testing all of the
dname lookup functions, so this did find a few bugs, mostly around
reporting the root correctly.
This commit is contained in:
@@ -20,10 +20,11 @@ COLORS = [
|
||||
TAG_NULL = 0x0000
|
||||
TAG_SUPERMAGIC = 0x0003
|
||||
TAG_SUPERCONFIG = 0x0004
|
||||
TAG_NAME = 0x0100
|
||||
TAG_BRANCH = 0x0100
|
||||
TAG_REG = 0x0101
|
||||
TAG_DIR = 0x0102
|
||||
TAG_NAME = 0x0200
|
||||
TAG_BRANCH = 0x0200
|
||||
TAG_DSTART = 0x0201
|
||||
TAG_REG = 0x0202
|
||||
TAG_DIR = 0x0203
|
||||
TAG_STRUCT = 0x0300
|
||||
TAG_INLINED = 0x0300
|
||||
TAG_BLOCK = 0x0302
|
||||
@@ -31,6 +32,7 @@ TAG_BTREE = 0x0303
|
||||
TAG_MROOT = 0x0304
|
||||
TAG_MDIR = 0x0305
|
||||
TAG_MTREE = 0x0306
|
||||
TAG_DID = 0x0307
|
||||
TAG_UATTR = 0x0400
|
||||
TAG_SATTR = 0x0500
|
||||
TAG_ALT = 0x4000
|
||||
@@ -38,6 +40,7 @@ TAG_CRC = 0x2000
|
||||
TAG_FCRC = 0x2100
|
||||
|
||||
|
||||
|
||||
# parse some rbyd addr encodings
|
||||
# 0xa -> [0xa]
|
||||
# 0xa.b -> ([0xa], b)
|
||||
@@ -130,6 +133,7 @@ def tagrepr(tag, w, size, off=None):
|
||||
elif (tag & 0xff00) == TAG_NAME:
|
||||
return '%s%s %d' % (
|
||||
'branch' if tag == TAG_BRANCH
|
||||
else 'dstart' if tag == TAG_DSTART
|
||||
else 'reg' if tag == TAG_REG
|
||||
else 'dir' if tag == TAG_DIR
|
||||
else 'name 0x%02x' % (tag & 0xff),
|
||||
@@ -143,6 +147,7 @@ def tagrepr(tag, w, size, off=None):
|
||||
else 'mroot' if tag == TAG_MROOT
|
||||
else 'mdir' if tag == TAG_MDIR
|
||||
else 'mtree' if tag == TAG_MTREE
|
||||
else 'did' if tag == TAG_DID
|
||||
else 'struct 0x%02x' % (tag & 0xff),
|
||||
' w%d' % w if w else '',
|
||||
size)
|
||||
|
||||
Reference in New Issue
Block a user