Any conditions in both the suites and cases are anded together to
determine when the test/bench should run.
Accepting a list here makes it easier to compose multiple conditions,
since toml-level elements are a bit easier to modify than strings of
C expressions.
This marks internal tests/benches (case.in="lfs.c") with an otherwise-unused
flag that is printed during --summary/--list-*. This just helps identify which
tests/benches are internal.
Previously no matches would noop, which, while consistent with an empty
test suite that contains no tests but shouldn't really error, this made
it easy to miss when a typo would cause tests to be missed.
Also added a bit of color to script-level errors in test/bench.py
This helps debug a corrupted mtree with cycles, which has been a problem
in the past.
Also fixed a small rendering issue with dbgmtree.py not connecting inner
tree edges to mdir roots correctly during rendering.
We are already paying the memory cost of a fetched lfsr_rbyd_t
during btree so we can traverse inner btree nodes. But we are currently
just wasting this memory when we traverse leaf entries.
Instead, we can use this memory to cache the btree's current leaf node,
avoiding a btree walk until we've iterated over all rids in the current
leaf.
Think about this for a second:
1. We cache the root rbyd in lfsr_btree_t because we share the memory with
a union and we always need to read the root during lookups.
2. We cache the leaf rbyds in lfsr_btree_traversal_t because we need the
memory for traversing inner btree nodes.
The only btree nodes we don't cache during traversal are inner nodes
when the height of the btree >= 3.
If you're familiar with how btrees behave on storage, you know the
height because _exponentially_ less likely to grow as the tree gets
larger. It's entirely possible for btree traversal to simply never
traverse a non-cached btree node once your block size gets large
enough.
---
There may be a way to instead reclaim this memory, such as sharing this
memory with mtree traversal and higher layers, but the current API design
and hierarchy of C structs make this difficult. Maybe this is worth looking
into in the future, but at most it would save one lfsr_rbyd_t.
This also reverts some rid changes in btree lookups to share less code
but be a bit easier to reason about.
I intended to also add a test for cycles in the btree that backs the
mtree (and eventually other btrees), but something really curious
happened.
It turns out it's actually really hard to create a btree cycle, even
intentionally.
This is because each CoW btree pointer includes the expected CRC of
the branch's rbyd. To succesfully create a cycle that isn't trivially
detected in a validating mtree traversal, you would somehow need to
solve for a cyclic set of dependent CRCs that are still valid.
I suspect this is slightly easier than a hash-based construction, due to
the linear nature of CRCs, but still I think it's unreasonable to expect
these sort of cycles to occur in the wild. Even with filesystem bugs.
---
Note this isn't true for the mdirs, which are mutable so storing a
checksum in the pointer isn't possible. For this reason, cycle detection
is kept for mdirs during mtree traversal. This may not be strictly
necessary for the mtree, but it needed for the mroot chain.
Nonetheless, this does simplify things. Specifically it reduces the
cycle detection's tortoise state to only mdir pairs.
This is a nice bit of deduplication as long as the mtree traversal can
handle both:
1. Cycle detection
2. Btree node validation
Eventually we'll also collect gstate here, which mtree traversal should
make quite easy.
The only catch is if we eventually need a non-fetching way to read the
mroot config, such as if we need to infer the csum type or block-size,
but that's a future problem.
This is a bit tricky because our tortoise state is now quite large
thanks to how we are nesting traversals:
- Current mdir pair
- Current mtree block+trunk
- Current btree block+trunk? (TODO)
- Others? (TODO)
This also raises some questions about what constitutes a cycle in our
btrees. Since they are strictly CoW, they should be strictly DAGs worst
case. But is that still true when considering that btree nodes can
contain multiple trunk versions?
To be safe, I'm currently including the trunks in our tortoise state,
but it may be possible to relax this in the future.
Note that because we amortize the traversal cost over the number of
entries, mtree traversal may have some strange looking results when
compared to mtree lookup.
Though it's interesting to note this is a valid result. In mtree lookups
we need to fetch the mdir for each entry, which is expensive. However
mtree traversal can strictly avoid fetching each mdir more than once.
This does make mdir traversal faster when iterating over all mdirs in
order.
This can be represented in big O notation if we treat the number of
entries (n) and block size (b) as variables:
- mtree traversal via lookup = O(nb+nlog(b)logb(n))
- mtree traversal via traversal = O(nlog(b)logb(n))
Validating btree nodes during lfsr_btree_lookup was useful as a
proof-of-concept, but it's not really needed if we validate btree nodes
during mtree traversal.
mtree traversal provides the first reads into the filesystem. It's how
we find the real mroot, and (in theory at the moment) it provides the core
operation for error detection in correction. With this in mind,
implementing btree node validation in mtree traversal makes a lot of
sense, with lfsr_btree_lookup leveraging an assumed successful
validation for faster/smaller btree walks.
Note that btree node validation during traversal is still optional. We
really don't want to pay this cost during block allocation for example.
---
It may look concerning that there's no related validation in btree traversal
layer itself.
It turns out that a quirk of btree traversal returning inner btree nodes on
first visit, before actually traversing the btree node, is that it's
safe for us to validte the btree node in only the mtree traversal layer.
As long as we don't continue traversing on finding a corrupted btree,
the btree traversal layer will never traverse an unvalidated btree node.
This keeps all the validation logic in the same place, mtree traversal.
I don't know if this will stay this way if/when more error correction
features are added, but it's convenient in the meantime.
Just like lfsr_btree_traversal_t, lfsr_mtree_traversal_t provides a
mechanism for traversing the mtree incrementally, including any inner
btree nodes.
This is one level more complex than btree traversal because we also need
to handle the mroot chain and traversal of rids in each mdir.
Again, mtree traversal returns temporary decoded rbyd structs for inner
nodes. Actually, mtree traversal only returns inner nodes... so maybe
using lfsr_data_t here is the wrong choice:
- tag=LFSR_TAG_BTREE => lfsr_rbyd_t
- tag=LFSR_TAG_MDIR => lfsr_mdir_t
The main thing to note is that traversal here != iteration.
Thanks to the right-leaning nature of our btrees, iteration is already
provided by lfsr_btree_lookupnext, using the bid as the current
iteration state.
What btree traversal provides is traversal over every rbyd + entries
used in the btree, include the inner btree nodes. This is useful for
things like garbage collection and error detection that need to operate
on the raw rbyds.
Note that both btree traversal and iteration are still O(n log_b(n)). We
can't do any better than that without recursion.
One non-intuitive implementation detail, we return a tag describing each
entry, but instead of returning an on-disk data reference for inner
btree nodes, we return a pointer to a temporarily decoded rbyd struct.
This simplifies root handling, and we probably want the decoded version
anyways:
- tag=LFSR_TAG_BTREE => lfsr_rbyd_t
- tag=anything else => lfsr_data_t
The reason for making btree traversal incremental, and not just use a
callback like we've done previously, is to eventually use this as a part
of high-level incremental garbage-collection/error-correction. For this
to work, all of the lower-levels also need to be incremental.
littlefs uses an invasive linked-list in open mdirs to keep any open
files/dirs (and some special mdirs) in sync during filesystem
operations. The main benefit of this is that the filesystem doesn't need
to know the number of open files at compile time.
The implementation here introduces a new type, lfsr_openedmdir_t, for
mdirs that want to participate in the opened-mdir linked-list. This
saves a couple words of memory in the cases where the mdir does not need
to participate in the opend-mdir linked-list.
Since we are creating quite a few more mdir structs in lfsr_mdir_commit now,
the size of this struct is valuable.
The implementation of lfsr_mdir_commit knew this was coming, so aside
from the new type, adding this feature was straightforward:
1. Update opened-mdirs based on in-flight attrs.
2. Update opened-mdirs rbyd state.
3. Mark any deleted opened-mdirs with the reserved mid -2.
4. Test.
Optimizing a script? This might sound premature, but the tree rendering
was, uh, quite slow for any decently sized (>1024) btree.
The main reason is that tree generation is quite hacky in places, repeatedly
spitting out multiple copies of the inner node's rbyd trees for example.
Rather than rewrite the tree generation implementation to be smarter,
this just changes all edge representations to namedtuples (which may
reduce memory pressure a bit), and collects them into a Python set.
This has the effect of deduplicating generated edges efficiently, and
improved the rendering performance significantly.
---
I also considered memoizing rbyd tree, but dropped the idea since the
current renderer performs well enough.
In addition to plugging in the rbyd and btree renderers in dbgbtree.py,
this required wiring in rbyd trees in the mdirs and mroots.
A bit tricky, but with a more-or-less straightforward implementation thanks
to the common edge description used for the tree renderer.
For example, a relatively small mtree:
$ ./scripts/dbgmtree.py disk -B4096 -t -i
mroot 0x{0,1}.45, rev 1, weight 0
mdir ids tag ...
{0000,0001}: .---------> -1 magic 8 ...
| .-------> config 21 ...
+-+-+ btree 7 ...
0006.000a: | .-+ 0 mdir w1 2 ...
{0002,0003}: | | '-> 0.0 inlined w1 1024 ...
0006.000a: '-+-+ 1 mdir w1 2 ...
{0004,0005}: '-> 1.0 inlined w1 1024 ...
This builds on dbgrbyd.py and dbgbtree.py by allowing for quick
debugging of the littlefs mtree, which is a btree of rbyd pairs with a
few bells and whistles.
This also comes with a number of tweaks to dbgrbyd.py and dbgbtree.py,
mostly changing rbyd addresses to support some more mdir friendly
formats.
The syntax for rbyd addresses is starting to converge into a couple
common patterns, which is nice for quickly determining what type of
address you are looking at at a glance:
- 0x12 => An rbyd at block 0x12
- 0x12.34 => An rbyd at block 0x12 with trunk 0x34
- 0x{12,34} => An rbyd at either block 0x12 or block 0x34 (an mdir)
- 0x{12,34}.56 => An rbyd at either block 0x12 or block 0x34 with trunk 0x56
These scripts have also been updated to support any number of blocks in
an rbyd address, for example 0x{12,34,56,78}. This is a bit of future
proofing. >2 blocks in mdirs may be explored in the future for the
increased redundancy.
It's interesting to note the different performance characteristics of
purely CoW btrees vs our mutable mtree.
The main downside of our mtree is the need to fetch leaf mdirs. This
fetch is expensive, and can be avoided in CoW btrees by storing the
trunk in each branch's parent.
On the other hand, btrees need to propagate all changes upwards to the
root.
An interesting takeaway is that a sort of mdir-trunk cache may be a very
interesting optimization for relatively little RAM cost. This may be
something to explore in the future.
This consolidates all mdir relocation and allocation logic into
lfsr_mdir_compact by using an extra mid hint to indicate if the mdir is
new and unallocated. When mid == -4, the mdir is NOT new, and the mid
should be taken from the mdir.
This deduplicates quite a bit of logic between mdir compaction and
unlining, though it is incompatible with the optional rbyd scheme
previously used to compact mroot 0x{0,1} during mroot extensions.
Though I'm not sure this is the best approach. We should probably look
at this again after things have more of a shape.
- lfsr_btree_isnull still used tag and not only weight for null trees
- relocation forgot the mid
- missed relocation when uninlining, though this fix should be cleaned up
- made revision count behavior a bit more consistent
Note that the new tests may be -Gnor exclusive, they rely quite a bit on
exactly when compaction happens...
This allows lfsr_mdir_compact_ to also cover the mroot extension commit.
The mroot located at 0x{0,1} is a bit unique in that it can never
relocated. Instead we "extend" the mroot chain by an additional mroot
that can relocate.
One nice thing is we can implement this by letting lfsr_mdir_commit_
perform a normal relocation, and then rewrite the 0x{0,1} mroot with
a pointer to the "relocated" mroot.
Though we have to take extra care to make sure this write doesn't
recursively trigger an additional relocate, which would never terminate.
Fortunately, this situation only happens when we are compacting the
0x{0,1} mroot. Which simplifies things a bit.
This was actually a bug, and would have eventually been caught when we resume
power-loss testing.
In our current implementation of the mtree, we immediately drop mdirs
when their weight goes to zero, since at this point there's no route to
write new commits to the mdir. This was implemented by 1. writing out the
commit, and then 2. removing the mdir from the mtree if its weight is zero.
But this has a problem analogous to why we can't salvage failed compacts
during mdir split: If we allow a valid commit to be written to an mdir
before we update its position in the mtree, that commit becomes immediately
visible in the case of a power-loss.
This is a bit tricky to fix since we rely entirely on appending tags to
the on-disk rbyd to determine weight changes. We tried simulating weight
changes previously, but that was a mistake that created complexity.
The solution here is separate the appending of tags from the commit
finalization:
1. Append any pending tags.
2. If weight->0, drop caches, abort the commit.
3. Otherwise, write the checksum, finalizing the commit.
This has the new side-effect of intentionally leaving unfinalized commits
on-disk, but since we have no way to reclaim the erased bytes in these
mdirs, that is probably ok.
lfsr_mdir_commit => lfsr_mdir_commit
|-> lfsr_mdir_commit_
'-> lfsr_mdir_compact_
The mess that was lfsr_mdir_commit was a growing problem. Flattening all
possible mdir operations into a single loop may have resulted in a
smaller code size, but at a significant cost to implementation
difficult, readability, bugs, etc.
This restructure splits the mdir commit logic into three components:
1. lfsr_mdir_compact_
This handles the swapping of mdir blocks, revision counts, erasing, etc.
lfsr_mdir_compact_ also accepts a range of ids, allowing it to be
called directly for mdir splitting/uninlining.
Actually, the biggest feature in lfsr_mdir_compact_, which is easy to
overlook, is that is accepts two attr lists. This seems like a weird
feature for an API, but keep in mind we have strict RAM limitations,
so we can't really concatenate attr lists easily.
There is only a single case we need two attr lists: When uninlining
an mroot we need to include 1. any pending mroot attrs, and 2. the
new mtree. But one case is enough to make attempted workarounds
excessively complicated.
Simply accepting two attr lists here resolves this.
2. lfsr_mdir_commit_
This handles the low-level mdir commit logic: It tries to do a simple
rbyd commit, and if that fails falls back to a compact/relocate loop.
Perhaps surprisingly, lfsr_mdir_commit_ does not handle mdir splits.
The exact behavior of mdir splits is context specific, so
lfsr_mdir_commit_ simple errors if lfsr_rbyd_estimate indicates
compaction will be unsuccessful.
Less surprisingly, lfsr_mdir_commit_ does not handle any
mtree/internal state updates. lfsr_mdir_commit_ is only concerned
with the specific mdir struct provided.
3. lfsr_mdir_commit
This ties together all of the mdir commit logic and provides the main
mechanism by which the rest of the filesystem interacts with mdirs.
lfsr_mdir_commit is mainly responsible for handling the side-effects
of the low-level operations:
- Propagating mtree/mroot updates caused by relocations/splits/drops
- Updating the provided mdir struct correctly if it splits/relocates
based on a rid hint
- Updating the internally tracked mroot/mtree state on success
- Updating any open mdirs on success (TODO)
This is a complicated function, but most of that complexity can be
captured in a large, but relatively simple, tree of if statements.
Not great for code cost, but this may just be a necessity of the new
mtree data-structure.
This also includes the tail-recursive mroot propagation loop, which
is an excellent example of how splitting the high/low-level logic
helps separate context-specific logic.
This still needs work, but the significantly improved readability of
lfsr_mdir_commit provides much more confidence in this design.
This already has the strong advantage that the extra mdir copies make it
clear when exactly the higher-level mdir copies are updated. This gives
us much better confidence that errors will not render the mdir state
unusable, though may be coming with a RAM cost.
Dropped the high-level "large entry" tests in exchange for these low-level
tests. The high-level tests accomplished the same thing, but worse and
less reliably.
Added some rough fixes (this whole code path needs to be rewritten).
Also made lfsr_rbyd_bisect a bit better behaved when dealing with a
small number of large entries. This was necessary for the split/drop
corner case tests since these rely on precise control of when mdirs
split.
mdirs behave a bit differently than btree nodes here. When an mdir's
weight drops to zero, we eagerly drop the mdir. Unfortunately this
introduce a large number of conditions into lfsr_mdir_commit. Maybe
there's some different way to structure to code to avoid this...
Also expanded mtree tests to cover more corner cases, these are
desperately for any confidence that mdir drops work.
This isn't the greatest coverage as we don't have a verifiable simulation.
Simulating the splitting-bucket-tree that is the mtree is tricky.
So right now this mostly just checks there's no internal assert failures and
if we have the expected number of entries afterwards.
- lfsr_btree_lookupnext_ => gives you the underlying rbyd/rid, intended
for btree-internal use.
- lfsr_btree_lookupnext => does not give you underlying rbyd/rid, used
for general purpose lookups/iteration.
This is for consistency with other *_lookupnext functions, and
discourages use of the leaf rbyd/rid. These are sensitive to internal
btree state.
This is mostly for consistency. It's unclear if we'll ever actually
use the on-disk lfsr_data_t representation here, since current thoughts
expect most btrees to only store pointers with in-device
representations.
It may be worth reverting this in the future.
After implementing lfsr_btree_commit and lfsr_mdir_commit, a common
pattern emerged for all compact/split operations:
1. Copy over subrange of tags.
2. Apply pending tags in that subrange.
lfsr_rbyd_appendall and lfsr_rbyd_compact now provide these operations,
allowing for better code sharing across these two algorithms.
The only hiccup is vestigial names in btree commit, which require a flag
and some special handling.
Note that btree merge is a bit of a special case for now.
This trades off simpler estimation (and more flexibility for
experimenting with better compaction strategies) for the possibility of
failed merges that require cleanup.
Note, we still estimate if our merge fits post-compaction, this just
isn't reliable for knowing if the merge fits appended to the current
compaction, with in-flight attrs for both the in-flight commit and merge name.
Fortunately, since our estimate is conservative, we shouldn't see any
split<->merge oscillation.
---
It's also worth noting since the merge name isn't accounted for in our
sibling, it wasn't clear if our estimation was correct in the first
place.
This change avoids any issues that may cause.
One downside of using our compaction estimate as a heuristic to avoid
failed merges: The merge abort code path may be difficult to cover in tests.
We should make sure merge aborts don't go untested.
Unfortunately due to different early-exit conditions,
estimate/isdegenerate isn't trivially compatible. The previous, merged
implementation missed the opportunity to inline btrees with two large
entries undergoing compaction.
It's unlikely to hit this, but splitting these back into two separate
passes simplifies the code and avoids the potential for other bugs from
this combination of unrelated pieces of logic.
Keep in mind lfsr_rbyd_isdegenerate is cheap:
1. Only ran when compacting the root of a btree.
2. Cutoff is usually small, at the moment requires at most 2 ids, or
2*2 rbyd lookups with the current btree implementation.
Currently relying on lfsr_rbyd_append/appendattrs to inject extra
attributes during lfsr_mdir_commit, need to consider if this is really
the best solution. This probably results in more function calls than we
really need.
This approach is simpler: fall back to using two passes if we split a
supermdir.
This trades off code complexity with runtime, but I think we really don't
care about the runtime here, since this operation should really only happen
once in a filesystem's entire lifetime.
This is tricky because of the number of corner-cases that can occur:
1. Our supermdir fits as is => compact normally.
2. Our supermdir does not fit, but it does if we separate the superattrs
from file attrs => uninline, but don't split.
3. Our supermdir does not fit, and does not fit after separating the
superattrs => uninline and split.
This trades of a simpler compaction estimate for a looser upper bound.
We now only lower the bound for:
- The number of alts per tag.
- The worst-case leb128 encoding assuming current block_size.
Since this worst case encoding only depends on the block_size, it can
also be precalculated and stored somewhere, though we're currently not
doing that.
On the plus side, this no longer varies depending on the rbyd's weight,
which could cause hard-to-detect issues for very large B-trees.
Still needs work, but at least adopted optionally in the btree.
Ignoring the mdirs for now, which is a bit ironic, because the mdir
compaction is really what this feature is for. But this at least proves
the concept.
---
Unlike btrees, mdirs simply cannot perform the attempt-then-delete-half
strategy current performed by the btrees during compaction with a single
pcache. This is because the moment we finish the commit with the delete,
it becomes visible to the filesystem. We can't abort the commit temporarily
to deal with the other half of the split, because our pcache is in use.
So, instead, the idea is to estimate the compacted rbyd size before
compacting, using conservative (but tight!) estimates for various leb128
encoded parts of the metadata.
And if we adopt this strategy for mdirs, we should probably adopt it in
the btrees for better code sharing.
A couple benefits:
- Major reduction in progs during split, since we don't write out tags
just to delete them.
- btree merge can actually consider both siblings now.
- Not needing to weave the split/merge logic around compact offers a
better route for code deduplication.
- mdir compact will actually work, that's generally a good thing.
And a couple downsides:
- This estimate is complex, meaning more code-cost and a bigger surface
area for bugs.
- This results in a minor performance hit for the common compact case,
since we need to read the rbyd being compacted twice instead of once.
LFSR_TAG_BNAME => LFSR_TAG_BRANCH
LFSR_TAG_BRANCH => LFSR_TAG_BTREE
Maybe this will be a problem in the future if our branch structure is
not the same as a standalone btree, but I don't really see that
happening.
This became surprisingly tricky.
The main issue is knowing when to split mdirs, and how to determine
this without wasting erase cycles.
Unlike splitting btree nodes, we can't salvage failed compacts here. As
soon as the salvage commit is written to disk, the commit becomes immediately
visibile to the filesystem because it still exists in the mtree. This is
a problem if we lose power.
We're likely going to need to implement rbyd estimates. This is
something I hoped to avoid because it brings in quite a bit of
complexity and might lead to an annoying amount of storage waste since
our estimates will need to be conservative to avoid unrecoverable
situations.
---
Also changed the on-disk btree/branch struct to store a copy of the weight.
This was already required for the root of the btree, requiring the
weight to be stored in every btree pointer allows better code
deduplication at the cost of some redundancy on btree branches, where
the weight is already implied by the rbyd structure.
This weight is usually a single byte for most branches anyways.
This may be worth revisiting at some point to see if there's any other
unexpected tradeoffs.
Added lfsr_mountinited/lfsr_formatinited, mostly so lfsr_formatinited
can just call lfsr_mountinited for its mount-check.
This also leads to a nice consolidation of the cleanup-on-error part of
lfsr_mount/lfsr_format.
The exact behavior of lfsr_rbyd_lookup is a bit unusual, and has already
resulted in a few mistakes. To make this more clear at a glance, names
have been changed and a few more helper functions added.
The new names and expected behavior:
- *_lookupnext - lookup the smallest id/tag less than or equal to the
requested id/tag, returns LFS_ERR_NOENT if id/tag is greater than all
ids/tags in the data structure.
- *_lookup - lookup the exact id/tag, returns LFS_ERR_NOENT if id/tag
is not in the data structure.
These have been adopted in all current data structures: rbyd/btree/mdir
- lfsr_rbyd_lookup => lfsr_rbyd_lookupnext
- lfsr_btree_lookup => lfsr_btree_lookupnext
- lfsr_btree_namelookup => lfsr_btree_namelookupnext
- lfsr_mdir_lookup => lfsr_mdir_lookupnext
Note no lfsr_btree_namelookup is added, this is a more complicated
than lfsr_btree_lookup (we need to cmp the name on-disk for equality)
and also probably not needed.
Considering that _most_ progs in littlefs needs to be checksummed for
future consistency checks, it's not really worth it to have a separate
non-checksumming function. Especially when you consider the fanout of
the prog extensions for different types.
A similar problem, sort of unresolved, is what to do with all the
validating functions. I guess we'll cross that bridge when we need to.
This makes lfsr_data_t a more powerful primitive in littlefs.
- Implemented a set of convenience lfsr_data_read* functions for
easier reading from muxed on-disk/in-ram data.
- Adoped these functions and lfsr_data_t in *_fromdisk functions
as well as low-level rbyd operations.
- Changed leb128 parsing to rely on returned limits instead of
pre-initialized 0xffs to detect truncated leb128s.
- Prefer incremental reading+parsing in more places as a side-effect,
though we don't really have a measurement if this is a net benefit or
cost code/ram wise.
These functions offer more than previous internal bd functions, the idea being
that the more functionality we can move into this layer, the less
functionality gets duplicated across dependent functions.
- lfsr_bd_read - caching read with hint
- lfsr_bd_readcsum - read with checksum
- lfsr_bd_csum - calculate checksum, don't read data
- lfsr_bd_cmp - compare data against a buffer
- lfsr_bd_prog - caching prog
- lfsr_bd_progcsum - prog with checksum
- lfsr_bd_sync - complete an in-flight prog
- lfsr_bd_progvalidate - prog with read-back validation
- lfsr_bd_progcsumvalidate - prog with checksum and read-back validation
- lfsr_bd_syncvalidate - complete an in-flight prog with read-back validation
- lfsr_bd_erase - erase a block
- lfsr_bd_readtag - read a tag with optional checksum
- lfsr_bd_progtag - prog a tag with checksum
Of course these are all susceptible to change.