forked from Imagelibrary/binutils-gdb
5a1d8eca5c331edab4e424c2034685433efa4bf5
3 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
a480362d88 |
libctf: string: refs rework
This commit moves provisional (not-yet-serialized) string refs towards the
scheme to be used for CTF IDs in the future. In particular
- provisional string offsets now count downwards from just under the
external string offset space (all bits on but the high bit). This makes
it possible to detect an overflowing strtab, and also makes it trivial to
determine whether any string offset (ref) updates were missed -- where
before we might get a slightly corrupted or incorrect string, we now get
a huge high strtab offset corresponding to no string, and an error is
emitted at read time.
- refs are emitted at serialization time during the pass through the types.
They are strictly associated with the newly-written-out buffer: the
existing opened CTF dict is not changed, though it does still get the new
strtab so that new refs to the same string can just refer directly to it.
The provisional strtab hash table that contains these strings is not
deleted after serialization (because we might serialize again): instead,
we keep track in the parent of the lowest-yet-used ("latest") provisional
strtab offset, and any strtab offset above that, but not external
(high-bit-on) is considered provisional.
This is sort-of-enforced by moving most of the ref-addition function
declarations (including ctf_str_add_ref) to a new ctf-ref.h, which is
not included by ctf-create.c or ctf-open.c.
- because we don't add refs when adding types, we don't need to handle the
case where we add things to expanding vlens (enums, struct members) and
have to realloc() them. So the entire painful movable refs system can
just be deleted, along with the ability to remove refs piecemeal at all
(purging all of them is still possible). Strings added during type
addition are added via ctf_str_add(), which adds no refs: the strings are
picked up at serialization time and refs to their final, serialized
resting place added. The DTDs never have any refs in them, and their
provisional strtab offsets are never updated by the ref system.
This caused several bugs to fall out of the earlier work and get fixed.
In particular, attempts to look up a string in a child dict now search
the parent's provisional strtab too: we add some extra special casing
for the null string so we don't need to worry about deduplication
moving it somewhere other than offset zero.
Finally, the optimization that removes an unreferenced synthetic external
strtab (the record of the strings the linker has told us about, kept around
internally for lookup during late serialization) is faulty: references to a
strtab entry will only produce CTF-level refs if their value might change,
and an external string's offset won't change, so it produces no refs: worse
yet, even if we did get a ref (say, if the string was originally believed
to be internal and only later were we told that the linker knew about it
too), when we serialize a strtab, all its refs are dropped (since they've
been updated and can no longer change); so if we serialized it a second
time, its synthetic external strtab would be considered empty and dropped,
even though the same external strings as before still exist, referencing
it. We must keep the synthetic external strtab around as long as external
strings exist that reference it, i.e. for the life of the dict.
One benefit of all this: now we're emitting provisional string offsets at
a really high value, it's out of the way of the consecutive, deduplicated
string offsets in child dicts. So we can drop the constraint that you
cannot add strings to a dict with children, which allows us to add types
freely to parent dicts again. What you can't do is write that dict out
again: when we serialize, we currently update the dict being serialized
with the updated strtabs: when you write a dict out, its provisional
strings become real strings, and suddenly the offsets would overlap once
more. But opening a dict and its children, adding to it, and then
writing it out again is rare indeed, and we have a workaround: anyone
wanting to do this can just use ctf_link instead.
|
||
|
|
e695879142 |
libctf, testsuite: fix various warnings in tests
These warnings are all off by default, but if they do fire you get spurious ERRORs when running make check-libctf. libctf/ChangeLog 2021-09-27 Nick Alcock <nick.alcock@oracle.com> * testsuite/libctf-lookup/enum-symbol.c: Remove unused label. * testsuite/libctf-lookup/conflicting-type-syms.c: Remove unused variables. * testsuite/libctf-regression/pptrtab.c: Likewise. * testsuite/libctf-regression/type-add-unnamed-struct.c: Likewise. * testsuite/libctf-writable/pptrtab.c: Likewise. * testsuite/libctf-writable/reserialize-strtab-corruption.c: Likewise. * testsuite/libctf-regression/nonstatic-var-section-ld-r.c: Fix format string. * testsuite/libctf-regression/nonstatic-var-section-ld.c: Likewise. * testsuite/libctf-regression/nonstatic-var-section-ld.lk: Adjust. * testsuite/libctf-writable/symtypetab-nonlinker-writeout.c: Fix initializer. |
||
|
|
986e9e3aa0 |
libctf: do not corrupt strings across ctf_serialize
The preceding change revealed a new bug: the string table is sorted for better compression, so repeated serialization with type (or member) additions in the middle can move strings around. But every serialization flushes the set of refs (the memory locations that are automatically updated with a final string offset when the strtab is updated), so if we are not to have string offsets go stale, we must do all ref additions within the serialization code (which walks the complete set of types and symbols anyway). Unfortunately, we were adding one ref in another place: the type name in the dynamic type definitions, which has a ref added to it by ctf_add_generic. So adding a type, serializing (via, say, one of the ctf_write functions), adding another type with a name that sorts earlier, and serializing again will corrupt the name of the first type because it no longer had a ref pointing to its dtd entry's name when its string offset was shifted later in the strtab to mae way for the other type. To ensure that we don't miss strings, we also maintain a set of *pending refs* that will be added later (during serialization), and remove entries from that set when the ref is finally added. We always use ctf_str_add_pending outside ctf-serialize.c, ensure that ctf_serialize adds all strtab offsets as refs (even those in the dtds) on every serialization, and mandate that no refs are live on entry to ctf_serialize and that all pending refs are gone before strtab finalization. (Of necessity ctf_serialize has to traverse all strtab offsets in the dtds in order to serialize them, so adding them as refs at the same time is easy.) (Note that we still can't erase unused atoms when we roll back, though we can erase unused refs: members and enums are still not removed by rollbacks and might reference strings added after the snapshot.) libctf/ChangeLog 2021-03-18 Nick Alcock <nick.alcock@oracle.com> * ctf-hash.c (ctf_dynset_elements): New. * ctf-impl.h (ctf_dynset_elements): Declare it. (ctf_str_add_pending): Likewise. (ctf_dict_t) <ctf_str_pending_ref>: New, set of refs that must be added during serialization. * ctf-string.c (ctf_str_create_atoms): Initialize it. (CTF_STR_ADD_REF): New flag. (CTF_STR_MAKE_PROVISIONAL): Likewise. (CTF_STR_PENDING_REF): Likewise. (ctf_str_add_ref_internal): Take a flags word rather than int params. Populate, and clear out, ctf_str_pending_ref. (ctf_str_add): Adjust accordingly. (ctf_str_add_external): Likewise. (ctf_str_add_pending): New. (ctf_str_remove_ref): Also remove the potential ref if it is a pending ref. * ctf-serialize.c (ctf_serialize): Prohibit addition of strings with ctf_str_add_ref before serialization. Ensure that the ctf_str_pending_ref set is empty before strtab finalization. (ctf_emit_type_sect): Add a ref to the ctt_name. * ctf-create.c (ctf_add_generic): Add the ctt_name as a pending ref. * testsuite/libctf-writable/reserialize-strtab-corruption.*: New test. |