binutils-gdb

Author	SHA1	Message	Date
Nick Alcock	83e9ca77b2	libctf, create: typedefs Nothing here but adjustment to internal API changes. Typedefs have no special properties that need querying, so there are no changes to ctf-types.c at all.	2025-04-25 18:07:43 +01:00
Nick Alcock	bd0c033b29	libctf, create, types: slices Nothing difficult for this CTF-specific type kind, just the usual adjustment to internal API changes.	2025-04-25 18:07:43 +01:00
Nick Alcock	0cd5118024	libctf: create, types: arrays The same internal API changes for arrays. There is one ABI change here, to ctf_arinfo_t: - uint32_t ctr_nelems; /* Number of elements. / + size_t ctr_nelems; / Number of elements. */	2025-04-25 18:07:43 +01:00
Nick Alcock	d65d03bec4	libctf: create, types: reftypes and pointers This is pure adjustment for internal API changes, and a change to the type-compatibility of pointers to type 0 now that it can be void as well as "unrepresentable". By now this dance should be quite familiar.	2025-04-25 18:07:43 +01:00
Nick Alcock	d5dd8997b3	libctf: create, types: conflicting types The conflicting type kind is a CTF-specific prefix kind consisting purely of an optional translation unit name. It takes the place of the old hidden bit: we have already seen it used to prefix types added with a CTF_ADD_NONROOT flag. The deduplicator will also use them to label conflicting types from different TUs smushed into the same dict by the CU-mapping mechanism: unlike the hidden bit, with this scheme users can tell which CUs the conflicting types came from. New API: +int ctf_type_conflicting (ctf_dict_t , ctf_id_t, const char cuname); +int ctf_set_conflicting (ctf_dict_t , ctf_id_t, const char *); (Frankly I expect ctf_set_conflicting to be used only by deduplicators and things like that, but if we provide an option to query something we should also provide an option to produce it...)	2025-04-25 18:07:43 +01:00
Nick Alcock	fb8917ac21	libctf, create, types: type and decl tags These are a little more fiddly than previous kinds, because their namespacing rules are odd: they have names (so presumably we want an API to look them up by name), but the names are not unique (they don't need to be, because they are not entities you can refer to from C), so many distinct tags in the same TU can have the same name. Type tags only refer to a type ID: decl tags refer to a specific function parameter or structure member via a zero-indexed "component index". The name tables for these things are a hash of name to a set of type IDs; rather different from all the other named entities in libctf. As a consequence, they can presently be looked up only using their own dedicated functions, not using ctf_lookup_by_name et al. (It's not clear if this restriction could ever be lifted: ctf_lookup_by_name and friends return a type ID, not a set of them.) They are similar enough to each other that we can at least have one function to look up both type and decl tags if you don't care about their component_idx and only want a type ID: ctf_tag. (And one to iterate over them, ctf_tag_next). (A caveat: because tags aren't widely used or generated yet, much of this is more or less untested and/or supposition and will need testing later.) New API, more or less the minimum needed because it's not entirely clear how these things will be used: +ctf_id_t ctf_tag (ctf_dict_t , ctf_id_t tag); +ctf_id_t ctf_decl_tag (ctf_dict_t , ctf_id_t decl_tag, + int64_t component_idx); +ctf_id_t ctf_tag_next (ctf_dict_t , const char tag, ctf_next_t ); +ctf_id_t ctf_add_type_tag (ctf_dict_t , uint32_t, ctf_id_t, const char ); +ctf_id_t ctf_add_decl_type_tag (ctf_dict_t , uint32_t, ctf_id_t, const char ); +ctf_id_t ctf_add_decl_tag (ctf_dict_t , uint32_t, ctf_id_t, const char *, + int component_idx);	2025-04-25 18:07:43 +01:00
Nick Alcock	39cdb3e395	libctf: create, types: functions, linkage, arg names (API ADVICE) Functions change in CTFv4 by growing argument names as well as argument types; the representation changes into a two-element array of (type, string offset) rather than a simple array of arg types. Functions also gain an explicit linkage in a different type kind (CTF_K_FUNC_LINKAGE, which corresponds to BTF_KIND_FUNC). New API: typedef struct ctf_funcinfo { /* ... / - uint32_t ctc_argc; / Number of typed arguments to function. / + size_t ctc_argc; / Number of typed arguments to function. / }; int ctf_func_arg_names (ctf_dict_t , unsigned long, uint32_t, const char *); int ctf_func_type_arg_names (ctf_dict_t , ctf_id_t, uint32_t, const char *names); +extern int ctf_type_linkage (ctf_dict_t , ctf_id_t); -extern ctf_id_t ctf_add_function (ctf_dict_t , uint32_t, - const ctf_funcinfo_t , const ctf_id_t ); +extern ctf_id_t ctf_add_function (ctf_dict_t , uint32_t, + const ctf_funcinfo_t , const ctf_id_t , + const char *arg_names); +extern ctf_id_t ctf_add_function_linkage (ctf_dict_t , uint32_t, + ctf_id_t, const char *, int linkage); Adding this is fairly straightforward; the only annoying part is the way the callers need to allocate space for the arg name and type arrays. Maybe we should rethink these into something like ctf_type_aname(), allocating space for the caller so the caller doesn't need to? It would certainly make all the callers in libctf much less complex... While we're at it, adjust ctf_type_reference, ctf_type_align, and ctf_type_size for the new internal API changes (they also all have special-case code for functions).	2025-04-25 18:07:43 +01:00
Nick Alcock	a632f3ed33	libctf, types: ctf_type_kind_{iter,next} et al These new functions let you iterate over types by kind, letting you get all variables, all enums, all datasecs, etc. (This is amenable to future optimization, and some is expected shortly.) We also add new iternal functions ctf_type_kind_{forwarded_,unsliced_,}tp which are like the corresponding non-_tp functions except that they take a ctf_type_t rather than a type ID: doing this allows the deduplicator to use these nearly-public functions more. The public ctf_type_kind* functions are reimplemented in terms of these. This machinery is the principal place where the magic encoding of forwards is encoded.	2025-04-25 18:07:43 +01:00
Nick Alcock	ea21a1b2ae	libctf: create, types: variables and datasecs (REVIEW NEEDED) This is an area of significant difference from CTFv3. The API changes significantly, with quite a few additions to allow creation and querying of these new datasec entities: -typedef int ctf_variable_f (const char name, ctf_id_t type, void arg); +typedef int ctf_variable_f (ctf_dict_t , const char name, ctf_id_t type, + void arg); +typedef int ctf_datasec_var_f (ctf_dict_t fp, ctf_id_t type, size_t offset, + size_t datasec_size, void arg); +/ Search a datasec for a variable covering a given offset. + + Errors with ECTF_NODATASEC if not found. / + +ctf_id_t ctf_datasec_var_offset (ctf_dict_t fp, ctf_id_t datasec, + uint32_t offset); + +/* Return the datasec that a given variable appears in, or ECTF_NODATASEC if + none. / + +ctf_id_t ctf_variable_datasec (ctf_dict_t fp, ctf_id_t var); +int ctf_datasec_var_iter (ctf_dict_t , ctf_id_t, ctf_datasec_var_f , + void ); +ctf_id_t ctf_datasec_var_next (ctf_dict_t , ctf_id_t, ctf_next_t *, + size_t size, size_t offset); -int ctf_add_variable (ctf_dict_t , const char , ctf_id_t); +/ ctf_add_variable adds variables to no datasec at all; + ctf_add_section_variable adds them to the given datasec, or to no datasec at + all if the datasec is NULL. / + +ctf_id_t ctf_add_variable (ctf_dict_t , const char , int linkage, ctf_id_t); +ctf_id_t ctf_add_section_variable (ctf_dict_t , uint32_t, + const char datasec, const char name, + int linkage, ctf_id_t type, + size_t size, size_t offset); We tie datasecs quite closely to variables at addition (and, as should become clear later, dedup) time: you never create datasecs, you only create variables in datasecs, and the datasec springs into existence when you do so: datasecs are always found in the same dict as the variables they contain (the variables are never in the parent if the datasec is in a child or anything). We keep track of the variable->datasec mapping in ctf_var_datasecs (populating it at addition and open time), to allow ctf_variable_datasec to work at reasonable speed. (But, as yet, there are no tests of this function at all.) The datasecs are created unsorted (to avoid variable addition becoming O(n^2)) and sorted at serialization time, and when ctf_datasec_var_offset is invoked. We reuse the natural-alignment code from struct addition to get a plausible offset in datasecs if an alignment of -1 is specified: maybe this is unnecessary now (it was originally added when ctf_add_variable added variables to a "default datasec", while now it just leaves them out of all datasecs, like externs are). One constraint of this is that we currently prohibit the addition of nonrepresentable-typed variables, because we can't tell what their natural alignment is: if we dropped the whole "align" and just required everyone adding a variable to a datasec to specify an offset, we could drop that restriction. WDYT? One additional caveat: right now, ctf_lookup_variable() looks up the type of a variable (because when it was invented, variables were not entities in themselves that you could look up). This name is confusing as hell as a result. It might be less confusing to make it return the CTF_K_VAR, but that would be awful to adapt callers to, since both are represented with ctf_id_t's, so the compiler wouldn't warn about the needed change at all... I've vacillated on this three or four times now.	2025-04-25 18:07:43 +01:00
Nick Alcock	097ff012e4	libctf: decl, types: revise ctf_decl, ctf_type_name These all need fairly trivial revisions for prefix types. While we're at it, we can add explicit handling of nonrepresentable types, returning CTF_K_UNKNOWN for such types rather than throwing an error, so that type printing prints (nonrepresentable type) for such types as it always intended to.	2025-04-25 18:07:42 +01:00
Nick Alcock	2c1a0a70d1	libctf, create, types: encoding, BTF floats This adds support for the nearly useless BTF_KIND_FLOAT, under the name CTF_K_BTF_FLOAT. At the same time we fix up the ctf_add_encoding and ctf_type_encoding machinery for the new API changes. I expect this to change a bit: Ali Bahrami reckons I've oversimplified the CTFv4 encoding representation and need to reintroduce at least a width. New API: ctf_id_t ctf_add_btf_float (ctf_dict_t , uint32_t, const char , const ctf_encoding_t *);	2025-04-25 18:07:42 +01:00
Nick Alcock	03609073b0	libctf: create, types: enums and enum64s; type encoding This commit adapts most aspects of enum handling: querying and iteration, enumerator querying and iteration, ctf_type_add, etc. We have to adapt to enum64s and to signed versus unsigned enums, to our vlen and DTD changes and other internal API changes to handle prefix types etc, and fix the types of things to allow for 64-bit enumerators. We can also (finally!) get useful info on enum size rather than being restricted to a value hardwired into libctf. We also adjust all the type-encoding functions for the internal API changes, since enums are the first encodable entities we have covered. API changes: -typedef int ctf_enum_f (const char name, int val, void arg); +typedef int ctf_enum_f (const char name, int64_t val, void arg); +typedef int ctf_unsigned_enum_f (const char name, uint64_t val, void arg); -extern const char ctf_enum_name (ctf_dict_t , ctf_id_t, int); -extern int ctf_enum_value (ctf_dict_t , ctf_id_t, const char , int ); +extern const char ctf_enum_name (ctf_dict_t , ctf_id_t, int64_t); +extern int ctf_enum_value (ctf_dict_t , ctf_id_t, const char , int64_t ); +extern int ctf_enum_unsigned_value (ctf_dict_t , ctf_id_t, const char , uint64_t ); + +/ Return 1 if this enum's contents are unsigned, so you can tell which of the + above functions to use. / + +extern int ctf_enum_unsigned (ctf_dict_t , ctf_id_t); -/* Return all enumeration constants in a given enum type. / -extern int ctf_enum_iter (ctf_dict_t , ctf_id_t, ctf_enum_f , void ); +/* Return all enumeration constants in a given enum type. The return value, and + VAL argument, may need to be cast to uint64_t: see ctf_enum_unsigned(). / +extern int64_t ctf_enum_iter (ctf_dict_t , ctf_id_t, ctf_enum_f , void ); extern const char ctf_enum_next (ctf_dict_t , ctf_id_t, ctf_next_t *, - int ); + int64_t ); + +/ enums are created signed by default. If you want an unsigned enum, + use ctf_add_enum_encoded() with an encoding of 0 (CTF_INT_SIGNED and + everything else off). This will not create a slice, unlike all other + uses of ctf_add_enum_encoded(), and the result is still representable + as BTF. / + +extern ctf_id_t ctf_add_enum64_encoded (ctf_dict_t , uint32_t, const char , + const ctf_encoding_t ); +extern ctf_id_t ctf_add_enum64 (ctf_dict_t , uint32_t, const char ); -extern int ctf_add_enumerator (ctf_dict_t , ctf_id_t, const char , int); +extern int ctf_add_enumerator (ctf_dict_t , ctf_id_t, const char , int64_t); The only aspects of enums that are not now handled are forwards to enums, dumping of enums, and deduplication of enums.	2025-04-25 18:07:42 +01:00
Nick Alcock	0a3ee49dd0	libctf: types: add ctf_struct_bitfield (NEEDS REVIEW) This new public API function allows you to find out if a struct has the bitfield flag set or not. (There are no other properties specific to a struct, so we needed a new function for it. I am open to a ctf_struct_info() function handing back a struct if people prefer.) New API: int ctf_struct_bitfield (ctf_dict_t *, ctf_id_t);	2025-04-25 18:07:42 +01:00
Nick Alcock	ceb15ece5e	libctf: create: ctf_add_type modifications This adapts ctf_add_type a little, adding support for prefix types, shifting away from hardwired things towards API functions that can adapt to the CTFv4 changes, adapting to the structure/union API changes, and adding bitfielded structures and structure members as needed.	2025-04-25 18:07:42 +01:00
Nick Alcock	4a4312b684	libctf: types: ctf_type_resolve_nonrepresentable This new internal function allows us to say "resolve a type to its base type, but treat type 0 like BTF, returning 0 if it is found rather than erroring with ECTF_NONREPRESENTABLE". Used in the next commit.	2025-04-25 18:07:42 +01:00
Nick Alcock	20e6f72dc7	libctf: create: structure and union member addition There is one API addition here: int ctf_add_member_bitfield (ctf_dict_t , ctf_id_t souid, const char , ctf_id_t type, unsigned long bit_offset, int bit_width); SoU addition handles the representational changes for bitfields and for CTF_K_BIG structs (i.e. all structs you can add members to), errors out if you add bitfields to structs that aren't created with the CTF_ADD_STRUCT_BITFIELDS flag, and arranges to add padding as needed if there is too much of a gap for the offsets to encode in one hop (that part is still untested).	2025-04-25 18:07:42 +01:00
Nick Alcock	cd8ea31666	libctf: create: struct/union addition There's one API addition here: the existing CTF_ADD_ROOT / CTF_ADD_NONROOT flags can have a new flag ORed with them, CTF_ADD_STRUCT_BITFIELDS, indicating that the newly-added struct/union is capable of having bitfields added to it via the new ctf_add_member_bitfield function (see a later commit). Without this, you can only add bitfields via the deprecated slice or base type encoding representations (the former will force CTF output). Implementation notes: structs and unions are always added with a CTF_K_BIG prefix: if promoting from a forward, one is added. These are elided at serialization time if they are not needed to encode this size of struct / this number of members. (This means you don't have to figure out in advance if your struct will be too big for BTF: you can just add members to it, and libctf will figure it out and upgrade the dict as needed, or tell you it can't if you've forbidden such things.) We take advantage of this to merge a couple of very similar functions, saving a bit of code.	2025-04-25 18:07:42 +01:00
Nick Alcock	d5bb2772c6	libctf: create: DTD addition and deletion; ctf_rollback DTD deletion changes mostly relate to the changes to the ctf_dtdef_t, but also we no longer nede to have special handling for forwards (we can just use ctf_type_kind_forwarded like everyone else). Rollback no longer needs to delete things by hand (it hasn't needed to for years): it can just call ctf_dtd_delete. ctf_add_generic changes substantially, mostly to allow for the ctf_dtdef_t changes. Rather than returning a type ID it now returns the DTD it just allocated: it can also be asked to add some prefixes, and return the first prefix added (which may not be the first prefix in the type, because if it is asked to add a non-root-visible type it will additionally allocate a CTF_K_CONFLICTING prefix to encode that). Finally, duplicate name detection is suppressed for type and decl tags.	2025-04-25 18:07:42 +01:00
Nick Alcock	05a2970ad1	libctf: create, lookup: delete DVDs; ctf_lookup_by_kind Variable handling in BTF and CTFv4 works quite differently from in CTFv3. Rather than a separate section containing sorted, bsearchable variables, they are simply named entities like types, stored in CTF_K_VARs. As a first stage towards migrating to this, delete most references to the ctf_varent_t and ctf_dvdef_t, including the DVD lookup code, all the linking code, and quite a lot of the serialization code. Note: CTF_LINK_OMIT_VARIABLES_SECTION, and the whole "delete variables that already exist in the symtypetabs section" stuff, has yet to be reimplemented. We can implement CTF_LINK_OMIT_VARIABLES_SECTION by simply excising all CTF_K_VARs at deduplication time if requested. (Note: symtypetabs should still point directly at the type, not at the CTF_K_VAR.) (Symtypetabs in general need a bit more thought -- perhaps we can now store them in a separate .ctf.symtypetab section with its own little four-entry header for the symtypetabs and their indexes, making .ctf even more like .BTF; the only difference would then be that .ctf could include prefix types, CTF_K_FLOAT, and external string refs. For later discussion.) We also add ctf_lookup_by_kind() at this stage (because it is hopelessly diff-entangled with ctf_lookup_variable): this looks up a type of a particular kind, without needing a per-kind lookup function for it, nor needing to hack around adding string prefixes (so you can do ctf_lookup_by_kind (fp, CTF_K_STRUCT, "foo") rather than having to do ctf_lookup_by_name (fp, "struct foo"): often this is more convenient, and anything that reduces string buffer manipulation in C is good.)	2025-04-25 18:07:42 +01:00
Nick Alcock	a93ad066f7	libctf: create: vlen growth and prefix addition (NEEDS REVIEW) This commit modifies ctf_grow_vlen to account for the recent changes to ctf_dtdef_t, and adds a new ctf_add_prefix function to add a prefix to an existing type, moving the dtd_data and dtd_vlen up accordinly. It deserves close review, since this is probably the single greatest bug cluster in libctf: the number of times I added to a variable of type ctf_type_t and assumed it would move it in bytes rather than ctf_type_t units is hard to believe.	2025-04-25 18:07:42 +01:00
Nick Alcock	64b65a0a34	libctf: types: struct/union member querying and iteration This commit revises ctf_member_next, ctf_member_iter, ctf_member_count, and ctf_member_info for the new CTFv4 world. This also pulls in a bunch of infrastructure used by most of the type querying functions, and fundamental changes to the way DTD records are represented in libctf (ctf-create not yet adjusted). Other type querying functions affected by changes in struct representation are also changed. There are some API changes here: new bit-width fields in ctf_member_f, ctf_membinfo_t and ctf_member_next, and a fix to the type of the offset in ctf_member_f, ctf_membinfo_t and and ctf_member_count. (ctf_member_next got the offset type right already.) ctf_member_f also gets a new ctf_dict_t arg so that you can actually use the member type it passes in without having to package up and pass in the dict type yourself (a frequent need). This change is later echoed in most of the rest of the _f typedefs. typedef struct ctf_membinfo { ctf_id_t ctm_type; / Type of struct or union member. / - unsigned long ctm_offset; / Offset of member in bits. / + size_t ctm_offset; / Offset of member in bits. / + int ctm_bit_width; / Width of member in bits: -1: not bitfield / } ctf_membinfo_t; -typedef int ctf_member_f (const char name, ctf_id_t membtype, - unsigned long offset, void arg); +typedef int ctf_member_f (ctf_dict_t , const char name, ctf_id_t membtype, + size_t offset, int bit_width, void arg); extern ssize_t ctf_member_next (ctf_dict_t , ctf_id_t, ctf_next_t , const char name, ctf_id_t membtype, - int flags); + int bit_width, int flags); -int ctf_member_count (ctf_dict_t , ctf_id_t); +ssize_t ctf_member_count (ctf_dict_t , ctf_id_t); The DTD changes are that where before the ctf_dtdef_t had a dtd_data which was the ctf_type_t type node for a type, and a separate dtd_vlen which was the vlen buffer which (in the final serialized representation) would directly follow that type, now it has one single buffer, dtd_buf, which consists of a stream of one or more ctf_type_t nodes, followed by a vlen, as it will appear in the final serialized form. This buffer has internal pointers into it: dtd_data is a pointer to the last ctf_type_t in the stream (the true type node, after all prefixes), and dtd_vlen is a pointer to the vlen (precisely one ctf_type_t after the dtd_data). This representation is nice because it means there is even less distinction between a dynamic type added by ctf_add_() and a static one read directly out of a dict: you can traverse the entire type without caring where it came from, simplifying most of the type querying functions. (There are a few more things in there which will be useful mostly when adding new types: their uses will be seen later.) Two new nontrivial functions exist (one of which is annoyingly tangled up in the diff, sorry about that): ctf_find_prefix, which hunts down a given prefix (if it exists) among the possibly many that may exist on a type (so you can ask it to find the CTF_K_BIG prefix for a type if it exists, and it'll return you a pointer to its ctf_type_t record), and ctf_vlen, which you hand a type ID and its ctf_type_t *, and it gives you back a pointer to its vlen and tells you how long it is. (This is one of only two places left in ctf-types.c which cares whether a type is dynamic or not. The other has yet to be added). Almost every function in ctf-types.c will end up calling ctf_lookup_by_id and ctf_vlen in turn. ctf_next_t has changed significantly: the ctn_type member is split in two so that we can tell whether a given iterator works using types or indexes, and we gain the ability to iterate over enum64s, DTDs themselves, and datasecs (most of this will only be used in later commits). The old internal function ctf_struct_member, which handled the distinction between ctf_member_t and ctf_lmember_t, is gone. Instead we have new code that handles the different representation of bitfield versus non-bitfield structs and unions, and more code to handle the different representation of CTF_K_BIG structs and unions (their offsets are the distance from the last offset, rather than the distance from the start of the structure).	2025-04-25 18:07:42 +01:00
Nick Alcock	ad13b7d44f	libctf: CTFv4: type opening The majority of this commit rejigs the core type table opening code for CTFv4: there are a few ancillary bits it drags in, indicated below. The internal definition of a child dict (that may not have type or string lookups performed in it until ctf_open time) used to be 'has a cth_parent_name', but since BTF doesn't have one of those at all, we add an additional check: a dict the first byte of whose strtab is not 0 must be a child. (If either is true, this is a child dict, which allows for the possibility of CTF dicts with non-deduplicated strtabs -- thus with leading \0's -- to exist in future.) The initial sweep through the type table in init_static_types (to size the name-table lookup hashes) also now checks for various types which indicate that this must be a CTF dict, in addition to being adjusted to cater for new CTFv4 representations of things like forwards. (At this early stage, we cannot rely on the functions in ctf-type.c to abstract over this for us.) We make some new hashtables for new namespace-like things: datasecs and type and decl tags. The main name-population loop in init_static_types_names_internal takes prefixes into account, looking for the name on the suffix type (where the name is always found). LSTRUCT handling is removed (they no longer exist); ENUM64s, enum forwards, VARs, datasecs, and type and decl tags get their names suitably populated. Some buggy code which tried to populate the name tables for cvr-quals (which are nameless) was dropped. We add an extra pass which traverses all datasecs and keeps track of which datasec each var is instantiated in (if any) in a new ctf_var_datasecs hash table. (This uses a number of type-querying functions which don't yet exist: they'll be added in the upcoming commits.) We handle the type 0 == void case by pointing the first element of ctf_txlate at a type read in named "void" (making type 0 an alias to it), or, if one doesn't exist, creating a new one (outside the type table and dtd arrays), and pointing type 0 at that. Since it is numbered 0 and not in the type table or dtd arrays, it will never be written out at serialization time, but since it is present, libctf consumers who expect the void type to have an integral definition rather than being a magic number will get what they expect.	2025-04-25 18:07:42 +01:00
Nick Alcock	f7d05ab342	libctf: CTFv4: core opening (other than the type table) This commit modifies the core opening code to handle opening CTFv4 and BTF. Much of the back-compatibility side is left for later and is currently untested, as is the type table side of things. We keep the v3 header (if any) stashed away in ctf_dict_t.ctf_v3_header, for the sake of the CTF dumper; we "upgrade" the BTF header to CTF (so that the rest of the code can ignore the distinction, and so that you can do CTFish things like adding symtypetab entries even to things opened as BTF), but keep note of the fact that it was opened as BTF in ctf_dict_t.ctf_opened_btf, so that things like ctf_import can allow for the absence of the various parent-length fields. A couple of ctf_dict_t fields are renamed for consistency with the headers' names for them (ctf_parname becomes ctf_parent_name; ctf_dynparname becomes ctf_dyn_parent_name; ctf_cuname becomes ctf_cu_name). Not all users are yet adjusted.	2025-04-25 18:07:42 +01:00
Nick Alcock	99e9ab4828	libctf: adjust foreign-endian byteswapping for v4 Split into a separate commit because it's not yet really tested. Callers not yet adjusted.	2025-04-25 18:07:41 +01:00
Nick Alcock	7bcd444b9c	libctf, include: debuggability improvements When --enable-libctf-hash-debugging is on, make ctf_set_errno and ctf_set_typed_errno into real functions, not inlines, so you can drop breakpoints on them. Since we are breaking API, also move ECTF_NEXT_END to the start of the _CTF_ERRORS array, so you can check for real (non-ECTF_NEXT_END) errors in breakpooints on those functions by checking for err > 1000.	2025-04-25 18:07:41 +01:00
Nick Alcock	2ef9554023	libctf: ctf-lookup: support prefixes in ctf_lookup_by_id ctf_lookup_by_id now has a new optional suffix argument, which, if set, returns the suffix of a prefixed type: the ctf_type_t it returns remains (as ever) the first one in the type (i.e. it may be a prefix type). This is most convenient because the prefix is the ctf_type_t that LCTF_KIND and other LCTF functions taking ctf_type_t's expect. Callers not yet adjusted.	2025-04-25 18:07:41 +01:00
Nick Alcock	a80b903b45	libctf: simplify ctf_txlate Before now, this critical internal structure was an array mapping from a type ID to the type index of the type with that ID. This was critical for the old world in which ctf_update() reserialized the entire dict, so things moved around in memory all the time: but these days, a ctf_type_t * never moves after creation, so we can just make ctf_txlate an array of ctf_type_t * and be done with it. This lets us point type indexes anywhere in memory, not just to entries in the ctf_buf, which means we can have synthetic ones for various purposes. And we will.	2025-04-25 18:07:41 +01:00
Nick Alcock	1d70873382	libctf: dynhash/dynset: a bit of const-correctness A pile of dynhash and dynset functions were requiring non-const hashes/sets unnecessarily. Fix them.	2025-04-25 18:07:41 +01:00
Nick Alcock	40aea6c596	libctf: ctf_next_t.ctn_size: make a size_t Literally every single user would rather this is a size_t, rather than an ssize_t. Change it.	2025-04-25 18:07:41 +01:00
Nick Alcock	6a4a485c7b	libctf: adapt core dictops for v4 and prefix types The heart of libctf's reading code is the ctf_dictops_t and the functions it provides for reading various things no matter what the CTF version in use: these are called via LCTF_*() macros that translate into calls into the dictops. The introduction of prefix types in v4 requires changes here: in particular, we want the ability to get the type kind of whatever ctf_type_t we are looking at (the 'unprefixed' kind), as well as the ability to get the type kind taking prefixes into account: and more generally we want the ability to both look at a given prefix and look at the type as a whole. So several ctf_dictops_t entries are added for this (ctfo_get_prefixed_kind, ctfo_get_prefixed_vlen). This means API changes (no callers yet adjusted, it'll happen as we go), because the existing macros were mostly called with e.g. a ctt_info value and returned a type kind, while now we need to be called with the actual ctf_type_t itself, so we can possibly walk beyond it to find the real type record. ctfo_get_vbytes needs adjusting for this. We also add names to most of the ctf_type_t parameters, because suddenly we can have up to three of them: one relating to the first entry in the type record (which may be a prefix, usually called 'prefix'), one relating to the true type record (which may be a suffix, so usually called 'suffix'), and one possibly relating to some intermediate record if we have multiple prefixes (usually called 'tp'). There is one horrible special case in here: the vlen of the new CTF_K_FUNC_LINKAGE kind (equivalent to BTF_KIND_FUNC) is always zero: it reuses the vlen field to encode the linkage (!). BTF is rife with ugly hacks like this.	2025-04-25 18:07:41 +01:00
Nick Alcock	ab3ad58be9	libctf: don't warn about unused fp in ctf_assert When hash debugging is enabled and NDEBUG is not set, ctf_assert() translates into a true assert(). Don't leave the fp parameter unused in this case (which can cause compiler errors when -Werror is also on).	2025-04-25 18:07:41 +01:00
Nick Alcock	3ae061cfb0	libctf: split out compatibility code The compatibility-opening code is quite voluminous, and is stuck right in the middle of ctf-open.c, rather interfering with maintenance. Split it out into a new ctf-open-compat.c. (Since it is not yet upgraded to support v4, the new file is not added to the build system yet: indeed, even the calls to it haven't been diked out at this stage.)	2025-04-25 18:07:41 +01:00
Nick Alcock	e0490fbc73	include, libctf, binutils: drop labels These have never been implemented properly and don't work with the linker or deduplicator: BTF has nothing like them, so the default assumption should be that we drop them. If we need something like them in future, we can add them back (which we do not expect). Quite a bit of label detritus is left in libctf after this: it's tied up with later changes so will be removed as part of later commits. (Because the entire thing is disabled, the non-compilability of this intermediate state is not a concern.)	2025-04-25 18:07:41 +01:00
Nick Alcock	de5a31a8ca	include, libctf: header and soname changes for CTFv4 These changes bump the current file format version to CTF_VERSION_4, and introduce a new VERSION_5 identical with it to get the version integer and the name identical again. A great many changes are made to account for the changes to handle CTFv4 (which is a BTF superset). libctf will not compile after these changes, which is why it's been diked out of the build system and forced-off until the series is complete. Because all the CTF_K constants have changed values, this is necessarily an ABI break: add a #define to make picking up this break at compile time obvious. Note that the ABI has broken by bumping the soname (deriving it now from libctf/libtool-version) and folding all newer symbols in the symbol version file into a new LIBCTF_2.0 version, which is now the only exported version.	2025-04-25 18:07:41 +01:00
Nick Alcock	13ce9e17b7	libctf: don't include cv-quals or pointers in the name table Even if these types have a name recorded against them, we should ignore it. They don't have names, full stop. libctf/ChangeLog: * ctf-open.c (init_static_types): Drop nameless types when sizing the name table. (init_static_types_names_internal): Never pass in their name.	2025-03-16 15:25:28 +00:00
Nick Alcock	456d5bedcc	types: add some more error checking A few places with inadequate error checking have fallen out of the ctf_id_t work: - ctf_add_slice doesn't make sure that the type it is slicing actually exists - ctf_add_member_offset doesn't check that the type of the member exists (though it will often fail if it doesn't, it doesn't explicitly check, so if you're unlucky it can sometimes succeed, giving you a corrupted dict) - ctf_type_encoding doesn't check whether its slied type exists: it should verify it so it can return a decent error, rather than a thoroughly misleading one - ctf_type_compat has the same problem with respect to both of its arguments. It would definitely be nicer if we could call ctf_type_compat and just get a boolean answer, but it's not clear to me whether a type can be said to be compatible or incompatible with a nonexistent one, and we should probably alert the users to a likely bug regardless. C error checking, sigh...	2025-03-16 15:25:28 +00:00
Nick Alcock	648f857144	Tiny stylistic spacing and comment tweaks	2025-03-16 15:25:28 +00:00
Nick Alcock	b5d3790c66	libctf: consecutive ctf_id_t assignment This change modifies type ID assignment in CTF so that it works like BTF: rather than flipping the high bit on for types in child dicts, types ascend directly from IDs in the parent to IDs in the child, without interruption (so type 0x4 in the parent is immediately followed by 0x5 in all children). Doing this while retaining useful semantics for modification of parents is challenging. By definition, child type IDs are not known until the parent is written out, but we don't want to find ourselves constrained to adding types to the parent in one go, followed by all child types: that would make the deduplicator a nightmare and would frankly make the entire ctf_add*() interface next to useless: all existing clients that add types at all add types to both parents and children without regard for ordering, and breaking that would probably necessitate redesigning all of them. So we have to be a litle cleverer. We approach this the same way as we approach strings in the recent refs rework: if a parent has children attached (or has ever had them attached since it was created or last read in), any new types created in the parent are assigned provisional IDs starting at the very top of the type space and working down. (Their indexes in the internal libctf arrays remain unchanged, so we don't suddenly need multigigabyte indexes!). At writeout (preserialization) time, we traverse the type table (and all other table containing type IDs) and assign refs to every type ID in exactly the same way we assign refs to every string offset (just a different set of refs -- we don't want to update type IDs with string offset values!). For a parent dict with children, these refs are real entities in memory: pointers to the memory locations where type IDs are stored, tracked in the DTD of each type. As we traverse the type table, we assign real IDs to each type (by simple incrementation), storing those IDs in a new dtd_final_type field in the DTD for each type. Once the type table and all other tables containing type IDs are fully traversed, we update all the refs and overwrite the IDs currently residing in each with the final IDs for each type. That fixes up IDs in the parent dict itself (including forward references in structs and the like: that's why the ref updates only happen at the end); but what about child dicts' references, both to parent types and to their own? We add armouring to enforce that parent dicts are always serialized before their children (which ctf-link.c already does, because it's a precondition for strtab deduplication), and then arrange that when a ref is added to a type whose ID has been assigned (has a dtd_final_type), we just immediately do an update rather than storing a ref for later updating. Since the parent is already serialized, all parent type IDs have a dtd_final_type by this point, and all parent IDs in the children are properly updated. The child types can now be renumbered now we now the number of types in the parent, and their refs updated identically to what was just done with the parent. One wrinkle: before the child refs are updated, while we are working over the child's type section, the type IDs in the child start from 1 (or something like that), which might seem to overlap the parent IDs. But this is not the case: when you serialize the parent, the IDs written out to disk are changed, but the only change to the representation in memory is that we remember a dtd_final_type for each type (and use it to update all the child type refs): its ID in memory is the same as it always was, a nonoverlapping provisional ID higher than any other valid ID. We enforce all of this by asserting that when you add a ref to a type, the memory location that is modified must be in the buffer being serialized: the code will not let you accidentally modify the actual DTDs in memory. We track the number of types in the parent in a new CTFv4 (not BTF) header field (the dumper is updated): we will also use this to open CTFv3 child dicts without change by simply declaring for them that the parent dict has 2^31 types in it (or 2^15, for v2 and below): the IDs in the children then naturally come out right with no other changes needed. (Right now, opening CTFv3 child dicts requires extra compatibility code that has not been written, but that code will no longer need to worry about type ID differences.) Various things are newly forbidden: - you cannot ctf_import() a child into a parent if you already ctf_add()ed types to the child, because all its IDs would change (and since you already cannot ctf_add() types to a child that hasn't had its parent imported, this in practice means only that ctf_create() must be followed immediately by a ctf_import() if this is a new child, which all sane clients were doing anyway). - You cannot import a child into a parent which has the wrong number of (non-provisional) types, again because all its IDs would be wrong: because parents only add types in the provisional space if children are attached to it, this would break the not unknown case of opening an archive, adding types to the parent, and only then importing children into it, so we add a special case: archive members which are not children in an archive with more than one member always pretend to have at least one child, so type additions in them are always provisional even before you ctf_import anything. In practice, this does exactly what we want, since all archives so far are created by the linker and have one parent and N children of that parent. Because this introduces huge gaps between index and type ID for provisional types, some extra assertions are added to ensure that the internal ctf_type_to_index() is only ever called on types in the current dict (never a parent dict): before now, this was just taken on trust, and it was often wrong (which at best led to wrong results, as wrong array indexes were used, and at worst to a buffer overflow). When hash debugging is on (suggesting that the user doesn't mind expensive checks), every ctf_type_to_index() triggers a ctf_index_to_type() to make sure that the operations are proper inverses. Lots and lots of tests are added to verify that assignment works and that updating of every type kind works fine -- existing tests suffice for type IDs in the variable and symtypetab sections. The ld-ctf tests get a bunch of largely display-based updates: various tests refer to 0x8... type IDs, which no longer exist, and because the IDs are shorter all the spacing and alignment has changed.	2025-03-16 15:25:27 +00:00
Nick Alcock	274cc1f13d	libctf: fix ctf_type_pointer on parent dicts, etc Before now, ctf_type_pointer was crippled: it returned some type (if any) that was a pointer to the type passed in, but only if both types were in the current dict: if either (or both) was in the parent dict, it said there was no pointer though there was. This breaks real users: it's past time to lift the restriction. WIP (complete, but not yet tested).	2025-02-28 15:13:24 +00:00
Nick Alcock	3737d9200d	libctf: don't call ctf_type_to_index with types in other dicts ctf_type_to_index has never given meaningful results when called with dicts in which the specified type does not reside: its only purpose is to return the offset in various dict-internal arrays in which this type is located, so doing so makes no sense. Stop ctf_lookup_by_name and refresh_pptrtab (which it calls) from doing so. As part of this, refactor ctf_lookup_by_name so that it's a bit less repetitive and squirrelly.	2025-02-28 15:13:24 +00:00
Nick Alcock	beccf36b88	libctf: move string deduplication into ctf-archive This means that any archive containing dicts can get its strings dedupped together, rather than only those that are ctf_linked. (For now, we are still constrained to ctf_linked archives, since fixing that requires further changes to ctf_dedup_strings: but this gives us the first half of what is necessary.) libctf/ * ctf-link.c (ctf_link_write): Move string dedup into... * ctf-archive.c (ctf_arc_preserialize): ... this new function. (ctf_arc_write_fd): Call it.	2025-02-28 15:13:24 +00:00
Nick Alcock	5a1d8eca5c	libctf: fix slices of slices and of enums Slices had a bunch of horrible usability problems. In particular, while towers of cv-quals are resolved away by functions that need to do it, towers of cv-quals with slices in the middle are not resolved away by functions like ctf_enum_value that can see through slices: resolving volatile -> slice -> const -> enum will leave it with a 'const', which will error pointlessly, annoying callers, who reasonably expect slices to be more invisible than this. (The user-callable ctf_type_resolve still does not resolve away slices, because this is the only way users can see that the slices are there at all.) This is induced by a fix for another wart: ctf_add_enumerator does not resolve anything away at all, so you can't even add enumerators to const or volatile enums -- and more problematically, you can't add enumerators to enums with an explicit encoding without resolving away the types by hand, since ctf_add_enum_encoded works by returning a slice! ctf_add_enumerator now resolves away all of those, so any cvr-or-typedef-or-slice-qual terminating in an enum can be added to, exactly as callers likely expect. (New tests added.) libctf/ * ctf-create.c (ctf_add_enumerator): Resolve away cvr-qualness. * ctf-types.c (ctf_type_resolve_unsliced): Don't terminate at the first slice. * testsuite/libctf-writable/slice-of-slice.*: New test.	2025-02-28 15:13:24 +00:00
Nick Alcock	a480362d88	libctf: string: refs rework This commit moves provisional (not-yet-serialized) string refs towards the scheme to be used for CTF IDs in the future. In particular - provisional string offsets now count downwards from just under the external string offset space (all bits on but the high bit). This makes it possible to detect an overflowing strtab, and also makes it trivial to determine whether any string offset (ref) updates were missed -- where before we might get a slightly corrupted or incorrect string, we now get a huge high strtab offset corresponding to no string, and an error is emitted at read time. - refs are emitted at serialization time during the pass through the types. They are strictly associated with the newly-written-out buffer: the existing opened CTF dict is not changed, though it does still get the new strtab so that new refs to the same string can just refer directly to it. The provisional strtab hash table that contains these strings is not deleted after serialization (because we might serialize again): instead, we keep track in the parent of the lowest-yet-used ("latest") provisional strtab offset, and any strtab offset above that, but not external (high-bit-on) is considered provisional. This is sort-of-enforced by moving most of the ref-addition function declarations (including ctf_str_add_ref) to a new ctf-ref.h, which is not included by ctf-create.c or ctf-open.c. - because we don't add refs when adding types, we don't need to handle the case where we add things to expanding vlens (enums, struct members) and have to realloc() them. So the entire painful movable refs system can just be deleted, along with the ability to remove refs piecemeal at all (purging all of them is still possible). Strings added during type addition are added via ctf_str_add(), which adds no refs: the strings are picked up at serialization time and refs to their final, serialized resting place added. The DTDs never have any refs in them, and their provisional strtab offsets are never updated by the ref system. This caused several bugs to fall out of the earlier work and get fixed. In particular, attempts to look up a string in a child dict now search the parent's provisional strtab too: we add some extra special casing for the null string so we don't need to worry about deduplication moving it somewhere other than offset zero. Finally, the optimization that removes an unreferenced synthetic external strtab (the record of the strings the linker has told us about, kept around internally for lookup during late serialization) is faulty: references to a strtab entry will only produce CTF-level refs if their value might change, and an external string's offset won't change, so it produces no refs: worse yet, even if we did get a ref (say, if the string was originally believed to be internal and only later were we told that the linker knew about it too), when we serialize a strtab, all its refs are dropped (since they've been updated and can no longer change); so if we serialized it a second time, its synthetic external strtab would be considered empty and dropped, even though the same external strings as before still exist, referencing it. We must keep the synthetic external strtab around as long as external strings exist that reference it, i.e. for the life of the dict. One benefit of all this: now we're emitting provisional string offsets at a really high value, it's out of the way of the consecutive, deduplicated string offsets in child dicts. So we can drop the constraint that you cannot add strings to a dict with children, which allows us to add types freely to parent dicts again. What you can't do is write that dict out again: when we serialize, we currently update the dict being serialized with the updated strtabs: when you write a dict out, its provisional strings become real strings, and suddenly the offsets would overlap once more. But opening a dict and its children, adding to it, and then writing it out again is rare indeed, and we have a workaround: anyone wanting to do this can just use ctf_link instead.	2025-02-28 15:13:24 +00:00
Nick Alcock	97a72b2a35	libctf: create: fix vlen / vbytes confusion The initial_vlen parameter to ctf_add_generic is misnamed: it's not the initial vlen (the initial number of members of a struct, etc), but rather the initial size of the vlen region. We have a term for that, vbytes: use it. Amazingly this doesn't seem to have caused any bugs to creep in.	2025-02-28 15:13:24 +00:00
Nick Alcock	dc93d01ff2	libctf: de-macroize LCTF_TYPE_TO_INDEX / LCTF_INDEX_TO_TYPE Making these functions is unnecessary right now, but will become much clearer shortly. While we're at it, we can drop the third child argument to LCTF_INDEX_TO_TYPE: it's only used for nontrivial purposes that aren't literally the same as getting the result from the fp in one place, in ctf_lookup_by_name_internal, and that place is easily fixed by just looking in the right dictionary in the first place.	2025-02-28 15:13:24 +00:00
Nick Alcock	003f19bfa7	libctf: make ctf_dynamic_type() the inverse of ctf_static_type() They're meant to be inverses, which makes it unfortunate that they check different bounds. No visible effect yet, since ctf_typemax and ctf_stypes currently cover the entire type ID space, but will have an effect shortly.	2025-02-28 15:13:24 +00:00
Nick Alcock	b875301e74	libctf: drop LCTF_TYPE_ISPARENT/LCTF_TYPE_ISCHILD Parent/child determination is about to become rather more complex, making a macro impractical. Use the ctf_type_isparent/ischild function calls everywhere and remove the macro. Make them more const-correct too, to make them more widely usable. While we're about it, change several places that hand-implemented ctf_get_dict() to call it instead, and armour several functions against the null returns that were always possible in this case (but previously unprotected-against).	2025-02-28 15:13:24 +00:00
Nick Alcock	9835747b21	libctf: generalize the ref system Despite the removal of the separate movable ref list, the ref system as a whole is more than complex enough to be worth generalizing now that we are adding different kinds of ref. Refs now are lists of uint32_t * which can be updated through the pointer for all entries in the list and moved to new sites for all pointers in a given range: they are no longer references to string offsets in particular and can be references to other uint32_t-sized things instead (note that ctf_id_t is a typedef to a uint32_t). ctf-string.c has been adjusted accordingly (the adjustments are tiny, more or less just turning a bunch of references to atom into &atom->csa_refs).	2025-02-28 15:13:24 +00:00
Nick Alcock	69d4f6d74c	libctf, string: remove movable refs properly Ever since pending refs were replaced with movable refs, we were failing to remove movable ref backpointers properly on ctf_remove_ref. I don't see how this could cause any problem but a memory leak, but since we do ultimately write down refs, leaking references to refs is still risky: best to fix this.	2025-02-28 15:13:23 +00:00
Nick Alcock	21f748e1e3	libctf, string: delete separate movable ref storage again This was added last year to let us maintain a backpointer to the movable refs dynhash in movable ref atoms without spending space for the backpointer on the majority of (non-movable) refs and also without causing an atom which had some refs movable and some refs not movable to dereference unallocated storage when freed. The backpointer's only purpose was to let us locate the ctf_str_movable_refs dynhash during item freeing, when we had nothing but a pointer to the atom being freed. Now we have a proper freeing arg, we don't need the backpointer at all: we can just pass a pointer to the dict in to the atoms dynhash as a freeing arg for the atom freeing functions, and throw the whole backpointer and separate movable ref list complexity away.	2025-02-28 15:13:23 +00:00

1 2 3 4 5 ...

437 Commits