Commit Graph

45 Commits

Author SHA1 Message Date
Nick Alcock
c14bdfc7a4 libctf: serialize: kind suppression and prohibition
The CTF serialization machinery decides whether to write out a dict as BTF
or CTF (or, in LIBCTF_BTM_BTF mode, whether to write out a dict or fail with
ECTF_NOTBTF) in part by looking at the type kinds in the dictionary.

It is possible that you'd like to extend this check and ban specific type
kinds from the dictionary (possibly even if it's CTF); it's also possible
that you'd like to *not* fail even if a CTF-only kind is found, but rather
replace it with a still-valid stub (CTF_K_UNKNOWN / BTF_KIND_UNKNOWN) and
keep going.  (The kernel's btfarchive machinery does this to ensure that
the compiler and previous link stages have emitted only valid BTF type
kinds.)

ctf_write_suppress_kind supports both these use cases:

+int ctf_write_suppress_kind (ctf_dict_t *fp, int kind, int prohibited);

This commit adds only the core population code: the actual suppression is
spread across the serializer and will be added in the next commits.
2025-04-25 18:07:44 +01:00
Nick Alcock
f7f72bcca6 libctf, open: new API for getting the size of CTF/BTF file sections
I wrote this for BTF type size querying programs, but it might be
of more general use and it's impossible to get this info in any
other way, so we might want to keep it.

New API:
+size_t ctf_sect_size (ctf_dict_t *, ctf_sect_names_t sect);
2025-04-25 18:07:43 +01:00
Nick Alcock
4837852527 libctf: types: access to raw type data
This new API lets users ask for the raw type data associated with a type
(either the whole lot including prefixes, or just the suffix if this is not
a CTF_K_BIG type), and then they can manipulate it using ctf.h functions
or whatever else they like.  Doing this does not preclude using libctf
querying functions at the same time (just don't change the type!  It's
const for a reason).

New API:

+const ctf_type_t *ctf_type_data (ctf_dict_t *, ctf_id_t, int prefix);

This function was unimplementable before the DTD changes, because the
ctf_type_t and vlen were separated in memory: but now they're always stored
in a single buffer, it's reliable and simple, indeed trivial.
2025-04-25 18:07:43 +01:00
Nick Alcock
d5dd8997b3 libctf: create, types: conflicting types
The conflicting type kind is a CTF-specific prefix kind consisting purely of
an optional translation unit name.  It takes the place of the old hidden
bit: we have already seen it used to prefix types added with a
CTF_ADD_NONROOT flag.  The deduplicator will also use them to label
conflicting types from different TUs smushed into the same dict by the
CU-mapping mechanism: unlike the hidden bit, with this scheme users can tell
which CUs the conflicting types came from.

New API:

+int ctf_type_conflicting (ctf_dict_t *, ctf_id_t, const char **cuname);
+int ctf_set_conflicting (ctf_dict_t *, ctf_id_t, const char *);

(Frankly I expect ctf_set_conflicting to be used only by deduplicators and
things like that, but if we provide an option to query something we should
also provide an option to produce it...)
2025-04-25 18:07:43 +01:00
Nick Alcock
fb8917ac21 libctf, create, types: type and decl tags
These are a little more fiddly than previous kinds, because their
namespacing rules are odd: they have names (so presumably we want an API to
look them up by name), but the names are not unique (they don't need to be,
because they are not entities you can refer to from C), so many distinct
tags in the same TU can have the same name.  Type tags only refer to a type
ID: decl tags refer to a specific function parameter or structure member via
a zero-indexed "component index".

The name tables for these things are a hash of name to a set of type IDs;
rather different from all the other named entities in libctf.  As a
consequence, they can presently be looked up only using their own dedicated
functions, not using ctf_lookup_by_name et al.  (It's not clear if this
restriction could ever be lifted: ctf_lookup_by_name and friends return a
type ID, not a set of them.)

They are similar enough to each other that we can at least have one function
to look up both type and decl tags if you don't care about their
component_idx and only want a type ID: ctf_tag.  (And one to iterate over
them, ctf_tag_next).

(A caveat: because tags aren't widely used or generated yet, much of this is
more or less untested and/or supposition and will need testing later.)

New API, more or less the minimum needed because it's not entirely clear how
these things will be used:

+ctf_id_t ctf_tag (ctf_dict_t *, ctf_id_t tag);
+ctf_id_t ctf_decl_tag (ctf_dict_t *, ctf_id_t decl_tag,
+		       int64_t *component_idx);
+ctf_id_t ctf_tag_next (ctf_dict_t *, const char *tag, ctf_next_t **);
+ctf_id_t ctf_add_type_tag (ctf_dict_t *, uint32_t, ctf_id_t, const char *);
+ctf_id_t ctf_add_decl_type_tag (ctf_dict_t *, uint32_t, ctf_id_t, const char *);
+ctf_id_t ctf_add_decl_tag (ctf_dict_t *, uint32_t, ctf_id_t, const char *,
+			   int component_idx);
2025-04-25 18:07:43 +01:00
Nick Alcock
39cdb3e395 libctf: create, types: functions, linkage, arg names (API ADVICE)
Functions change in CTFv4 by growing argument names as well as argument
types; the representation changes into a two-element array of (type, string
offset) rather than a simple array of arg types.  Functions also gain an
explicit linkage in a different type kind (CTF_K_FUNC_LINKAGE, which
corresponds to BTF_KIND_FUNC).

New API:

 typedef struct ctf_funcinfo {
 /* ... */
-  uint32_t ctc_argc;		/* Number of typed arguments to function.  */
+  size_t ctc_argc;		/* Number of typed arguments to function.  */
};

int ctf_func_arg_names (ctf_dict_t *, unsigned long, uint32_t, const char **);
int ctf_func_type_arg_names (ctf_dict_t *, ctf_id_t, uint32_t,
		 	    const char **names);
+extern int ctf_type_linkage (ctf_dict_t *, ctf_id_t);
-extern ctf_id_t ctf_add_function (ctf_dict_t *, uint32_t,
-				  const ctf_funcinfo_t *, const ctf_id_t *);
+extern ctf_id_t ctf_add_function (ctf_dict_t *, uint32_t,
+				  const ctf_funcinfo_t *, const ctf_id_t *,
+				  const char **arg_names);
+extern ctf_id_t ctf_add_function_linkage (ctf_dict_t *, uint32_t,
+					  ctf_id_t, const char *, int linkage);

Adding this is fairly straightforward; the only annoying part is the way the
callers need to allocate space for the arg name and type arrays.  Maybe we
should rethink these into something like ctf_type_aname(), allocating
space for the caller so the caller doesn't need to?  It would certainly
make all the callers in libctf much less complex...

While we're at it, adjust ctf_type_reference, ctf_type_align, and
ctf_type_size for the new internal API changes (they also all have
special-case code for functions).
2025-04-25 18:07:43 +01:00
Nick Alcock
a632f3ed33 libctf, types: ctf_type_kind_{iter,next} et al
These new functions let you iterate over types by kind, letting you get all
variables, all enums, all datasecs, etc.  (This is amenable to future
optimization, and some is expected shortly.)

We also add new iternal functions ctf_type_kind_{forwarded_,unsliced_,}tp
which are like the corresponding non-_tp functions except that they
take a ctf_type_t rather than a type ID: doing this allows the deduplicator
to use these nearly-public functions more.  The public ctf_type_kind*
functions are reimplemented in terms of these.

This machinery is the principal place where the magic encoding of forwards
is encoded.
2025-04-25 18:07:43 +01:00
Nick Alcock
ea21a1b2ae libctf: create, types: variables and datasecs (REVIEW NEEDED)
This is an area of significant difference from CTFv3.  The API changes
significantly, with quite a few additions to allow creation and querying of
these new datasec entities:

-typedef int ctf_variable_f (const char *name, ctf_id_t type, void *arg);
+typedef int ctf_variable_f (ctf_dict_t *, const char *name, ctf_id_t type,
+			    void *arg);
+typedef int ctf_datasec_var_f (ctf_dict_t *fp, ctf_id_t type, size_t offset,
+			       size_t datasec_size, void *arg);

+/* Search a datasec for a variable covering a given offset.
+
+   Errors with ECTF_NODATASEC if not found.  */
+
+ctf_id_t ctf_datasec_var_offset (ctf_dict_t *fp, ctf_id_t datasec,
+				 uint32_t offset);
+
+/* Return the datasec that a given variable appears in, or ECTF_NODATASEC if
+   none.  */
+
+ctf_id_t ctf_variable_datasec (ctf_dict_t *fp, ctf_id_t var);

+int ctf_datasec_var_iter (ctf_dict_t *, ctf_id_t, ctf_datasec_var_f *,
+			  void *);
+ctf_id_t ctf_datasec_var_next (ctf_dict_t *, ctf_id_t, ctf_next_t **,
+			       size_t *size, size_t *offset);

-int ctf_add_variable (ctf_dict_t *, const char *, ctf_id_t);
+/* ctf_add_variable adds variables to no datasec at all;
+   ctf_add_section_variable adds them to the given datasec, or to no datasec at
+   all if the datasec is NULL.  */
+
+ctf_id_t ctf_add_variable (ctf_dict_t *, const char *, int linkage, ctf_id_t);
+ctf_id_t ctf_add_section_variable (ctf_dict_t *, uint32_t,
+				   const char *datasec, const char *name,
+				   int linkage, ctf_id_t type,
+				   size_t size, size_t offset);

We tie datasecs quite closely to variables at addition (and, as should
become clear later, dedup) time: you never create datasecs, you only create
variables *in* datasecs, and the datasec springs into existence when you do
so: datasecs are always found in the same dict as the variables they contain
(the variables are never in the parent if the datasec is in a child or
anything).  We keep track of the variable->datasec mapping in
ctf_var_datasecs (populating it at addition and open time), to allow
ctf_variable_datasec to work at reasonable speed.  (But, as yet, there are
no tests of this function at all.)

The datasecs are created unsorted (to avoid variable addition becoming
O(n^2)) and sorted at serialization time, and when ctf_datasec_var_offset is
invoked.

We reuse the natural-alignment code from struct addition to get a plausible
offset in datasecs if an alignment of -1 is specified: maybe this is
unnecessary now (it was originally added when ctf_add_variable added
variables to a "default datasec", while now it just leaves them out of
all datasecs, like externs are).

One constraint of this is that we currently prohibit the addition of
nonrepresentable-typed variables, because we can't tell what their natural
alignment is: if we dropped the whole "align" and just required everyone
adding a variable to a datasec to specify an offset, we could drop that
restriction. WDYT?

One additional caveat: right now, ctf_lookup_variable() looks up the type of
a variable (because when it was invented, variables were not entities in
themselves that you could look up).  This name is confusing as hell as a
result.  It might be less confusing to make it return the CTF_K_VAR, but
that would be awful to adapt callers to, since both are represented with
ctf_id_t's, so the compiler wouldn't warn about the needed change at all...
I've vacillated on this three or four times now.
2025-04-25 18:07:43 +01:00
Nick Alcock
2c1a0a70d1 libctf, create, types: encoding, BTF floats
This adds support for the nearly useless BTF_KIND_FLOAT, under the name
CTF_K_BTF_FLOAT.  At the same time we fix up the ctf_add_encoding and
ctf_type_encoding machinery for the new API changes.

I expect this to change a bit: Ali Bahrami reckons I've oversimplified the
CTFv4 encoding representation and need to reintroduce at least a width.

New API:

ctf_id_t ctf_add_btf_float (ctf_dict_t *, uint32_t,
                            const char *, const ctf_encoding_t *);
2025-04-25 18:07:42 +01:00
Nick Alcock
03609073b0 libctf: create, types: enums and enum64s; type encoding
This commit adapts most aspects of enum handling: querying and iteration,
enumerator querying and iteration, ctf_type_add, etc.  We have to adapt to
enum64s and to signed versus unsigned enums, to our vlen and DTD changes and
other internal API changes to handle prefix types etc, and fix the types of
things to allow for 64-bit enumerators.  We can also (finally!) get useful
info on enum size rather than being restricted to a value hardwired into
libctf.

We also adjust all the type-encoding functions for the internal API changes,
since enums are the first encodable entities we have covered.

API changes:

-typedef int ctf_enum_f (const char *name, int val, void *arg);
+typedef int ctf_enum_f (const char *name, int64_t val, void *arg);
+typedef int ctf_unsigned_enum_f (const char *name, uint64_t val, void *arg);

-extern const char *ctf_enum_name (ctf_dict_t *, ctf_id_t, int);
-extern int ctf_enum_value (ctf_dict_t *, ctf_id_t, const char *, int *);
+extern const char *ctf_enum_name (ctf_dict_t *, ctf_id_t, int64_t);
+extern int ctf_enum_value (ctf_dict_t *, ctf_id_t, const char *, int64_t *);
+extern int ctf_enum_unsigned_value (ctf_dict_t *, ctf_id_t, const char *, uint64_t *);
+
+/* Return 1 if this enum's contents are unsigned, so you can tell which of the
+   above functions to use.  */
+
+extern int ctf_enum_unsigned (ctf_dict_t *, ctf_id_t);

-/* Return all enumeration constants in a given enum type.  */
-extern int ctf_enum_iter (ctf_dict_t *, ctf_id_t, ctf_enum_f *, void *);
+/* Return all enumeration constants in a given enum type.  The return value, and
+   VAL argument, may need to be cast to uint64_t: see ctf_enum_unsigned().  */
+extern int64_t ctf_enum_iter (ctf_dict_t *, ctf_id_t, ctf_enum_f *, void *);
 extern const char *ctf_enum_next (ctf_dict_t *, ctf_id_t, ctf_next_t **,
-				  int *);
+				  int64_t *);
+
+/* enums are created signed by default.  If you want an unsigned enum,
+   use ctf_add_enum_encoded() with an encoding of 0 (CTF_INT_SIGNED and
+   everything else off).  This will not create a slice, unlike all other
+   uses of ctf_add_enum_encoded(), and the result is still representable
+   as BTF.  */
+
+extern ctf_id_t ctf_add_enum64_encoded (ctf_dict_t *, uint32_t, const char *,
+					const ctf_encoding_t *);
+extern ctf_id_t ctf_add_enum64 (ctf_dict_t *, uint32_t, const char *);

-extern int ctf_add_enumerator (ctf_dict_t *, ctf_id_t, const char *, int);
+extern int ctf_add_enumerator (ctf_dict_t *, ctf_id_t, const char *, int64_t);

The only aspects of enums that are not now handled are forwards to enums,
dumping of enums, and deduplication of enums.
2025-04-25 18:07:42 +01:00
Nick Alcock
0a3ee49dd0 libctf: types: add ctf_struct_bitfield (NEEDS REVIEW)
This new public API function allows you to find out if a struct has the
bitfield flag set or not.  (There are no other properties specific to a
struct, so we needed a new function for it.  I am open to a
ctf_struct_info() function handing back a struct if people prefer.)

New API:

int ctf_struct_bitfield (ctf_dict_t *, ctf_id_t);
2025-04-25 18:07:42 +01:00
Nick Alcock
20e6f72dc7 libctf: create: structure and union member addition
There is one API addition here:

int ctf_add_member_bitfield (ctf_dict_t *, ctf_id_t souid,
                             const char *, ctf_id_t type,
                             unsigned long bit_offset,
                             int bit_width);

SoU addition handles the representational changes for bitfields and for
CTF_K_BIG structs (i.e. all structs you can add members to), errors out if
you add bitfields to structs that aren't created with the
CTF_ADD_STRUCT_BITFIELDS flag, and arranges to add padding as needed if
there is too much of a gap for the offsets to encode in one hop (that
part is still untested).
2025-04-25 18:07:42 +01:00
Nick Alcock
05a2970ad1 libctf: create, lookup: delete DVDs; ctf_lookup_by_kind
Variable handling in BTF and CTFv4 works quite differently from in CTFv3.
Rather than a separate section containing sorted, bsearchable variables,
they are simply named entities like types, stored in CTF_K_VARs.

As a first stage towards migrating to this, delete most references to
the ctf_varent_t and ctf_dvdef_t, including the DVD lookup code, all
the linking code, and quite a lot of the serialization code.

Note: CTF_LINK_OMIT_VARIABLES_SECTION, and the whole "delete variables that
already exist in the symtypetabs section" stuff, has yet to be
reimplemented.  We can implement CTF_LINK_OMIT_VARIABLES_SECTION by simply
excising all CTF_K_VARs at deduplication time if requested.  (Note:
symtypetabs should still point directly at the type, not at the CTF_K_VAR.)

(Symtypetabs in general need a bit more thought -- perhaps we can now store
them in a separate .ctf.symtypetab section with its own little four-entry
header for the symtypetabs and their indexes, making .ctf even more like
.BTF; the only difference would then be that .ctf could include prefix
types, CTF_K_FLOAT, and external string refs.  For later discussion.)

We also add ctf_lookup_by_kind() at this stage (because it is hopelessly
diff-entangled with ctf_lookup_variable): this looks up a type of a
particular kind, without needing a per-kind lookup function for it,
nor needing to hack around adding string prefixes (so you can do
ctf_lookup_by_kind (fp, CTF_K_STRUCT, "foo") rather than having to
do ctf_lookup_by_name (fp, "struct foo"): often this is more convenient, and
anything that reduces string buffer manipulation in C is good.)
2025-04-25 18:07:42 +01:00
Nick Alcock
e0490fbc73 include, libctf, binutils: drop labels
These have never been implemented properly and don't work with the linker or
deduplicator: BTF has nothing like them, so the default assumption should be
that we drop them.  If we need something like them in future, we can add
them back (which we do not expect).

Quite a bit of label detritus is left in libctf after this: it's tied up
with later changes so will be removed as part of later commits.  (Because
the entire thing is disabled, the non-compilability of this intermediate
state is not a concern.)
2025-04-25 18:07:41 +01:00
Nick Alcock
de5a31a8ca include, libctf: header and soname changes for CTFv4
These changes bump the current file format version to CTF_VERSION_4, and
introduce a new VERSION_5 identical with it to get the version integer and
the name identical again.  A great many changes are made to account for
the changes to handle CTFv4 (which is a BTF superset).

libctf will not compile after these changes, which is why it's been diked
out of the build system and forced-off until the series is complete.

Because all the CTF_K constants have changed values, this is necessarily an
ABI break: add a #define to make picking up this break at compile time
obvious.

Note that the ABI has broken by bumping the soname (deriving it now from
libctf/libtool-version) and folding all newer symbols in the symbol version
file into a new LIBCTF_2.0 version, which is now the only exported version.
2025-04-25 18:07:41 +01:00
Alan Modra
e8e7cf2abe Update year range in copyright notice of binutils files 2025-01-01 18:29:57 +10:30
Nick Alcock
6da9267482 libctf, include: add ctf_dict_set_flag: less enum dup checking by default
The recent change to detect duplicate enum values and return ECTF_DUPLICATE
when found turns out to perturb a great many callers.  In particular, the
pahole-created kernel BTF has the same problem we historically did, and
gleefully emits duplicated enum constants in profusion.  Handling the
resulting duplicate errors from BTF -> CTF converters reasonably is
unreasonably difficult (it amounts to forcing them to skip some types or
reimplement the deduplicator).

So let's step back a bit.  What we care about mostly is that the
deduplicator treat enums with conflicting enumeration constants as
conflicting types: programs that want to look up enumeration constant ->
value mappings using the new APIs to do so might well want the same checks
to apply to any ctf_add_* operations they carry out (and since they're
*using* the new APIs, added at the same time as this restriction was
imposed, there is likely to be no negative consequence of this).

So we want some way to allow processes that know about duplicate detection
to opt into it, while allowing everyone else to stay clear of it: but we
want ctf_link to get this behaviour even if its caller has opted out.

So add a new concept to the API: dict-wide CTF flags, set via
ctf_dict_set_flag, obtained via ctf_dict_get_flag.  They are not bitflags
but simple arbitrary integers and an on/off value, stored in an unspecified
manner (the one current flag, we translate into an LCTF_* flag value in the
internal ctf_dict ctf_flags word). If you pass in an invalid flag or value
you get a new ECTF_BADFLAG error, so the caller can easily tell whether
flags added in future are valid with a particular libctf or not.

We check this flag in ctf_add_enumerator, and set it around the link
(including on child per-CU dicts).  The newish enumerator-iteration test is
souped up to check the semantics of the flag as well.

The fact that the flag can be set and unset at any time has curious
consequences. You can unset the flag, insert a pile of duplicates, then set
it and expect the new duplicates to be detected, not only by
ctf_add_enumerator but also by ctf_lookup_enumerator.  This means we now
have to maintain the ctf_names and conflicting_enums enum-duplication
tracking as new enums are added, not purely as the dict is opened.
Move that code out of init_static_types_internal and into a new
ctf_track_enumerator function that addition can also call.

(None of this affects the file format or serialization machinery, which has
to be able to handle duplicate enumeration constants no matter what.)

include/
	* ctf-api.h (CTF_ERRORS) [ECTF_BADFLAG]: New.
	(ECTF_NERR): Update.
	(CTF_STRICT_NO_DUP_ENUMERATORS): New flag.
	(ctf_dict_set_flag): New function.
	(ctf_dict_get_flag): Likewise.

libctf/
	* ctf-impl.h (LCTF_STRICT_NO_DUP_ENUMERATORS): New flag.
	(ctf_track_enumerator): Declare.
	* ctf-dedup.c (ctf_dedup_emit_type): Set it.
	* ctf-link.c (ctf_create_per_cu): Likewise.
	(ctf_link_deduplicating_per_cu): Likewise.
	(ctf_link): Likewise.
	(ctf_link_write): Likewise.
	* ctf-subr.c (ctf_dict_set_flag): New function.
	(ctf_dict_get_flag): New function.
	* ctf-open.c (init_static_types_internal): Move enum tracking to...
	* ctf-create.c (ctf_track_enumerator): ... this new function.
	(ctf_add_enumerator): Call it.
	* libctf.ver: Add the new functions.
	* testsuite/libctf-lookup/enumerator-iteration.c: Test them.
2024-07-31 21:02:05 +01:00
Nick Alcock
2fa4b6e6df libctf, include: new functions for looking up enumerators
Three new functions for looking up the enum type containing a given
enumeration constant, and optionally that constant's value.

The simplest, ctf_lookup_enumerator, looks up a root-visible enumerator by
name in one dict: if the dict contains multiple such constants (which is
possible for dicts created by older versions of the libctf deduplicator),
ECTF_DUPLICATE is returned.

The next simplest, ctf_lookup_enumerator_next, is an iterator which returns
all enumerators with a given name in a given dict, whether root-visible or
not.

The most elaborate, ctf_arc_lookup_enumerator_next, finds all
enumerators with a given name across all dicts in an entire CTF archive,
whether root-visible or not, starting looking in the shared parent dict;
opened dicts are cached (as with all other ctf_arc_*lookup functions) so
that repeated use does not incur repeated opening costs.

All three of these return enumerator values as int64_t: unfortunately, API
compatibility concerns prevent us from doing the same with the other older
enum-related functions, which all return enumerator constant values as ints.
We may be forced to add symbol-versioning compatibility aliases that fix the
other functions in due course, bumping the soname for platforms that do not
support such things.

ctf_arc_lookup_enumerator_next is implemented as a nested ctf_archive_next
iterator, and inside that, a nested ctf_lookup_enumerator_next iterator
within each dict.  To aid in this, add support to ctf_next_t iterators for
iterators that are implemented in terms of two simultaneous nested iterators
at once.  (It has always been possible for callers to use as many nested or
semi-overlapping ctf_next_t iterators as they need, which is one of the
advantages of this style over the _iter style that calls a function for each
thing iterated over: the iterator change here permits *ctf_next_t iterators
themselves* to be implemented by iterating using multiple other iterators as
part of their internal operation, transparently to the caller.)

Also add a testcase that tests all these functions (which is fairly easy
because ctf_arc_lookup_enumerator_next is implemented in terms of
ctf_lookup_enumerator_next) in addition to enumeration addition in
ctf_open()ed dicts, ctf_add_enumerator duplicate enumerator addition, and
conflicting enumerator constant deduplication.

include/
	* ctf-api.h (ctf_lookup_enumerator): New.
	(ctf_lookup_enumerator_next): Likewise.
	(ctf_arc_lookup_enumerator_next): Likewise.

libctf/
	* libctf.ver: Add them.
	* ctf-impl.h (ctf_next_t) <ctn_next_inner>: New.
	* ctf-util.c (ctf_next_copy): Copy it.
        (ctf_next_destroy): Destroy it.
	* ctf-lookup.c (ctf_lookup_enumerator): New.
	(ctf_lookup_enumerator_next): New.
	* ctf-archive.c (ctf_arc_lookup_enumerator_next): New.
	* testsuite/libctf-lookup/enumerator-iteration.*: New test.
	* testsuite/libctf-lookup/enum-ctf-2.c: New test CTF, used by the
	  above.
2024-06-18 13:20:32 +01:00
Nicholas Vinson
d8e1bca7eb libctf: Remove undefined functions from ver. map
Starting with ld.lld-17, ld.lld is invoked with the option
--no-undefined-version enabled by default. Furthermore, The functions
ctf_label_set() and ctf_label_get() are not defined. Their inclusion in
libctf/libctf.ver causes ld.lld-17 to fail emitting the following error
messages:

ld.lld: error: version script assignment of 'LIBCTF_1.0' to symbol 'ctf_label_set' failed: symbol not defined
ld.lld: error: version script assignment of 'LIBCTF_1.0' to symbol 'ctf_label_get' failed: symbol not defined

This patch fixes the issue by removing the symbol names from
libctf/libctf.ver.

[nca: fused in later commit that marked ctf_arc_open as libctf
      only as well.  Added ChangeLog entry.]

Signed-off-by: Nicholas Vinson <nvinson234@gmail.com>

libctf/
	* libctf.ver: drop nonexistent label functions: mark
	ctf_arc_open as libctf-only.
2024-04-19 16:14:48 +01:00
Alan Modra
fd67aa1129 Update year range in copyright notice of binutils files
Adds two new external authors to etc/update-copyright.py to cover
bfd/ax_tls.m4, and adds gprofng to dirs handled automatically, then
updates copyright messages as follows:

1) Update cgen/utils.scm emitted copyrights.
2) Run "etc/update-copyright.py --this-year" with an extra external
   author I haven't committed, 'Kalray SA.', to cover gas testsuite
   files (which should have their copyright message removed).
3) Build with --enable-maintainer-mode --enable-cgen-maint=yes.
4) Check out */po/*.pot which we don't update frequently.
2024-01-04 22:58:12 +10:30
Alan Modra
d87bef3a7b Update year range in copyright notice of binutils files
The newer update-copyright.py fixes file encoding too, removing cr/lf
on binutils/bfdtest2.c and ld/testsuite/ld-cygwin/exe-export.exp, and
embedded cr in binutils/testsuite/binutils-all/ar.exp string match.
2023-01-01 21:50:11 +10:30
Alan Modra
a2c5833233 Update year range in copyright notice of binutils files
The result of running etc/update-copyright.py --this-year, fixing all
the files whose mode is changed by the script, plus a build with
--enable-maintainer-mode --enable-cgen-maint=yes, then checking
out */po/*.pot which we don't update frequently.

The copy of cgen was with commit d1dd5fcc38ead reverted as that commit
breaks building of bfp opcodes files.
2022-01-02 12:04:28 +10:30
Nick Alcock
ea9c200911 libctf: try several possibilities for linker versioning flags
Checking for linker versioning by just grepping ld --help output for
mentions of --version-script is inadequate now that Solaris 11.4
implements a --version-script with different semantics.  Try linking a
test program with a small wildcard-using version script with each
supported set of flags in turn, to make sure that linker versioning is
not only advertised but actually works.

The Solaris "GNU-compatible" linker versioning is not quite
GNU-compatible enough, but we can work around the differences by
generating a new version script that removes the comments from the
original (Solaris ld requires #-style comments), and making another
version script for libctf-nonbfd in particular which doesn't mention any
of the symbols that appear in libctf.la, to avoid Solaris ld introducing
corresponding new NOTYPE symbols to match the version script.

libctf/ChangeLog
2021-09-27  Nick Alcock  <nick.alcock@oracle.com>

	PR libctf/27967
	* configure.ac (VERSION_FLAGS): Replace with...
	(ac_cv_libctf_version_script): ... this multiple test.
	(VERSION_FLAGS_NOBFD): Substitute this too.
	* Makefile.am (libctf_nobfd_la_LDFLAGS): Use it.  Split out...
	(libctf_ldflags_nover): ... non-versioning flags here.
	(libctf_la_LDFLAGS): Use it.
	* libctf.ver: Give every symbol not in libctf-nobfd a comment on
	the same line noting as much.
2021-09-27 20:31:24 +01:00
Nick Alcock
49da556c65 libctf, include: support an alternative encoding for nonrepresentable types
Before now, types that could not be encoded in CTF were represented as
references to type ID 0, which does not itself appear in the
dictionary. This choice is annoying in several ways, principally that it
forces generators and consumers of CTF to grow special cases for types
that are referenced in valid dicts but don't appear.

Allow an alternative representation (which will become the only
representation in format v4) whereby nonrepresentable types are encoded
as actual types with kind CTF_K_UNKNOWN (an already-existing kind
theoretically but not in practice used for padding, with value 0).
This is backward-compatible, because CTF_K_UNKNOWN was not used anywhere
before now: it was used in old-format function symtypetabs, but these
were never emitted by any compiler and the code to handle them in libctf
likely never worked and was removed last year, in favour of new-format
symtypetabs that contain only type IDs, not type kinds.

In order to link this type, we need an API addition to let us add types
of unknown kind to the dict: we let them optionally have names so that
GCC can emit many different unknown types and those types with identical
names will be deduplicated together.  There are also small tweaks to the
deduplicator to actually dedup such types, to let opening of dicts with
unknown types with names work, to return the ECTF_NONREPRESENTABLE error
on resolution of such types (like ID 0), and to print their names as
something useful but not a valid C identifier, mostly for the sake of
the dumper.

Tests added in the next commit.

include/ChangeLog
2021-05-06  Nick Alcock  <nick.alcock@oracle.com>

	* ctf.h (CTF_K_UNKNOWN): Document that it can be used for
	nonrepresentable types, not just padding.
	* ctf-api.h (ctf_add_unknown): New.

libctf/ChangeLog
2021-05-06  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-open.c (init_types): Unknown types may have names.
	* ctf-types.c (ctf_type_resolve): CTF_K_UNKNOWN is as
	non-representable as type ID 0.
	(ctf_type_aname): Print unknown types.
	* ctf-dedup.c (ctf_dedup_hash_type): Do not early-exit for
	CTF_K_UNKNOWN types: they have real hash values now.
	(ctf_dedup_rwalk_one_output_mapping): Treat CTF_K_UNKNOWN types
	like other types with no referents: call the callback and do not
	skip them.
	(ctf_dedup_emit_type): Emit via...
	* ctf-create.c (ctf_add_unknown): ... this new function.
	* libctf.ver (LIBCTF_1.2): Add it.
2021-05-06 09:30:59 +01:00
Nick Alcock
f4f60336da libctf, include: find types of symbols by name
The existing ctf_lookup_by_symbol and ctf_arc_lookup_symbol functions
suffice to look up the types of symbols if the caller already has a
symbol number.  But the caller often doesn't have one of those and only
knows the name of the symbol: also, in object files, the caller might
not have a useful symbol number in any sense (and neither does libctf:
the 'symbol number' we use in that case literally starts at 0 for the
lexicographically first-sorted symbol in the symtypetab and counts those
symbols, so it corresponds to nothing useful).

This means that even though object files have a symtypetab (generated by
the compiler or by ld -r), the only way we can look up anything in it is
to iterate over all symbols in turn with ctf_symbol_next until we find
the one we want.

This is unhelpful and pointlessly inefficient.

So add a pair of functions to look up symbols by name in a dict and in a
whole archive: ctf_lookup_by_symbol_name and ctf_arc_lookup_symbol_name.
These are identical to the existing functions except that they take
symbol names rather than symbol numbers.

To avoid insane repetition, we do some refactoring in the process, so
that both ctf_lookup_by_symbol and ctf_arc_lookup_symbol turn into thin
wrappers around internal functions that do both lookup by symbol index
and lookup by name.  This massively reduces code duplication because
even the existing lookup-by-index stuff wants to use a name sometimes
(when looking up in indexed sections), and the new lookup-by-name stuff
has to turn it into an index sometimes (when looking up in non-indexed
sections): doing it this way lets us share most of that.

The actual name->index lookup is done by ctf_lookup_symbol_idx.  We do
not anticipate this lookup to be as heavily used as ld.so symbol lookup
by many orders of magnitude, so using the ELF symbol hashes would
probably take more time to read them than is saved by using the hashes,
and it adds a lot of complexity.  Instead, do a linear search for the
symbol name, caching all the name -> index mappings as we go, so that
future searches are likely to hit in the cache.  To avoid having to
repeat this search over and over in a CTF archive when
ctf_arc_lookup_symbol_name is used, have cached archive lookups (the
sort done by ctf_arc_lookup_symbol* and the ctf_archive_next iterator)
pick out the first dict they cache in a given archive and store it in a
new ctf_archive field, ctfi_crossdict_cache.  This can be used to store
cross-dictionary cached state that depends on things like the ELF symbol
table rather than the contents of any one dict.  ctf_lookup_symbol_idx
then caches its name->index mappings in the dictionary named in the
crossdict cache, if any, so that ctf_lookup_symbol_idx in other dicts
in the same archive benefit from the previous linear search, and the
symtab only needs to be scanned at most once.

(Note that if you call ctf_lookup_by_symbol_name in one specific dict,
and then follow it with a ctf_arc_lookup_symbol_name, the former will
not use the crossdict cache because it's only populated by the dict
opens in ctf_arc_lookup_symbol_name. This is harmless except for a small
one-off waste of memory and time: it's only a cache, after all.  We can
fix this later by using the archive caching machinery more
aggressively.)

In ctf-archive, we do similar things, turning ctf_arc_lookup_symbol into
a wrapper around a new function that does both index -> ID and name ->
ID lookups across all dicts in an archive.  We add a new
ctfi_symnamedicts cache that maps symbol names to the ctf_dict_t * that
it was found in (so that linear searches for symbols don't need to be
repeated): but we also *remove* a cache, the ctfi_syms cache that was
memoizing the actual ctf_id_t returned from every call to
ctf_arc_lookup_symbol.  This is pointless: all it saves is one call to
ctf_lookup_by_symbol, and that's basically an array lookup and nothing
more so isn't worth caching.  (Equally, given that symbol -> index
mappings are cached by ctf_lookup_by_symbol_name, those calls are nearly
free after the first call, so there's no point caching the ctf_id_t in
that case either.)

We fix up one test that was doing manual symbol lookup to use
ctf_arc_lookup_symbol instead, and enhance it to check that the caching
layer is not totally broken: we also add a new test to do lookups in a
.o file, and another to do lookups in an archive with conflicted types
and make sure that sort of multi-dict lookup is actually working.

include/ChangeLog
2021-02-17  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-api.h (ctf_arc_lookup_symbol_name): New.
	(ctf_lookup_by_symbol_name): Likewise.

libctf/ChangeLog
2021-02-17  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-impl.h (ctf_dict_t) <ctf_symhash>: New.
	<ctf_symhash_latest>: Likewise.
	(struct ctf_archive_internal) <ctfi_crossdict_cache>: New.
	<ctfi_symnamedicts>: New.
	<ctfi_syms>: Remove.
	(ctf_lookup_symbol_name): Remove.
	* ctf-lookup.c (ctf_lookup_symbol_name): Propagate errors from
	parent properly.  Make static.
	(ctf_lookup_symbol_idx): New, linear search for the symbol name,
	cached in the crossdict cache's ctf_symhash (if available), or
	this dict's (otherwise).
	(ctf_try_lookup_indexed): Allow the symname to be passed in.
	(ctf_lookup_by_symbol): Turn into a wrapper around...
	(ctf_lookup_by_sym_or_name): ... this, supporting name lookup too,
	using ctf_lookup_symbol_idx in non-writable dicts.  Special-case
	name lookup in dynamic dicts without reported symbols, which have
	no symtab or dynsymidx but where name lookup should still work.
	(ctf_lookup_by_symbol_name): New, another wrapper.
	* ctf-archive.c (enosym): Note that this is present in
	ctfi_symnamedicts too.
	(ctf_arc_close): Adjust for removal of ctfi_syms.  Free the
	ctfi_symnamedicts.
	(ctf_arc_flush_caches): Likewise.
	(ctf_dict_open_cached): Memoize the first cached dict in the
	crossdict cache.
	(ctf_arc_lookup_symbol): Turn into a wrapper around...
	(ctf_arc_lookup_sym_or_name): ... this.  No longer cache
	ctf_id_t lookups: just call ctf_lookup_by_symbol as needed (but
	still cache the dicts those lookups succeed in).  Add
	lookup-by-name support, with dicts of successful lookups cached in
	ctfi_symnamedicts.  Refactor the caching code a bit.
	(ctf_arc_lookup_symbol_name): New, another wrapper.
	* ctf-open.c (ctf_dict_close): Free the ctf_symhash.
	* libctf.ver (LIBCTF_1.2): New version.  Add
	ctf_lookup_by_symbol_name, ctf_arc_lookup_symbol_name.
	* testsuite/libctf-lookup/enum-symbol.c (main): Use
	ctf_arc_lookup_symbol rather than looking up the name ourselves.
	Fish it out repeatedly, to make sure that symbol caching isn't
	broken.
	(symidx_64): Remove.
	(symidx_32): Remove.
	* testsuite/libctf-lookup/enum-symbol-obj.lk: Test symbol lookup
	in an unlinked object file (indexed symtypetab sections only).
	* testsuite/libctf-writable/symtypetab-nonlinker-writeout.c
	(try_maybe_reporting): Check symbol types via
	ctf_lookup_by_symbol_name as well as ctf_symbol_next.
	* testsuite/libctf-lookup/conflicting-type-syms.*: New test of
	lookups in a multi-dict archive.
2021-02-20 16:37:08 +00:00
Alan Modra
250d07de5c Update year range in copyright notice of binutils files 2021-01-01 10:31:05 +10:30
Nick Alcock
53651de80f libctf, include: support foreign-endianness symtabs with CTF
The CTF symbol lookup machinery added recently has one deficit: it
assumes the symtab is in the machine's native endianness.  This is
always true when the linker is writing out symtabs (because cross
linkers byteswap symbols only after libctf has been called on them), but
may be untrue in the cross case when the linker or another tool
(objdump, etc) is reading them.

Unfortunately the easy way to model this to the caller, as an endianness
field in the ctf_sect_t, is precluded because doing so would change the
size of the ctf_sect_t, which would be an ABI break.  So, instead, allow
the endianness of the symtab to be set after open time, by calling one
of the two new API functions ctf_symsect_endianness (for ctf_dict_t's)
or ctf_arc_symsect_endianness (for entire ctf_archive_t's).  libctf
calls these functions automatically for objects opened via any of the
BFD-aware mechanisms (ctf_bfdopen, ctf_bfdopen_ctfsect, ctf_fdopen,
ctf_open, or ctf_arc_open), but the various mechanisms that just take
raw ctf_sect_t's will assume the symtab is in native endianness and need
a later call to ctf_*symsect_endianness to adjust it if needed.  (This
call is basically free if the endianness is actually native: it only
costs anything if the symtab endianness was previously guessed wrong,
and there is a symtab, and we are using it directly rather than using
symtab indexing.)

Obviously, calling ctf_lookup_by_symbol or ctf_symbol_next before the
symtab endianness is correctly set will probably give wrong answers --
but you can set it at any time as long as it is before then.

include/ChangeLog
2020-11-23  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-api.h: Style nit: remove () on function names in comments.
	(ctf_sect_t): Mention endianness concerns.
	(ctf_symsect_endianness): New declaration.
	(ctf_arc_symsect_endianness): Likewise.

libctf/ChangeLog
2020-11-23  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-impl.h (ctf_dict_t) <ctf_symtab_little_endian>: New.
	(struct ctf_archive_internal) <ctfi_symsect_little_endian>: Likewise.
	* ctf-create.c (ctf_serialize): Adjust for new field.
	* ctf-open.c (init_symtab): Note the semantics of repeated calls.
	(ctf_symsect_endianness): New.
	(ctf_bufopen_internal): Set ctf_symtab_little_endian suitably for
	the native endianness.
	(_Static_assert): Moved...
	(swap_thing): ... with this...
	* swap.h: ... to here.
	* ctf-util.c (ctf_elf32_to_link_sym): Use it, byteswapping the
	Elf32_Sym if the ctf_symtab_little_endian demands it.
	(ctf_elf64_to_link_sym): Likewise swap the Elf64_Sym if needed.
	* ctf-archive.c (ctf_arc_symsect_endianness): New, set the
	endianness of the symtab used by the dicts in an archive.
	(ctf_archive_iter_internal): Initialize to unknown (assumed native,
	do not call ctf_symsect_endianness).
	(ctf_dict_open_by_offset): Call ctf_symsect_endianness if need be.
	(ctf_dict_open_internal): Propagate the endianness down.
	(ctf_dict_open_sections): Likewise.
	* ctf-open-bfd.c (ctf_bfdopen_ctfsect): Get the endianness from the
	struct bfd and pass it down to the archive.
	* libctf.ver: Add ctf_symsect_endianness and
	ctf_arc_symsect_endianness.
2020-11-25 19:11:35 +00:00
Nick Alcock
97a2a623d0 libctf, include: add ctf_getsymsect and ctf_getstrsect
libctf has long provided ctf_getdatasect, which hands back a pointer to
the CTF section a (read-only) dict came from.  But it has no such
functions to return pointers to the ELF symbol table or string table
it's working from, which is unfortunate because several libctf functions
(ctf_open, ctf_fdopen, and ctf_bfdopen) figure out which string and
symbol table to use themselves, and don't tell the user what they
decided, so the caller can't agree on which symtab to use with libctf
even if it wanted to.

Add a pair of functions to return the symtab and strtab in use.  Like
ctf_getdatasect, these return ctf_sect_t structures by value, filled
with all-NULL/0 content if a symtab or strtab is not being used.

include/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-api.h (ctf_getsymsect): New.
	(ctf_getstrsect): Likewise.

libctf/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-open.c (ctf_getsymsect): New.
	(ctf_getstrsect): Likewise.
	* libctf.ver: Add them.
2020-11-20 13:34:12 +00:00
Nick Alcock
2c78e92523 libctf, include: CTF-archive-wide symbol lookup
CTF archives may contain multiple dicts, each of which contain many
types and possibly a bunch of symtypetab entries relating to those
types: each symtypetab entry is going to appear in exactly one dict,
with the corresponding entries in the other dicts empty (either pads, or
indexed symtypetabs that do not mention that symbol).  But users of
libctf usually want to get back the type associated with a symbol
without having to dig around to find out which dict that type might be
in.

This adds machinery to do that -- and since you probably want to do it
repeatedly, it adds internal caching to the ctf-archive machinery so
that iteration over archives via ctf_archive_next and repeated symbol
lookups do not have to repeatedly reopen the archive.  (Iteration using
ctf_archive_iter will gain caching soon.)

Two new API functions:

ctf_dict_t *
ctf_arc_lookup_symbol (ctf_archive_t *arc, unsigned long symidx,
		       ctf_id_t *typep, int *errp);

This looks up the symbol with index SYMIDX in the archive ARC, returning
the dictionary in which it resides and optionally the type index as
well.  Errors are returned in ERRP.  The dict should be
ctf_dict_close()d when done, but is also cached inside the ctf_archive
so that the open cost is only paid once.  The result of the symbol
lookup is also cached internally, so repeated lookups of the same symbol
are nearly free.

void ctf_arc_flush_caches (ctf_archive_t *arc);

Flush all the caches. Done at close time, but also available as an API
function if users want to do it by hand.

include/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-api.h (ctf_arc_lookup_symbol): New.
	(ctf_arc_flush_caches): Likewise.
	* ctf.h: Document new auto-ctf_import behaviour.

libctf/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-impl.h (struct ctf_archive_internal) <ctfi_dicts>: New, dicts
	the archive machinery has opened and cached.
	<ctfi_symdicts>: New, cache of dicts containing symbols looked up.
	<ctfi_syms>: New, cache of types of symbols looked up.
	* ctf-archive.c (ctf_arc_close): Free them on close.
	(enosym): New, flag entry for 'symbol not present'.
	(ctf_arc_import_parent): New, automatically import the parent from
	".ctf" if this is a child in an archive and ".ctf" is present.
	(ctf_dict_open_sections): Use it.
	(ctf_archive_iter_internal): Likewise.
	(ctf_cached_dict_close): New, thunk around ctf_dict_close.
	(ctf_dict_open_cached): New, open and cache a dict.
	(ctf_arc_flush_caches): New, flush the caches.
	(ctf_arc_lookup_symbol): New, look up a symbol in (all members of)
	an archive, and cache the lookup.
	(ctf_archive_iter): Note the new caching behaviour.
	(ctf_archive_next): Use ctf_dict_open_cached.
	* libctf.ver: Add ctf_arc_lookup_symbol and ctf_arc_flush_caches.
2020-11-20 13:34:11 +00:00
Nick Alcock
1136c37971 libctf: symbol type linking support
This adds facilities to write out the function info and data object
sections, which efficiently map from entries in the symbol table to
types.  The write-side code is entirely new: the read-side code was
merely significantly changed and support for indexed tables added
(pointed to by the no-longer-unused cth_objtidxoff and cth_funcidxoff
header fields).

With this in place, you can use ctf_lookup_by_symbol to look up the
types of symbols of function and object type (and, as before, you can
use ctf_lookup_variable to look up types of file-scope variables not
present in the symbol table, as long as you know their name: but
variables that are also data objects are now found in the data object
section instead.)

(Compatible) file format change:

The CTF spec has always said that the function info section looks much
like the CTF_K_FUNCTIONs in the type section: an info word (including an
argument count) followed by a return type and N argument types. This
format is suboptimal: it means function symbols cannot be deduplicated
and it causes a lot of ugly code duplication in libctf.  But
conveniently the compiler has never emitted this!  Because it has always
emitted a rather different format that libctf has never accepted, we can
be sure that there are no instances of this function info section in the
wild, and can freely change its format without compatibility concerns or
a file format version bump.  (And since it has never been emitted in any
code that generated any older file format version, either, we need keep
no code to read the format as specified at all!)

So the function info section is now specified as an array of uint32_t,
exactly like the object data section: each entry is a type ID in the
type section which must be of kind CTF_K_FUNCTION, the prototype of
this function.

This allows function types to be deduplicated and also correctly encodes
the fact that all functions declared in C really are types available to
the program: so they should be stored in the type section like all other
types.  (In format v4, we will be able to represent the types of static
functions as well, but that really does require a file format change.)

We introduce a new header flag, CTF_F_NEWFUNCINFO, which is set if the
new function info format is in use.  A sufficiently new compiler will
always set this flag.  New libctf will always set this flag: old libctf
will refuse to open any CTF dicts that have this flag set.  If the flag
is not set on a dict being read in, new libctf will disregard the
function info section.  Format v4 will remove this flag (or, rather, the
flag has no meaning there and the bit position may be recycled for some
other purpose).

New API:

Symbol addition:
  ctf_add_func_sym: Add a symbol with a given name and type.  The
                    type must be of kind CTF_K_FUNCTION (a function
                    pointer).  Internally this adds a name -> type
                    mapping to the ctf_funchash in the ctf_dict.
  ctf_add_objt_sym: Add a symbol with a given name and type.  The type
                    kind can be anything, including function pointers.
		    This adds to ctf_objthash.

These both treat symbols as name -> type mappings: the linker associates
symbol names with symbol indexes via the ctf_link_shuffle_syms callback,
which sets up the ctf_dynsyms/ctf_dynsymidx/ctf_dynsymmax fields in the
ctf_dict.  Repeated relinks can add more symbols.

Variables that are also exposed as symbols are removed from the variable
section at serialization time.

CTF symbol type sections which have enough pads, defined by
CTF_INDEX_PAD_THRESHOLD (whether because they are in dicts with symbols
where most types are unknown, or in archive where most types are defined
in some child or parent dict, not in this specific dict) are sorted by
name rather than symidx and accompanied by an index which associates
each symbol type entry with a name: the existing ctf_lookup_by_symbol
will map symbol indexes to symbol names and look the names up in the
index automatically.  (This is currently ELF-symbol-table-dependent, but
there is almost nothing specific to ELF in here and we can add support
for other symbol table formats easily).

The compiler also uses index sections to communicate the contents of
object file symbol tables without relying on any specific ordering of
symbols: it doesn't need to sort them, and libctf will detect an
unsorted index section via the absence of the new CTF_F_IDXSORTED header
flag, and sort it if needed.

Iteration:
  ctf_symbol_next: Iterator which returns the types and names of symbols
                   one by one, either for function or data symbols.

This does not require any sorting: the ctf_link machinery uses it to
pull in all the compiler-provided symbols cheaply, but it is not
restricted to that use.

(Compatible) changes in API:
  ctf_lookup_by_symbol: can now be called for object and function
                        symbols: never returns ECTF_NOTDATA (which is
			now not thrown by anything, but is kept for
                        compatibility and because it is a plausible
                        error that we might start throwing again at some
                        later date).

Internally we also have changes to the ctf-string functionality so that
"external" strings (those where we track a string -> offset mapping, but
only write out an offset) can be consulted via the usual means
(ctf_strptr) before the strtab is written out.  This is important
because ctf_link_add_linker_symbol can now be handed symbols named via
strtab offsets, and ctf_link_shuffle_syms must figure out their actual
names by looking in the external symtab we have just been fed by the
ctf_link_add_strtab callback, long before that strtab is written out.

include/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-api.h (ctf_symbol_next): New.
	(ctf_add_objt_sym): Likewise.
	(ctf_add_func_sym): Likewise.
	* ctf.h: Document new function info section format.
	(CTF_F_NEWFUNCINFO): New.
	(CTF_F_IDXSORTED): New.
	(CTF_F_MAX): Adjust accordingly.

libctf/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-impl.h (CTF_INDEX_PAD_THRESHOLD): New.
	(_libctf_nonnull_): Likewise.
	(ctf_in_flight_dynsym_t): New.
	(ctf_dict_t) <ctf_funcidx_names>: Likewise.
	<ctf_objtidx_names>: Likewise.
	<ctf_nfuncidx>: Likewise.
	<ctf_nobjtidx>: Likewise.
	<ctf_funcidx_sxlate>: Likewise.
	<ctf_objtidx_sxlate>: Likewise.
	<ctf_objthash>: Likewise.
	<ctf_funchash>: Likewise.
	<ctf_dynsyms>: Likewise.
	<ctf_dynsymidx>: Likewise.
	<ctf_dynsymmax>: Likewise.
	<ctf_in_flight_dynsym>: Likewise.
	(struct ctf_next) <u.ctn_next>: Likewise.
	(ctf_symtab_skippable): New prototype.
	(ctf_add_funcobjt_sym): Likewise.
	(ctf_dynhash_sort_by_name): Likewise.
	(ctf_sym_to_elf64): Rename to...
	(ctf_elf32_to_link_sym): ... this, and...
	(ctf_elf64_to_link_sym): ... this.
	* ctf-open.c (init_symtab): Check for lack of CTF_F_NEWFUNCINFO
	flag, and presence of index sections.  Refactor out
	ctf_symtab_skippable and ctf_elf*_to_link_sym, and use them.  Use
	ctf_link_sym_t, not Elf64_Sym.  Skip initializing objt or func
	sxlate sections if corresponding index section is present.  Adjust
	for new func info section format.
	(ctf_bufopen_internal): Add ctf_err_warn to corrupt-file error
	handling.  Report incorrect-length index sections.  Always do an
	init_symtab, even if there is no symtab section (there may be index
	sections still).
	(flip_objts): Adjust comment: func and objt sections are actually
	identical in structure now, no need to caveat.
	(ctf_dict_close):  Free newly-added data structures.
	* ctf-create.c (ctf_create): Initialize them.
	(ctf_symtab_skippable): New, refactored out of
	init_symtab, with st_nameidx_set check added.
	(ctf_add_funcobjt_sym): New, add a function or object symbol to the
	ctf_objthash or ctf_funchash, by name.
	(ctf_add_objt_sym): Call it.
	(ctf_add_func_sym): Likewise.
	(symtypetab_delete_nonstatic_vars): New, delete vars also present as
	data objects.
	(CTF_SYMTYPETAB_EMIT_FUNCTION): New flag to symtypetab emitters:
	this is a function emission, not a data object emission.
	(CTF_SYMTYPETAB_EMIT_PAD): New flag to symtypetab emitters: emit
	pads for symbols with no type (only set for unindexed sections).
	(CTF_SYMTYPETAB_FORCE_INDEXED): New flag to symtypetab emitters:
	always emit indexed.
	(symtypetab_density): New, figure out section sizes.
	(emit_symtypetab): New, emit a symtypetab.
	(emit_symtypetab_index): New, emit a symtypetab index.
	(ctf_serialize): Call them, emitting suitably sorted symtypetab
	sections and indexes.  Set suitable header flags.  Copy over new
	fields.
	* ctf-hash.c (ctf_dynhash_sort_by_name): New, used to impose an
	order on symtypetab index sections.
	* ctf-link.c (ctf_add_type_mapping): Delete erroneous comment
	relating to code that was never committed.
	(ctf_link_one_variable): Improve variable name.
	(check_sym): New, symtypetab analogue of check_variable.
	(ctf_link_deduplicating_one_symtypetab): New.
	(ctf_link_deduplicating_syms): Likewise.
	(ctf_link_deduplicating): Call them.
	(ctf_link_deduplicating_per_cu): Note that we don't call them in
	this case (yet).
	(ctf_link_add_strtab): Set the error on the fp correctly.
	(ctf_link_add_linker_symbol): New (no longer a do-nothing stub), add
	a linker symbol to the in-flight list.
	(ctf_link_shuffle_syms): New (no longer a do-nothing stub), turn the
	in-flight list into a mapping we can use, now its names are
	resolvable in the external strtab.
	* ctf-string.c (ctf_str_rollback_atom): Don't roll back atoms with
	external strtab offsets.
	(ctf_str_rollback): Adjust comment.
	(ctf_str_write_strtab): Migrate ctf_syn_ext_strtab population from
	writeout time...
	(ctf_str_add_external): ... to string addition time.
	* ctf-lookup.c (ctf_lookup_var_key_t): Rename to...
	(ctf_lookup_idx_key_t): ... this, now we use it for syms too.
	<clik_names>: New member, a name table.
	(ctf_lookup_var): Adjust accordingly.
	(ctf_lookup_variable): Likewise.
	(ctf_lookup_by_id): Shuffle further up in the file.
	(ctf_symidx_sort_arg_cb): New, callback for...
	(sort_symidx_by_name): ... this new function to sort a symidx
	found to be unsorted (likely originating from the compiler).
	(ctf_symidx_sort): New, sort a symidx.
	(ctf_lookup_symbol_name): Support dynamic symbols with indexes
	provided by the linker.  Use ctf_link_sym_t, not Elf64_Sym.
	Check the parent if a child lookup fails.
	(ctf_lookup_by_symbol): Likewise.  Work for function symbols too.
	(ctf_symbol_next): New, iterate over symbols with types (without
	sorting).
	(ctf_lookup_idx_name): New, bsearch for symbol names in indexes.
	(ctf_try_lookup_indexed): New, attempt an indexed lookup.
	(ctf_func_info): Reimplement in terms of ctf_lookup_by_symbol.
	(ctf_func_args): Likewise.
	(ctf_get_dict): Move...
	* ctf-types.c (ctf_get_dict): ... here.
	* ctf-util.c (ctf_sym_to_elf64): Re-express as...
	(ctf_elf64_to_link_sym): ... this.  Add new st_symidx field, and
	st_nameidx_set (always 0, so st_nameidx can be ignored).  Look in
	the ELF strtab for names.
	(ctf_elf32_to_link_sym): Likewise, for Elf32_Sym.
	(ctf_next_destroy): Destroy ctf_next_t.u.ctn_next if need be.
	* libctf.ver: Add ctf_symbol_next, ctf_add_objt_sym and
	ctf_add_func_sym.
2020-11-20 13:34:08 +00:00
Nick Alcock
3d16b64e28 bfd, include, ld, binutils, libctf: CTF should use the dynstr/sym
This is embarrassing.

The whole point of CTF is that it remains intact even after a binary is
stripped, providing a compact mapping from symbols to types for
everything in the externally-visible interface of an ELF object: it has
connections to the symbol table for that purpose, and to the string
table to avoid duplicating symbol names.  So it's a shame that the hooks
I implemented last year served to hook it up to the .symtab and .strtab,
which obviously disappear on strip, leaving any accompanying the CTF
dict containing references to strings (and, soon, symbols) which don't
exist any more because their containing strtab has been vaporized.  The
original Solaris design used .dynsym and .dynstr (well, actually,
.ldynsym, which has more symbols) which do not disappear. So should we.

Thankfully the work we did before serves as guide rails, and adjusting
things to use the .dynstr and .dynsym was fast and easy.  The only
annoyance is that the dynsym is assembled inside elflink.c in a fairly
piecemeal fashion, so that the easiest way to get the symbols out was to
hook in before every call to swap_symbol_out (we also leave in a hook in
front of symbol additions to the .symtab because it seems plausible that
we might want to hook them in future too: for now that hook is unused).
We adjust things so that rather than being offered a whole hash table of
symbols at once, libctf is now given symbols one at a time, with st_name
indexes already resolved and pointing at their final .dynstr offsets:
it's now up to libctf to resolve these to names as needed using the
strtab info we pass it separately.

Some bits might be contentious.  The ctf_new_dynstr callback takes an
elf_internal_sym, and this remains an elf_internal_sym right down
through the generic emulation layers into ldelfgen.  This is no worse
than the elf_sym_strtab we used to pass down, but in the future when we
gain non-ELF CTF symtab support we might want to lower the
elf_internal_sym to some other representation (perhaps a
ctf_link_symbol) in bfd or in ldlang_ctf_new_dynsym.  We rename the
'apply_strsym' hooks to 'acquire_strings' instead, becuse they no longer
have anything to do with symbols.

There are some API changes to pieces of API which are technically public
but actually totally unused by anything and/or unused by anything but ld
so they can change freely: the ctf_link_symbol gains new fields to allow
symbol names to be given as strtab offsets as well as strings, and a
symidx so that the symbol index can be passed in.  ctf_link_shuffle_syms
loses its callback parameter: the idea now is that linkers call the new
ctf_link_add_linker_symbol for every symbol in .dynsym, feed in all the
strtab entries with ctf_link_add_strtab, and then a call to
ctf_link_shuffle_syms will apply both and arrange to use them to reorder
the CTF symtab at CTF serialization time (which is coming in the next
commit).

Inside libctf we have a new preamble flag CTF_F_DYNSTR which is always
set in v3-format CTF dicts from this commit forwards: CTF dicts without
this flag are associated with .strtab like they used to be, so that old
dicts' external strings don't turn to garbage when loaded by new libctf.
Dicts with this flag are associated with .dynstr and .dynsym instead.
(The flag is not the next in sequence because this commit was written
quite late: the missing flags will be filled in by the next commit.)

Tests forthcoming in a later commit in this series.

bfd/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* elflink.c (elf_finalize_dynstr): Call examine_strtab after
	dynstr finalization.
	(elf_link_swap_symbols_out): Don't call it here.  Call
	ctf_new_symbol before swap_symbol_out.
	(elf_link_output_extsym): Call ctf_new_dynsym before
	swap_symbol_out.
	(bfd_elf_final_link): Likewise.
	* elf.c (swap_out_syms): Pass in bfd_link_info.  Call
	ctf_new_symbol before swap_symbol_out.
	(_bfd_elf_compute_section_file_positions): Adjust.

binutils/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* readelf.c (dump_section_as_ctf): Use .dynsym and .dynstr, not
	.symtab and .strtab.

include/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* bfdlink.h (struct elf_sym_strtab): Replace with...
	(struct elf_internal_sym): ... this.
	(struct bfd_link_callbacks) <examine_strtab>: Take only a
	symstrtab argument.
	<ctf_new_symbol>: New.
	<ctf_new_dynsym>: Likewise.
	* ctf-api.h (struct ctf_link_sym) <st_symidx>: New.
	<st_nameidx>: Likewise.
	<st_nameidx_set>: Likewise.
	(ctf_link_iter_symbol_f): Removed.
	(ctf_link_shuffle_syms): Remove most parameters, just takes a
	ctf_dict_t now.
	(ctf_link_add_linker_symbol): New, split from
	ctf_link_shuffle_syms.
	* ctf.h (CTF_F_DYNSTR): New.
	(CTF_F_MAX): Adjust.

ld/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* ldelfgen.c (struct ctf_strsym_iter_cb_arg): Rename to...
	(struct ctf_strtab_iter_cb_arg): ... this, changing fields:
	<syms>: Remove.
	<symcount>: Remove.
	<symstrtab>: Rename to...
	<strtab>: ... this.
	(ldelf_ctf_strtab_iter_cb): Adjust.
	(ldelf_ctf_symbols_iter_cb): Remove.
	(ldelf_new_dynsym_for_ctf): New, tell libctf about a single
	symbol.
	(ldelf_examine_strtab_for_ctf): Rename to...
	(ldelf_acquire_strings_for_ctf): ... this, only doing the strtab
	portion and not symbols.
	* ldelfgen.h: Adjust declarations accordingly.
	* ldemul.c (ldemul_examine_strtab_for_ctf): Rename to...
	(ldemul_acquire_strings_for_ctf): ... this.
	(ldemul_new_dynsym_for_ctf): New.
	* ldemul.h: Adjust declarations accordingly.
	* ldlang.c (ldlang_ctf_apply_strsym): Rename to...
	(ldlang_ctf_acquire_strings): ... this.
	(ldlang_ctf_new_dynsym): New.
	(lang_write_ctf): Call ldemul_new_dynsym_for_ctf with NULL to do
	the actual symbol shuffle.
	* ldlang.h (struct elf_strtab_hash): Adjust accordingly.
	* ldmain.c (bfd_link_callbacks): Wire up new/renamed callbacks.

libctf/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-link.c (ctf_link_shuffle_syms): Adjust.
	(ctf_link_add_linker_symbol): New, unimplemented stub.
	* libctf.ver: Add it.
	* ctf-create.c (ctf_serialize): Set CTF_F_DYNSTR on newly-serialized
	dicts.
	* ctf-open-bfd.c (ctf_bfdopen_ctfsect): Check for the flag: open the
	symtab/strtab if not present, dynsym/dynstr otherwise.
	* ctf-archive.c (ctf_arc_bufpreamble): New, get the preamble from
	some arbitrary member of a CTF archive.
	* ctf-impl.h (ctf_arc_bufpreamble): Declare it.
2020-11-20 13:34:07 +00:00
Nick Alcock
ae41200ba8 libctf, include, binutils, gdb: rename CTF-opening functions
The functions that return ctf_dict_t's given a ctf_archive_t and a name
are very clumsily named.  It sounds like they return *archives*, not
dictionaries, and the names are very long and clunky.  Why do we
have a ctf_arc_open_by_name when it opens a dictionary, not an archive,
and when there is no way to open a dictionary in any other way?  The
answer is purely internal: the function is located in ctf-archive.c,
and everything in there was called ctf_arc_*, and there is another
way to open a dict (by offset in the archive), that is internal to
ctf-archive.c and that nothing else can call.

This is clearly bad naming. The internal organization of the source tree
should not dictate public API names!

So rename things (keeping the old, bad names for compatibility), and
adjust all users.  You now open a dict using ctf_dict_open, and
open it giving ELF sections via ctf_dict_open_sections.

binutils/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* objdump.c (dump_ctf): Use ctf_dict_open, not
	ctf_arc_open_by_name.
	* readelf.c (dump_section_as_ctf): Likewise.

gdb/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* ctfread.c (elfctf_build_psymtabs): Use ctf_dict_open, not
	ctf_arc_open_by_name.

include/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-api.h (ctf_arc_open_by_name): Rename to...
	(ctf_dict_open): ... this, keeping compatibility function.
	(ctf_arc_open_by_name_sections): Rename to...
	(ctf_dict_open_sections): ... this, keeping compatibility function.

libctf/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-archive.c (ctf_arc_open_by_offset): Rename to...
	(ctf_dict_open_by_offset): ... this.  Adjust callers.
	(ctf_arc_open_by_name_internal): Rename to...
	(ctf_dict_open_internal): ... this.  Adjust callers.
	(ctf_arc_open_by_name_sections): Rename to...
	(ctf_dict_open_sections): ... this, keeping compatibility function.
	(ctf_arc_open_by_name): Rename to...
	(ctf_dict_open): ... this, keeping compatibility function.
	* libctf.ver: New functions added.
	* ctf-link.c (ctf_link_one_input_archive): Adjusted accordingly.
	(ctf_link_deduplicating_open_inputs): Likewise.
2020-11-20 13:34:05 +00:00
Nick Alcock
139633c307 libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file".  Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago.  So the term "CTF file"
refers to something that is never a file!  This is at best confusing.

The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.

So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead.  Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.

Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).

binutils/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
	(dump_ctf_archive_member): Likewise.
	(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
	* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
	(dump_ctf_archive_member): Likewise.
	(dump_section_as_ctf): Likewise.  Use ctf_dict_close, not
	ctf_file_close.

gdb/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
	(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.

include/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-api.h (ctf_file_t): Rename to...
	(ctf_dict_t): ... this.  Keep ctf_file_t around for compatibility.
	(struct ctf_file): Likewise rename to...
	(struct ctf_dict): ... this.
	(ctf_file_close): Rename to...
	(ctf_dict_close): ... this, keeping compatibility function.
	(ctf_parent_file): Rename to...
	(ctf_parent_dict): ... this, keeping compatibility function.
	All callers adjusted.
	* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
	(struct ctf_archive) <ctfa_nfiles>: Rename to...
	<ctfa_ndicts>: ... this.

ld/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* ldlang.c (ctf_output): This is a ctf_dict_t now.
	(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
	(ldlang_open_ctf): Adjust comment.
	(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
	* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
	ctf_dict_t.  Change opaque declaration accordingly.
	* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
	* ldemul.h (examine_strtab_for_ctf): Likewise.
	(ldemul_examine_strtab_for_ctf): Likewise.
	* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.

libctf/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
	adjusted.
	(ctf_fileops): Rename to...
	(ctf_dictops): ... this.
	(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
	<cd_id_to_dict_t>: ... this.
	(ctf_file_t): Fix outdated comment.
	<ctf_fileops>: Rename to...
	<ctf_dictops>: ... this.
	(struct ctf_archive_internal) <ctfi_file>: Rename to...
	<ctfi_dict>: ... this.
	* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
	Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
	Rename ctf_file_close to ctf_dict_close.  All users adjusted.
	* ctf-create.c: Likewise.  Refer to CTF dicts, not CTF containers.
	(ctf_bundle_t) <ctb_file>: Rename to...
	<ctb_dict): ... this.
	* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
	* ctf-dedup.c: Likewise.  Rename ctf_file_close to
	ctf_dict_close. Refer to CTF dicts, not CTF containers.
	* ctf-dump.c: Likewise.
	* ctf-error.c: Likewise.
	* ctf-hash.c: Likewise.
	* ctf-inlines.h: Likewise.
	* ctf-labels.c: Likewise.
	* ctf-link.c: Likewise.
	* ctf-lookup.c: Likewise.
	* ctf-open-bfd.c: Likewise.
	* ctf-string.c: Likewise.
	* ctf-subr.c: Likewise.
	* ctf-types.c: Likewise.
	* ctf-util.c: Likewise.
	* ctf-open.c: Likewise.
	(ctf_file_close): Rename to...
	(ctf_dict_close): ...this.
	(ctf_file_close): New trivial wrapper around ctf_dict_close, for
	compatibility.
	(ctf_parent_file): Rename to...
	(ctf_parent_dict): ... this.
	(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
	compatibility.
	* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 13:34:04 +00:00
Nick Alcock
6dd2819ffc libctf, link: add the ability to filter out variables from the link
The CTF variables section (containing variables that have no
corresponding symtab entries) can cause the string table to get very
voluminous if the names of variables are long.  Some callers want to
filter out particular variables they know they won't need.

So add a "variable filter" callback that does that: it's passed the name
of the variable and a corresponding ctf_file_t / ctf_id_t pair, and
should return 1 to filter it out.

ld doesn't use this machinery yet, but we could easily add it later if
desired.  (But see later for a commit that turns off CTF variable-
section linking in ld entirely by default.)

include/
	* ctf-api.h (ctf_link_variable_filter_t): New.
	(ctf_link_set_variable_filter): Likewise.

libctf/
	* libctf.ver (ctf_link_set_variable_filter): Add.
	* ctf-impl.h (ctf_file_t) <ctf_link_variable_filter>: New.
	<ctf_link_variable_filter_arg>: Likewise.
	* ctf-create.c (ctf_serialize): Adjust.
	* ctf-link.c (ctf_link_set_variable_filter): New, set it.
	(ctf_link_one_variable): Call it if set.
2020-07-22 18:02:18 +01:00
Nick Alcock
8d2229ad1e libctf, link: add lazy linking: clean up input members: err/warn cleanup
This rather large and intertwined pile of changes does three things:

First, it transitions from dprintf to ctf_err_warn for things the user might
care about: this one file is the major impetus for the ctf_err_warn
infrastructure, because things like file names are crucial in linker
error messages, and errno values are utterly incapable of
communicating them

Second, it stabilizes the ctf_link APIs: you can now call
ctf_link_add_ctf without a CTF argument (only a NAME), to lazily
ctf_open the file with the given NAME when needed, and close it as soon
as possible, to save memory.  This is not an API change because a null
CTF argument was prohibited before now.

Since getting CTF directly from files uses ctf_open, passing in only a
NAME requires use of libctf, not libctf-nobfd.  The linker's behaviour
is unchanged, as it still passes in a ctf_archive_t as before.

This also let us fix a leak: we were opening ctf_archives and their
containing ctf_files, then only closing the files and leaving the
archives open.

Third, this commit restructures the ctf_link_in_member argument used by
the CTF linking machinery and adjusts its users accordingly.

We drop two members:

- arcname, which is difficult to construct and then only used in error
  messages (that were only dprintf()ed, so never seen!)
- share_mode, since we store the flags passed to ctf_link (including the
  share mode) in a new ctf_file_t.ctf_link_flags to help dedup get hold
  of it

We rename others whose existing names were fairly dreadful:

- done_main_member -> done_parent, using consistent terminology for .ctf
  as the parent of all archive members
- main_input_fp -> in_fp_parent, likewise
- file_name -> in_file_name, likewise

We add one new member, cu_mapped.

Finally, we move the various frees of things like mapping table data to
the top-level ctf_link, since deduplicating links will want to do that
too.

include/
	* ctf-api.h (ECTF_NEEDSBFD): New.
	(ECTF_NERR): Adjust.
	(ctf_link): Rename share_mode arg to flags.
libctf/
	* Makefile.am: Set -DNOBFD=1 in libctf-nobfd, and =0 elsewhere.
	* Makefile.in: Regenerated.
	* ctf-impl.h (ctf_link_input_name): New.
	(ctf_file_t) <ctf_link_flags>: New.
	* ctf-create.c (ctf_serialize): Adjust accordingly.
	* ctf-link.c: Define ctf_open as weak when PIC.
	(ctf_arc_close_thunk): Remove unnecessary thunk.
	(ctf_file_close_thunk): Likewise.
	(ctf_link_input_name): New.
	(ctf_link_input_t): New value of the ctf_file_t.ctf_link_input.
	(ctf_link_input_close): Adjust accordingly.
	(ctf_link_add_ctf_internal): New, split from...
	(ctf_link_add_ctf): ... here.  Return error if lazy loading of
	CTF is not possible.  Change to just call...
	(ctf_link_add): ... this new function.
	(ctf_link_add_cu_mapping): Transition to ctf_err_warn.  Drop the
	ctf_file_close_thunk.
	(ctf_link_in_member_cb_arg_t) <file_name> Rename to...
	<in_file_name>: ... this.
	<arcname>: Drop.
	<share_mode>: Likewise (migrated to ctf_link_flags).
	<done_main_member>: Rename to...
	<done_parent>: ... this.
	<main_input_fp>: Rename to...
	<in_fp_parent>: ... this.
	<cu_mapped>: New.
	(ctf_link_one_type): Adjuwt accordingly.  Transition to
	ctf_err_warn, removing a TODO.
	(ctf_link_one_variable): Note a case too common to warn about.
	Report in the debug stream if a cu-mapped link prevents addition
	of a conflicting variable.
	(ctf_link_one_input_archive_member): Adjust.
	(ctf_link_lazy_open): New, open a CTF archive for linking when
	needed.
	(ctf_link_close_one_input_archive): New, close it again.
	(ctf_link_one_input_archive): Adjust for lazy opening, member
	renames, and ctf_err_warn transition.  Move the
	empty_link_type_mapping call to...
	(ctf_link): ... here.  Adjut for renamings and thunk removal.
	Don't spuriously fail if some input contains no CTF data.
	(ctf_link_write): ctf_err_warn transition.
	* libctf.ver: Remove not-yet-stable comment.
2020-07-22 18:02:18 +01:00
Nick Alcock
8b37e7b63e libctf, ld, binutils: add textual error/warning reporting for libctf
This commit adds a long-missing piece of infrastructure to libctf: the
ability to report errors and warnings using all the power of printf,
rather than being restricted to one errno value.  Internally, libctf
calls ctf_err_warn() to add errors and warnings to a list: a new
iterator ctf_errwarning_next() then consumes this list one by one and
hands it to the caller, which can free it.  New errors and warnings are
added until the list is consumed by the caller or the ctf_file_t is
closed, so you can dump them at intervals.  The caller can of course
choose to print only those warnings it wants.  (I am not sure whether we
want objdump, readelf or ld to print warnings or not: right now I'm
printing them, but maybe we only want to print errors?  This entirely
depends on whether warnings are voluminous things describing e.g. the
inability to emit single types because of name clashes or something.
There are no users of this infrastructure yet, so it's hard to say.)

There is no internationalization here yet, but this at least adds a
place where internationalization can be added, to one of
ctf_errwarning_next or ctf_err_warn.

We also provide a new ctf_assert() function which uses this
infrastructure to provide non-fatal assertion failures while emitting an
assert-like string to the caller: to save space and avoid needlessly
duplicating unchanging strings, the assertion test is inlined but the
print-things-out failure case is not.  All assertions in libctf will be
converted to use this machinery in future commits and propagate
assertion-failure errors up, so that the linker in particular cannot be
killed by libctf assertion failures when it could perfectly well just
print warnings and drop the CTF section.

include/
	* ctf-api.h (ECTF_INTERNAL): Adjust error text.
	(ctf_errwarning_next): New.
libctf/
	* ctf-impl.h (ctf_assert): New.
	(ctf_err_warning_t): Likewise.
	(ctf_file_t) <ctf_errs_warnings>: Likewise.
	(ctf_err_warn): New prototype.
	(ctf_assert_fail_internal): Likewise.
	* ctf-inlines.h (ctf_assert_internal): Likewise.
	* ctf-open.c (ctf_file_close): Free ctf_errs_warnings.
	* ctf-create.c (ctf_serialize): Copy it on serialization.
	* ctf-subr.c (ctf_err_warn): New, add an error/warning.
	(ctf_errwarning_next): New iterator, free and pass back
	errors/warnings in succession.
	* libctf.ver (ctf_errwarning_next): Add.
ld/
	* ldlang.c (lang_ctf_errs_warnings): New, print CTF errors
	and warnings.  Assert when libctf asserts.
	(lang_merge_ctf): Call it.
	(land_write_ctf): Likewise.
binutils/
	* objdump.c (ctf_archive_member): Print CTF errors and warnings.
	* readelf.c (dump_ctf_archive_member): Likewise.
2020-07-22 18:02:17 +01:00
Nick Alcock
688d28f621 libctf, next: introduce new class of easier-to-use iterators
The libctf machinery currently only provides one way to iterate over its
data structures: ctf_*_iter functions that take a callback and an arg
and repeatedly call it.

This *works*, but if you are doing a lot of iteration it is really quite
inconvenient: you have to package up your local variables into
structures over and over again and spawn lots of little functions even
if it would be clearer in a single run of code.  Look at ctf-string.c
for an extreme example of how unreadable this can get, with
three-line-long functions proliferating wildly.

The deduplicator takes this to the Nth level. It iterates over a whole
bunch of things: if we'd had to use _iter-class iterators for all of
them there would be twenty additional functions in the deduplicator
alone, for no other reason than that the iterator API requires it.

Let's do something better. strtok_r gives us half the design: generators
in a number of other languages give us the other half.

The *_next API allows you to iterate over CTF-like entities in a single
function using a normal while loop. e.g. here we are iterating over all
the types in a dict:

ctf_next_t *i = NULL;
int *hidden;
ctf_id_t id;

while ((id = ctf_type_next (fp, &i, &hidden, 1)) != CTF_ERR)
  {
    /* do something with 'hidden' and 'id' */
  }
if (ctf_errno (fp) != ECTF_NEXT_END)
    /* iteration error */

Here we are walking through the members of a struct with CTF ID
'struct_type':

ctf_next_t *i = NULL;
ssize_t offset;
const char *name;
ctf_id_t membtype;

while ((offset = ctf_member_next (fp, struct_type, &i, &name,
                                  &membtype)) >= 0
  {
    /* do something with offset, name, and membtype */
  }
if (ctf_errno (fp) != ECTF_NEXT_END)
    /* iteration error */

Like every other while loop, this means you have access to all the local
variables outside the loop while inside it, with no need to tiresomely
package things up in structures, move the body of the loop into a
separate function, etc, as you would with an iterator taking a callback.

ctf_*_next allocates 'i' for you on first entry (when it must be NULL),
and frees and NULLs it and returns a _next-dependent flag value when the
iteration is over: the fp errno is set to ECTF_NEXT_END when the
iteartion ends normally.  If you want to exit early, call
ctf_next_destroy on the iterator.  You can copy iterators using
ctf_next_copy, which copies their current iteration position so you can
remember loop positions and go back to them later (or ctf_next_destroy
them if you don't need them after all).

Each _next function returns an always-likely-to-be-useful property of
the thing being iterated over, and takes pointers to parameters for the
others: with very few exceptions all those parameters can be NULLs if
you're not interested in them, so e.g. you can iterate over only the
offsets of members of a structure this way:

while ((offset = ctf_member_next (fp, struct_id, &i, NULL, NULL)) >= 0)

If you pass an iterator in use by one iteration function to another one,
you get the new error ECTF_NEXT_WRONGFUN back; if you try to change
ctf_file_t in mid-iteration, you get ECTF_NEXT_WRONGFP back.

Internally the ctf_next_t remembers the iteration function in use,
various sizes and increments useful for almost all iterations, then
uses unions to overlap the actual entities being iterated over to keep
ctf_next_t size down.

Iterators available in the public API so far (all tested in actual use
in the deduplicator):

/* Iterate over the members of a STRUCT or UNION, returning each member's
   offset and optionally name and member type in turn.  On end-of-iteration,
   returns -1.  */
ssize_t
ctf_member_next (ctf_file_t *fp, ctf_id_t type, ctf_next_t **it,
                 const char **name, ctf_id_t *membtype);

/* Iterate over the members of an enum TYPE, returning each enumerand's
   NAME or NULL at end of iteration or error, and optionally passing
   back the enumerand's integer VALue.  */
const char *
ctf_enum_next (ctf_file_t *fp, ctf_id_t type, ctf_next_t **it,
              int *val);

/* Iterate over every type in the given CTF container (not including
   parents), optionally including non-user-visible types, returning
   each type ID and optionally the hidden flag in turn. Returns CTF_ERR
   on end of iteration or error.  */
ctf_id_t
ctf_type_next (ctf_file_t *fp, ctf_next_t **it, int *flag,
               int want_hidden);

/* Iterate over every variable in the given CTF container, in arbitrary
   order, returning the name and type of each variable in turn.  The
   NAME argument is not optional.  Returns CTF_ERR on end of iteration
   or error.  */
ctf_id_t
ctf_variable_next (ctf_file_t *fp, ctf_next_t **it, const char **name);

/* Iterate over all CTF files in an archive, returning each dict in turn as a
   ctf_file_t, and NULL on error or end of iteration.  It is the caller's
   responsibility to close it.  Parent dicts may be skipped.  Regardless of
   whether they are skipped or not, the caller must ctf_import the parent if
   need be.  */
ctf_file_t *
ctf_archive_next (const ctf_archive_t *wrapper, ctf_next_t **it,
                  const char **name, int skip_parent, int *errp);

ctf_label_next is prototyped but not implemented yet.

include/
	* ctf-api.h (ECTF_NEXT_END): New error.
	(ECTF_NEXT_WRONGFUN): Likewise.
	(ECTF_NEXT_WRONGFP): Likewise.
	(ECTF_NERR): Adjust.
	(ctf_next_t): New.
	(ctf_next_create): New prototype.
	(ctf_next_destroy): Likewise.
	(ctf_next_copy): Likewise.
	(ctf_member_next): Likewise.
	(ctf_enum_next): Likewise.
	(ctf_type_next): Likewise.
	(ctf_label_next): Likewise.
	(ctf_variable_next): Likewise.

libctf/
	* ctf-impl.h (ctf_next): New.
	(ctf_get_dict): New prototype.
	* ctf-lookup.c (ctf_get_dict): New, split out of...
	(ctf_lookup_by_id): ... here.
	* ctf-util.c (ctf_next_create): New.
	(ctf_next_destroy): New.
	(ctf_next_copy): New.
	* ctf-types.c (includes): Add <assert.h>.
	(ctf_member_next): New.
	(ctf_enum_next): New.
	(ctf_type_iter): Document the lack of iteration over parent
	types.
	(ctf_type_next): New.
	(ctf_variable_next): New.
	* ctf-archive.c (ctf_archive_next): New.
	* libctf.ver: Add new public functions.
2020-07-22 17:57:50 +01:00
Nick Alcock
2399827bfa libctf: add ctf_ref
This allows you to bump the refcount on a ctf_file_t, so that you can
smuggle it out of iterators which open and close the ctf_file_t for you
around the loop body (like ctf_archive_iter).

You still can't use this to preserve a ctf_file_t for longer than the
lifetime of its containing entity (e.g. ctf_archive).

include/
	* ctf-api.h (ctf_ref): New.
libctf/
	* libctf.ver (ctf_ref): New.
	* ctf-open.c (ctf_ref): Implement it.
2020-07-22 17:57:49 +01:00
Nick Alcock
9c23dfa5aa libctf: add ctf_archive_count
Another count that was otherwise unavailable without doing expensive
operations.

include/
	* ctf-api.h (ctf_archive_count): New.

libctf/
	* ctf-archive.c (ctf_archive_count): New.
	* libctf.ver: New public function.
2020-07-22 17:57:39 +01:00
Nick Alcock
e0325e2ced libctf: add ctf_member_count
This returns the number of members in a struct or union, or the number
of enumerations in an enum.  (This was only available before now by
iterating across every member, but it can be returned much faster than
that.)

include/
	* ctf-api.h (ctf_member_count): New.

libctf/
	* ctf-types.c (ctf_member_count): New.
	* libctf.ver: New public function.
2020-07-22 17:57:38 +01:00
Nick Alcock
9b15cbb789 libctf: add ctf_type_kind_forwarded
This is just like ctf_type_kind, except that forwards get the
type of the thing being pointed to rather than CTF_K_FORWARD.

include/
	* ctf-api.h (ctf_type_kind_forwarded): New.
libctf/
	* ctf-types.c (ctf_type_kind_forwarded): New.
2020-07-22 17:57:37 +01:00
Nick Alcock
01d9317436 libctf: add ctf_type_name_raw
We already have a function ctf_type_aname_raw, which returns the raw
name of a type with no decoration for structures or arrays or anything
like that: just the underlying name of whatever it is that's being
ultimately pointed at.

But this can be inconvenient to use, becauswe it always allocates new
storage for the string and copies it in, so it can potentially fail.
Add ctf_type_name_raw, which just returns the string directly out of
libctf's guts: it will live until the ctf_file_t is closed (if we later
gain the ability to remove types from writable dicts, it will live as
long as the type lives).

Reimplement ctf_type_aname_raw in terms of it.

include/
	* ctf-api.c (ctf_type_name_raw): New.

libctf/
	* ctf-types.c (ctf_type_name_raw): New.
	(ctf_type_aname_raw): Reimplement accordingly.
2020-07-22 17:57:36 +01:00
Nick Alcock
2f6ecaed66 libctf, binutils: support CTF archives like objdump
objdump and readelf have one major CTF-related behavioural difference:
objdump can read .ctf sections that contain CTF archives and extract and
dump their members, while readelf cannot.  Since the linker often emits
CTF archives, this means that readelf intermittently and (from the
user's perspective) randomly fails to read CTF in files that ld emits,
with a confusing error message wrongly claiming that the CTF content is
corrupt.  This is purely because the archive-opening code in libctf was
needlessly tangled up with the BFD code, so readelf couldn't use it.

Here, we disentangle it, moving ctf_new_archive_internal from
ctf-open-bfd.c into ctf-archive.c and merging it with the helper
function in ctf-archive.c it was already using.  We add a new public API
function ctf_arc_bufopen, that looks very like ctf_bufopen but returns
an archive given suitable section data rather than a ctf_file_t: the
archive is a ctf_archive_t, so it can be called on raw CTF dictionaries
(with no archive present) and will return a single-member synthetic
"archive".

There is a tiny lifetime tweak here: before now, the archive code could
assume that the symbol section in the ctf_archive_internal wrapper
structure was always owned by BFD if it was present and should always be
freed: now, the caller can pass one in via ctf_arc_bufopen, wihch has
the usual lifetime rules for such sections (caller frees): so we add an
extra field to track whether this is an internal call from ctf-open-bfd,
in which case we still free the symbol section.

include/
	* ctf-api.h (ctf_arc_bufopen): New.
libctf/
	* ctf-impl.h (ctf_new_archive_internal): Declare.
	(ctf_arc_bufopen): Remove.
	(ctf_archive_internal) <ctfi_free_symsect>: New.
	* ctf-archive.c (ctf_arc_close): Use it.
	(ctf_arc_bufopen): Fuse into...
	(ctf_new_archive_internal): ... this, moved across from...
	* ctf-open-bfd.c: ... here.
	(ctf_bfdopen_ctfsect): Use ctf_arc_bufopen.
	* libctf.ver: Add it.
binutils/
	* readelf.c (dump_section_as_ctf): Support .ctf archives using
	ctf_arc_bufopen.  Automatically load the .ctf member of such
	archives as the parent of all other members, unless specifically
	overridden via --ctf-parent.  Split out dumping code into...
	(dump_ctf_archive_member): ... here, as in objdump, and call
	it once per archive member.
	(dump_ctf_indent_lines): Code style fix.
2020-06-26 15:56:39 +01:00
Alan Modra
b3adc24a07 Update year range in copyright notice of binutils files 2020-01-01 18:42:54 +10:30
Nick Alcock
87279e3cef libctf: installable libctf as a shared library
This lets other programs read and write CTF-format data.

Two versioned shared libraries are created: libctf.so and
libctf-nobfd.so.  They contain identical content except that
libctf-nobfd.so contains no references to libbfd and does not implement
ctf_open, ctf_fdopen, ctf_bfdopen or ctf_bfdopen_ctfsect, so it can be
used by programs that cannot use BFD, like readelf.

The soname major version is presently .0 until the linker API
stabilizes, when it will flip to .1 and hopefully never change again.

New in v3.
v4: libtoolize and turn into a pair of shared libraries.  Drop
    --enable-install-ctf: now controlled by --enable-shared and
    --enable-install-libbfd, like everything else.
v5: Add ../bfd to ACLOCAL_AMFLAGS and AC_CONFIG_MACRO_DIR.  Fix tabdamage.

	* Makefile.def (host_modules): libctf is no longer no_install.
	* Makefile.in: Regenerated.
libctf/
	* configure.ac (AC_DISABLE_SHARED): New, like opcodes/.
	(LT_INIT): Likewise.
	(AM_INSTALL_LIBBFD): Likewise.
	(dlopen): Note why this is necessary in a comment.
	(SHARED_LIBADD): Initialize for possibly-PIC libiberty: derived from
	opcodes/.
	(SHARED_LDFLAGS): Likewise.
	(BFD_LIBADD): Likewise, for libbfd.
	(BFD_DEPENDENCIES): Likewise.
	(VERSION_FLAGS): Initialize, using a version script if ld supports
	one, or libtool -export-symbols-regex otherwise.
	(AC_CONFIG_MACRO_DIR): Add ../BFD.
	* Makefile.am (ACLOCAL_AMFLAGS): Likewise.
	(INCDIR): New.
	(AM_CPPFLAGS): Use $(srcdir), not $(top_srcdir).
	(noinst_LIBRARIES): Replace with...
	[INSTALL_LIBBFD] (lib_LTLIBRARIES): This, or...
	[!INSTALL_LIBBFD] (noinst_LTLIBRARIES): ... this, mentioning new
	libctf-nobfd.la as well.
	[INSTALL_LIBCTF] (include_HEADERS): Add the CTF headers.
	[!INSTALL_LIBCTF] (include_HEADERS): New, empty.
	(libctf_a_SOURCES): Rename to...
	(libctf_nobfd_la_SOURCES): ... this, all of libctf other than
	ctf-open-bfd.c.
	(libctf_la_SOURCES): Now derived from libctf_nobfd_la_SOURCES,
	with ctf-open-bfd.c added.
	(libctf_nobfd_la_LIBADD): New, using @SHARED_LIBADD@.
	(libctf_la_LIBADD): New, using @BFD_LIBADD@ as well.
	(libctf_la_DEPENDENCIES): New, using @BFD_DEPENDENCIES@.
	* Makefile.am [INSTALL_LIBCTF]: Use it.
	* aclocal.m4: Add ../bfd/acinclude.m4, ../config/acx.m4, and the
	libtool macros.
	* libctf.ver: New, everything is version LIBCTF_1.0 currently (even
	the unstable components).
	* Makefile.in: Regenerated.
	* config.h.in: Likewise.
	* configure: Likewise.
binutils/
	* Makefile.am (LIBCTF): Mention the .la file.
	(LIBCTF_NOBFD): New.
	(readelf_DEPENDENCIES): Use it.
	(readelf_LDADD): Likewise.
	* Makefile.in: Regenerated.
ld/
	* configure.ac (TESTCTFLIB): Set to the .so or .a, like TESTBFDLIB.
	* Makefile.am (TESTCTFLIB): Use it.
	(LIBCTF): Use the .la file.
	(check-DEJAGNU): Use it.
	* Makefile.in: Regenerated.
	* configure: Likewise.
include/
	* ctf-api.h: Note the instability of the ctf_link interfaces.
2019-10-03 17:04:56 +01:00