binutils-gdb

mirror of https://github.com/bminor/binutils-gdb.git synced 2025-11-16 12:34:43 +00:00

Files

Andrew Burgess ccd56e2270 gdb: remove attempted type de-duplication when building gdb-index

This commit removes the attempted de-duplication of types when
building the gdb-index.  This commit is the natural extension of this
earlier commit:

  commit aef36dee93
  Date:   Sun Aug 13 14:08:06 2023 +0200

      [gdb/symtab] Don't deduplicate variables in gdb-index

Which removed the de-duplication of variables.  It is worth reading
the earlier commit as all the justifications for that patch also
apply to this one.

Currently, when building the gdb-index we sort the type entries,
moving declarations to the end of the entry list, and non-declarations
to the front.  Then within each group, declarations, and
non-declarations, the index entries are sorted by CU offset.

We then emit the first entry for any given type name.

There are two problems with this.

First, a non-declaration entry could be a definition, but it could
also be a typedef.  Now sure, a typedef is a type definition, but not
necessarily a useful one.

If we have a header file that contains:

  typedef struct foo_t foo_t;

And a CU which makes use of 'foo_t', then the CU will include both a
typedef and a type declaration.  The target of the typedef will be the
declaration.  But notice, the CU will not include a type definition.

If we have two CUs, one which only sees the above typedef and
declaration, and another which sees the typedef and an actual type
definition, then the final list of entries for this type's name will
be:

  1. A typedef entry that points at the declaration.
  2. A typedef entry that points at the definition.
  3. A definition.
  4. A declaration.

Now (4) will get sorted to the end of the entry list.  But the order
of (1), (2), and (3) will depend on the CU offset.  If the CU which
containing the typedef and declaration has the smallest offset,
then (1) will be sorted to the front of the list of entries for this
type name.  Due to the de-duplication code this means that only (1)
will be added to the gdb-index.

After GDB starts and parses the index, if a user references 'foo_t'
GDB will look in the index and find just (1).  GDB loads the CU
containing (1) and finds both the typedef and the declaration.  But
GDB does not find the full type definition.  As a result GDB will
display 'foo_t' as an incomplete type.

This differs from the behaviour when no index is used.  With no index
GDB expands the first CU containing 'foo_t', finds the typedef and
type declaration, decides that this is not good enough and carries on.
GDB will then expand the second CU and find the type's definition, GDB
now has a full understanding of the type, and can print the type
correctly.

We could solve this problem by marking typedefs as a distinct
sub-category of types, just as we do with declarations.  Then we could
sort definitions to the front of the list, then typedefs, and finally,
declarations after that.  This would, I think, mean that we always
prefer emitting a definition for a type, which would resolve this
first problem, or at least, it would resolve it well enough, but it
wouldn't fix the second problem.

The second problem is that the Python API and the 'info types' command
can be used to query all type symbols.  As such, GDB needs to be able
to find all the CUs which contain a given type.  Especially as it is
possible that a type might be defined differently within different
CUs.

NOTE: Obviously a program doing this (defining a type differently in
  different CUs) would need to be mindful of the One Definition Rule,
  but so long as the type doesn't escape outside of a single CU then
  reusing a type name isn't, as I understand it, wrong.  And even if
  it is, the fact that it compiles, and could be a source of bugs,
  means (in my opinion) that GDB should handle this case to enable
  debugging of it.

Even something as simple as 'info types ....' relies on GDB being able
to find multiple entries for a given type in different CUs.  If the
index only contains a single type entry, then this means GDB will see
different things depending on which CUs happen to have been expanded.

Given all of the above, I think that any attempt to remove type
entries from the gdb-index is unsafe and can result in GDB behaving
differently when using the gdb-index compared to using no index.

The solution is to remove the de-duplication code, which is what this
patch does.

Now that we no longer need to sort declarations to the end of the
entry list, I've removed all the code related to the special use of
GDB_INDEX_SYMBOL_KIND_UNUSED5 (which is how we marked declarations),
this cleans things up a little bit.

I've also renamed some of the functions away from minimize, now that
there's no minimization being done.

A problem was revealed by this change.  When running the test
gdb.cp/stub-array-size.exp with the --target_board=cc-with-gdb-index,
I was seeing a failure using gcc 15.1.0.

This test has two CUs, and a type 'A'.  The test description says:

  Test size of arrays of stubbed types (structures where the full
  definition is not immediately available).

Which I don't really understand given the test's source code.  The
type 'A' is defined in a header, which is included in both CUs.
However, the test description does seem to be accurate; in one CU the
type looks like this:

 <1><4a>: Abbrev Number: 8 (DW_TAG_structure_type)
    <4b>   DW_AT_name        : A
    <4d>   DW_AT_declaration : 1
    <4d>   DW_AT_sibling     : <0x6d>
 <2><51>: Abbrev Number: 9 (DW_TAG_subprogram)
    <52>   DW_AT_external    : 1
    <52>   DW_AT_name        : ~A
    <55>   DW_AT_decl_file   : 2
    <56>   DW_AT_decl_line   : 20
    <57>   DW_AT_decl_column : 11
    <58>   DW_AT_linkage_name: (indirect string, offset: 0x103): _ZN1AD4Ev
    <5c>   DW_AT_virtuality  : 1        (virtual)
    <5d>   DW_AT_containing_type: <0x4a>
    <61>   DW_AT_declaration : 1
    <61>   DW_AT_object_pointer: <0x66>
    <65>   DW_AT_inline      : 0        (not inlined)
 <3><66>: Abbrev Number: 10 (DW_TAG_formal_parameter)
    <67>   DW_AT_type        : <0x8c>
    <6b>   DW_AT_artificial  : 1
 <3><6b>: Abbrev Number: 0
 <2><6c>: Abbrev Number: 0

while in the second CU, the type looks like this:

 <1><178>: Abbrev Number: 4 (DW_TAG_structure_type)
    <179>   DW_AT_name        : A
    <17b>   DW_AT_byte_size   : 8
    <17c>   DW_AT_decl_file   : 2
    <17d>   DW_AT_decl_line   : 18
    <17e>   DW_AT_decl_column : 8
    <17f>   DW_AT_containing_type: <0x178>
    <183>   DW_AT_sibling     : <0x1ac>
 <2><187>: Abbrev Number: 5 (DW_TAG_member)
    <188>   DW_AT_name        : (indirect string, offset: 0x19e): _vptr.A
    <18c>   DW_AT_type        : <0x1be>
    <190>   DW_AT_data_member_location: 0
    <191>   DW_AT_artificial  : 1
 <2><191>: Abbrev Number: 6 (DW_TAG_subprogram)
    <192>   DW_AT_external    : 1
    <192>   DW_AT_name        : ~A
    <195>   DW_AT_decl_file   : 1
    <196>   DW_AT_decl_line   : 20
    <197>   DW_AT_decl_column : 1
    <198>   DW_AT_linkage_name: (indirect string, offset: 0x103): _ZN1AD4Ev
    <19c>   DW_AT_virtuality  : 1       (virtual)
    <19d>   DW_AT_containing_type: <0x178>
    <1a1>   DW_AT_declaration : 1
    <1a1>   DW_AT_object_pointer: <0x1a5>
 <3><1a5>: Abbrev Number: 7 (DW_TAG_formal_parameter)
    <1a6>   DW_AT_type        : <0x1cd>
    <1aa>   DW_AT_artificial  : 1
 <3><1aa>: Abbrev Number: 0
 <2><1ab>: Abbrev Number: 0

So, for reasons that I don't understand, the type, despite (as far as
I can see) having its full definition available, is recorded only as
declared in one CU.

The test then performs some actions that rely on 'sizeof(A)' and
expects GDB to correctly figure out the size.  This requires GDB to
find, and expand the CU containing the real definition of 'A'.

Prior to this patch GDB would sort the two type entries for 'A',
placing the declaration second, and then record only one entry, the
definition.  When it came to expansion there was only one thing to
expand, and this is the declaration we needed.  It happens that in
this test the definition is in the second CU, that is, the CU with the
biggest offset.  This means that, if all index entries were considered
equal, the definition entry would be second.  However, currently, due
to the way GDB forces definitions to the front, the entry for the
second CU, the definition, is placed first in the index, and with
de-duplication, this is the only entry added to the index.

After this patch, both the declaration and the definition are placed
in the index, and as the declaration is in the CU at offset 0, the
declaration is added first to the index.

This should be fine.  When looking for 'A' GDB should expand the CU
containing the declaration, see that all we have is a declaration, and
so continue, next expanding the definition, at which point we're done.

However, in read-gdb-index.c, in the function
mapped_gdb_index::build_name_components, there is a work around for
gold bug PR gold/15646.  Ironically, the bug here is that gold was not
removing duplicate index entries, and it is noted that this has a
performance impact on GDB.  A work around for this was added to GDB in
commit:

  commit 8943b87476
  Date:   Tue Nov 12 09:43:17 2013 -0800

      Work around gold/15646.

A test for this was added in:

  commit 40d22035a7
  Date:   Tue May 26 11:35:32 2020 +0200

      [gdb/testsuite] Add test-case gold-gdb-index.exp

And the fix was tweaked in commit:

  commit f030440daa
  Date:   Thu May 28 17:26:22 2020 +0200

      [gdb/symtab] Make gold index workaround more precise

The problem specifically called out in the bug report is that
namespaces can appear in multiple CUs, and that trying to complete
'ns::misspelled' would expand every CU containing namespace 'ns' due
to the duplicate 'ns' type symbols.

The work around that was added in 8943b87476 was to ignore
duplicate global symbols when expanding entries from the index.  In
commit f030440daa this work around was restricted to only ignore
duplicate type entries.  This restriction was required to allow the
earlier de-duplication patch aef36dee93 to function correctly.

Now that I'm taking the work started in aef36dee93 to its
logical conclusion, and allowing duplicate type entries, the work
around of ignoring duplicate global type symbols is no longer needed,
and can be removed.

The associated test for this, added in 40d22035a7, is also
removed in this commit.

To be clear; the performance issue mentioned in PR gold/15646 is now
back again.  But my claim is that gold was right all along to include
the duplicate index entries, and any performance hit we see as a
result, though unfortunate, is just a consequence of doing it right.

That doesn't mean there's not room for optimisation and improvement in
the future, though I don't have any immediate ideas, or plans in this
area.  It's just we can't throw out a bunch of index entries that are
critical, and claim this as a performance optimisation.

I am seeing some failure with this patch when using the board file
dwarf5-fission-debug-types.  These failures all have the error:

  DWARF Error: wrong unit_type in unit header (is DW_UT_skeleton, should be DW_UT_type) [in module ....]

However, I ran the whole testsuite with this board, and this error
crops up often, so I don't think this is something specific to my
patch, so I'm choosing to ignore this.

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=15646
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=15035

Approved-By: Tom Tromey <tom@tromey.com>