Commit Graph

950 Commits

Author SHA1 Message Date
Tom Tromey
2caf7b1689 Introduce gdbsupport/cxx-thread.h and use it
This introduces a new file, gdbsupport/cxx-thread.h, which provides
stubs for the C++ threading functionality on systems that don't
support it.

On fully-working ports, this header just supplies a number of aliases
in the gdb namespace.  So, for instance, gdb::mutex is just an alias
for std::mutex.

For non-working ports, compatibility stubs are provided for the subset
of threading functionality that's used in gdb.  These generally do
nothing and assume single-threaded operation.

The idea behind this is to reduce the number of checks of
CXX_STD_THREAD, making the code cleaner.

Not all spots using CXX_STD_THREAD could readily be converted.
In particular:

* Unit tests
* --config output
* Code manipulating threads themselves
* The extension interrupting handling code

These all seem fine to me.

Note there's also a check in py-dap.c.  This one is perhaps slightly
subtle: DAP starts threads on the Python side, but it relies on gdb
itself being thread-savvy, for instance in gdb.post_event.

Approved-By: Simon Marchi <simon.marchi@efficios.com>
2025-10-02 11:55:14 -06:00
Tom Tromey
e576e948da Remove two unused includes
dwarf2/read.c no longer uses gdb::task_group, so the include isn't
needed.  Simon pointed out that the thread-pool.h include isn't needed
either.

Approved-By: Simon Marchi <simon.marchi@efficios.com>
2025-10-02 11:55:01 -06:00
Simon Marchi
dad36cf919 gdb/dwarf: use dynamic partitioning for DWARF CU indexing
The DWARF indexer splits the work statically based on the unit sizes,
attempting to give each worker thread about the same amount of bytes to
process.  This works relatively well with standard compilation.  But
when compiling with DWO files (-gsplit-dwarf), it's not as good.  I see
this when loading a relatively big program (telegram-desktop, which
includes a lot of static dependencies) compiled with -gsplit-dwarf:

    Time for "DWARF indexing worker": wall 0.000, user 0.000, sys 0.000, user+sys 0.000, -nan % CPU
    Time for "DWARF indexing worker": wall 0.001, user 0.000, sys 0.000, user+sys 0.000, 0.0 % CPU
    Time for "DWARF indexing worker": wall 0.001, user 0.001, sys 0.000, user+sys 0.001, 100.0 % CPU
    Time for "DWARF indexing worker": wall 0.748, user 0.284, sys 0.297, user+sys 0.581, 77.7 % CPU
    Time for "DWARF indexing worker": wall 0.818, user 0.408, sys 0.262, user+sys 0.670, 81.9 % CPU
    Time for "DWARF indexing worker": wall 1.196, user 0.580, sys 0.402, user+sys 0.982, 82.1 % CPU
    Time for "DWARF indexing worker": wall 1.250, user 0.511, sys 0.500, user+sys 1.011, 80.9 % CPU
    Time for "DWARF indexing worker": wall 7.730, user 5.891, sys 1.729, user+sys 7.620, 98.6 % CPU

Note how the wall times vary from 0 to 7.7 seconds.  This is
undesirable, because the time to do that indexing step takes as long as
the slowest worker thread takes.

The imbalance in this step also causes imbalance in the following
"finalize" step:

    Time for "DWARF finalize worker": wall 0.007, user 0.004, sys 0.002, user+sys 0.006, 85.7 % CPU
    Time for "DWARF finalize worker": wall 0.012, user 0.005, sys 0.005, user+sys 0.010, 83.3 % CPU
    Time for "DWARF finalize worker": wall 0.015, user 0.010, sys 0.004, user+sys 0.014, 93.3 % CPU
    Time for "DWARF finalize worker": wall 0.389, user 0.359, sys 0.029, user+sys 0.388, 99.7 % CPU
    Time for "DWARF finalize worker": wall 0.680, user 0.644, sys 0.035, user+sys 0.679, 99.9 % CPU
    Time for "DWARF finalize worker": wall 0.929, user 0.907, sys 0.020, user+sys 0.927, 99.8 % CPU
    Time for "DWARF finalize worker": wall 1.093, user 1.055, sys 0.037, user+sys 1.092, 99.9 % CPU
    Time for "DWARF finalize worker": wall 2.016, user 1.934, sys 0.082, user+sys 2.016, 100.0 % CPU
    Time for "DWARF finalize worker": wall 25.882, user 25.471, sys 0.404, user+sys 25.875, 100.0 % CPU

With DWO files, the split of the workload by size doesn't work, because
it is done using the size of the skeleton units in the main file, which
is not representative of how much DWARF is contained in each DWO file.

I haven't tried it, but a similar problem could occur with cross-unit
imports, which can happen with dwz or LTO.  You could have a small unit
that imports a lot from other units, in which case the size of the units
is not representative of the work to accomplish.

To try to improve this situation, change the DWARF indexer to use
dynamic partitioning, using gdb::parallel_for_each_async.  With this,
each worker thread pops one unit at a time from a shared work queue to
process it, until the queue is empty.  That should result in a more
balance workload split.  I chose 1 as the minimum batch size here,
because I judged that indexing one CU was a big enough piece of work
compared to the synchronization overhead of the queue.  That can always
be tweaked later if someone wants to do more tests.

As a result, the timings are much more balanced:

    Time for "DWARF indexing worker": wall 2.325, user 1.033, sys 0.573, user+sys 1.606, 69.1 % CPU
    Time for "DWARF indexing worker": wall 2.326, user 1.028, sys 0.568, user+sys 1.596, 68.6 % CPU
    Time for "DWARF indexing worker": wall 2.326, user 1.068, sys 0.513, user+sys 1.581, 68.0 % CPU
    Time for "DWARF indexing worker": wall 2.326, user 1.005, sys 0.579, user+sys 1.584, 68.1 % CPU
    Time for "DWARF indexing worker": wall 2.326, user 1.070, sys 0.516, user+sys 1.586, 68.2 % CPU
    Time for "DWARF indexing worker": wall 2.326, user 1.063, sys 0.584, user+sys 1.647, 70.8 % CPU
    Time for "DWARF indexing worker": wall 2.326, user 1.049, sys 0.550, user+sys 1.599, 68.7 % CPU
    Time for "DWARF indexing worker": wall 2.328, user 1.058, sys 0.541, user+sys 1.599, 68.7 % CPU
    ...
    Time for "DWARF finalize worker": wall 2.833, user 2.791, sys 0.040, user+sys 2.831, 99.9 % CPU
    Time for "DWARF finalize worker": wall 2.939, user 2.896, sys 0.043, user+sys 2.939, 100.0 % CPU
    Time for "DWARF finalize worker": wall 3.016, user 2.969, sys 0.046, user+sys 3.015, 100.0 % CPU
    Time for "DWARF finalize worker": wall 3.076, user 2.957, sys 0.118, user+sys 3.075, 100.0 % CPU
    Time for "DWARF finalize worker": wall 3.159, user 3.054, sys 0.104, user+sys 3.158, 100.0 % CPU
    Time for "DWARF finalize worker": wall 3.198, user 3.082, sys 0.114, user+sys 3.196, 99.9 % CPU
    Time for "DWARF finalize worker": wall 3.197, user 3.076, sys 0.121, user+sys 3.197, 100.0 % CPU
    Time for "DWARF finalize worker": wall 3.268, user 3.136, sys 0.131, user+sys 3.267, 100.0 % CPU
    Time for "DWARF finalize worker": wall 1.907, user 1.810, sys 0.096, user+sys 1.906, 99.9 % CPU

In absolute terms, the total time for GDB to load the file and exit goes
down from about 42 seconds to 17 seconds.

Some implementation notes:

 - The state previously kept in as local variables in
   cooked_index_worker_debug_info::process_units becomes fields of the
   new parallel worker object.

 - The work previously done for each unit in
   cooked_index_worker_debug_info::process_units becomes the operator()
   of the new parallel worker object.

 - The work previously done at the end of
   cooked_index_worker_debug_info::process_units (including calling
   bfd_thread_cleanup) becomes the destructor of the new parallel worker
   object.

 - The "done" callback of gdb::task_group becomes the "done" callback of
   gdb::parallel_for_each_async.

 - I placed the parallel_indexing_worker struct inside
   cooked_index_worker_debug_info, so that it has access to
   parallel_indexing_worker's private fields (e.g. m_results, to push
   the results).  It will also be possible to re-use it for skeletonless
   type units in a later patch.

Change-Id: I5dc5cf8793abe9ebe2659e78da38ffc94289e5f2
Approved-By: Tom Tromey <tom@tromey.com>
2025-09-30 19:37:20 +00:00
Tom Tromey
8bbc7f91fc Fix test in anonymous_struct_prefix
I noticed a bad test in dwarf2/read.c:anonymous_struct_prefix:

   attr = dw2_linkage_name_attr (die, cu);
   const char *attr_name = attr->as_string ();
  if (attr == NULL || attr_name == NULL)
    return NULL;

Here, if attr==NULL, this will crash before the test can be executed.

This patch fixes the problem by hoisting the test.  I'm checking this
in as obvious.
2025-09-20 14:08:19 -06:00
Keith Seitz
31cb4bb676 Correct bounds check when working around GAS DWARF 5 directory table bug
Recent Go toolchains are causing GDB to crash on a relatively recent
workaround for a GAS bug:

commit a833790a62
Date:   Wed Nov 1 00:33:12 2023 +0100

    [gdb/symtab] Work around gas PR28629

In the original GAS bug, the first directory table entry did not contain
the current directory of the compilation. So the above commit added a
workaround fix to prepend the second directory table entry.

However recent Go toolchain compilations (specifically on aarch64)
only output a single directory table entry. Looking at the workaround:

       if (lh->version == 5 && lh->is_valid_file_index (1))
         {
           std::string dir = lh->include_dir_at (1);
           fnd.set_comp_dir (std::move (dir));
         }

`lh->is_valid_file_index (1)' is true, but since the directory table only
has one entry, `include_dir_at (1)' returns nullptr. Consequently the
std::string ctor will segfault. Since there are no guarantees that the file
and directory tables are the same size, a better bounds check is to simply
rely on `include_dir_at' to ensure a valid directory table entry.

I have updated the workaround commit's test, gdb.dwarf2/dw2-gas-workaround.exp
and tested on x86_64 and aarch64 RHEL 9 and Fedora 41.

Approved-By: Andrew Burgess <aburgess@redhat.com>
2025-09-19 09:50:46 -07:00
Tom Tromey
ae912a65f9 Rename expand_symtabs_matching
After this series, expand_symtabs_matching is now misnamed.  This
patch renames it, renames some associated types, and also fixes up
some comments that I previously missed.

Acked-By: Simon Marchi <simon.marchi@efficios.com>
2025-09-10 16:07:58 -06:00
Tom Tromey
8ac273ba2a Make dw_expand_symtabs_matching_file_matcher static
dw_expand_symtabs_matching_file_matcher is no longer needed outside of
read.c, so it can be made static.

Acked-By: Simon Marchi <simon.marchi@efficios.com>
2025-09-10 16:07:58 -06:00
Tom Tromey
486bc5ac81 Rewrite the .gdb_index reader
This patch rewrites the .gdb_index reader to create the same data
structures that are created by the cooked indexer and the .debug_names
reader.

This is done in support of this series; but also because, from what I
can tell, the "templates.exp" change didn't really work properly with
this reader.

In addition to fixing that problem, this patch removes a lot of code.

Implementing this required a couple of hacks, as .gdb_index does not
contain all the information that's used by the cooked index
implementation.

* The index-searching code likes to differentiate between the various
  DWARF tags when matching, but .gdb_index lumps many things into a
  single "other" category.  To handle this, we introduce a phony tag
  that's used so that the match method can match on multiple domains.

* Similarly, .gdb_index doesn't distinguish between the type and
  struct domains, so another phony tag is used for this.

* The reader must attempt to guess the language of various symbols.
  This is somewhat finicky.  "Plain" (unqualified) symbols are marked
  as language_unknown and then a couple of hacks are used to handle
  these -- one in expand_symtabs_matching and another when recognizing
  "main".

For what it's worth, I consider .gdb_index to be near the end of its
life.  While .debug_names is not perfect -- we found a number of bugs
in the standard while implementing it -- it is better than .gdb_index
and also better documented.

After this patch, we could conceivably remove dwarf_scanner_base.
However, I have not done this.

Finally, this patch also changes this reader to dump the content of
the index, as the other DWARF readers do.  This can be handy when
debugging gdb.

Acked-By: Simon Marchi <simon.marchi@efficios.com>
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=33316
2025-09-10 16:05:28 -06:00
Tom Tromey
f88f9f42db Have expand_symtabs_matching work for already-expanded CUs
Currently, gdb will search the already-expanded symtabs in one loop,
and then also expand matching symtabs in another loop.  However, this
is somewhat inefficient -- when searching the already-expanded
symtabs, all such symtabs are examined.  However, the various "quick"
implementations already know which subset of symtabs might have a
match.

This changes the contract of expand_symtabs_matching to also call the
callback for an already-expanded symtab.  With this change, and some
subsequent enabling changes, the number of searched symtabs should
sometimes be reduced.  This also cuts down on the amount of redundant
code.

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=16994
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=16998
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30736
Acked-By: Simon Marchi <simon.marchi@efficios.com>
2025-09-10 16:05:28 -06:00
Tom Tromey
29fa4279c2 Remove dwarf2_per_cu_data::mark
This removes dwarf2_per_cu_data::mark, replacing it with a
locally-allocated boolean vector.  It also inverts the sense of the
flag -- now, the flag is true when a CU should be skipped, and false
when the CU should be further examined.  Also, the validity of the
flag is no longer dependent on 'file_matcher != NULL'.

This patch makes the subsequent patch to searching a bit simpler, so
I've separated it out.

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=16994
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=16998
Acked-By: Simon Marchi <simon.marchi@efficios.com>
2025-09-10 16:05:28 -06:00
Tom Tromey
a7343246f5 Ada import functions not in index
The cooked index does not currently contain entries for Ada import
functions.  This means that whether or not these are visible to
"break" depends on which CUs were previously expanded -- clearly a
bug.

This patch fixes the issue.  I think the comments in the patch explain
the fix reasonably well.

Perhaps one to-do item here is to change GNAT to use
DW_TAG_imported_declaration for these imports.  This may eventually
let us remove some of the current hacks.

This version includes a fix from Simon to initialize the new member.

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32511
Acked-By: Simon Marchi <simon.marchi@efficios.com>
2025-09-10 16:05:27 -06:00
Tom Tromey
3719472095 Use gnulib c-ctype module in gdb
PR ada/33217 points out that gdb incorrectly calls the <ctype.h>
functions.  In particular, gdb feels free to pass a 'char' like:

    char *str = ...;
    ... isdigit (*str)

This is incorrect as isdigit only accepts EOF and values that can be
represented as 'unsigned char' -- that is, a cast is needed here to
avoid undefined behavior when 'char' is signed and a character in the
string might be sign-extended.  (As an aside, I think this API seems
obviously bad, but unfortunately this is what the standard says, and
some systems check this.)

Rather than adding casts everywhere, this changes all the code in gdb
that uses any <ctype.h> API to instead call the corresponding c-ctype
function.

Now, c-ctype has some limitations compared to <ctype.h>.  It works as
if the C locale is in effect, so in theory some non-ASCII characters
may be misclassified.  This would only affect a subset of character
sets, though, and in most places I think ASCII is sufficient -- for
example the many places in gdb that check for whitespace.
Furthermore, in practice most users are using UTF-8-based locales,
where these functions aren't really informative for non-ASCII
characters anyway; see the existing workarounds in gdb/c-support.h.

Note that safe-ctype.h cannot be used because it causes conflicts with
readline.h.  And, we canot poison the <ctype.h> identifiers as this
provokes errors from some libstdc++ headers.

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=33217
Approved-By: Simon Marchi <simon.marchi@efficios.com>
2025-09-09 11:59:04 -06:00
Tom Tromey
247712bb94 Move compute_include_file_name earlier
I noticed that the compute_include_file_name intro comment was
slightly wrong, and while looking at this, I also noticed that it has
a single caller.  This patch hoists it slightly so that a forward
declaration isn't needed.

Approved-By: Simon Marchi <simon.marchi@efficios.com>
2025-09-08 18:52:47 -06:00
Tom Tromey
5b17433341 Move lnp_state_machine to new file
This patch moves lnp_state_machine and some supporting code to a new
file, dwarf2/line-program.c.  The main benefit of this is shrinking
dwarf2/read.c a bit.

Approved-By: Simon Marchi <simon.marchi@efficios.com>
2025-09-08 18:51:58 -06:00
Simon Marchi
4260abb7a7 gdb: rename address_class -> location_class
The enum address_class and related fields and methods seem misnamed to
me.  Generalize it to "location_class".  The enumerators in
address_class are already prefixed with LOC, so the new name seems
logical to me.  Rename related fields and methods as well.

Plus, address_class could easily be mistaken for other unrelated things
named "address class" in GDB or DWARF.

Tested by rebuilding.

Change-Id: I0dca3738df412b350715286c608041b08e9b4d82
Approved-by: Kevin Buettner <kevinb@redhat.com>
2025-08-19 09:49:46 -04:00
Simon Marchi
94de78f9d0 gdb/dwarf: clear per_bfd::num_{comp,type}_units on error
Commit bedd6a7a44 ("gdb/dwarf: track compilation and type unit count")
causes this internal error:

    $ ./gdb -nx -q --data-directory=data-directory testsuite/outputs/gdb.dwarf2/debug-names-duplicate-cu/debug-names-duplicate-cu -ex "save gdb-index -dwarf-5 /tmp" -batch

    warning: Section .debug_names has incorrect number of CUs in CU table, ignoring .debug_names.
    /home/smarchi/src/binutils-gdb/gdb/dwarf2/index-write.c:1454: internal-error: write_debug_names: Assertion `comp_unit_counter == per_bfd->num_comp_units' failed.

This is visible when running this test:

    $ make check TESTS="gdb.dwarf2/debug-names-duplicate-cu.exp" RUNTESTFLAGS="--target_board=cc-with-debug-names"
    ...
    Running /home/smarchi/src/binutils-gdb/gdb/testsuite/gdb.dwarf2/debug-names-duplicate-cu.exp ...
    gdb compile failed, warning: Section .debug_names has incorrect number of CUs in CU table, ignoring .debug_names.
    /home/smarchi/src/binutils-gdb/gdb/dwarf2/index-write.c:1454: internal-error: write_debug_names: Assertion `comp_unit_counter == per_bfd->num_comp_units' failed.
    ...
                    === gdb Summary ===

    # of untested testcases         1

However, it's easy to miss because it only causes an "UNTESTED" to be
recorded, not a FAIL or UNRESOLVED.  This is because the problem happens
while trying to create the .debug_names index, as part of the test case
compilation.

The problem is: when we bail out from using .debug_names because we
detect it is inconsistent with the units in .debug_info, we clear
per_bfd->all_units, to destroy all units previously created, before
proceeding to read the units with an index.  However, we don't clear
per_bfd->num_{comp,type}_units.  As a result, per_bfd->all_units
contains one unit, while per_bfd->num_comp_units is 2.  Whenever we
clear per_bfd->all_units, we should also clear
per_bfd->num_{comp,type}_units.

While at it, move this logic inside a scoped object.

I added an assertion in finalize_all_units to verify that the size of
per_bfd->all_units equals per_bfd->num_comp_units +
per_bfd->num_type_units.  This makes the problem (if omitting the fix)
visible when running gdb.dwarf2/debug-names-duplicate-cu.exp with the
unix (default) target board:

    ERROR: Couldn't load debug-names-duplicate-cu into GDB (GDB internal error).
    FAIL: gdb.dwarf2/debug-names-duplicate-cu.exp: find index type (GDB internal error)
    FAIL: gdb.dwarf2/debug-names-duplicate-cu.exp: find index type, check type is valid

                    === gdb Summary ===

    # of expected passes            1
    # of unexpected failures        2
    # of unresolved testcases       1

I considered changing the code to build a local vector of units first,
then move it in per_bfd->all_units on success, that would avoid having
to clean it up on error.  I did not do it because it's a much larger
change, but we could consider it.

Change-Id: I49bcc0cb4b34aba3d882b27c8a93c168e8875c08
Approved-By: Tom Tromey <tom@tromey.com>
2025-08-14 12:49:20 -04:00
Tom Tromey
89495c3326 Change type::fields to return an array_view
This patch changes type::fields to return an array_view of the fields,
then fixes up the fallout.

More cleanups would be possible here (in particular in the field
initialization code) but I haven't done so.

The main motivation for this patch was to make it simpler to iterate
over the fields of a type.

Regression tested on x86-64 Fedora 41.

Approved-By: Simon Marchi <simon.marchi@efficios.com>
2025-08-12 08:30:37 -06:00
Simon Marchi
6474c699a5 gdb/dwarf: sort dwarf2_per_bfd::all_units by (section, offset)
This patch started as a fix for PR 29518 ("GDB doesn't handle
DW_FORM_ref_addr DIE references correctly with .debug_types sections")
[1], but the scope has expanded a bit to fix the problem more generally,
after I spotted a few issues related to the order of all_units.  The
first version of this patch is here [2].

PR 29518 shows that dwarf2_find_containing_comp_unit can erroneously
find a type unit.  The obvious problem is that the
dwarf2_find_containing_comp_unit function searches the whole all_units
vector (containing both comp and type units), when really it should just
search the compilation units.  A simple solution would be to make it
search the all_comp_units view (which is removed in a patch earlier in
this series).

I then realized that in DWARF 5, since type units are in .debug_info
(versus .debug_types in DWARF 4), type units can be interleaved with
comp type in the all_units vector.  That would make the all_comp_units
and all_type_units views erroneous, and dwarf2_find_containing_comp_unit
could still return something wrong.  In v1, I added a sort in
finalize_all_units to make sure all_units is in the order that
dwarf2_find_containing_comp_unit expects:

 - comp units from the main file
 - type units from the main file
 - comp units from the dwz file
 - type units from the dwz file (not actually supported, see PR 30838)

Another problem I spotted is that the .gdb_index reader creates units in
this order:

 - comp units from .gdb_index from main file
 - comp units from .gdb_index from dwz file
 - type units from .gdb_index from main file

This isn't the same order as above, so it would need the same sort step.

Finally, I'm not exactly sure if and when it happens, but it looks like
lookup_signatured_type can be called at a later time (after the initial
scan and creation of dwarf2_per_cu object creation), when expanding a
symtab.  And that could lead to the creation of a new type unit (see
function add_type_unit), which would place the new type unit at the end
of the all_units vector, possibly screwing up the previous order.

To handle all this in a nice and generic way, Tom Tromey proposed to
change the all_units order, so that units are sorted by section, then
section offset.  This is what this patch implements.  The sorting is
done in finalize_all_units.

This works well, because when looking up a unit by section offset, the
caller knows which section the unit is in.  Passing down a (section,
section offset) tuple makes it clear and unambiguous what unit the
caller is referring to.  It should help eliminate some bugs where the
callee used the section offset in the wrong section.  Passing down the
section along with the section offset replaces the "is_dwz" flag passed
to dwarf2_find_containing_comp_unit and a bunch of other functions in a
more general way.

dwarf2_find_containing_comp_unit can now legitimately find and return
type units even though it should be needed (type units are typically
referred to by signature).  But I don't think there is harm for this
function to be more generic than needed.  I therefore I renamed it to
dwarf2_find_containing_unit.

The sort criterion for "section" can be anything, as long as we use the
same for sorting and searching.  In this patch, I use the pointer to
dwarf2_section_info, because it's easy.  The downside is that the actual
order depends on what the memory allocator decided to return, so could
change from run to run, or machine to machine.  Later, I might change it
so that sections are ordered based on their properties, making the order
stable across the board.  This logic is encapsulated in the
all_units_less_than function, so it's easy to change.

The .debug_names reader can no longer rely on the order of the all_units
vector for its checks, since all_units won't be the same order as found
in the .debug_names lists.  In fact, even before, it wasn't: this check
assumed that .debug_info had all CUs before TUs, and that the index
listed them in the exact same order.  When I build a file with gcc and
"-gdwarf-5 -fdebug-types-section", type units appear first in
.debug_info.  This caused GDB to reject a .debug_names index that is had
produced:

    $ GDB="./gdb -nx -q --data-directory=data-directory" /home/smarchi/src/binutils-gdb/gdb/contrib/gdb-add-index.sh -dwarf-5 hello.so
    $ ./gdb -nx -q --data-directory=data-directory hello.so
    Reading symbols from hello.so...

    ⚠️  warning: Section .debug_names has incorrect entry in CU table, ignoring .debug_names.

To make it work, add a new dwarf2_find_unit function that allows looking
up a unit by start address (unlike dwarf2_find_containing_unit, which
can find by any containing address), and make the .debug_names reader
use it.  It might make the load time of .debug_names a bit longer (the
build and check step is now going to be O(n*log(n)) instead of O(n)
where n is the number of units, or something like that), but I think
it's important to be correct here.

This patch adds a test
(gdb.dwarf2/dw-form-ref-addr-with-type-units.exp), which tries to
replicate the problem as shown by PR 29518.

gdb.base/varval.exp needs a small change, because an error message
changes (for the better, I think)

gdb.dwarf2/debug-names-non-ascending-cu.exp now fails, because GDB no
longer rejects a .debug_names index which lists CUs in a different order
than .debug_info.  Given the change I did to the .debug_names reader,
explained above, I don't think this is a problem anymore (GDB can accept
an index like that).  I also don't think that DWARF 5 mandates that CUs
are in ascending order.  Delete this test.

[1] https://sourceware.org/bugzilla/show_bug.cgi?id=29518
[2] https://inbox.sourceware.org/gdb-patches/20250218193443.118139-1-simon.marchi@efficios.com/

Change-Id: I45f982d824d3842ac1eb73f8cce721a0a24b5faa
Approved-By: Tom Tromey <tom@tromey.com>
2025-08-01 00:25:54 -04:00
Simon Marchi
3e27b49025 gdb/dwarf: remove all_{comp,type}_units views
In DWARF 5, type units appear in the .debug_info section, interleaved
with comp units, and the order in all_units reflects that.  The
all_comp_units and all_type_units views are wrong in that case
(all_comp_units contains some type units, and vice-versa).

It would be possible to manually sort all_units to ensure that type
units follow comp units, but this series takes the approach of sorting
the units by section and section offset.

Remove those views, and replace their uses with num_comp_units and
num_type_units.  It appears that the views were only used to know the
number of each kind.

The finalize_all_units function is now empty, but I am keeping it
because a subsequent patch adds a call to std::sort in there to sort the
all_units vector.

Change-Id: I42a65b6f1b6192957b55cea0e2eaff097e13a33b
Approved-By: Tom Tromey <tom@tromey.com>
2025-08-01 00:25:15 -04:00
Simon Marchi
bedd6a7a44 gdb/dwarf: track compilation and type unit count
A subsequent commit will remove the all_comp_units and all_type_units
array views, since it's not possible to assume that that all_units
vector is segmented between comp and type units.  Some callers still
need to know the number of each kind, so track that separately.

Change-Id: I712fbdfbf10b333c431b688b881cc0987e74f688
Approved-By: Tom Tromey <tom@tromey.com>
2025-08-01 00:25:15 -04:00
Simon Marchi
91bca5d7bc gdb/dwarf: apply DW_AT_bit_offset when DW_AT_data_member_location is constant block
Since commit 420d030e88 ("Handle field with dynamic bit offset"), I see:

    $ make check TESTS="gdb.trace/unavailable-dwarf-piece.exp" RUNTESTFLAGS="--target_board=native-extended-gdbserver"
    FAIL: gdb.trace/unavailable-dwarf-piece.exp: tracing bar: p/d x
    FAIL: gdb.trace/unavailable-dwarf-piece.exp: tracing bar: p/d y
    FAIL: gdb.trace/unavailable-dwarf-piece.exp: tracing bar: p/d z

The first FAIL is:

    p/d x
    $4 = {a = 0, b = <unavailable>, c = <unavailable>, d = <unavailable>, e = <unavailable>, f = <unavailable>, g = <unavailable>, h = <unavailable>, i = <unavailable>, j = 0}
    (gdb) FAIL: gdb.trace/unavailable-dwarf-piece.exp: tracing bar: p/d x

When we should see:

    p/d x
    $4 = {a = 0, b = <unavailable>, c = 0, d = 0, e = 0, f = 0, g = 0, h = 0, i = 0, j = 0}
    (gdb) PASS: gdb.trace/unavailable-dwarf-piece.exp: tracing bar: p/d x

The structure we print is:

    0x0000004f:   DW_TAG_structure_type
                    DW_AT_name [DW_FORM_string]     ("t")
                    DW_AT_byte_size [DW_FORM_sdata] (3)
                    DW_AT_decl_file [DW_FORM_udata] (0)
                    DW_AT_decl_line [DW_FORM_udata] (1)

    0x00000055:     DW_TAG_member
                      DW_AT_name [DW_FORM_string]   ("a")
                      DW_AT_type [DW_FORM_ref4]     (0x00000019 "unsigned char")
                      DW_AT_data_member_location [DW_FORM_exprloc]  (DW_OP_plus_uconst 0x0)

    0x0000005f:     DW_TAG_member
                      DW_AT_name [DW_FORM_string]   ("b")
                      DW_AT_type [DW_FORM_ref4]     (0x00000019 "unsigned char")
                      DW_AT_byte_size [DW_FORM_sdata]       (1)
                      DW_AT_bit_size [DW_FORM_sdata]        (1)
                      DW_AT_bit_offset [DW_FORM_sdata]      (7)
                      DW_AT_data_member_location [DW_FORM_exprloc]  (DW_OP_plus_uconst 0x1)

    ...

The particularity of field "b" (and the following ones, not shown here)
is that they have:

 - a DW_AT_data_member_location of expression form, but that GDB reduces
   to a constant
 - a DW_AT_bit_offset

What I think happens is that the code path taken in this particular
scenario never ends up using the DW_AT_bit_offset value.  Fix it by
calling apply_bit_offset_to_field, like what is done when
data_member_location_attr is using a constant form.

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=33136
Change-Id: I18e838e6c56a548495d3af332aeff3051188eaa9
Approved-By: Tom Tromey <tom@tromey.com>
2025-07-25 09:25:37 -04:00
Simon Marchi
165d75b0ec gdb/dwarf: rename some variables in handle_member_location
For legibility, use more specific names for attribute variables and
don't reuse them for different attributes.

Change-Id: I98d8bb32fc64b5f6357fbc88f6fe93f2ddc8ef7c
Approved-By: Tom Tromey <tom@tromey.com>
2025-07-25 09:25:37 -04:00
Tom Tromey
5fe70629ce Change file initialization to use INIT_GDB_FILE macro
This patch introduces a new macro, INIT_GDB_FILE.  This is used to
replace the current "_initialize_" idiom when introducing a per-file
initialization function.  That is, rather than write:

    void _initialize_something ();
    void
    _initialize_something ()
    {
       ...
    }

... now you would write:

    INIT_GDB_FILE (something)
    {
       ...
    }

The macro handles both the declaration and definition of the function.

The point of this approach is that it makes it harder to accidentally
cause an initializer to be omitted; see commit 2711e475 ("Ensure
cooked_index_entry self-tests are run").  Specifically, the regexp now
used by make-init-c seems harder to trick.

New in v2: un-did some erroneous changes made by the script.

The bulk of this patch was written by script.
Regression tested on x86-64 Fedora 41.
2025-06-26 06:15:59 -06:00
Simon Marchi
7af3b05ce9 gdb/dwarf: change CUs -> units in print_stats
Change the messages to reflect that these numbers includes type units,
not only compile units.

Change-Id: Id2f511d4666e5cf92112be917d72ff76791b7e1d
Approved-by: Kevin Buettner <kevinb@redhat.com>
2025-06-19 13:17:51 -04:00
Simon Marchi
b3f4f211e2 gdb/dwarf: rename get_cu -> get_unit
This method returns type units too, so "get_unit" is a better name.

Change-Id: I6ec9de3f783637a3e206bcaaec96a4e00b4b7d31
Approved-By: Tom Tromey <tom@tromey.com>
2025-06-17 14:51:42 -04:00
Simon Marchi
e4a998f4b6 gdb/dwarf2: remove erroneous comment in open_and_init_dwo_file
When writing commit 28f15782ad ("gdb/dwarf: read multiple .debug_info.dwo
sections"), I initially thought that the gcc behavior of producing multiple
.debug_info.dwo sections was a bug (it is not).  I updated the commit
message, but it looks like this comment stayed.  Remove it, since it can
be misleading.

Change-Id: I027712d44b778e836f41afbfafab993da02726ef
Approved-By: Tom Tromey <tom@tromey.com>
2025-06-10 11:05:47 -04:00
Tom Tromey
21b25b168d Fix regression with DW_AT_bit_offset handling
Internal AdaCore testing using -gdwarf-4 found a spot where GCC will
emit a negative DW_AT_bit_offset.  However, my recent signed/unsigned
changes assumed that this value had to be positive.

I feel this bug somewhat invalidates my previous thinking about how
DWARF attributes should be handled.

In particular, both GCC and LLVM at understand that a negative bit
offset can be generated -- but for positive offsets they might use a
smaller "data" form, which is expected not to be sign-extended.  LLVM
has similar code but GCC does:

  if (bit_offset < 0)
    add_AT_int (die, DW_AT_bit_offset, bit_offset);
  else
    add_AT_unsigned (die, DW_AT_bit_offset, (unsigned HOST_WIDE_INT) bit_offset);

What this means is that this attribute is "signed but default
unsigned".

To fix this, I've added a new attribute::confused_constant method.
This should be used when a constant value might be signed, but where
narrow forms (e.g., DW_FORM_data1) should *not* cause sign extension.

I examined the GCC and LLVM DWARF writers to come up with the list of
attributes where this applies, namely DW_AT_bit_offset,
DW_AT_const_value and DW_AT_data_member_location (GCC only, but LLVM
always emits it as unsigned, so we're safe here).

This patch corrects the bug and imports the relevant test case.

Regression tested on x86-64 Fedora 41.

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32680
Bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118837
Approved-By: Simon Marchi <simon.marchi@efficios.com>
2025-06-06 10:13:23 -06:00
Tom Tromey
692252c4b0 Handle dynamic DW_AT_data_bit_offset
In Ada, a field can have a dynamic bit offset in its enclosing record.

In DWARF 3, this was handled using a dynamic
DW_AT_data_member_location, combined with a DW_AT_bit_offset -- this
combination worked out ok because in practice GNAT only needs a
dynamic byte offset with a fixed offset within the byte.

However, this approach was deprecated in DWARF 4 and then removed in
DWARF 5.  No replacement approach was given, meaning that in strict
mode there is no way to express this.

This is a DWARF bug, see

    https://dwarfstd.org/issues/250501.1.html

In a discussion on the DWARF mailing list, a couple people mentioned
that compilers could use the obvious extension of a dynamic
DW_AT_data_bit_offset.  I've implemented this for LLVM:

    https://github.com/llvm/llvm-project/pull/141106

In preparation for that landing, this patch implements support for
this construct in gdb.

New in v2: renamed some constants and added a helper method, per
Simon's review.

New in v3: more renamings.

Approved-By: Simon Marchi <simon.marchi@efficios.com>
2025-06-03 07:17:53 -06:00
Tom de Vries
c6115b5eac [gdb/cli] Use captured per_command_time in worker threads
With test-case gdb.base/maint.exp, I ran into:
...
(gdb) file maint^M
Reading symbols from maint...^M
(gdb) mt set per-command on^M
(gdb) Time for "DWARF indexing worker": ...^M
Time for "DWARF indexing worker": ...^M
Time for "DWARF indexing worker": ...^M
Time for "DWARF indexing worker": ...^M
Time for "DWARF skeletonless type units": ...^M
Time for "DWARF add parent map": ...^M
Time for "DWARF finalize worker": ...^M
Time for "DWARF finalize worker": ...^M
Time for "DWARF finalize worker": ...^M
Time for "DWARF finalize worker": ...^M
Time for "DWARF finalize worker": ...^M
FAIL: $exp: warnings: per-command: mt set per-command on (timeout)
mt set per-command off^M
2025-05-31 09:33:44.711 - command started^M
(gdb) PASS: $exp: warnings: per-command: mt set per-command off
...

I didn't manage to reproduce this by rerunning the test-case, but it's fairly
easy to reproduce using a file with more debug info, for instance gdb:
...
$ gdb -q -batch -ex "file build/gdb/gdb" -ex "mt set per-command on"
...

Due to the default "mt dwarf synchronous" == off, the file command starts
building the cooked index in the background, and returns immediately without
waiting for the result.

The subsequent "mt set per-command on" implies "mt set per-command time on",
which switches on displaying of per-command execution time.

The "Time for" lines are the result of those two commands, but these lines
shouldn't be there because "mt per-command time" == off at the point of
issuing the file command.

Fix this by capturing the per_command_time variable, and using the captured
value instead.

Tested on x86_64-linux.

Approved-By: Simon Marchi <simon.marchi@efficios.com>

PR cli/33039
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=33039
2025-06-03 08:59:58 +02:00
Tom de Vries
11cb20e27b [gdb/symtab] Note errors in process_skeletonless_type_units
With a hello world a.out, and using the compiler flags from target board
dwarf5-fission-debug-types:
...
$ gcc -gdwarf-5 -fdebug-types-section -gsplit-dwarf ~/data/hello.c
...
I run into:
...
$ gdb -q -batch a.out
terminate called after throwing an instance of 'gdb_exception_error'
...

What happens is that an error is thrown due to invalid dwarf, but the error is
not caught, causing gdb to terminate.

In a way, this is a duplicate of PR32861, in the sense that we no longer run
into this after:
- applying the proposed patch (work around compiler bug), or
- using gcc 9 or newer (compiler bug fixed).

But in this case, the failure mode is worse than in PR32861.

Fix this by catching the error in
cooked_index_worker_debug_info::process_skeletonless_type_units.

With the patch, we get instead:
...
$ gdb -q -batch a.out
Offset from DW_FORM_GNU_str_index or DW_FORM_strx pointing outside of \
  .debug_str.dwo section in CU at offset 0x0 [in module a.out]
...

While we're at it, absorb the common use of
cooked_index_worker_result::note_error:
...
  try
    {
      ...
    }
  catch (gdb_exception &exc)
    {
      (...).note_error (std::move (exc));
    }
...
into the method and rename it to catch_error, resulting in more compact code
for the fix:
...
  (...).catch_error ([&] ()
    {
      ...
    });
...

While we're at it, also use it in
cooked_index_worker_debug_info::process_units which looks like it needs the
same fix.

Tested on x86_64-linux.

PR symtab/32979
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32979
2025-05-28 22:17:19 +02:00
Tom de Vries
e64cd55419 [gdb/build] Fix unused var in lookup_dwo_unit_in_dwp
On x86_64-linux, with gcc 7.5.0 I ran into a build breaker:
...
gdb/dwarf2/read.c: In function ‘dwo_unit* lookup_dwo_unit_in_dwp()’:
gdb/dwarf2/read.c:7403:22: error: unused variable ‘inserted’ \
  [-Werror=unused-variable]
    auto [it, inserted] = dwo_unit_set.emplace (std::move (dwo_unit));
                      ^
...

Fix this by dropping the unused variable.

Tested on x86_64-linux, by completing a build.
2025-05-24 10:27:12 +02:00
Simon Marchi
fbf19b6cc6 gdb/dwarf: split dwo_lock in more granular locks
The dwo_lock mutex is used to synchronize access to some dwo/dwp-related
data structures, such as dwarf2_per_bfd::dwo_files and
dwp_file::loaded_{cus,tus}.  Right now the scope of the lock is kind of
coarse.  It is taken in the top-level lookup_dwo_unit function, and held
while the thread:

 - looks up an existing dwo_file in the per-bfd hash table for the given
   id/signature
 - if there's no existing dwo_file, attempt to find a .dwo file, open
   it, build the list of units it contains
 - if a new dwo_file was created, insert it in the per-bfd hash table
 - look up the desired unit in the dwo_file

And something similar for the dwp code path.  This means that two
indexing thread can't read in two dwo files simultaneously.  This isn't
ideal in terms of parallelism.

This patch breaks this lock into 3 more fine grained locks:

 - one lock to access dwarf2_per_bfd::dwo_files
 - one lock to access dwp_file::loaded_{cus,tus}
 - one lock in try_open_dwop_file, where we do two operations that
   aren't thread safe (bfd_check_format and gdb_bfd_record_inclusion)

Unfortunately I don't see a clear speedup on my computer with 8 threads.
But the change shouldn't hurt, in theory, and hopefully this can be a
piece that helps in making GDB scale better on machines with many cores
(if we ever bump the max number of worker threads).

This patch uses "double-checked locking" to avoid holding the lock(s)
for the whole duration of reading in dwo files.  The idea is, when
looking up a dwo with a given name:

 - with the lock held, check for an existing dwo_file with that name in
   dwarf2_per_bfd::dwo_files, if found return it
 - if not found, drop the lock, load the dwo file and create a dwo_file
   describing it
 - with the lock held, attempt to insert the new dwo_file in
   dwarf2_per_bfd::dwo_files.  If an entry exists, it means another
   thread simultaneously created an equivalent dwo_file, but won the
   race.  Drop the new dwo_file and use the existing one.  The new
   dwo_file is automatically deleted, because it is help by a unique_ptr
   and the insertion into the unordered_set fails.

Note that it shouldn't normally happen for two threads to look up a dwo
file with the same name, since different units will point to different
dwo files.  But it were to happen, we handle it.  This way of doing
things allows two threads to read in two different dwo files
simulatenously, which in theory should help get better parallelism.  The
same technique is used for dwp_file::loaded_{cus,tus}.

I have some local CI jobs that run the fission and fission-dwp boards,
and I haven't seen regressions.  In addition to the regular testing, I
ran a few tests using those boards on a ThreadSanitizer build of GDB.

Change-Id: I625c98b0aa97b47d5ee59fe22a137ad0eafc8c25
Reviewed-By: Andrew Burgess <aburgess@redhat.com>
2025-05-23 11:14:20 -04:00
Simon Marchi
e95749bd0d gdb/dwarf: allocate DWP dwarf2_section_info with new
For the same reason as explained in the previous patch (allocations on
obstacks aren't thread-safe), change the allocation of
dwarf2_section_info object for dwo files within dwp files to use "new".

The dwo_file::section object is not always owned by the dwo_file, so
introduce a new "dwo_file::section_holder" object that is only set when
the dwo_file owns the dwarf2_section_info.

Change-Id: I74c4608573c7a435bf3dadb83f96a805d21798a2
Approved-By: Tom Tromey <tom@tromey.com>
2025-05-23 11:12:53 -04:00
Simon Marchi
e82c588969 gdb/dwarf: allocate dwo_unit with new
The following patch reduces the duration where the dwo_lock mutex is
taken.  One operation that is not thread safe is the allocation on
dwo_units on the per_bfd obstack:

    dwo_unit *dwo_unit = OBSTACK_ZALLOC (&per_bfd->obstack, struct dwo_unit);

We could take the lock around this allocation, but I think it's just
easier to avoid the problem by having the dwo_unit objects allocated
with "new".

Change-Id: Ida04f905cb7941a8826e6078ed25dbcf57674090
Approved-By: Tom Tromey <tom@tromey.com>
2025-05-23 11:12:53 -04:00
Tom Tromey
e1ec485cfa Update comment for find_field_create_baton
Andrew pointed out that a recent commit neglected to update the
comment for find_field_create_baton.  This patch fixes the oversight.
2025-05-16 08:17:49 -06:00
Tom Tromey
9b02626409 Fix regression with dynamic array bounds
Kévin discovered that commit ba005d32b0 ("Handle dynamic field
properties") regressed a test in the internal AdaCore test suite.

The problem here is that, when writing that patch, I did not consider
the case where an array type's bounds might come from a member of a
structure -- but where the array is not defined in the structure's
scope.

In this scenario the field-resolution logic would trip this condition:

  /* Defensive programming in case we see unusual DWARF.  */
  if (fi == nullptr)
    return nullptr;

This patch reworks this area, partly backing out that commit, and
fixes the problem.

In the new code, I chose to simply duplicate the field's location
information.  This isn't totally ideal, in that it might result in
multiple copies of a baton.  However, this seemed nicer than tracking
the DIE/field correspondence for every field in every CU -- my
thinking here is that this particular dynamic scenario is relatively
rare overall.  Also, if the baton cost does prove onerous, we could
intern the batons somewhere.

Regression tested on x86-64 Fedora 41.  I also tested this using the
AdaCore internal test suite.

Tested-By: Simon Marchi <simon.marchi@efficios.com>
2025-05-15 06:51:19 -06:00
Andreas Schwab
a22a215fa8 gdb: rename ldirname to gdb_ldirname
It conflicts with the ldirname function that will be added in the next
libiberty sync.
2025-05-15 10:20:53 +02:00
Simon Marchi
57eea4cd0d gdb/dwarf: skip broken .debug_macro.dwo
Running gdb.base/errno.exp with gcc <= 13 with split DWARF results in:

    $ make check TESTS="gdb.base/errno.exp" RUNTESTFLAGS="CC_FOR_TARGET=gcc-13 --target_board=fission"
    (gdb) break -qualified main
    /home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:7549: internal-error: locate_dwo_sections: Assertion `!dw_sect->readin' failed.
    A problem internal to GDB has been detected,
    further debugging may prove unreliable.
    ...
    FAIL: gdb.base/errno.exp: macros: gdb_breakpoint: set breakpoint at main (GDB internal error)

The assert being hit has been added in 28f15782ad ("gdb/dwarf: read
multiple .debug_info.dwo sections"), but it merely exposed an existing
problem.

gcc versions <= 13 are affected by this bug:

  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111409

Basically, it produces .dwo files with multiple .debug_macro.dwo
sections, with some unresolved links between them.  I think that this
macro debug info is unusable, and all we can do is ignore it.

In locate_dwo_sections, if we detect a second .debug_macro.dwo section,
forget about the previous .debug_macro.dwo and any subsequent one.  This
will effectively make it as if the macro debug info wasn't there at all.

The errno test seems happy with it:

    # of expected passes            84
    # of expected failures          8

Change-Id: I6489b4713954669bf69f6e91865063ddcd1ac2c8
Approved-By: Tom Tromey <tom@tromey.com>
2025-05-12 14:50:31 -04:00
Simon Marchi
8422833a4f gdb/dwarf: move loops into locate_dw{o,z}_sections
For a subsequent patch, it would be easier if the loop over sections
inside locate_dwo_sections (I want to maintain some state for the
duration of the loop).  Move the for loop in there.  And because
locate_dwz_sections is very similar, modify that one too, to keep both
in sync.

Change-Id: I90b3d44184910cc2d86af265bb4b41828a5d2c2e
Approved-By: Tom Tromey <tom@tromey.com>
2025-05-12 14:05:21 -04:00
Tom Tromey
420d030e88 Handle field with dynamic bit offset
I discovered that GCC emitted incorrect DWARF for the test case
included in this patch.  Eric wrote a fix for GCC, but then he found
that gdb crashed on the resulting file.

This test has a field that is at a non-constant bit offset from the
start of the type.  DWARF 5 does not allow for this situation (I've
sent a report to the DWARF list), but DWARF 3 did allow for this via a
combination of an expression for the byte offset and then the use of
DW_AT_bit_offset.  This looks like:

 <5><117a>: Abbrev Number: 17 (DW_TAG_member)
    <117b>   DW_AT_name        : (indirect string, offset: 0x1959): another_field
...
    <1188>   DW_AT_bit_offset  : 6
    <1189>   DW_AT_data_member_location: 6 byte block: 99 3d 1 0 0 22 	(DW_OP_call4: <0x1193>; DW_OP_plus)
...
 <3><1193>: Abbrev Number: 2 (DW_TAG_dwarf_procedure)
    <1194>   DW_AT_location    : 15 byte block: 97 94 1 37 1a 32 1e 23 7 38 1b 31 1c 23 3 	(DW_OP_push_object_address; DW_OP_deref_size: 1; DW_OP_lit7; DW_OP_and; DW_OP_lit2; DW_OP_mul; DW_OP_plus_uconst: 7; DW_OP_lit8; DW_OP_div; DW_OP_lit1; DW_OP_minus; DW_OP_plus_uconst: 3)

Now, that combination is not fully general, in that the bit offset
must be a constant -- only the byte offset may really vary.  However,
I couldn't come up with a situation where full generality is needed,
mainly because GNAT won't seem to pack fields into the padding of a
variable-length array.

Meanwhile, the reason for the gdb crash is that the code handling
DW_AT_bit_offset assumes that the byte offset is a constant.  This
causes an assertion failure.

This patch arranges for DW_AT_bit_offset to be applied during field
resolution, when needed.
2025-05-06 09:01:55 -06:00
Tom Tromey
ee580641bc Introduce apply_bit_offset_to_field helper function
This patch makes a new function, apply_bit_offset_to_field, that is
used to handle the logic of DW_AT_bit_offset.  Currently there is just
a single caller, but the next patch will change this.
2025-05-06 09:01:55 -06:00
Tom Tromey
1d9fb3ba19 Use OBSTACK_ZALLOC when allocating batons
I found some places in dwarf2/read.c that allocate a location baton,
but fail to initialize one of the fields.  It seems safer to me to use
OBSTACK_ZALLOC here, so this patch makes this change.  This will be
useful in a subsequent patch as well, where a new field is added to
one of the batons.
2025-05-06 09:01:54 -06:00
Tom Tromey
b6acdd724d Clean up handle_member_location
This removes a redundant check from handle_member_location, and also
changes the complaint -- currently it will issue the "complex
location" complaint, but really what is happening here is an
unrecognized form.
2025-05-06 09:01:54 -06:00
Tom Tromey
ba005d32b0 Handle dynamic field properties
I found a situation where gdb could not properly decode an Ada type.
In this first scenario, the discriminant of a type is a bit-field.
PROP_ADDR_OFFSET does not handle this situation, because it only
allows an offset -- not a bit-size.

My original approach to this just added a bit size as well, but after
some discussion with Eric Botcazou, we found another failing case: a
tagged type can have a second discriminant that appears at a variable
offset.

So, this patch changes this code to accept a general 'struct field'
instead of trying to replicate the field-finding machinery by itself.

This is handled at property-evaluation time by simply using a 'field'
and resolving its dynamic properties.  Then the usual field-extraction
function is called to get the value.

Because the baton now just holds a field, I renamed PROP_ADDR_OFFSET
to PROP_FIELD.

The DWARF reader now defers filling in the property baton until the
fields have been attached to the type.

Finally, I noticed that if the discriminant field has a biased
representation, then unpack_field_as_long would not handle this
either.  This bug is also fixed here, and the test case checks this.

Regression tested on x86-64 Fedora 41.
2025-05-06 09:01:54 -06:00
Tom Tromey
f3d834df28 Fix sign of Ada rational constants
My earlier patch commit 0c03db90 ("Use correct sign in get_mpz") was
(very) incorrect.  It changed get_mpz to check for a strict sign when
examining part of an Ada rational constant.  However, in Ada the
"delta" for a fixed-point type must be positive, and so the components
of the rational representation will be positive.

This patch corrects the error.  It also renames the get_mpz function,
in case anyone is tempted to reuse this code for another purpose.

Finally, this pulls over the test from the internal AdaCore test suite
that found this issue.
2025-05-05 07:37:18 -06:00
Tom de Vries
6ec31a457e [gdb/symtab] Throw DWARF error on out-of-bounds DW_FORM_strx
With the test-case contained in the patch, and gdb build with
-fsanitize=address we get:
...
==23678==ERROR: AddressSanitizer: heap-buffer-overflow ...^M
READ of size 1 at 0x6020000c30dc thread T3^[[1m^[[0m^M
ptype global_var^M
    #0 0x2c6a40b in bfd_getl32 bfd/libbfd.c:846^M
    #1 0x168f96c in read_str_index gdb/dwarf2/read.c:15349^M
...

The executable contains an out-of-bounds DW_FORM_strx attribute:
...
$ readelf -wi $exec
<2eb>   DW_AT_name        :readelf: Warning: string index of 1 converts to \
  an offset of 0xc which is too big for section .debug_str
 (indexed string: 0x1): <string index too big>
...
and read_str_index doesn't check for this:
...
  info_ptr = (str_offsets_section->buffer
	      + str_offsets_base
	      + str_index * offset_size);
   if (offset_size == 4)
     str_offset = bfd_get_32 (abfd, info_ptr);
...
and consequently reads out-of-bounds.

Fix this in read_str_index by checking for the out-of-bounds condition and
throwing a DWARF error:
...
(gdb) ptype global_var
DWARF Error: Offset from DW_FORM_GNU_str_index or DW_FORM_strx pointing \
  outside of .debug_str_offsets section in CU at offset 0x2d7 \
  [in module dw-form-strx-out-of-bounds]
No symbol "global_var" in current context.
(gdb)
...

Tested on x86_64-linux.

Approved-By: Tom Tromey <tom@tromey.com>
2025-05-02 22:21:36 +02:00
Tom de Vries
9c1f84c9b4 [gdbsupport] Reimplement phex and phex_nz as templates
Gdbsupport functions phex and phex_nz have a parameter sizeof_l:
...
extern const char *phex (ULONGEST l, int sizeof_l);
extern const char *phex_nz (ULONGEST l, int sizeof_l);
...
and a lot of calls use:
...
  phex (l, sizeof (l))
...

Make this easier by reimplementing the functions as a template, allowing us to
simply write:
...
  phex (l)
...

Simplify existing code using:
...
$ find gdb* -type f \
    | xargs sed -i 's/phex (\([^,]*\), sizeof (\1))/phex (\1)/'
$ find gdb* -type f \
    | xargs sed -i 's/phex_nz (\([^,]*\), sizeof (\1))/phex_nz (\1)/'
...
and manually review:
...
$ find gdb* -type f | xargs grep "phex (.*, sizeof.*)"
$ find gdb* -type f | xargs grep "phex_nz (.*, sizeof.*)"
...

Tested on x86_64-linux.

Approved-By: Tom Tromey <tom@tromey.com>
2025-05-02 22:10:53 +02:00
Simon Marchi
2c00fd5c6c gdb/dwarf: change a bunch of functions to be methods of cooked_index_worker_debug_info
Move a few functions exclusively used to process units to become methods
of cooked_index_worker_debug_info.  Rename them to a more consistent
name scheme, which gets rid of outdated naming.  The comments were also
quite outdated.

Change-Id: I2e7dcc2e4ff372007dcb4f6c3d34187c9cc2da05
Approved-By: Tom Tromey <tom@tromey.com>
2025-04-29 15:58:03 -04:00
Simon Marchi
0bd12b5c06 gdb/dwarf: move cooked_index_worker_debug_info up
The next patch moves some functions to be methods of
cooked_index_worker_debug_info.  Move cooked_index_worker_debug_info
above those functions, to make that easier (methods can't be defined
before the class declaration).

Change-Id: I7723cb42efadb2cc86f2227b3c2fb275e2d620f9
Approved-By: Tom Tromey <tom@tromey.com>
2025-04-29 15:58:03 -04:00
Simon Marchi
dbfd92856a gdb/dwarf: clean up some cutu_reader::is_dummy() calls
This patch tries to standardize the places where we check if units are
dummy.  When checking if a unit is dummy, it is not necessary to check
for some other conditions.

 - cutu_reader::is_dummy() is a superset of cutu_reader::cu() returning
   nullptr, so it's not necessary to check if the cu method return
   nullptr if also checking if the unit is dummy.
 - cutu_reader::is_dummy() is a superset of cutu_reader::top_level_die()
   returning nullptr, so same deal.

Remove some spots that check for these conditions in addition to
cutu_reader::is_dummy().

In addition, also remove the checks for:

    !new_reader->top_level_die ()->has_children

in cooked_indexer::ensure_cu_exists.  IMO, it is not useful to special
case the units having a single DIE.  Especially in this function, which
deals with importing things from another unit, a unit with a single DIE
would be an edge case that should not happen with good debug info.  I
think it's preferable to have simpler code.

Change-Id: I4529d7b3a0bd2891a60f41671de8cfd3114adb4a
Approved-By: Tom Tromey <tom@tromey.com>
2025-04-29 15:54:17 -04:00