Files
binutils-gdb/gdb/dwarf2/section.h
Simon Marchi 6474c699a5 gdb/dwarf: sort dwarf2_per_bfd::all_units by (section, offset)
This patch started as a fix for PR 29518 ("GDB doesn't handle
DW_FORM_ref_addr DIE references correctly with .debug_types sections")
[1], but the scope has expanded a bit to fix the problem more generally,
after I spotted a few issues related to the order of all_units.  The
first version of this patch is here [2].

PR 29518 shows that dwarf2_find_containing_comp_unit can erroneously
find a type unit.  The obvious problem is that the
dwarf2_find_containing_comp_unit function searches the whole all_units
vector (containing both comp and type units), when really it should just
search the compilation units.  A simple solution would be to make it
search the all_comp_units view (which is removed in a patch earlier in
this series).

I then realized that in DWARF 5, since type units are in .debug_info
(versus .debug_types in DWARF 4), type units can be interleaved with
comp type in the all_units vector.  That would make the all_comp_units
and all_type_units views erroneous, and dwarf2_find_containing_comp_unit
could still return something wrong.  In v1, I added a sort in
finalize_all_units to make sure all_units is in the order that
dwarf2_find_containing_comp_unit expects:

 - comp units from the main file
 - type units from the main file
 - comp units from the dwz file
 - type units from the dwz file (not actually supported, see PR 30838)

Another problem I spotted is that the .gdb_index reader creates units in
this order:

 - comp units from .gdb_index from main file
 - comp units from .gdb_index from dwz file
 - type units from .gdb_index from main file

This isn't the same order as above, so it would need the same sort step.

Finally, I'm not exactly sure if and when it happens, but it looks like
lookup_signatured_type can be called at a later time (after the initial
scan and creation of dwarf2_per_cu object creation), when expanding a
symtab.  And that could lead to the creation of a new type unit (see
function add_type_unit), which would place the new type unit at the end
of the all_units vector, possibly screwing up the previous order.

To handle all this in a nice and generic way, Tom Tromey proposed to
change the all_units order, so that units are sorted by section, then
section offset.  This is what this patch implements.  The sorting is
done in finalize_all_units.

This works well, because when looking up a unit by section offset, the
caller knows which section the unit is in.  Passing down a (section,
section offset) tuple makes it clear and unambiguous what unit the
caller is referring to.  It should help eliminate some bugs where the
callee used the section offset in the wrong section.  Passing down the
section along with the section offset replaces the "is_dwz" flag passed
to dwarf2_find_containing_comp_unit and a bunch of other functions in a
more general way.

dwarf2_find_containing_comp_unit can now legitimately find and return
type units even though it should be needed (type units are typically
referred to by signature).  But I don't think there is harm for this
function to be more generic than needed.  I therefore I renamed it to
dwarf2_find_containing_unit.

The sort criterion for "section" can be anything, as long as we use the
same for sorting and searching.  In this patch, I use the pointer to
dwarf2_section_info, because it's easy.  The downside is that the actual
order depends on what the memory allocator decided to return, so could
change from run to run, or machine to machine.  Later, I might change it
so that sections are ordered based on their properties, making the order
stable across the board.  This logic is encapsulated in the
all_units_less_than function, so it's easy to change.

The .debug_names reader can no longer rely on the order of the all_units
vector for its checks, since all_units won't be the same order as found
in the .debug_names lists.  In fact, even before, it wasn't: this check
assumed that .debug_info had all CUs before TUs, and that the index
listed them in the exact same order.  When I build a file with gcc and
"-gdwarf-5 -fdebug-types-section", type units appear first in
.debug_info.  This caused GDB to reject a .debug_names index that is had
produced:

    $ GDB="./gdb -nx -q --data-directory=data-directory" /home/smarchi/src/binutils-gdb/gdb/contrib/gdb-add-index.sh -dwarf-5 hello.so
    $ ./gdb -nx -q --data-directory=data-directory hello.so
    Reading symbols from hello.so...

    ⚠️  warning: Section .debug_names has incorrect entry in CU table, ignoring .debug_names.

To make it work, add a new dwarf2_find_unit function that allows looking
up a unit by start address (unlike dwarf2_find_containing_unit, which
can find by any containing address), and make the .debug_names reader
use it.  It might make the load time of .debug_names a bit longer (the
build and check step is now going to be O(n*log(n)) instead of O(n)
where n is the number of units, or something like that), but I think
it's important to be correct here.

This patch adds a test
(gdb.dwarf2/dw-form-ref-addr-with-type-units.exp), which tries to
replicate the problem as shown by PR 29518.

gdb.base/varval.exp needs a small change, because an error message
changes (for the better, I think)

gdb.dwarf2/debug-names-non-ascending-cu.exp now fails, because GDB no
longer rejects a .debug_names index which lists CUs in a different order
than .debug_info.  Given the change I did to the .debug_names reader,
explained above, I don't think this is a problem anymore (GDB can accept
an index like that).  I also don't think that DWARF 5 mandates that CUs
are in ascending order.  Delete this test.

[1] https://sourceware.org/bugzilla/show_bug.cgi?id=29518
[2] https://inbox.sourceware.org/gdb-patches/20250218193443.118139-1-simon.marchi@efficios.com/

Change-Id: I45f982d824d3842ac1eb73f8cce721a0a24b5faa
Approved-By: Tom Tromey <tom@tromey.com>
2025-08-01 00:25:54 -04:00

128 lines
4.5 KiB
C++

/* DWARF 2 low-level section code
Copyright (C) 1994-2025 Free Software Foundation, Inc.
Adapted by Gary Funck (gary@intrepid.com), Intrepid Technology,
Inc. with support from Florida State University (under contract
with the Ada Joint Program Office), and Silicon Graphics, Inc.
Initial contribution by Brent Benson, Harris Computer Systems, Inc.,
based on Fred Fish's (Cygnus Support) implementation of DWARF 1
support.
This file is part of GDB.
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>. */
#ifndef GDB_DWARF2_SECTION_H
#define GDB_DWARF2_SECTION_H
/* A descriptor for dwarf sections.
S.ASECTION, SIZE are typically initialized when the objfile is first
scanned. BUFFER, READIN are filled in later when the section is read.
If the section contained compressed data then SIZE is updated to record
the uncompressed size of the section.
DWP file format V2 introduces a wrinkle that is easiest to handle by
creating the concept of virtual sections contained within a real section.
In DWP V2 the sections of the input DWO files are concatenated together
into one section, but section offsets are kept relative to the original
input section.
If this is a virtual dwp-v2 section, S.CONTAINING_SECTION is a backlink to
the real section this "virtual" section is contained in, and BUFFER,SIZE
describe the virtual section. */
#include "dwarf2/types.h"
struct dwarf2_section_info
{
/* Return the name of this section. */
const char *get_name () const;
/* Return the containing section of this section, which must be a
virtual section. */
struct dwarf2_section_info *get_containing_section () const;
/* Return the bfd owner of this section. */
struct bfd *get_bfd_owner () const;
/* Return the bfd section of this section.
Returns NULL if the section is not present. */
asection *get_bfd_section () const;
/* Return the name of the file this section is in. */
const char *get_file_name () const;
/* Return the id of this section.
Returns 0 if this section doesn't exist. */
int get_id () const;
/* Return the flags of this section. This section (or containing
section if this is a virtual section) must exist. */
int get_flags () const;
/* Return true if this section does not exist or if it has no
contents. */
bool empty () const;
/* Read the contents of this section.
OBJFILE is the main object file, but not necessarily the file where
the section comes from. E.g., for DWO files the bfd of INFO is the bfd
of the DWO file.
If the section is compressed, uncompress it before returning. */
void read (struct objfile *objfile);
/* Issue a complaint that something was outside the bounds of this
buffer. */
void overflow_complaint () const;
/* Return pointer to string in this section at offset STR_OFFSET
with error reporting string FORM_NAME. */
const char *read_string (struct objfile *objfile, LONGEST str_offset,
const char *form_name);
union
{
/* If this is a real section, the bfd section. */
asection *section;
/* If this is a virtual section, pointer to the containing ("real")
section. */
struct dwarf2_section_info *containing_section;
} s;
/* Pointer to section data, only valid if readin. */
const gdb_byte *buffer;
/* The size of the section, real or virtual. */
bfd_size_type size;
/* If this is a virtual section, the offset in the real section.
Only valid if is_virtual. */
bfd_size_type virtual_offset;
/* True if we have tried to read this section. */
bool readin;
/* True if this is a virtual section, False otherwise.
This specifies which of s.section and s.containing_section to use. */
bool is_virtual;
};
using dwarf2_section_info_up = std::unique_ptr<dwarf2_section_info>;
/* A pair-like structure to represent an offset into a section. */
struct section_and_offset
{
const dwarf2_section_info *section;
sect_offset offset;
};
#endif /* GDB_DWARF2_SECTION_H */