Commit Graph

117 Commits

Author SHA1 Message Date
Simon Marchi
cb208105eb gdb/dwarf: sort units when writing index
The order of all_units can't be relied on when writing the CU and TU
lists to .gdb_index or .debug_names.

Both the .gdb_index and .debug_names writers expect that all_units
contains comp units followed by type units.  As of this commit, when
reading a DWARF 5 .debug_info, the all_units vector is ordered based on
the order the units appear in .debug_info, where type units can be
interleaved with comp units.

It probably worked fine with DWARF 4, where type units were in a section
of their own (.debug_types).  They were read after comp units, and
therefore after them in the all_units vector.

Change the writers to use a common function that splits the units in two
lists (comp units and type units).  Sort both lists by section offset.
This is more than required, but it should help produce a stable and
predictable output.

Change-Id: I5a22e2e354145e3d6b5b2822dc2a3af2f9d6bb76
Approved-By: Tom Tromey <tom@tromey.com>
2025-08-01 00:25:54 -04:00
Simon Marchi
3e27b49025 gdb/dwarf: remove all_{comp,type}_units views
In DWARF 5, type units appear in the .debug_info section, interleaved
with comp units, and the order in all_units reflects that.  The
all_comp_units and all_type_units views are wrong in that case
(all_comp_units contains some type units, and vice-versa).

It would be possible to manually sort all_units to ensure that type
units follow comp units, but this series takes the approach of sorting
the units by section and section offset.

Remove those views, and replace their uses with num_comp_units and
num_type_units.  It appears that the views were only used to know the
number of each kind.

The finalize_all_units function is now empty, but I am keeping it
because a subsequent patch adds a call to std::sort in there to sort the
all_units vector.

Change-Id: I42a65b6f1b6192957b55cea0e2eaff097e13a33b
Approved-By: Tom Tromey <tom@tromey.com>
2025-08-01 00:25:15 -04:00
Tom Tromey
5fe70629ce Change file initialization to use INIT_GDB_FILE macro
This patch introduces a new macro, INIT_GDB_FILE.  This is used to
replace the current "_initialize_" idiom when introducing a per-file
initialization function.  That is, rather than write:

    void _initialize_something ();
    void
    _initialize_something ()
    {
       ...
    }

... now you would write:

    INIT_GDB_FILE (something)
    {
       ...
    }

The macro handles both the declaration and definition of the function.

The point of this approach is that it makes it harder to accidentally
cause an initializer to be omitted; see commit 2711e475 ("Ensure
cooked_index_entry self-tests are run").  Specifically, the regexp now
used by make-init-c seems harder to trick.

New in v2: un-did some erroneous changes made by the script.

The bulk of this patch was written by script.
Regression tested on x86-64 Fedora 41.
2025-06-26 06:15:59 -06:00
Tom Tromey
d01e823438 Update copyright dates to include 2025
This updates the copyright headers to include 2025.  I did this by
running gdb/copyright.py and then manually modifying a few files as
noted by the script.

Approved-By: Eli Zaretskii <eliz@gnu.org>
2025-04-08 10:54:39 -06:00
Simon Marchi
6fca4d9694 gdbsupport: add some -Wunused-* warning flags
Add a few -Wunused-* diagnostic flags that look useful.  Some are known
to gcc, some to clang, some to both.  Fix the fallouts.

-Wunused-const-variable=1 is understood by gcc, but not clang.
-Wunused-const-variable would be undertsood by both, but for gcc at
least it would flag the unused const variables in headers.  This doesn't
make sense to me, because as soon as one source file includes a header
but doesn't use a const variable defined in that header, it's an error.
With `=1`, gcc only warns about unused const variable in the main source
file.  It's not a big deal that clang doesn't understand it though: any
instance of that problem will be flagged by any gcc build.

Change-Id: Ie20d99524b3054693f1ac5b53115bb46c89a5156
Approved-By: Tom Tromey <tom@tromey.com>
2025-03-17 16:14:08 -04:00
Simon Marchi
2920415622 gdb/dwarf: use ranged for loop in some spots
I noticed that these loops could be written to avoid the iteration
variable `i`.

Change-Id: I8b58eb9913b6ac8505ee45eb8009ef7027236cb9
2025-03-12 22:27:12 -04:00
Simon Marchi
c19c928f7b Revert past commits
I accidentally pushed my work-in-progress branch... revert that.  Sorry
for the noise :(.

The list of commits reverted are:

    ae2a50a9ae attempt to revamp to the CU/TU list
    e9386435c9 gdb/dwarf: print DWARF CUs/TUs in "maint print objfiles"
    6cbd64aa3e gdb/dwarf: add dwarf_source_language_name
    32a187da76 libiberty: move DW_LANG_* definitions to dwarf2.def
    b3fa38aef5 gdb/dwarf: move index unit vectors to debug names reader and use them
    30ba744189 gdb/dwarf: track comp and type units count
    bedb4e09f2 gdb/dwarf: remove unnecessary braces
    b4f18de12c gdb/dwarf: use ranged for loop in some pots

Change-Id: I80aed2847025f5b15c16c997680783b39858a703
2025-03-10 16:15:42 -04:00
Simon Marchi
ae2a50a9ae attempt to revamp to the CU/TU list
Change-Id: I1c8214413583d540c10c9a2322ef2a21f8bb54e7
2025-03-10 16:09:02 -04:00
Simon Marchi
b4f18de12c gdb/dwarf: use ranged for loop in some pots
I noticed that these loops could be written to avoid the iteration
variable `i`.

Change-Id: Ia3717acbbf732f0337870d35ac60fe6400383324
2025-03-10 16:09:02 -04:00
Tom Tromey
c05c9914b1 Use flags enum for cooked_index_entry::full_name
I found a small bug coming from a couple of  recent patches of mine for
cooked_index_entry::full_name.

First, commit aab26529b3 (Add "Ada linkage" mode to
cooked_index_entry::full_name) added a small hack to optionally
compute the Ada linkage name.

Then, commit aab2ac34d7 (Avoid excessive CU expansion on failed
matches) changed the relevant expand_symtabs_matching implementation
to use this feature.

However, the feature was used unconditionally, causing a bad side
effect: the non-canonical name is now used for all languages, not just
Ada.  But, for C++ this is wrong.

Furthermore, consider the declaration of full_name:

   const char *full_name (struct obstack *storage,
			 bool for_main = false,
			 bool for_ada_linkage = false,
 			 const char *default_sep = nullptr) const;

... and then consider this call in cooked_index::dump:

       gdb_printf ("    qualified:  %s\n",
		  entry->full_name (&temp_storage, false, "::"));

Oops!  The "::" is silently converted to 'true' here.

To fix both of these problems, this patch changes full_name to accept
a flags enum rather than booleans.  This avoids the type-safety
problem.

Then, full_name is changed to remove the "Ada" flag when the entry is
not in fact an Ada symbol.

Regression tested on x86-64 Fedora 40.

Approved-By: Simon Marchi <simon.marchi@efficios.com>
2025-03-10 13:40:25 -06:00
Tom Tromey
4a4a50517b Add "synthetic" marker for index entries
Currently, gdb will synthesize DW_TAG_module entries for Ada names.
These entries are treated specially by the index writer,

When GNAT starts emitting DW_TAG_module, the special case will be
incorrect, because there will be non-synthetic DW_TAG_module entries
in the index.

This patch arranges to mark the synthetic entries and changes the
index writer to follow.
2025-03-06 14:17:17 -07:00
Tom Tromey
e382ede5ea Use DW_TAG_module for Ada
In GCC we decided to use DW_TAG_module to represent Ada packages, so
make this same decision in gdb.  This also updates tag_matches_domain
to handle this case.
2025-03-06 14:17:17 -07:00
Simon Marchi
66195fe109 gdb/dwarf: make dwarf2_get_dwz_file a method of dwarf2_per_bfd
dwarf2_get_dwz_file looks more or less like a simple getter of
dwarf2_per_bfd::dwz_file, so make it into a method.

I typically avoid the `get_` prefix for getters, but that would conflict
with the field name here.

Change-Id: Idd0d5b1bd3813babf438b20aac514b19c77cfc18
Approved-By: Tom Tromey <tom@tromey.com>
2025-03-04 10:41:54 -05:00
Tom Tromey
268c8bda25 Add language to type unit in debug-names-tu.exp.tcl
I think debug-names-tu.exp.tcl only passes by accident -- the type
unit does not have a language, which gdb essentially requires.

This isn't noticeable right now because the type unit in question is
expanded in one phase and then the symbol found in another.  However,
I'm working on a series that would regress this.

This patch partially fixes the problem by correcting the test case,
adding the language to the TU.

Hoewver, it then goes a bit further and arranges for this information
not to be written to .debug_names.  Whether or not a type should be
considered "static" seems like something that is purely internal to
gdb, so this patch has the entry-creation function apply the
appropriate transform.

It also may make sense to change the "debug_names" proc in the test
suite to process attributes more like the ordinary "cu" proc does.
2025-03-03 14:16:44 -07:00
Simon Marchi
b55c841c51 gdb/dwarf: rename dwarf2_per_cu_data -> dwarf2_per_cu
This scratches an itch I had for a while.  I don't know why this struct
type has "data" in its name.  Others like "dwarf2_per_objfile" and
"dwarf2_per_bfd" don't.  The primary job of a structure is to hold data,
there's no need to specify it.  It also makes the name a bit shorter,
which is always nice.

Rename related types too.

Change-Id: Ifb63195ff105809fc15b502f639c0bb4d18a675e
Approved-By: Tom Tromey <tom@tromey.com>
Reviewed-By: Guinevere Larsen <guinevere@redhat.com>
2025-03-03 15:57:03 -05:00
Tom Tromey
dfd769b375 Fix formatting in dwarf2/index-write.c
I noticed a spot in dwarf2/index-write.c that was mis-formatted.  This
fixes it.
2025-02-23 10:10:26 -07:00
Simon Marchi
890d1d385f gdb/dwarf: std::unordered_{set,map} -> gdb::unordered_{set,map} throughout
No behavior changes expected.

Change-Id: I16ff6c67058362c65cc8edb05d1948e48be6b2e1
Approved-By: Tom Tromey <tom@tromey.com>
2025-02-19 11:14:41 -05:00
Simon Marchi
de33cf88da gdb/dwarf: allow for cooked_index_shard::m_addrmap to be nullptr
The following patch makes the .debug_names reader create multiple cooked
index shards, only one of them having an address map.  The others will
have a nullptr address map.

Change the code using cooked_index_shard::m_addrmap to account for the
fact that it can be nullptr.

Change-Id: Id05b974e661d901dd43bb5ecb3a8fcfc15abc7ed
Approved-By: Tom Tromey <tom@tromey.com>
2025-02-10 11:28:56 -05:00
Simon Marchi
ad6dde5aaa gdb/dwarf: write offset to parent entry for DW_IDX_parent
New in v2:

 - add doc
 - fix computation of offset in entry pool

Due to a mistake in the DWARF 5 spec, the way that GDB interprets
DW_IDX_parent when generating and reading .debug_names is not correct.

In Section 6.1.1.2, the parent index entry attribute is described as:

  Parent debugging information entry, a reference to the index entry for
  the parent. This is represented as the offset of the entry relative to
  the start of the entry pool.

But in Table 6.1, DW_IDX_parent is described as:

  Index of name table entry for parent

These two contradict each other.  The former is the correct one and the
latter is an unfortunate leftover from an earlier version of the
proposal, according to [1].  It does make sense, because pointing to a
name table entry is ambiguous, while poiting to an index entry directly
is not.  Unfortunately, GDB implemented pointing to a name table entry.

Changes on the writer side:

 - For each written pool entry, remember the offset within the pool.

 - Change the DW_IDX_parent form to DW_FORM_data4.

   Using DW_FORM_udata isn't an option, because we don't know the actual
   value when doing the first pass of writing the pool (see next point),
   so we wouldn't know how many bytes to reserve, if we used a
   variable-size encoding.

   Using a fixed 4 bytes encoding would be an issue if the entry pool
   was larger than 4 GiB, but that seems unlikely.

   Note that clang uses DW_FORM_ref4 for this, but I'm not sure it is
   appropriate, since forms of the reference class are specified as
   referring "to one of the debugging information entries that describe
   the program".  Since we're not referring to a DIE, I decided to stay
   with a form of the "constant" class.  I think that readers will be
   able to understand either way.

 - Write a dummy 4 byte number when writing the pool, then patch those
   values later.  This is needed because parents can appear before their
   children in the pool (there's no way to ensure that parents always
   appear before their children), so we might now know at first what
   value to put in.

 - Add a `write_uint` method to `class data_buf` to support that use
   case of patching a value in the middle of the data buffer.

 - Simplify the type of `m_name_to_value_set`, we no longer need to
   track the index at which a name will be written at.

 - Produce a new augmentation string, "GDB3", to be able to distinguish
   "old" and "new" indexes.  It would be possible for a reader to
   distinguish the two semantics of DW_IDX_parent using the form.
   However, current versions of GDB don't do that, so they would be
   confused trying to read a new index.  I think it is preferable to use
   a new augmentation string so that they will reject a new index
   instead.

Changes on the reader side:

 - Track the GDB augmentation version, in addition to whether the
   augmentation string indicates the index was produced by GDB.

 - When reading index entries, maintain a "pool offset" -> "cooked index
   entry" mapping, to be able to find parents by pool offset.

 - When resolving parents, keep the existing behavior of finding parents
   by name table index if the augmentation string is "GDB2.  Otherwise,
   look up parents by pool offset.  This assumes that .debug_names from
   other producers (if/when we add support for reading them) use pool
   offsets for DW_IDX_parent.  This at least what clang does.

 - Simplify augmentation string comparison a bit by using array views.

Update the "Extensions to ‘.debug_names’" section of the documentation
to reflect the new augmentation string version.

Tested by:

 - manually producing executables with "GDB2" and "GDB3" .debug_names
   sections and reading them.
 - running the testsuite with the cc-with-debug-names board

[1] https://lists.dwarfstd.org/pipermail/dwarf-discuss/2025-January/002618.html

Change-Id: I265fa38070b86ef320e0a972c300d1d755735d8d
Reviewed-By: Eli Zaretskii <eliz@gnu.org>
Approved-By: Tom Tromey <tom@tromey.com>
2025-02-10 11:26:44 -05:00
Simon Marchi
d0cfeca7e9 gdb: remove unused includes from dwarf2/index-write.c
These includes are reported as unused by clangd.

Change-Id: Ibf3cdc881abad5f5969edca623412ceac7212149
2025-01-30 12:46:54 -05:00
Tom Tromey
8d6a3c8e72 Add missing includes of extract-store-integer.h
I found a number of .c files that need to include
extract-store-integer.h but that were only including it indirectly.
This patch adds the missing includes.  This change enables the next
patch.

Approved-By: Andrew Burgess <aburgess@redhat.com>
2025-01-17 08:52:04 -07:00
Tom de Vries
80ac478511 [gdb/symtab] Handle multiple .debug_info sections
When compiling dw2-multiple-debug-info.c using -gdwarf-5
-fdebug-types-section, we end with two .debug_info sections in the object
file:
...
$ g++ gdb.dwarf2/dw2-multiple-debug-info.c -c -g \
    -gdwarf-5 \
    -fdebug-types-section
$ readelf -WS dw2-multiple-debug-info.o | grep -v RELA | grep .debug_info
  [10] .debug_info  PROGBITS        0 000128 0000cd 00  GC  0   0  8
  [12] .debug_info  PROGBITS        0 0001f8 0000ad 00   C  0   0  8
...

One of them contains the CU for dw2-multiple-debug-info.c, the other contains
the TU for the type of variable a.

When trying to print the type of variable a, we get:
...
$ gdb -q -batch dw2-multiple-debug-info.o -ex "ptype a"
'a' has unknown type; cast it to its declared type
...
because the TU hasn't been read.

Fix this by adding support for reading multiple .debug_info sections, similar
to how that is done for multiple .debug_types sections, getting us instead:
...
$ gdb -q -batch dw2-multiple-debug-info.o -ex "ptype a"
type = class sp1::A {
  ...
}
...

Tested on x86_64-linux.

PR symtab/32223
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32223
2024-10-29 10:08:04 +01:00
Andrew Burgess
e454ae416a gdb: allow quoted filenames for commands that have custom completion
This commit changes how GDB processes command arguments for the
following commands:

  compile file
  maint print c-tdesc
  save gdb-index

After this commit these commands will now expect their single filename
argument to be (optionally) quoted if it contains any special
characters (e.g. whit space or quotes).

If the filename does not contain any special characters then nothing
changes.  As an example:

   (gdb) save gdb-index /path/to/some/directory/

will work before and after this patch.  However, if the directory
name contains a white space then before this patch a user would write:

  (gdb) save gdb-index /path/to some/directory/

But this will now fail as GDB will consider this as two arguments,
'/path/to' and 'some/directory/'.  To pass this single directory name
a user must now do one of these:

  (gdb) save gdb-index "/path/to some/directory/"
  (gdb) save gdb-index '/path/to some/directory/'
  (gdb) save gdb-index /path/to\ some/directory/

This brings these commands into line with commands like 'file' and
'symbol-file', which have supported quoted filenames for a while.

The motivation for this change is to make handling of filename
arguments consistent throughout GDB.  We can't move to all commands
taking non-quoted filenames as the non-quoted style only allows for a
single argument.  Additionally, the non-quoted style doesn't allow for
filenames that end in white space (though this is probably pretty
rare).  So, if we want to have consistency the only choice is to move
towards supporting quote filenames.

Reviewed-By: Eli Zaretskii <eliz@gnu.org>
2024-09-07 20:28:59 +01:00
Andrew Burgess
dc22ab49e9 gdb: deprecated filename_completer and associated functions
Following on from the previous commit, this commit marks the old
unquoted filename completion related functions as deprecated.

The aim of doing this is to make it more obvious to someone adding a
new command that they should not be using the older unquoted style
filename argument handling.

I split this change from the previous to make for an easier review.
This commit touches more files, but is _just_ function renaming.
Check out gdb/completer.{c,h} for what has been renamed.  All the
other files have just been updated to use the new names.

There should be no user visible changes after this commit.
2024-09-07 20:28:58 +01:00
Simon Marchi
e5afccc7e4 gdb/dwarf2: cleanup includes
Cleanup includes in dwarf2/*.

 1. Add the necessary includes so that clangd reports no errors when
    opening header files.  This ensures that header files include what
    they use.

 2. Remove all includes reported as unused by clangd (except
    gdb-safe-ctype.h, which I think does some magic that affects what
    follows).

Built-tested --enable-threading at "yes" and "no", since there are some
portions of code gated by `#ifdef CXX_STD_THREAD`.

Change-Id: I21debffcd7c2caf90f08e1e0fbba3ce30422d042
Approved-By: Tom Tromey <tom@tromey.com>
2024-08-30 13:57:11 -04:00
Simon Marchi
05d9d66d92 gdb: remove unused includes in utils.h
Remove some includes reported as unused by clangd.  Add some includes in
other files that were previously relying on the transitive include.

Change-Id: Ibdd0a998b04d21362a20d0ca8e5267e21e2e133e
2024-05-30 22:43:52 -04:00
Simon Marchi
5b9707eb87 gdb: remove gdbcmd.h
Most files including gdbcmd.h currently rely on it to access things
actually declared in cli/cli-cmds.h (setlist, showlist, etc).  To make
things easy, replace all includes of gdbcmd.h with includes of
cli/cli-cmds.h.  This might lead to some unused includes of
cli/cli-cmds.h, but it's harmless, and much faster than going through
the 170 or so files by hand.

Change-Id: I11f884d4d616c12c05f395c98bbc2892950fb00f
Approved-By: Tom Tromey <tom@tromey.com>
2024-04-25 12:59:02 -04:00
Simon Marchi
18d2988e5d gdb, gdbserver, gdbsupport: remove includes of early headers
Now that defs.h, server.h and common-defs.h are included via the
`-include` option, it is no longer necessary for source files to include
them.  Remove all the inclusions of these files I could find.  Update
the generation scripts where relevant.

Change-Id: Ia026cff269c1b7ae7386dd3619bc9bb6a5332837
Approved-By: Pedro Alves <pedro@palves.net>
2024-03-26 21:13:22 -04:00
Tom Tromey
932e5949a9 Use DW_FORM_ref_addr for DIE offset in .debug_names
Today I realized that while the .debug_names writer uses DW_FORM_udata
for the DIE offset, DW_FORM_ref_addr would be more appropriate here.
This patch makes this change.

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31361
2024-02-29 17:12:58 -07:00
Tom Tromey
974b36c2ae Use the new symbol domains
This patch changes the DWARF reader to use the new symbol domains.  It
also adjusts many bits of associated code to adapt to this change.

The non-DWARF readers are updated on a best-effort basis.  This is
somewhat simpler since most of them only support C and C++.  I have no
way to test a few of these.

I went back and forth a few times on how to handle the "tag"
situation.  The basic problem is that C has a special namespace for
tags, which is separate from the type namespace.  Other languages
don't do this.  So, the question is, should a DW_TAG_structure_type
end up in the tag domain, or the type domain, or should it be
language-dependent?

I settled on making it language-dependent using a thought experiment.
Suppose there was a Rust compiler that only emitted nameless
DW_TAG_structure_type objects, and specified all structure type names
using DW_TAG_typedef.  This DWARF would be correct, in that it
faithfully represents the source language -- but would not work with a
purely struct-domain implementation in gdb.  Therefore gdb would be
wrong.

Now, this approach is a little tricky for C++, which uses tags but
also enters a typedef for them.  I notice that some other readers --
like stabsread -- actually emit a typedef symbol as well.  And, I
think this is a reasonable approach.  It uses more memory, but it
makes the internals simpler.  However, DWARF never did this for
whatever reason, and so in the interest of keeping the series slightly
shorter, I've left some C++-specific hacks in place here.

Note that this patch includes language_minimal as a language that uses
tags.  I did this to avoid regressing gdb.dwarf2/debug-names-tu.exp,
which doesn't specify the language for a type unit.  Arguably this
test case is wrong.

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30164
2024-01-28 10:58:16 -07:00
Tom Tromey
91a42a6180 Rewrite .debug_names writer
This rewrites GDB's .debug_names writer.  It is now closer to the form
imagined in the DWARF spec.  In particular, names are emitted exactly
as they appear in the original DWARF.

In order to make the reader work nicely, some extensions were needed.
These were all documented in an earlier patch.  Note that in
particular this writer solves the "main name" problem by putting a
flag into the table.

GDB does not use the .debug_names hash table, so it also does not
write one.  I consider this hash table to be essentially useless in
general, due to the name canonicalization problem -- while DWARF says
that writers should use the system demangling style, (1) this style
varies across systems, so it can't truly be relied on; and (2) at
least GCC and one other compiler don't actually follow this part of
the spec anyway.

It's important to note, though, that even if the hash was somehow
useful, GDB probably still would not use it -- a sorted list of names
is needed for completion and performs reasonably well for other
lookups, so a hash table is just overhead, IMO.

String emission is also simplified.  There's no need in this writer to
ingest the contents of .debug_str.

A couple of tests are updated to reflect the fact that they now "fail"
because the tests don't include .debug_aranges in the .S file.
Arguably the .debug_names writer should also create this section; but
I did not implement that in this series, and there is a separate bug
about it.

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=24820
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=24549
2024-01-18 08:51:06 -07:00
Tom Tromey
6e53356b77 Do not write the index cache from an index
The new .debug_names reader will work by creating a cooked index from
.debug_names.  This patch updates cooked_index::maybe_write_index to
avoid writing the index in this case.

However, in order to do this in a clean way, the readers are changed
so that a nullptr result from index_for_writing means "cannot be
done", and then the error message is moved into write_dwarf_index
(where it historically lived).
2024-01-18 08:20:16 -07:00
Tom Tromey
5902fa8a55 Add language to cooked_index_entry
This adds a new 'lang' member to cooked_index_entry.  This holds the
language of the symbol.  This is primarily useful for the new
.debug_names reader, which will not scan the CUs for languages up
front.

This also changes cooked_index_shard::add to return a non-const
pointer.  This doesn't impact the current code, but is needed for the
new reader.
2024-01-18 08:20:16 -07:00
Andrew Burgess
1d506c26d9 Update copyright year range in header of all files managed by GDB
This commit is the result of the following actions:

  - Running gdb/copyright.py to update all of the copyright headers to
    include 2024,

  - Manually updating a few files the copyright.py script told me to
    update, these files had copyright headers embedded within the
    file,

  - Regenerating gdbsupport/Makefile.in to refresh it's copyright
    date,

  - Using grep to find other files that still mentioned 2023.  If
    these files were updated last year from 2022 to 2023 then I've
    updated them this year to 2024.

I'm sure I've probably missed some dates.  Feel free to fix them up as
you spot them.
2024-01-12 15:49:57 +00:00
Nils-Christian Kempke
3396471b4c dwarf, fortran: add support for DW_TAG_entry_point
Fortran provides additional entry points for subroutines and functions.
These entry points may use only a subset (or a different set) of the
parameters of the original subroutine.  The entry points may be described
via the DWARF tag DW_TAG_entry_point.

This commit adds support for parsing the DW_TAG_entry_point DWARF tag.
Currently, between ifx/ifort/gfortran, only ifort is actually emitting
this tag.  Both, ifx and gfortran use the DW_TAG_subprogram tag as
workaround/alternative.  Thus, this patch really only adds more ifort
support.  Even so, some of the attached tests still fail for ifort, due
to some wrong line info generated for the entry points in ifort.

After this patch it is possible to set a breakpoint in gdb with the
ifort compiled example at the entry points 'foo' and 'foobar', which was not
possible before.

As gcc and ifx do not emit the tag I also added a test to gdb.dwarf2
which uses some underlying c compiled code and adds some Fortran style DWARF
to it emitting the DW_TAG_entry_point.  Before this patch it was not
possible to actually define breakpoint at the entry point tags.

For gfortran there actually exists a bug on bugzilla, asking for the use
of DW_TAG_entry_point over DW_TAG_subprogram:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=37134

This patch was originally posted here

https://sourceware.org/legacy-ml/gdb-patches/2017-07/msg00317.html

but its review/pinging got lost after a while.  I reworked it to fit the
current GDB.

Co-authored-by: Bernhard Heckel <bernhard.heckel@intel.com>
Co-authored-by: Tim Wiederhake  <tim.wiederhake@intel.com>
Approved-by: Tom Tromey <tom@tromey.com>
2023-12-29 11:31:10 +01:00
Andrew Burgess
5c5e642dc0 gdb: improve error reporting for 'save gdb-index'
While making recent changes to 'save gdb-index' command I triggered
some errors -- of the kind a user might be expected to trigger if they
do something wrong -- and I didn't find GDB's output as helpful as it
might be.

For example:

  $ gdb -q /tmp/hello.x
  ...
  (gdb) save gdb-index /non_existing_dir
  Error while writing index for `/tmp/hello': mkstemp: No such file or directory.

That the error message mentions '/tmp/hello', which does exist, but
doesn't mention '/non_existing_dir', which doesn't is, I think,
confusing.

Also, I find the 'mkstemp' in the error message confusing for a user
facing error.  A user might not know what mkstemp means, and even if
they do, that it appears in the error message is an internal GDB
detail.  The user doesn't care what function failed, but wants to know
what was wrong with their input, and what they should do to fix
things.

Similarly, for a directory that does exist, but can't be written to:

  (gdb) save gdb-index /no_access_dir
  Error while writing index for `/tmp/hello': mkstemp: Permission denied.

In this case, the 'Permission denied' might make the user thing there
is a permissions issue with '/tmp/hello', which is not the case.

After this patch, the new errors are:

  (gdb) save gdb-index /non_existing_dir
  Error while writing index for `/tmp/hello': `/non_existing_dir': No such file or directory.

and:

  (gdb) save gdb-index /no_access_dir
  Error while writing index for `/tmp/hello': `/no_access_dir': Permission denied.

we also have:

  (gdb) save gdb-index /tmp/not_a_directory
  Error while writing index for `/tmp/hello': `/tmp/not_a_directory': Is not a directory.

I think these do a better job of guiding the user towards fixing the
problem.

I've added a new test that exercises all of these cases, and also
checks the case where a user tries to use an executable that already
contains an index in order to generate an index.  As part of the new
test I've factored out some code from ensure_gdb_index (lib/gdb.exp)
into a new proc (get_index_type), which I've then used in the new
test.  I've confirmed that all the tests that use ensure_gdb_index
still pass.

During review it was pointed out that the testsuite proc
have_index (lib/gdb.exp) is similar to the new get_index_type proc, so
I've rewritten have_index to also use get_index_type, I've confirmed
that all the tests that use have_index still pass.

Nothing that worked correctly before this patch should give an error
after this patch; I've only changed the output when the user was going
to get an error anyway.

Reviewed-By: Tom de Vries <tdevries@suse.de>
Reviewed-By: Tom Tromey <tom@tromey.com>
Approved-By: Tom Tromey <tom@tromey.com>
2023-12-13 08:54:06 +00:00
Tom Tromey
b096524827 Use try_emplace in index-write.c
index-write.c has a comment indicating that C++17's try_emplace could
be used.  This patch makes the change.

Approved-By: Pedro Alves <pedro@palves.net>
2023-11-29 14:29:44 -07:00
Andrew Burgess
3644f41dc8 gdb: generate dwarf-5 index identically as worker-thread count changes
Similar to the previous commit, this commit ensures that the dwarf-5
index files are generated identically as the number of worker-threads
changes.

Building the dwarf-5 index makes use of a closed hash table, the
bucket_hash local within debug_names::build().  Entries are added to
bucket_hash from m_name_to_value_set, which, in turn, is populated
by calls to debug_names::insert() in write_debug_names.  The insert
calls are ordered based on the entries within the cooked_index, and
the ordering within cooked_index depends on the number of worker
threads that GDB is using.

My proposal is to sort each chain within the bucket_hash closed hash
table prior to using this to build the dwarf-5 index.

The buckets within bucket_hash will always have the same ordering (for
a given GDB build with a given executable), and by sorting the chains
within each bucket, we can be sure that GDB will see each entry in a
deterministic order.

I've extended the index creation test to cover this case.

Approved-By: Tom Tromey <tom@tromey.com>
2023-11-28 10:23:19 +00:00
Andrew Burgess
aff250145a gdb: generate gdb-index identically regardless of work thread count
It was observed that changing the number of worker threads that GDB
uses (maintenance set worker-threads NUM) would have an impact on the
layout of the generated gdb-index.

The cause seems to be how the CU are distributed between threads, and
then symbols that appear in multiple CU can be encountered earlier or
later depending on whether a particular CU moves between threads.

I certainly found this behaviour was reproducible when generating an
index for GDB itself, like:

  gdb -q -nx -nh -batch \
      -eiex 'maint set worker-threads NUM' \
      -ex 'save gdb-index /tmp/'

And then setting different values for NUM will change the generated
index.

Now, the question is: does this matter?

I would like to suggest that yes, this does matter.  At Red Hat we
generate a gdb-index as part of the build process, and we would
ideally like to have reproducible builds: for the same source,
compiled with the same tool-chain, we should get the exact same output
binary.  And we do .... except for the index.

Now we could simply force GDB to only use a single worker thread when
we build the index, but, I don't think the idea of reproducible builds
is that strange, so I think we should ensure that our generated
indexes are always reproducible.

To achieve this, I propose that we add an extra step when building the
gdb-index file.  After constructing the initial symbol hash table
contents, we will pull all the symbols out of the hash, sort them,
then re-insert them in sorted order.  This will ensure that the
structure of the generated hash will remain consistent (given the same
set of symbols).

I've extended the existing index-file test to check that the generated
index doesn't change if we adjust the number of worker threads used.
Given that this test is already rather slow, I've only made one change
to the worker-thread count.  Maybe this test should be changed to use
a smaller binary, which is quicker to load, and for which we could
then try many different worker thread counts.

Approved-By: Tom Tromey <tom@tromey.com>
2023-11-28 10:23:19 +00:00
Andrew Burgess
acc117b57f gdb: C++-ify mapped_symtab from dwarf2/index-write.c
Make static the functions add_index_entry, find_slot, and hash_expand,
member functions of the mapped_symtab class.

Fold an additional snippet of code from write_gdbindex into
mapped_symtab::minimize, this code relates to minimisation, so this
seems like a good home for it.

Make the n_elements, data, and m_string_obstack member variables of
mapped_symtab private.  Provide a new obstack() member function to
provide access to the obstack when needed, and also add member
functions begin(), end(), cbegin(), and cend() so that the
mapped_symtab class can be treated like a contained and iterated
over.

I've also taken this opportunity to split out the logic for whether
the hash table (m_data) needs expanding, this is the new function
hash_needs_expanding.  This will be useful in a later commit.

There should be no user visible changes after this commit.

Approved-By: Tom Tromey <tom@tromey.com>
2023-11-28 10:23:19 +00:00
Andrew Burgess
aa19bc1d25 gdb: reduce size of generated gdb-index file
I noticed in passing that out algorithm for generating the gdb-index
file is incorrect.  When building the hash table in add_index_entry we
count every incoming entry rehash when the number of entries gets too
large.  However, some of the incoming entries will be duplicates,
which don't actually result in new items being added to the hash
table.

As a result, we grow the gdb-index hash table far too often.

With an unmodified GDB, generating a gdb-index for GDB, I see a file
size of 90M, with a hash usage (in the generated index file) of just
2.6%.

With a patched GDB, generating a gdb-index for the _same_ GDB binary,
I now see a gdb-index file size of 30M, with a hash usage of 41.9%.

This is a 67% reduction in gdb-index file size.

Obviously, not every gdb-index file is going to see such big savings,
however, the larger a program, and the more symbols that are
duplicated between compilation units, the more GDB would over count,
and so, over-grow the index.

The gdb-index hash table we create has a minimum size of 1024, and
then we grow the hash when it is 75% full, doubling the hash table at
that time.  Given this, then we expect that either:

  a. The hash table is size 1024, and less than 75% full, or
  b. The hash table is between 37.5% and 75% full.

I've include a test that checks some of these constraints -- I've not
bothered to check the upper limit, and over full hash table isn't
really a problem here, but if the fill percentage is less than 37.5%
then this indicates that we've done something wrong (obviously, I also
check for the 1024 minimum size).

Approved-By: Tom Tromey <tom@tromey.com>
2023-11-28 10:23:19 +00:00
Andrew Burgess
640713c5a0 gdb: option completion for 'save gdb-index' command
Add proper support for option completion to the 'save gdb-index'
command.  Update save_gdb_index_command function to make use of the
new option_def data structures for parsing the '-dwarf-5' option.

Approved-By: Tom Tromey <tom@tromey.com>
2023-11-28 10:23:18 +00:00
Andrew Burgess
4793f551a5 gdb: allow use of ~ in 'save gdb-index' command
Add a call to gdb_tilde_expand in the save_gdb_index_command function,
this means that we can now do:

  (gdb) save gdb-index ~/blah/

Previous this wouldn't work.

Approved-By: Tom Tromey <tom@tromey.com>
2023-11-28 10:23:18 +00:00
Lancelot Six
6b09f1342c gdb: Replace gdb::optional with std::optional
Since GDB now requires C++17, we don't need the internally maintained
gdb::optional implementation.  This patch does the following replacing:
  - gdb::optional -> std::optional
  - gdb::in_place -> std::in_place
  - #include "gdbsupport/gdb_optional.h" -> #include <optional>

This change has mostly been done automatically.  One exception is
gdbsupport/thread-pool.* which did not use the gdb:: prefix as it
already lives in the gdb namespace.

Change-Id: I19a92fa03e89637bab136c72e34fd351524f65e9
Approved-By: Tom Tromey <tom@tromey.com>
Approved-By: Pedro Alves <pedro@palves.net>
2023-11-21 11:52:35 +00:00
Tom Tromey
fb8ea9d2ca Don't include cooked-index.h from dwarf2/read.h
dwarf2/read.h includes cooked-index.h, but it doesn't need to.  This
patch removes the inclusion from this header, and adds one to
index-write.c to make up for the absence.
2023-10-20 13:03:34 -06:00
Tom de Vries
938459015c [gdb/symtab] Fix more style issues in v9 .gdb_index section support
I noticed a few more style issues in commit 8b9c08edda ("[gdb/symtab] Add
name_of_main and language_of_main to the DWARF index"), after checking it
with gcc's check_GNU_style.{sh,py}.

Fix these.

Build on x86_64-linux.
2023-10-20 11:56:49 +02:00
Tom Tromey
379435351c Fix race in DWARF reader
The recent change to record the DWARF language in the per-CU data
yielded a race warning in my testing:

ThreadSanitizer: data race ../../binutils-gdb/gdb/dwarf2/read.c:21779 in prepare_one_comp_unit

This patch fixes the bug by applying the same style of fix that was
done for the ordinary (gdb) language.

I wonder if this code could be improved.  Requiring an atomic for the
language in particular seems unfortunate, as it is often consulted
during index finalization.  However, I haven't investigated this.

Regression tested on x86-64 Fedora 38.

Reviewed-by: Tom de Vries <tdevries@suse.de>
2023-10-19 16:51:29 -06:00
Tom de Vries
8bb3d8b1f9 [gdb/symtab] Fix two style issues in gdb/dwarf2/index-write.c
While reviewing gdb/dwarf2/index-write.c I noticed two style issues.

Fix these.

Tested on x86_64-linux.

Approved-By: Tom Tromey <tom@tromey.com>
2023-10-18 07:45:39 +02:00
Tom de Vries
729d066794 [gdb/symtab] Fix style issues in v9 .gdb_index section support
Post-commit review pointed out a few style issues in commit 8b9c08edda
("[gdb/symtab] Add name_of_main and language_of_main to the DWARF index").

Fix these.

Tested on x86_64-linux.

Reported-By: Tom Tromey <tom@tromey.com>
Approved-By: Tom Tromey <tom@tromey.com>
2023-10-18 07:45:39 +02:00
Matheus Branco Borella
8b9c08edda [gdb/symtab] Add name_of_main and language_of_main to the DWARF index
This patch adds a new section to the DWARF index containing the name
and the language of the main function symbol, gathered from
`cooked_index::get_main`, if available. Currently, for lack of a better name,
this section is called the "shortcut table". The way this name is both saved and
applied upon an index being loaded in mirrors how it is done in
`cooked_index_functions`, more specifically, the full name of the main function
symbol is saved and `set_objfile_main_name` is used to apply it after it is
loaded.

The main use case for this patch is in improving startup times when dealing with
large binaries. Currently, when an index is used, GDB has to expand symtabs
until it finds out what the language of the main function symbol is. For some
large executables, this may take a considerable amount of time to complete,
slowing down startup. This patch bypasses that operation by having both the name
and language of the main function symbol be provided ahead of time by the index.

In my testing (a binary with about 1.8GB worth of DWARF data) this change brings
startup time down from about 34 seconds to about 1.5 seconds.

When testing the patch with target board cc-with-gdb-index, test-case
gdb.fortran/nested-funcs-2.exp starts failing, but this is due to a
pre-existing issue, filed as PR symtab/30946.

Tested on x86_64-linux, with target board unix and cc-with-gdb-index.

PR symtab/24549
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=24549

Approved-By: Tom de Vries <tdevries@suse.de>
2023-10-10 10:26:40 +02:00