mirror of
https://github.com/bminor/binutils-gdb.git
synced 2025-12-05 23:23:09 +00:00
This commit removes the attempted de-duplication of types when building the gdb-index. This commit is the natural extension of this earlier commit: commitaef36dee93Date: Sun Aug 13 14:08:06 2023 +0200 [gdb/symtab] Don't deduplicate variables in gdb-index Which removed the de-duplication of variables. It is worth reading the earlier commit as all the justifications for that patch also apply to this one. Currently, when building the gdb-index we sort the type entries, moving declarations to the end of the entry list, and non-declarations to the front. Then within each group, declarations, and non-declarations, the index entries are sorted by CU offset. We then emit the first entry for any given type name. There are two problems with this. First, a non-declaration entry could be a definition, but it could also be a typedef. Now sure, a typedef is a type definition, but not necessarily a useful one. If we have a header file that contains: typedef struct foo_t foo_t; And a CU which makes use of 'foo_t', then the CU will include both a typedef and a type declaration. The target of the typedef will be the declaration. But notice, the CU will not include a type definition. If we have two CUs, one which only sees the above typedef and declaration, and another which sees the typedef and an actual type definition, then the final list of entries for this type's name will be: 1. A typedef entry that points at the declaration. 2. A typedef entry that points at the definition. 3. A definition. 4. A declaration. Now (4) will get sorted to the end of the entry list. But the order of (1), (2), and (3) will depend on the CU offset. If the CU which containing the typedef and declaration has the smallest offset, then (1) will be sorted to the front of the list of entries for this type name. Due to the de-duplication code this means that only (1) will be added to the gdb-index. After GDB starts and parses the index, if a user references 'foo_t' GDB will look in the index and find just (1). GDB loads the CU containing (1) and finds both the typedef and the declaration. But GDB does not find the full type definition. As a result GDB will display 'foo_t' as an incomplete type. This differs from the behaviour when no index is used. With no index GDB expands the first CU containing 'foo_t', finds the typedef and type declaration, decides that this is not good enough and carries on. GDB will then expand the second CU and find the type's definition, GDB now has a full understanding of the type, and can print the type correctly. We could solve this problem by marking typedefs as a distinct sub-category of types, just as we do with declarations. Then we could sort definitions to the front of the list, then typedefs, and finally, declarations after that. This would, I think, mean that we always prefer emitting a definition for a type, which would resolve this first problem, or at least, it would resolve it well enough, but it wouldn't fix the second problem. The second problem is that the Python API and the 'info types' command can be used to query all type symbols. As such, GDB needs to be able to find all the CUs which contain a given type. Especially as it is possible that a type might be defined differently within different CUs. NOTE: Obviously a program doing this (defining a type differently in different CUs) would need to be mindful of the One Definition Rule, but so long as the type doesn't escape outside of a single CU then reusing a type name isn't, as I understand it, wrong. And even if it is, the fact that it compiles, and could be a source of bugs, means (in my opinion) that GDB should handle this case to enable debugging of it. Even something as simple as 'info types ....' relies on GDB being able to find multiple entries for a given type in different CUs. If the index only contains a single type entry, then this means GDB will see different things depending on which CUs happen to have been expanded. Given all of the above, I think that any attempt to remove type entries from the gdb-index is unsafe and can result in GDB behaving differently when using the gdb-index compared to using no index. The solution is to remove the de-duplication code, which is what this patch does. Now that we no longer need to sort declarations to the end of the entry list, I've removed all the code related to the special use of GDB_INDEX_SYMBOL_KIND_UNUSED5 (which is how we marked declarations), this cleans things up a little bit. I've also renamed some of the functions away from minimize, now that there's no minimization being done. A problem was revealed by this change. When running the test gdb.cp/stub-array-size.exp with the --target_board=cc-with-gdb-index, I was seeing a failure using gcc 15.1.0. This test has two CUs, and a type 'A'. The test description says: Test size of arrays of stubbed types (structures where the full definition is not immediately available). Which I don't really understand given the test's source code. The type 'A' is defined in a header, which is included in both CUs. However, the test description does seem to be accurate; in one CU the type looks like this: <1><4a>: Abbrev Number: 8 (DW_TAG_structure_type) <4b> DW_AT_name : A <4d> DW_AT_declaration : 1 <4d> DW_AT_sibling : <0x6d> <2><51>: Abbrev Number: 9 (DW_TAG_subprogram) <52> DW_AT_external : 1 <52> DW_AT_name : ~A <55> DW_AT_decl_file : 2 <56> DW_AT_decl_line : 20 <57> DW_AT_decl_column : 11 <58> DW_AT_linkage_name: (indirect string, offset: 0x103): _ZN1AD4Ev <5c> DW_AT_virtuality : 1 (virtual) <5d> DW_AT_containing_type: <0x4a> <61> DW_AT_declaration : 1 <61> DW_AT_object_pointer: <0x66> <65> DW_AT_inline : 0 (not inlined) <3><66>: Abbrev Number: 10 (DW_TAG_formal_parameter) <67> DW_AT_type : <0x8c> <6b> DW_AT_artificial : 1 <3><6b>: Abbrev Number: 0 <2><6c>: Abbrev Number: 0 while in the second CU, the type looks like this: <1><178>: Abbrev Number: 4 (DW_TAG_structure_type) <179> DW_AT_name : A <17b> DW_AT_byte_size : 8 <17c> DW_AT_decl_file : 2 <17d> DW_AT_decl_line : 18 <17e> DW_AT_decl_column : 8 <17f> DW_AT_containing_type: <0x178> <183> DW_AT_sibling : <0x1ac> <2><187>: Abbrev Number: 5 (DW_TAG_member) <188> DW_AT_name : (indirect string, offset: 0x19e): _vptr.A <18c> DW_AT_type : <0x1be> <190> DW_AT_data_member_location: 0 <191> DW_AT_artificial : 1 <2><191>: Abbrev Number: 6 (DW_TAG_subprogram) <192> DW_AT_external : 1 <192> DW_AT_name : ~A <195> DW_AT_decl_file : 1 <196> DW_AT_decl_line : 20 <197> DW_AT_decl_column : 1 <198> DW_AT_linkage_name: (indirect string, offset: 0x103): _ZN1AD4Ev <19c> DW_AT_virtuality : 1 (virtual) <19d> DW_AT_containing_type: <0x178> <1a1> DW_AT_declaration : 1 <1a1> DW_AT_object_pointer: <0x1a5> <3><1a5>: Abbrev Number: 7 (DW_TAG_formal_parameter) <1a6> DW_AT_type : <0x1cd> <1aa> DW_AT_artificial : 1 <3><1aa>: Abbrev Number: 0 <2><1ab>: Abbrev Number: 0 So, for reasons that I don't understand, the type, despite (as far as I can see) having its full definition available, is recorded only as declared in one CU. The test then performs some actions that rely on 'sizeof(A)' and expects GDB to correctly figure out the size. This requires GDB to find, and expand the CU containing the real definition of 'A'. Prior to this patch GDB would sort the two type entries for 'A', placing the declaration second, and then record only one entry, the definition. When it came to expansion there was only one thing to expand, and this is the declaration we needed. It happens that in this test the definition is in the second CU, that is, the CU with the biggest offset. This means that, if all index entries were considered equal, the definition entry would be second. However, currently, due to the way GDB forces definitions to the front, the entry for the second CU, the definition, is placed first in the index, and with de-duplication, this is the only entry added to the index. After this patch, both the declaration and the definition are placed in the index, and as the declaration is in the CU at offset 0, the declaration is added first to the index. This should be fine. When looking for 'A' GDB should expand the CU containing the declaration, see that all we have is a declaration, and so continue, next expanding the definition, at which point we're done. However, in read-gdb-index.c, in the function mapped_gdb_index::build_name_components, there is a work around for gold bug PR gold/15646. Ironically, the bug here is that gold was not removing duplicate index entries, and it is noted that this has a performance impact on GDB. A work around for this was added to GDB in commit: commit8943b87476Date: Tue Nov 12 09:43:17 2013 -0800 Work around gold/15646. A test for this was added in: commit40d22035a7Date: Tue May 26 11:35:32 2020 +0200 [gdb/testsuite] Add test-case gold-gdb-index.exp And the fix was tweaked in commit: commitf030440daaDate: Thu May 28 17:26:22 2020 +0200 [gdb/symtab] Make gold index workaround more precise The problem specifically called out in the bug report is that namespaces can appear in multiple CUs, and that trying to complete 'ns::misspelled' would expand every CU containing namespace 'ns' due to the duplicate 'ns' type symbols. The work around that was added in8943b87476was to ignore duplicate global symbols when expanding entries from the index. In commitf030440daathis work around was restricted to only ignore duplicate type entries. This restriction was required to allow the earlier de-duplication patchaef36dee93to function correctly. Now that I'm taking the work started inaef36dee93to its logical conclusion, and allowing duplicate type entries, the work around of ignoring duplicate global type symbols is no longer needed, and can be removed. The associated test for this, added in40d22035a7, is also removed in this commit. To be clear; the performance issue mentioned in PR gold/15646 is now back again. But my claim is that gold was right all along to include the duplicate index entries, and any performance hit we see as a result, though unfortunate, is just a consequence of doing it right. That doesn't mean there's not room for optimisation and improvement in the future, though I don't have any immediate ideas, or plans in this area. It's just we can't throw out a bunch of index entries that are critical, and claim this as a performance optimisation. I am seeing some failure with this patch when using the board file dwarf5-fission-debug-types. These failures all have the error: DWARF Error: wrong unit_type in unit header (is DW_UT_skeleton, should be DW_UT_type) [in module ....] However, I ran the whole testsuite with this board, and this error crops up often, so I don't think this is something specific to my patch, so I'm choosing to ignore this. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=15646 Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=15035 Approved-By: Tom Tromey <tom@tromey.com>
824 lines
24 KiB
C
824 lines
24 KiB
C
/* Reading code for .gdb_index
|
|
|
|
Copyright (C) 2023-2025 Free Software Foundation, Inc.
|
|
|
|
This file is part of GDB.
|
|
|
|
This program is free software; you can redistribute it and/or modify
|
|
it under the terms of the GNU General Public License as published by
|
|
the Free Software Foundation; either version 3 of the License, or
|
|
(at your option) any later version.
|
|
|
|
This program is distributed in the hope that it will be useful,
|
|
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
GNU General Public License for more details.
|
|
|
|
You should have received a copy of the GNU General Public License
|
|
along with this program. If not, see <http://www.gnu.org/licenses/>. */
|
|
|
|
#include "read-gdb-index.h"
|
|
|
|
#include "cli/cli-cmds.h"
|
|
#include "cli/cli-style.h"
|
|
#include "complaints.h"
|
|
#include "dwarf2/index-common.h"
|
|
#include "dwz.h"
|
|
#include "event-top.h"
|
|
#include "gdb/gdb-index.h"
|
|
#include "gdbsupport/gdb-checked-static-cast.h"
|
|
#include "cooked-index.h"
|
|
#include "read.h"
|
|
#include "extract-store-integer.h"
|
|
#include "cp-support.h"
|
|
#include "symtab.h"
|
|
#include "gdbsupport/selftest.h"
|
|
#include "tag.h"
|
|
|
|
/* When true, do not reject deprecated .gdb_index sections. */
|
|
static bool use_deprecated_index_sections = false;
|
|
|
|
struct dwarf2_gdb_index : public cooked_index_functions
|
|
{
|
|
/* This dumps minimal information about the index.
|
|
It is called via "mt print objfiles".
|
|
One use is to verify .gdb_index has been loaded by the
|
|
gdb.dwarf2/gdb-index.exp testcase. */
|
|
void dump (struct objfile *objfile) override;
|
|
};
|
|
|
|
/* This is a cooked index as ingested from .gdb_index. */
|
|
|
|
class cooked_gdb_index : public cooked_index
|
|
{
|
|
public:
|
|
|
|
cooked_gdb_index (cooked_index_worker_up worker,
|
|
int version)
|
|
: cooked_index (std::move (worker)),
|
|
version (version)
|
|
{ }
|
|
|
|
/* This can't be used to write an index. */
|
|
cooked_index *index_for_writing () override
|
|
{ return nullptr; }
|
|
|
|
quick_symbol_functions_up make_quick_functions () const override
|
|
{ return quick_symbol_functions_up (new dwarf2_gdb_index); }
|
|
|
|
bool version_check () const override
|
|
{
|
|
return version >= 8;
|
|
}
|
|
|
|
/* Index data format version. */
|
|
int version;
|
|
};
|
|
|
|
/* See above. */
|
|
|
|
void
|
|
dwarf2_gdb_index::dump (struct objfile *objfile)
|
|
{
|
|
dwarf2_per_objfile *per_objfile = get_dwarf2_per_objfile (objfile);
|
|
|
|
cooked_gdb_index *index = (gdb::checked_static_cast<cooked_gdb_index *>
|
|
(per_objfile->per_bfd->index_table.get ()));
|
|
gdb_printf (".gdb_index: version %d\n", index->version);
|
|
cooked_index_functions::dump (objfile);
|
|
}
|
|
|
|
/* This is a view into the index that converts from bytes to an
|
|
offset_type, and allows indexing. Unaligned bytes are specifically
|
|
allowed here, and handled via unpacking. */
|
|
|
|
class offset_view
|
|
{
|
|
public:
|
|
offset_view () = default;
|
|
|
|
explicit offset_view (gdb::array_view<const gdb_byte> bytes)
|
|
: m_bytes (bytes)
|
|
{
|
|
}
|
|
|
|
/* Extract the INDEXth offset_type from the array. */
|
|
offset_type operator[] (size_t index) const
|
|
{
|
|
const gdb_byte *bytes = &m_bytes[index * sizeof (offset_type)];
|
|
return (offset_type) extract_unsigned_integer (bytes,
|
|
sizeof (offset_type),
|
|
BFD_ENDIAN_LITTLE);
|
|
}
|
|
|
|
/* Return the number of offset_types in this array. */
|
|
size_t size () const
|
|
{
|
|
return m_bytes.size () / sizeof (offset_type);
|
|
}
|
|
|
|
/* Return true if this view is empty. */
|
|
bool empty () const
|
|
{
|
|
return m_bytes.empty ();
|
|
}
|
|
|
|
private:
|
|
/* The underlying bytes. */
|
|
gdb::array_view<const gdb_byte> m_bytes;
|
|
};
|
|
|
|
/* A worker for reading .gdb_index. The file format is described in
|
|
the manual. */
|
|
|
|
struct mapped_gdb_index
|
|
{
|
|
/* Index data format version. */
|
|
int version = 0;
|
|
|
|
/* Compile units followed by type units, in the order as found in the
|
|
index. Indices found in index entries can index directly into this. */
|
|
std::vector<dwarf2_per_cu *> units;
|
|
|
|
/* The address table data. */
|
|
gdb::array_view<const gdb_byte> address_table;
|
|
|
|
/* The symbol table, implemented as a hash table. */
|
|
offset_view symbol_table;
|
|
|
|
/* A pointer to the constant pool. */
|
|
gdb::array_view<const gdb_byte> constant_pool;
|
|
|
|
/* The shortcut table data. */
|
|
gdb::array_view<const gdb_byte> shortcut_table;
|
|
|
|
/* An address map that maps from PC to dwarf2_per_cu. */
|
|
addrmap_fixed *index_addrmap = nullptr;
|
|
|
|
/* The name of 'main', or nullptr if not known. */
|
|
const char *main_name = nullptr;
|
|
|
|
/* The language of 'main', if known. */
|
|
enum language main_lang = language_minimal;
|
|
|
|
/* The result we're constructing. */
|
|
cooked_index_worker_result result;
|
|
|
|
/* Return the index into the constant pool of the name of the IDXth
|
|
symbol in the symbol table. */
|
|
offset_type symbol_name_index (offset_type idx) const
|
|
{
|
|
return symbol_table[2 * idx];
|
|
}
|
|
|
|
/* Return the index into the constant pool of the CU vector of the
|
|
IDXth symbol in the symbol table. */
|
|
offset_type symbol_vec_index (offset_type idx) const
|
|
{
|
|
return symbol_table[2 * idx + 1];
|
|
}
|
|
|
|
/* Return whether the name at IDX in the symbol table should be
|
|
ignored. */
|
|
bool symbol_name_slot_invalid (offset_type idx) const
|
|
{
|
|
return symbol_name_index (idx) == 0 && symbol_vec_index (idx) == 0;
|
|
}
|
|
|
|
/* Convenience method to get at the name of the symbol at IDX in the
|
|
symbol table. */
|
|
const char *symbol_name_at (offset_type idx,
|
|
dwarf2_per_objfile *per_objfile) const
|
|
{
|
|
return (const char *) (this->constant_pool.data ()
|
|
+ symbol_name_index (idx));
|
|
}
|
|
|
|
size_t symbol_name_count () const
|
|
{ return this->symbol_table.size () / 2; }
|
|
|
|
/* Set the name and language of the main function from the shortcut
|
|
table. */
|
|
void set_main_name (dwarf2_per_objfile *per_objfile);
|
|
|
|
/* Build the symbol name component sorted vector, if we haven't
|
|
yet. */
|
|
void build_name_components (dwarf2_per_objfile *per_objfile);
|
|
};
|
|
|
|
/* See declaration. */
|
|
|
|
void
|
|
mapped_gdb_index::build_name_components (dwarf2_per_objfile *per_objfile)
|
|
{
|
|
std::vector<std::pair<std::string_view, std::vector<cooked_index_entry *>>>
|
|
need_parents;
|
|
gdb::unordered_map<std::string_view, cooked_index_entry *> by_name;
|
|
|
|
auto count = this->symbol_name_count ();
|
|
for (offset_type idx = 0; idx < count; idx++)
|
|
{
|
|
if (this->symbol_name_slot_invalid (idx))
|
|
continue;
|
|
|
|
const char *name = this->symbol_name_at (idx, per_objfile);
|
|
|
|
/* This code only knows how to break apart components of C++
|
|
symbol names (and other languages that use '::' as
|
|
namespace/module separator) and Ada symbol names.
|
|
|
|
It's unfortunate that we need the language, but since it is
|
|
really only used to rebuild full names, pairing it with the
|
|
split method is fine. */
|
|
enum language lang;
|
|
std::vector<std::string_view> components;
|
|
if (strstr (name, "::") != nullptr)
|
|
{
|
|
components = split_name (name, split_style::CXX);
|
|
lang = language_cplus;
|
|
}
|
|
else if (strchr (name, '<') != nullptr)
|
|
{
|
|
/* Guess that this is a template and so a C++ name. */
|
|
components.emplace_back (name);
|
|
lang = language_cplus;
|
|
}
|
|
else if (strstr (name, "__") != nullptr)
|
|
{
|
|
/* The Ada case is handled during finalization, because gdb
|
|
does not write the synthesized package names into the
|
|
index. */
|
|
components.emplace_back (name);
|
|
lang = language_ada;
|
|
}
|
|
else
|
|
{
|
|
components = split_name (name, split_style::DOT_STYLE);
|
|
/* Mark ordinary names as having an unknown language. This
|
|
is a hack to avoid problems with some Ada names. */
|
|
lang = (components.size () == 1) ? language_unknown : language_go;
|
|
}
|
|
|
|
std::vector<cooked_index_entry *> these_entries;
|
|
offset_view vec (constant_pool.slice (symbol_vec_index (idx)));
|
|
offset_type vec_len = vec[0];
|
|
for (offset_type vec_idx = 0; vec_idx < vec_len; ++vec_idx)
|
|
{
|
|
offset_type cu_index_and_attrs = vec[vec_idx + 1];
|
|
gdb_index_symbol_kind symbol_kind
|
|
= GDB_INDEX_SYMBOL_KIND_VALUE (cu_index_and_attrs);
|
|
/* Only use a symbol if the attributes are present. Indices
|
|
prior to version 7 don't record them, and indices >= 7
|
|
may elide them for certain symbols (gold does this). */
|
|
if (symbol_kind == GDB_INDEX_SYMBOL_KIND_NONE)
|
|
continue;
|
|
|
|
int is_static = GDB_INDEX_SYMBOL_STATIC_VALUE (cu_index_and_attrs);
|
|
|
|
int cu_index = GDB_INDEX_CU_VALUE (cu_index_and_attrs);
|
|
/* Don't crash on bad data. */
|
|
if (cu_index >= units.size ())
|
|
{
|
|
complaint (_(".gdb_index entry has bad CU index"
|
|
" [in module %s]"),
|
|
objfile_name (per_objfile->objfile));
|
|
continue;
|
|
}
|
|
dwarf2_per_cu *per_cu = units[cu_index];
|
|
|
|
enum language this_lang = lang;
|
|
dwarf_tag tag;
|
|
switch (symbol_kind)
|
|
{
|
|
case GDB_INDEX_SYMBOL_KIND_VARIABLE:
|
|
tag = DW_TAG_variable;
|
|
break;
|
|
case GDB_INDEX_SYMBOL_KIND_FUNCTION:
|
|
tag = DW_TAG_subprogram;
|
|
break;
|
|
case GDB_INDEX_SYMBOL_KIND_TYPE:
|
|
if (is_static)
|
|
tag = (dwarf_tag) DW_TAG_GDB_INDEX_TYPE;
|
|
else
|
|
{
|
|
tag = DW_TAG_structure_type;
|
|
this_lang = language_cplus;
|
|
}
|
|
break;
|
|
/* The "default" should not happen, but we mention it to
|
|
avoid an uninitialized warning. */
|
|
default:
|
|
case GDB_INDEX_SYMBOL_KIND_OTHER:
|
|
tag = (dwarf_tag) DW_TAG_GDB_INDEX_OTHER;
|
|
break;
|
|
}
|
|
|
|
cooked_index_flag flags = 0;
|
|
if (is_static)
|
|
flags |= IS_STATIC;
|
|
if (main_name != nullptr
|
|
&& tag == DW_TAG_subprogram
|
|
&& strcmp (name, main_name) == 0)
|
|
{
|
|
flags |= IS_MAIN;
|
|
this_lang = main_lang;
|
|
/* Don't bother looking for another. */
|
|
main_name = nullptr;
|
|
}
|
|
|
|
/* Note that this assumes the final component ends in \0. */
|
|
cooked_index_entry *entry = result.add (per_cu->sect_off (), tag,
|
|
flags, this_lang,
|
|
components.back ().data (),
|
|
nullptr, per_cu);
|
|
/* Don't bother pushing if we do not need a parent. */
|
|
if (components.size () > 1)
|
|
these_entries.push_back (entry);
|
|
|
|
/* We don't care exactly which entry ends up in this
|
|
map. */
|
|
by_name[std::string_view (name)] = entry;
|
|
}
|
|
|
|
if (components.size () > 1)
|
|
{
|
|
std::string_view penultimate = components[components.size () - 2];
|
|
std::string_view prefix (name, &penultimate.back () + 1 - name);
|
|
|
|
need_parents.emplace_back (prefix, std::move (these_entries));
|
|
}
|
|
}
|
|
|
|
for (const auto &[prefix, entries] : need_parents)
|
|
{
|
|
auto iter = by_name.find (prefix);
|
|
/* If we can't find the parent entry, just lose. It should
|
|
always be there. We could synthesize it from the components,
|
|
if we kept those, but that seems like overkill. */
|
|
if (iter != by_name.end ())
|
|
{
|
|
for (cooked_index_entry *entry : entries)
|
|
entry->set_parent (iter->second);
|
|
}
|
|
}
|
|
}
|
|
|
|
/* The worker that reads a mapped index and fills in a
|
|
cooked_index_worker_result. */
|
|
|
|
class gdb_index_worker : public cooked_index_worker
|
|
{
|
|
public:
|
|
|
|
gdb_index_worker (dwarf2_per_objfile *per_objfile,
|
|
std::unique_ptr<mapped_gdb_index> map)
|
|
: cooked_index_worker (per_objfile),
|
|
map (std::move (map))
|
|
{ }
|
|
|
|
void do_reading () override;
|
|
|
|
/* The map we're reading. */
|
|
std::unique_ptr<mapped_gdb_index> map;
|
|
};
|
|
|
|
void
|
|
gdb_index_worker::do_reading ()
|
|
{
|
|
complaint_interceptor complaint_handler;
|
|
map->build_name_components (m_per_objfile);
|
|
|
|
m_results.push_back (std::move (map->result));
|
|
m_results[0].done_reading (complaint_handler.release ());
|
|
|
|
/* No longer needed. */
|
|
map.reset ();
|
|
|
|
done_reading ();
|
|
|
|
bfd_thread_cleanup ();
|
|
}
|
|
|
|
/* A helper function that reads the .gdb_index from BUFFER and fills
|
|
in MAP. FILENAME is the name of the file containing the data;
|
|
it is used for error reporting. DEPRECATED_OK is true if it is
|
|
ok to use deprecated sections.
|
|
|
|
CU_LIST, CU_LIST_ELEMENTS, TYPES_LIST, and TYPES_LIST_ELEMENTS are
|
|
out parameters that are filled in with information about the CU and
|
|
TU lists in the section.
|
|
|
|
Returns true if all went well, false otherwise. */
|
|
|
|
static bool
|
|
read_gdb_index_from_buffer (const char *filename,
|
|
bool deprecated_ok,
|
|
gdb::array_view<const gdb_byte> buffer,
|
|
mapped_gdb_index *map,
|
|
const gdb_byte **cu_list,
|
|
offset_type *cu_list_elements,
|
|
const gdb_byte **types_list,
|
|
offset_type *types_list_elements)
|
|
{
|
|
const gdb_byte *addr = &buffer[0];
|
|
offset_view metadata (buffer);
|
|
|
|
/* Version check. */
|
|
offset_type version = metadata[0];
|
|
/* GDB now requires the symbol attributes, which were added in
|
|
version 7. */
|
|
if (version < 7)
|
|
{
|
|
static int warning_printed = 0;
|
|
if (!warning_printed)
|
|
{
|
|
warning (_("Skipping obsolete .gdb_index section in %s."),
|
|
filename);
|
|
warning_printed = 1;
|
|
}
|
|
return 0;
|
|
}
|
|
/* Version 7 indices generated by gold refer to the CU for a symbol instead
|
|
of the TU (for symbols coming from TUs),
|
|
http://sourceware.org/bugzilla/show_bug.cgi?id=15021.
|
|
Plus gold-generated indices can have duplicate entries for global symbols,
|
|
http://sourceware.org/bugzilla/show_bug.cgi?id=15646.
|
|
These are just performance bugs, and we can't distinguish gdb-generated
|
|
indices from gold-generated ones, so issue no warning here. */
|
|
|
|
/* Indexes with higher version than the one supported by GDB may be no
|
|
longer backward compatible. */
|
|
if (version > 9)
|
|
return 0;
|
|
|
|
map->version = version;
|
|
|
|
int i = 1;
|
|
*cu_list = addr + metadata[i];
|
|
*cu_list_elements = (metadata[i + 1] - metadata[i]) / 8;
|
|
++i;
|
|
|
|
*types_list = addr + metadata[i];
|
|
*types_list_elements = (metadata[i + 1] - metadata[i]) / 8;
|
|
++i;
|
|
|
|
const gdb_byte *address_table = addr + metadata[i];
|
|
const gdb_byte *address_table_end = addr + metadata[i + 1];
|
|
map->address_table
|
|
= gdb::array_view<const gdb_byte> (address_table, address_table_end);
|
|
++i;
|
|
|
|
const gdb_byte *symbol_table = addr + metadata[i];
|
|
const gdb_byte *symbol_table_end = addr + metadata[i + 1];
|
|
map->symbol_table
|
|
= offset_view (gdb::array_view<const gdb_byte> (symbol_table,
|
|
symbol_table_end));
|
|
|
|
++i;
|
|
|
|
if (version >= 9)
|
|
{
|
|
const gdb_byte *shortcut_table = addr + metadata[i];
|
|
const gdb_byte *shortcut_table_end = addr + metadata[i + 1];
|
|
map->shortcut_table
|
|
= gdb::array_view<const gdb_byte> (shortcut_table, shortcut_table_end);
|
|
++i;
|
|
}
|
|
|
|
map->constant_pool = buffer.slice (metadata[i]);
|
|
|
|
if (map->constant_pool.empty () && !map->symbol_table.empty ())
|
|
{
|
|
/* An empty constant pool implies that all symbol table entries are
|
|
empty. Make map->symbol_table.empty () == true. */
|
|
map->symbol_table
|
|
= offset_view (gdb::array_view<const gdb_byte> (symbol_table,
|
|
symbol_table));
|
|
}
|
|
|
|
return 1;
|
|
}
|
|
|
|
/* A helper for create_cus_from_gdb_index that handles a given list of
|
|
CUs. */
|
|
|
|
static void
|
|
create_cus_from_gdb_index_list (dwarf2_per_bfd *per_bfd,
|
|
const gdb_byte *cu_list, offset_type n_elements,
|
|
struct dwarf2_section_info *section,
|
|
int is_dwz, std::vector<dwarf2_per_cu *> &units)
|
|
{
|
|
for (offset_type i = 0; i < n_elements; i += 2)
|
|
{
|
|
static_assert (sizeof (ULONGEST) >= 8);
|
|
|
|
sect_offset sect_off
|
|
= (sect_offset) extract_unsigned_integer (cu_list, 8, BFD_ENDIAN_LITTLE);
|
|
ULONGEST length = extract_unsigned_integer (cu_list + 8, 8, BFD_ENDIAN_LITTLE);
|
|
cu_list += 2 * 8;
|
|
|
|
dwarf2_per_cu_up per_cu = per_bfd->allocate_per_cu (section, sect_off,
|
|
length, is_dwz);
|
|
units.emplace_back (per_cu.get ());
|
|
per_bfd->all_units.emplace_back (std::move (per_cu));
|
|
}
|
|
}
|
|
|
|
/* Read the CU list from the mapped index, and use it to create all
|
|
the CU objects for PER_BFD. */
|
|
|
|
static void
|
|
create_cus_from_gdb_index (dwarf2_per_bfd *per_bfd,
|
|
const gdb_byte *cu_list, offset_type cu_list_elements,
|
|
std::vector<dwarf2_per_cu *> &units,
|
|
const gdb_byte *dwz_list, offset_type dwz_elements)
|
|
{
|
|
gdb_assert (per_bfd->all_units.empty ());
|
|
per_bfd->all_units.reserve ((cu_list_elements + dwz_elements) / 2);
|
|
|
|
create_cus_from_gdb_index_list (per_bfd, cu_list, cu_list_elements,
|
|
&per_bfd->infos[0], 0, units);
|
|
|
|
if (dwz_elements == 0)
|
|
return;
|
|
|
|
dwz_file *dwz = per_bfd->get_dwz_file ();
|
|
create_cus_from_gdb_index_list (per_bfd, dwz_list, dwz_elements,
|
|
&dwz->info, 1, units);
|
|
}
|
|
|
|
/* Create the signatured type hash table from the index. */
|
|
|
|
static void
|
|
create_signatured_type_table_from_gdb_index
|
|
(dwarf2_per_bfd *per_bfd, struct dwarf2_section_info *section,
|
|
const gdb_byte *bytes, offset_type elements,
|
|
std::vector<dwarf2_per_cu *> &units)
|
|
{
|
|
signatured_type_set sig_types_hash;
|
|
|
|
for (offset_type i = 0; i < elements; i += 3)
|
|
{
|
|
static_assert (sizeof (ULONGEST) >= 8);
|
|
sect_offset sect_off
|
|
= (sect_offset) extract_unsigned_integer (bytes, 8, BFD_ENDIAN_LITTLE);
|
|
cu_offset type_offset_in_tu
|
|
= (cu_offset) extract_unsigned_integer (bytes + 8, 8,
|
|
BFD_ENDIAN_LITTLE);
|
|
ULONGEST signature
|
|
= extract_unsigned_integer (bytes + 16, 8, BFD_ENDIAN_LITTLE);
|
|
bytes += 3 * 8;
|
|
|
|
/* The length of the type unit is unknown at this time. It gets
|
|
(presumably) set by a cutu_reader when it gets expanded later. */
|
|
signatured_type_up sig_type
|
|
= per_bfd->allocate_signatured_type (section, sect_off, 0 /* length */,
|
|
false /* is_dwz */, signature);
|
|
sig_type->type_offset_in_tu = type_offset_in_tu;
|
|
|
|
sig_types_hash.emplace (sig_type.get ());
|
|
units.emplace_back (sig_type.get ());
|
|
per_bfd->all_units.emplace_back (sig_type.release ());
|
|
}
|
|
|
|
per_bfd->signatured_types = std::move (sig_types_hash);
|
|
}
|
|
|
|
/* Read the address map data from the mapped GDB index. Return true if no
|
|
errors were found, otherwise return false. */
|
|
|
|
static bool
|
|
create_addrmap_from_gdb_index (dwarf2_per_objfile *per_objfile,
|
|
mapped_gdb_index *index)
|
|
{
|
|
objfile *objfile = per_objfile->objfile;
|
|
|
|
addrmap_mutable mutable_map;
|
|
|
|
/* Build an unrelocated address map of the sections in this objfile. */
|
|
addrmap_mutable sect_map;
|
|
for (obj_section &s : objfile->sections ())
|
|
{
|
|
if (s.addr_unrel () >= s.endaddr_unrel ())
|
|
continue;
|
|
|
|
CORE_ADDR start = CORE_ADDR (s.addr_unrel ());
|
|
CORE_ADDR end_inclusive = CORE_ADDR (s.endaddr_unrel ()) - 1;
|
|
sect_map.set_empty (start, end_inclusive, &s);
|
|
}
|
|
|
|
auto find_section
|
|
= [&] (ULONGEST addr, struct obj_section *&cached_section)
|
|
{
|
|
if (cached_section != nullptr
|
|
&& cached_section->contains (unrelocated_addr (addr)))
|
|
return cached_section;
|
|
|
|
cached_section = (struct obj_section *) sect_map.find (addr);
|
|
return cached_section;
|
|
};
|
|
|
|
auto invalid_range_warning = [&] (ULONGEST lo, ULONGEST hi)
|
|
{
|
|
warning (_(".gdb_index address table has invalid range (%s - %s),"
|
|
" ignoring .gdb_index"),
|
|
hex_string (lo), hex_string (hi));
|
|
return false;
|
|
};
|
|
|
|
/* Cache the section for possible re-use on the next entry. */
|
|
struct obj_section *prev_sect = nullptr;
|
|
|
|
const gdb_byte *iter = index->address_table.data ();
|
|
const gdb_byte *end = iter + index->address_table.size ();
|
|
while (iter < end)
|
|
{
|
|
ULONGEST hi, lo, cu_index;
|
|
lo = extract_unsigned_integer (iter, 8, BFD_ENDIAN_LITTLE);
|
|
iter += 8;
|
|
hi = extract_unsigned_integer (iter, 8, BFD_ENDIAN_LITTLE);
|
|
iter += 8;
|
|
cu_index = extract_unsigned_integer (iter, 4, BFD_ENDIAN_LITTLE);
|
|
iter += 4;
|
|
|
|
if (lo >= hi)
|
|
return invalid_range_warning (lo, hi);
|
|
|
|
if (cu_index >= index->units.size ())
|
|
{
|
|
warning (_(".gdb_index address table has invalid CU number %u,"
|
|
" ignoring .gdb_index"),
|
|
(unsigned) cu_index);
|
|
return false;
|
|
}
|
|
|
|
/* Variable hi is the exclusive upper bound, get the inclusive one. */
|
|
CORE_ADDR hi_incl = hi - 1;
|
|
|
|
struct obj_section *lo_sect = find_section (lo, prev_sect);
|
|
struct obj_section *hi_sect = find_section (hi_incl, prev_sect);
|
|
if (lo_sect == nullptr || hi_sect == nullptr)
|
|
return invalid_range_warning (lo, hi);
|
|
|
|
bool full_range_p
|
|
= mutable_map.set_empty (lo, hi_incl, index->units[cu_index]);
|
|
if (!full_range_p)
|
|
{
|
|
warning (_(".gdb_index address table has a range (%s - %s) that"
|
|
" overlaps with an earlier range, ignoring .gdb_index"),
|
|
hex_string (lo), hex_string (hi));
|
|
return false;
|
|
}
|
|
}
|
|
|
|
index->result.set_addrmap (std::move (mutable_map));
|
|
|
|
return true;
|
|
}
|
|
|
|
void
|
|
mapped_gdb_index::set_main_name (dwarf2_per_objfile *per_objfile)
|
|
{
|
|
const auto expected_size = 2 * sizeof (offset_type);
|
|
if (this->shortcut_table.size () < expected_size)
|
|
/* The data in the section is not present, is corrupted or is in a version
|
|
we don't know about. Regardless, we can't make use of it. */
|
|
return;
|
|
|
|
auto ptr = this->shortcut_table.data ();
|
|
const auto dw_lang = extract_unsigned_integer (ptr, 4, BFD_ENDIAN_LITTLE);
|
|
if (dw_lang >= DW_LANG_hi_user)
|
|
{
|
|
complaint (_(".gdb_index shortcut table has invalid main language %u"),
|
|
(unsigned) dw_lang);
|
|
return;
|
|
}
|
|
if (dw_lang == 0)
|
|
{
|
|
/* Don't bother if the language for the main symbol was not known or if
|
|
there was no main symbol at all when the index was built. */
|
|
return;
|
|
}
|
|
ptr += 4;
|
|
|
|
main_lang = dwarf_lang_to_enum_language (dw_lang);
|
|
const auto name_offset = extract_unsigned_integer (ptr,
|
|
sizeof (offset_type),
|
|
BFD_ENDIAN_LITTLE);
|
|
main_name = (const char *) (this->constant_pool.data () + name_offset);
|
|
}
|
|
|
|
/* See read-gdb-index.h. */
|
|
|
|
bool
|
|
dwarf2_read_gdb_index
|
|
(dwarf2_per_objfile *per_objfile,
|
|
get_gdb_index_contents_ftype get_gdb_index_contents,
|
|
get_gdb_index_contents_dwz_ftype get_gdb_index_contents_dwz)
|
|
{
|
|
const gdb_byte *cu_list, *types_list, *dwz_list = NULL;
|
|
offset_type cu_list_elements, types_list_elements, dwz_list_elements = 0;
|
|
struct objfile *objfile = per_objfile->objfile;
|
|
dwarf2_per_bfd *per_bfd = per_objfile->per_bfd;
|
|
scoped_remove_all_units remove_all_units (*per_bfd);
|
|
|
|
gdb::array_view<const gdb_byte> main_index_contents
|
|
= get_gdb_index_contents (objfile, per_bfd);
|
|
|
|
if (main_index_contents.empty ())
|
|
return false;
|
|
|
|
auto map = std::make_unique<mapped_gdb_index> ();
|
|
if (!read_gdb_index_from_buffer (objfile_name (objfile),
|
|
use_deprecated_index_sections,
|
|
main_index_contents, map.get (), &cu_list,
|
|
&cu_list_elements, &types_list,
|
|
&types_list_elements))
|
|
return false;
|
|
|
|
/* Don't use the index if it's empty. */
|
|
if (map->symbol_table.empty ())
|
|
return false;
|
|
|
|
/* If there is a .dwz file, read it so we can get its CU list as
|
|
well. */
|
|
dwz_file *dwz = per_bfd->get_dwz_file ();
|
|
if (dwz != NULL)
|
|
{
|
|
mapped_gdb_index dwz_map;
|
|
const gdb_byte *dwz_types_ignore;
|
|
offset_type dwz_types_elements_ignore;
|
|
|
|
gdb::array_view<const gdb_byte> dwz_index_content
|
|
= get_gdb_index_contents_dwz (objfile, dwz);
|
|
|
|
if (dwz_index_content.empty ())
|
|
return false;
|
|
|
|
if (!read_gdb_index_from_buffer (dwz->filename (),
|
|
1, dwz_index_content, &dwz_map,
|
|
&dwz_list, &dwz_list_elements,
|
|
&dwz_types_ignore,
|
|
&dwz_types_elements_ignore))
|
|
{
|
|
warning (_("could not read '.gdb_index' section from %s; skipping"),
|
|
dwz->filename ());
|
|
return false;
|
|
}
|
|
}
|
|
|
|
create_cus_from_gdb_index (per_bfd, cu_list, cu_list_elements, map->units,
|
|
dwz_list, dwz_list_elements);
|
|
|
|
if (types_list_elements)
|
|
{
|
|
/* We can only handle a single .debug_info and .debug_types when we have
|
|
an index. */
|
|
if (per_bfd->infos.size () > 1
|
|
|| per_bfd->types.size () > 1)
|
|
return false;
|
|
|
|
dwarf2_section_info *section
|
|
= (per_bfd->types.size () == 1
|
|
? &per_bfd->types[0]
|
|
: &per_bfd->infos[0]);
|
|
|
|
create_signatured_type_table_from_gdb_index (per_bfd, section, types_list,
|
|
types_list_elements,
|
|
map->units);
|
|
}
|
|
|
|
finalize_all_units (per_bfd);
|
|
|
|
if (!create_addrmap_from_gdb_index (per_objfile, map.get ()))
|
|
return false;
|
|
|
|
map->set_main_name (per_objfile);
|
|
|
|
int version = map->version;
|
|
auto worker = std::make_unique<gdb_index_worker> (per_objfile,
|
|
std::move (map));
|
|
auto idx = std::make_unique<cooked_gdb_index> (std::move (worker),
|
|
version);
|
|
|
|
per_bfd->start_reading (std::move (idx));
|
|
remove_all_units.disable ();
|
|
|
|
return true;
|
|
}
|
|
|
|
INIT_GDB_FILE (read_gdb_index)
|
|
{
|
|
add_setshow_boolean_cmd ("use-deprecated-index-sections",
|
|
no_class, &use_deprecated_index_sections, _("\
|
|
Set whether to use deprecated gdb_index sections."), _("\
|
|
Show whether to use deprecated gdb_index sections."), _("\
|
|
When enabled, deprecated .gdb_index sections are used anyway.\n\
|
|
Normally they are ignored either because of a missing feature or\n\
|
|
performance issue.\n\
|
|
Warning: This option must be enabled before gdb reads the file."),
|
|
NULL,
|
|
NULL,
|
|
&setlist, &showlist);
|
|
}
|