Files
binutils-gdb/gdb/testsuite/gdb.dwarf2/dw2-empty-inline-ranges.exp
Andrew Burgess bade3fecaf gdb: handle empty ranges for inline subroutines
The work in this patch is based on changes found in this series:

  https://inbox.sourceware.org/gdb-patches/AS1PR01MB946510286FBF2497A6F03E83E4922@AS1PR01MB9465.eurprd01.prod.exchangelabs.com

That series has the fixes here merged along with other changes, and
takes a different approach for how to handle the issue addressed here.

Credit for identifying the original issue belongs with Bernd, the
author of the original patch, who I have included as a co-author on
this patch.  A brief description of how the approach taken in this
patch differs from the approach Bernd took can be found at the end of
this commit message.

When compiling with optimisation, it can often happen that gcc will
emit an inline function instance with an empty range associated.  This
can happen in two ways.  The inline function might have a DW_AT_low_pc
and DW_AT_high_pc, where the high-pc is an offset from the low-pc, but
the high-pc offset is given as 0 by gcc.

Alternatively, the inline function might have a DW_AT_ranges, and one
of the sub-ranges might be empty, though usually in this case, other
ranges will be non-empty.

The second case is made worse in that sometimes gcc will specify a
DW_AT_entry_pc value which points to the address of the empty
sub-range.

My understanding of the DWARF spec is that empty ranges as seen in
these examples indicate that no instructions are associated with the
inline function, and indeed, this is how GDB handles these cases,
rejecting blocks and sub-ranges which are empty.

  DWARF-5, 2.17.2, Contiguous Address Range:
    The value of the DW_AT_low_pc attribute is the address of the
    first instruction associated with the entity. If the value of the
    DW_AT_high_pc is of class address, it is the address of the first
    location past the last instruction associated with the entity...

  DWARF-5, 2.17.3, Non-Contiguous Address Ranges:
    A bounded range entry whose beginning and ending address offsets
    are equal (including zero) indicates an empty range and may be
    ignored.

As a consequence, an attempt by the user to place a breakpoint on an
inline function with an empty low/high address range will trigger
GDB's pending breakpoint message:

  (gdb) b foo
  Function "foo" not defined.
  Make breakpoint pending on future shared library load? (y or [n]) n

While, having the entry-pc point at an empty range forces GDB to
ignore the given entry-pc and select a suitable alternative.

If instead of ignoring these empty ranges, we instead teach GDB to
treat these as non-empty, what we find is that, in all the cases I've
seen, the debug experience is improved.

As a minimum, in the low/high case, GDB now knows about the inline
function, and can place breakpoints that will be hit.  Further, in
most cases, local variables from the inline function can be accessed.

If we do start treating empty address ranges as non-empty then we are
deviating from the DWARF spec.  It is not clear if we are working
around a gcc bug (I suspect so), or if gcc actually considers the
inline function gone, and we're just getting lucky that the debug
experience seems improved.

My proposed strategy for handling these empty address ranges is to
only perform this work around if the compiler is gcc, so far I've not
seen this issue with Clang (the only other compiler I've tested),
though extending this to other compilers in the future would be
trivial.

Additionally, I only apply the work around for
DW_TAG_inlined_subroutine DIEs, as I've only seen the issue for
inline functions.

If we find a suitable empty address range then the fix-up is to give
the address range a length of 1 byte.

Now clearly, in most cases, 1 byte isn't even going to cover a single
instruction, but so far this doesn't seem to be a problem. An
alternative to using a 1-byte range would be to try and disassemble
the code at the given address, calculate the instruction length, and
use that, the length of one instruction.  But this means that the
DWARF parser now needs to make use of the disassembler, which feels
like a big change that I'd rather avoid if possible.

The other alternative is to allow blocks to be created with zero
length address ranges and then change the rest of GDB to allow for
lookup of zero sized blocks to succeed.  This is the approach taken by
the original patch series that I linked above.

The results achieved by the original patch are impressive, and Bernd,
the original patch author, makes a good argument that at least some of
the problems relating to empty ranges are a result of deficiencies in
the DWARF specification rather than issues with gcc.

However, I remain unconvinced.  But even if I accept that the issue is
with DWARF itself rather than gcc, the question still remains; should
we fix the problem by synthesising new DWARF attributes and/or accept
non-standard DWARF during the dwarf2/read.c phase, and then update GDB
to handle the new reality, or should we modify the incoming DWARF as
we read it to make it fit GDB's existing algorithms.

The original patch, I believe, took the former approach, while I
favour the later, and so, for now, I propose that the single byte
range proposal is good enough, at least until we find counter examples
where this doesn't work.

This leaves just one question: what about the remaining work in the
original patch.  That work deals with problems around the end address
of non-empty ranges.  The original patch handled that case using the
same algorithm changes, which is neat, but I think there are
alternative solutions that should be investigated.  If the
alternatives don't end up working out, then it's trivial to revert
this patch in the future and adopt the original proposal.

For testing I have two approaches, C/C++ test compiled with
optimisation that show the problems discussed.  These are good because
they show that these issues do crop up in compiled code.  But they are
bad in that the next compiler version might change the way the test is
optimised such that the problem no longer shows.

And so I've backed up the real code tests with DWARF assembler tests
which reproduce each issue.

The DWARF assembler tests are not really impacted by which gcc version
is used, but I've run all of these tests using gcc versions 8.4.0,
9.5.0, 10.5.0, 11.5.0, 12.2.0, and 14.2.0.  I see failures in all of
the new tests when using an unpatched GDB, and no failures when using
a patched GDB.

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=25987
Co-Authored-By: Bernd Edlinger <bernd.edlinger@hotmail.de>
2024-12-04 14:03:25 +00:00

261 lines
8.0 KiB
Plaintext

# Copyright 2024 Free Software Foundation, Inc.
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# Define an inline function `foo` within the function `main`. The
# function `foo` uses DW_AT_ranges to define its ranges. One of the
# sub-ranges for foo will be empty.
#
# An empty sub-rnage should indicate that there is no code associated
# with `foo` at that address, however, with gcc versions at least
# between 8.x and 14.x (latest at the time of writing this comment),
# it is observed that when these empty sub-ranges are created for an
# inline function, if GDB treats the sub-range as non-empty, and stops
# at that location, then this generally gives a better debug
# experience. It is often still possible to read local variables at
# that address.
#
# This function defines an inline function, places a breakpoint on its
# entry-pc, and then runs and expects GDB to stop, and report the stop
# as being inside the inline function.
#
# We then check that the next outer frame is `main` as expected, and
# that the block for `foo` has the expected sub-ranges.
#
# We compile a variety of different configurations, broadly there are
# two variables, the location of the empty sub-range, and whether the
# entry-pc points at the empty sub-range or not.
#
# The the empty sub-range location, the empty sub-range can be the
# sub-range at the lowest address, highest address, or can be
# somewhere between a blocks low and high addresses.
load_lib dwarf.exp
require dwarf2_support
standard_testfile .c .S
# Lines we reference in the generated DWARF.
set main_decl_line [gdb_get_line_number "main decl line"]
set foo_call_line [gdb_get_line_number "foo call line"]
get_func_info main
# Compile the source file and load the executable into GDB so we can
# extract some addresses needed for creating the DWARF.
if { [prepare_for_testing "failed to prepare" ${testfile} \
[list ${srcfile}]] } {
return -1
}
if {![runto_main]} {
return -1
}
# Some addresses that we need when generating the DWARF.
for { set i 0 } { $i < 9 } { incr i } {
set main_$i [get_hexadecimal_valueof "&main_$i" "UNKNOWN" \
"get address for main_$i"]
}
# Create the DWARF assembler file into ASM_FILE. Using DWARF_VERSION
# to define which style of ranges to create. FUNC_RANGES is a list of
# 6 entries, each of which is an address, used to create the ranges
# for the inline function DIE. The ENTRY_PC is also an address and is
# used for the DW_AT_entry_pc of the inlined function.
proc write_asm_file { asm_file dwarf_version func_ranges entry_pc } {
Dwarf::assemble $asm_file {
upvar entry_label entry_label
upvar dwarf_version dwarf_version
upvar func_ranges func_ranges
upvar entry_pc entry_pc
declare_labels lines_table inline_func ranges_label
cu { version $dwarf_version } {
compile_unit {
{producer "GNU C 14.1.0"}
{language @DW_LANG_C}
{name $::srcfile}
{comp_dir /tmp}
{low_pc 0 addr}
{DW_AT_stmt_list $lines_table DW_FORM_sec_offset}
} {
inline_func: subprogram {
{name foo}
{inline @DW_INL_declared_inlined}
}
subprogram {
{name main}
{decl_file 1 data1}
{decl_line $::main_decl_line data1}
{decl_column 1 data1}
{low_pc $::main_start addr}
{high_pc $::main_len data4}
{external 1 flag}
} {
inlined_subroutine {
{abstract_origin %$inline_func}
{call_file 1 data1}
{call_line $::foo_call_line data1}
{entry_pc $entry_pc addr}
{ranges $ranges_label DW_FORM_sec_offset}
}
}
}
}
lines {version 2} lines_table {
include_dir "$::srcdir/$::subdir"
file_name "$::srcfile" 1
}
if { $dwarf_version == 5 } {
rnglists {} {
table {} {
ranges_label: list_ {
start_end [lindex $func_ranges 0] [lindex $func_ranges 1]
start_end [lindex $func_ranges 2] [lindex $func_ranges 3]
start_end [lindex $func_ranges 4] [lindex $func_ranges 5]
}
}
}
} else {
ranges { } {
ranges_label: sequence {
range [lindex $func_ranges 0] [lindex $func_ranges 1]
range [lindex $func_ranges 2] [lindex $func_ranges 3]
range [lindex $func_ranges 4] [lindex $func_ranges 5]
}
}
}
}
}
# Gobal used to give each generated binary a unique name.
set test_id 0
proc run_test { dwarf_version empty_loc entry_pc_type } {
incr ::test_id
set this_testfile $::testfile-$::test_id
set asm_file [standard_output_file $this_testfile.S]
if { $empty_loc eq "start" } {
set ranges [list \
$::main_1 $::main_1 \
$::main_3 $::main_4 \
$::main_6 $::main_7]
set entry_pc_choices [list $::main_1 $::main_3]
} elseif { $empty_loc eq "middle" } {
set ranges [list \
$::main_1 $::main_2 \
$::main_4 $::main_4 \
$::main_6 $::main_7]
set entry_pc_choices [list $::main_4 $::main_1]
} elseif { $empty_loc eq "end" } {
set ranges [list \
$::main_1 $::main_2 \
$::main_4 $::main_5 \
$::main_7 $::main_7]
set entry_pc_choices [list $::main_7 $::main_1]
} else {
error "unknown location for empty range '$empty_loc'"
}
if { $entry_pc_type eq "empty" } {
set entry_pc [lindex $entry_pc_choices 0]
} elseif { $entry_pc_type eq "non_empty" } {
set entry_pc [lindex $entry_pc_choices 1]
} else {
error "unknown entry-pc type '$entry_pc_type'"
}
write_asm_file $asm_file $dwarf_version $ranges $entry_pc
if {[prepare_for_testing "failed to prepare" $this_testfile \
[list $::srcfile $asm_file] {nodebug}]} {
return
}
if {![runto_main]} {
return
}
# Continue until we stop in 'foo'.
gdb_breakpoint foo
gdb_test "continue" \
"Breakpoint $::decimal, $::hex in foo \\(\\)" \
"continue to b/p in foo"
# Check we stopped at the entry-pc.
set pc [get_hexadecimal_valueof "\$pc" "*UNKNOWN*" \
"get \$pc at breakpoint"]
gdb_assert { $pc == $entry_pc } "stopped at entry-pc"
# The block's expected overall low/high addresses.
set block_start [lindex $ranges 0]
set block_end [lindex $ranges 5]
# Setup variables r{0,1,2}s, r{0,1,2}e, to represent ranges start
# and end addresses. These are extracted from the RANGES
# variable. However, RANGES includes the empty ranges, so spot
# the empty ranges and update the end address as GDB does.
#
# Also, if the empty range is at the end of the block, then the
# block's overall end address also needs adjusting.
for { set i 0 } { $i < 3 } { incr i } {
set start [lindex $ranges [expr $i * 2]]
set end [lindex $ranges [expr $i * 2 + 1]]
if { $start == $end } {
set end [format "0x%x" [expr $end + 1]]
}
if { $block_end == $start } {
set block_end $end
}
set r${i}s $start
set r${i}e $end
}
# Check the block 'foo' has the expected ranges.
gdb_test "maintenance info blocks" \
[multi_line \
"\\\[\\(block \\*\\) $::hex\\\] $block_start\\.\\.$block_end" \
" entry pc: $entry_pc" \
" inline function: foo" \
" symbol count: $::decimal" \
" address ranges:" \
" $r0s\\.\\.$r0e" \
" $r1s\\.\\.$r1e" \
" $r2s\\.\\.$r2e"] \
"block for foo has some content"
# Check the outer frame is 'main' as expected.
gdb_test "frame 1" \
[multi_line \
"#1 main \\(\\) at \[^\r\n\]+/$::srcfile:$::foo_call_line" \
"$::foo_call_line\\s+\[^\r\n\]+/\\* foo call line \\*/"] \
"frame 1 is for main"
}
foreach_with_prefix dwarf_version { 4 5 } {
foreach_with_prefix empty_loc { start middle end } {
foreach_with_prefix entry_pc_type { empty non_empty } {
run_test $dwarf_version $empty_loc $entry_pc_type
}
}
}