forked from Imagelibrary/binutils-gdb
f7f94f99273176f5f6b64f69e01673f935d54cad
1321 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
9e69a2e127 |
Introduce "command" styling
This adds a new "command" style that is used when styling the name of a gdb command. Note that not every instance of a command name that is output by gdb is changed here. There is currently no way to style error() strings, and there is no way to mark up command help strings. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31747 Reviewed-By: Eli Zaretskii <eliz@gnu.org> Reviewed-By: Keith Seitz <keiths@redhat.com> Approved-By: Andrew Burgess <aburgess@redhat.com> |
||
|
|
ac51afb51c |
[gdb/contrib] Add two rules in common-misspellings.txt
Eli mentioned [1] that given that we use US English spelling in our documentation, we should use "behavior" instead of "behaviour". In wikipedia-common-misspellings.txt there's a rule: ... behavour->behavior, behaviour ... which leaves this as a choice. Add an overriding rule to hardcode the choice to common-misspellings.txt: ... behavour->behavior ... and add a rule to rewrite behaviour into behavior: ... behaviour->behavior ... and re-run spellcheck.sh on gdb*. Tested on x86_64-linux. [1] https://sourceware.org/pipermail/gdb-patches/2024-November/213371.html |
||
|
|
5d9887ffa2 |
gdb: stepping between inline functions with multiple ranges
I (Andrew) have split this small change from a larger patch which was posted here: https://inbox.sourceware.org/gdb-patches/AS1PR01MB9465608EBD5D62642C51C428E4922@AS1PR01MB9465.eurprd01.prod.exchangelabs.com And I have written the stand alone test for this issue. The original patch included this paragraph to explain this change (I've fixed one typo in this text replacing 'program' with 'function'): ... it may happen that the infrun machinery steps from one inline range to another inline range of the same inline function. That can look like jumping back and forth from the calling function to the inline function, while really the inline function just jumps from a hot to a cold section of the code, i.e. error handling. The important thing that happens here is that both the outer function and the inline function must both have multiple ranges. When the inferior is within the inline function and moves from one range to another it is critical that the address we stop at is the start of a range in both the outer function and the inline function. The diagram below represents how the functions are split and aligned: (A) (B) bar: |------------| |---| foo: |------------------| |--------| The inferior is stepping through 'bar' and eventually reaches point (A) at which point control passes to point (B). Currently, when the inferior stops, GDB notices that both 'foo' and 'bar' start at address (B), and so GDB uses the inline frame mechanism to skip 'bar' and tells the user that the inferior is in 'foo'. However, as we were in 'bar' before the step then it makes sense that we should be in 'bar' after the step, and this is what the patch does. There are two tests using the DWARF assembler, the first checks the above situation and ensures that GDB reports 'bar' after the step. The second test is similar, but after the step we enter a new range where a different inline function starts, something like this: (A) (B) bar: |------------| baz: |---| foo: |------------------| |--------| In this case as we step at (A) and land at (B) we leave 'bar' and expect to stop in 'foo', GDB shouldn't automatically enter 'baz' as that is a completely different inline function. And this is, indeed, what we see. Co-Authored-By: Andrew Burgess <aburgess@redhat.com> |
||
|
|
0891970109 |
Change message when reaching end of reverse history.
In a record session, when we move backward, GDB switches from normal execution to simulation. Moving forward again, the emulation continues until the end of the reverse history. When the end is reached, the execution stops, and a warning message is shown. This message has been modified to indicate that the forward emulation has reached the end, but the execution can continue as normal, and the recording will also continue. Before this patch, the warning message shown in that case was the same as in the reverse case. This meant that when the end of history was reached in either backward or forward emulation, the same message was displayed: "No more reverse-execution history." This message has changed for these two cases. Backward emulation: "Reached end of recorded history; stopping. Backward execution from here not possible." Forward emulation: "Reached end of recorded history; stopping. Following forward execution will be added to history." The reason for this change is that the initial message was deceiving, for the forward case, making the user believe that forward debugging could not continue. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31224 Reviewed-By: Markus T. Metzger <markus.t.metzger@intel.com> (btrace) Approved-By: Guinevere Larsen <blarsen@redhat.com> |
||
|
|
a078084852 |
gdb: Fix printing frame when reversing out of a recursive call with clang
Commit
|
||
|
|
b464e193d1 |
[gdb] Notice when stepping into different file
Consider the following test-case:
...
$ cat -n test.c
1 int var;
2
3 int
4 foo (void)
5 {
6 var = 1;
7 #include "test.h"
8 }
9
10 int
11 main ()
12 {
13 return foo ();
14 }
$ cat -n test.h
1 return 1;
$ gcc test.c -g
...
When stepping through the test-case, gdb doesn't make it explicit that line 1
is not in test.c:
...
Temporary breakpoint 1, main () at test.c:13
13 return foo ();
(gdb) step
foo () at test.c:6
6 var = 1;
(gdb) n
1 return 1;
(gdb)
8 }
(gdb)
...
which makes it easy to misinterpret the output.
This is with the default "print frame-info" == auto, with documented
behaviour [1]:
...
stepi will switch between source-line and source-and-location depending on the
program counter.
...
What is actually implemented is that source-line is used unless stepping into
or out of a function.
The problem can be worked around by using
"set print frame-info source-and-location", but that's a bit verbose.
Instead, change the behaviour of "print frame-info" == auto to also use
source-and-location when stepping into another file, which gets us:
...
(gdb) n
foo () at test.h:1
1 return 1;
...
Tested on x86_64-linux.
Reviewed-By: Kevin Buettner <kevinb@redhat.com>
Reviewed-By: Kévin Le Gouguec <legouguec@adacore.com>
PR gdb/32011
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32011
[1] https://sourceware.org/gdb/current/onlinedocs/gdb.html/Print-Settings.html#index-set-print-frame_002dinfo
|
||
|
|
b8c9d0de90 |
gdb: pass program space to no_shared_libraries
Make the current program space reference bubble up one level. Pass `current_program_space` everywhere, except in some cases where we can get the pspace another way, and it's relatively obvious that it's the same as the current program space. Change-Id: Id86b79f1e44f92a398f49d137d57457174dfa96d Approved-By: Tom Tromey <tom@tromey.com> Reviewed-By: Thiago Jung Bauermann <thiago.bauermann@linaro.org> |
||
|
|
89dc60d957 |
gdb: split no_shared_libraries, command vs implementation
The `no_shared_libraries` function is currently used to implement the `nosharedlibrary` command, but it also used internally by other functions. This does not make a very good internal API. Add the `no_shared_libraries_command` function to implement the CLI command. Remove the unused parameters from `no_shared_libraries`. Remove the `from_tty` parameter of `target_pre_inferior`, since it's now unused. Change-Id: I4fcba5ee1e0f7d250aab1a7b62b9ea16265fe962 Approved-By: Tom Tromey <tom@tromey.com> Reviewed-By: Thiago Jung Bauermann <thiago.bauermann@linaro.org> |
||
|
|
05d9d66d92 |
gdb: remove unused includes in utils.h
Remove some includes reported as unused by clangd. Add some includes in other files that were previously relying on the transitive include. Change-Id: Ibdd0a998b04d21362a20d0ca8e5267e21e2e133e |
||
|
|
354f8d0a1f |
Remove unnecessary get_current_frame calls from infrun.c
Since the frame variable is now a frame_info_ptr, the issue with the dangling frame pointer is apparently no longer there. So remove the re-fetch code and the corresponding meanwhile misleading comments. Approved-By: Tom Tromey <tom@tromey.com> |
||
|
|
eb97e68430 |
gdb: remove unused include in infrun.c
Remove the gdbcmd.h, which is reported as unused by clangd. Add cli/cli-cmds.h instead, to get access to `cmdlist` and friends. Change-Id: Ic0c60d2f6d3618f1bd9fd80b95ffd7c33c692a04 |
||
|
|
18d2988e5d |
gdb, gdbserver, gdbsupport: remove includes of early headers
Now that defs.h, server.h and common-defs.h are included via the `-include` option, it is no longer necessary for source files to include them. Remove all the inclusions of these files I could find. Update the generation scripts where relevant. Change-Id: Ia026cff269c1b7ae7386dd3619bc9bb6a5332837 Approved-By: Pedro Alves <pedro@palves.net> |
||
|
|
3df7843699 |
gdb: fix b/p conditions with infcalls in multi-threaded inferiors
This commit fixes bug PR 28942, that is, creating a conditional breakpoint in a multi-threaded inferior, where the breakpoint condition includes an inferior function call. Currently, when a user tries to create such a breakpoint, then GDB will fail with: (gdb) break infcall-from-bp-cond-single.c:61 if (return_true ()) Breakpoint 2 at 0x4011fa: file /tmp/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.threads/infcall-from-bp-cond-single.c, line 61. (gdb) continue Continuing. [New Thread 0x7ffff7c5d700 (LWP 2460150)] [New Thread 0x7ffff745c700 (LWP 2460151)] [New Thread 0x7ffff6c5b700 (LWP 2460152)] [New Thread 0x7ffff645a700 (LWP 2460153)] [New Thread 0x7ffff5c59700 (LWP 2460154)] Error in testing breakpoint condition: Couldn't get registers: No such process. An error occurred while in a function called from GDB. Evaluation of the expression containing the function (return_true) will be abandoned. When the function is done executing, GDB will silently stop. Selected thread is running. (gdb) Or, in some cases, like this: (gdb) break infcall-from-bp-cond-simple.c:56 if (is_matching_tid (arg, 1)) Breakpoint 2 at 0x401194: file /tmp/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.threads/infcall-from-bp-cond-simple.c, line 56. (gdb) continue Continuing. [New Thread 0x7ffff7c5d700 (LWP 2461106)] [New Thread 0x7ffff745c700 (LWP 2461107)] ../../src.release/gdb/nat/x86-linux-dregs.c:146: internal-error: x86_linux_update_debug_registers: Assertion `lwp_is_stopped (lwp)' failed. A problem internal to GDB has been detected, further debugging may prove unreliable. The precise error depends on the exact thread state; so there's race conditions depending on which threads have fully started, and which have not. But the underlying problem is always the same; when GDB tries to execute the inferior function call from within the breakpoint condition, GDB will, incorrectly, try to resume threads that are already running - GDB doesn't realise that some threads might already be running. The solution proposed in this patch requires an additional member variable thread_info::in_cond_eval. This flag is set to true (in breakpoint.c) when GDB is evaluating a breakpoint condition. In user_visible_resume_ptid (infrun.c), when the in_cond_eval flag is true, then GDB will only try to resume the current thread, that is, the thread for which the breakpoint condition is being evaluated. This solves the problem of GDB trying to resume threads that are already running. The next problem is that inferior function calls are assumed to be synchronous, that is, GDB doesn't expect to start an inferior function call in thread #1, then receive a stop from thread #2 for some other, unrelated reason. To prevent GDB responding to an event from another thread, we update fetch_inferior_event and do_target_wait in infrun.c, so that, when an inferior function call (on behalf of a breakpoint condition) is in progress, we only wait for events from the current thread (the one evaluating the condition). In do_target_wait I had to change the inferior_matches lambda function, which is used to select which inferior to wait on. Previously the logic was this: auto inferior_matches = [&wait_ptid] (inferior *inf) { return (inf->process_target () != nullptr && ptid_t (inf->pid).matches (wait_ptid)); }; This compares the pid of the inferior against the complete ptid we want to wait on. Before this commit wait_ptid was only ever minus_one_ptid (which is special, and means any process), and so every inferior would match. After this commit though wait_ptid might represent a specific thread in a specific inferior. If we compare the pid of the inferior to a specific ptid then these will not match. The fix is to compare against the pid extracted from the wait_ptid, not against the complete wait_ptid itself. In fetch_inferior_event, after receiving the event, we only want to stop all the other threads, and call inferior_event_handler with INF_EXEC_COMPLETE, if we are not evaluating a conditional breakpoint. If we are, then all the other threads should be left doing whatever they were before. The inferior_event_handler call will be performed once the breakpoint condition has finished being evaluated, and GDB decides to stop or not. The final problem that needs solving relates to GDB's commit-resume mechanism, which allows GDB to collect resume requests into a single packet in order to reduce traffic to a remote target. The problem is that the commit-resume mechanism will not send any resume requests for an inferior if there are already events pending on the GDB side. Imagine an inferior with two threads. Both threads hit a breakpoint, maybe the same conditional breakpoint. At this point there are two pending events, one for each thread. GDB selects one of the events and spots that this is a conditional breakpoint, GDB evaluates the condition. The condition includes an inferior function call, so GDB sets up for the call and resumes the one thread, the resume request is added to the commit-resume queue. When the commit-resume queue is committed GDB sees that there is a pending event from another thread, and so doesn't send any resume requests to the actual target, GDB is assuming that when we wait we will select the event from the other thread. However, as this is an inferior function call for a condition evaluation, we will not select the event from the other thread, we only care about events from the thread that is evaluating the condition - and the resume for this thread was never sent to the target. And so, GDB hangs, waiting for an event from a thread that was never fully resumed. To fix this issue I have added the concept of "forcing" the commit-resume queue. When enabling commit resume, if the force flag is true, then any resumes will be committed to the target, even if there are other threads with pending events. A note on authorship: this patch was based on some work done by Natalia Saiapova and Tankut Baris Aktemur from Intel[1]. I have made some changes to their work in this version. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28942 [1] https://sourceware.org/pipermail/gdb-patches/2020-October/172454.html Co-authored-by: Natalia Saiapova <natalia.saiapova@intel.com> Co-authored-by: Tankut Baris Aktemur <tankut.baris.aktemur@intel.com> Reviewed-By: Tankut Baris Aktemur <tankut.baris.aktemur@intel.com> Tested-By: Luis Machado <luis.machado@arm.com> Tested-By: Keith Seitz <keiths@redhat.com> |
||
|
|
07505b613a |
Revert "gdb: remove unnecessary parameter wait_ptid from do_target_wait"
This reverts commit
|
||
|
|
8480a37e14 |
gdb: pass frames as const frame_info_ptr &
We currently pass frames to function by value, as `frame_info_ptr`.
This is somewhat expensive:
- the size of `frame_info_ptr` is 64 bytes, which is a bit big to pass
by value
- the constructors and destructor link/unlink the object in the global
`frame_info_ptr::frame_list` list. This is an `intrusive_list`, so
it's not so bad: it's just assigning a few points, there's no memory
allocation as if it was `std::list`, but still it's useless to do
that over and over.
As suggested by Tom Tromey, change many function signatures to accept
`const frame_info_ptr &` instead of `frame_info_ptr`.
Some functions reassign their `frame_info_ptr` parameter, like:
void
the_func (frame_info_ptr frame)
{
for (; frame != nullptr; frame = get_prev_frame (frame))
{
...
}
}
I wondered what to do about them, do I leave them as-is or change them
(and need to introduce a separate local variable that can be
re-assigned). I opted for the later for consistency. It might not be
clear why some functions take `const frame_info_ptr &` while others take
`frame_info_ptr`. Also, if a function took a `frame_info_ptr` because
it did re-assign its parameter, I doubt that we would think to change it
to `const frame_info_ptr &` should the implementation change such that
it doesn't need to take `frame_info_ptr` anymore. It seems better to
have a simple rule and apply it everywhere.
Change-Id: I59d10addef687d157f82ccf4d54f5dde9a963fd0
Approved-By: Andrew Burgess <aburgess@redhat.com>
|
||
|
|
f592870204 |
gdb: add inferior parameter to breakpoint_init_inferior
By inspection, I believe that breakpoint_init_inferior doesn't call anything that relies on the current program space or inferior. So, add an inferior parameter, to make the current inferior / program space references bubble up one level. Change-Id: Ib07b7a6d360e324f6ae1aa502dd314b8cce421b7 Approved-By: Andrew Burgess <aburgess@redhat.com> |
||
|
|
c72348e3b4 |
gdb: add program_space parameter to mark_breakpoints_out
Make the current_program_space reference bubble up one level. Change-Id: Idc8ed78d23bf3bb2969f6963d8cc049f26901c29 Approved-By: Andrew Burgess <aburgess@redhat.com> |
||
|
|
9c175474a8 |
gdb: remove some unnecessary frame_info_ptr resets
This code was probably needed before we had reinflatable frame_info_ptrs, it's not necessary anymore. Change-Id: I5474c6081ee1e39624c9266b05dbe01351a130b5 Approved-By: Tom Tromey <tom@tromey.com> |
||
|
|
ccf41c2487 |
Use domain_search_flags in lookup_symbol et al
This changes lookup_symbol and associated APIs to accept domain_search_flags rather than a domain_enum. Note that this introduces some new constants to Python and Guile. I chose to break out the documentation patch for this, because the internals here do not change until a later patch, and it seemed simpler to patch the docs just once, rather than twice. |
||
|
|
5266f5c25b |
gdb/infrun: lazily load curr_frame_id in process_event_stop_test
A recent(ish) change in gdb/infrun.c made process_event_stop_test load
debug information where it would not have done so previously. The
change is:
commit
|
||
|
|
519d634396 |
gdb: Buffer output streams during events that might download debuginfo
Introduce new ui_file buffering_file to temporarily collect output
written to gdb_std* output streams during print_thread, print_frame_info
and print_stop_event.
This ensures that output during these functions is not interrupted
by debuginfod progress messages.
With the addition of deferred debuginfo downloading it is possible
for download progress messages to print during these events.
Without any intervention we can end up with poorly formatted output:
(gdb) backtrace
[...]
#8 0x00007fbe8af7d7cf in pygi_invoke_c_callable (Downloading separate debug info for /lib64/libpython3.11.so.1.0
function_cache=0x561221b224d0, state=<optimized out>...
To fix this we buffer writes to gdb_std* output streams while allowing
debuginfod progress messages to skip the buffers and print to the
underlying output streams immediately. Buffered output is then written
to the output streams. This ensures that progress messages print first,
followed by uninterrupted frame/thread/stop info:
(gdb) backtrace
[...]
Downloading separate debug info for /lib64/libpython3.11.so.1.0
#8 0x00007fbe8af7d7cf in pygi_invoke_c_callable (function_cache=0x561221b224d0, state=<optimized out>...
Co-Authored-By: Andrew Burgess <aburgess@redhat.com>
Approved-By: Andrew Burgess <aburgess@redhat.com>
|
||
|
|
1d506c26d9 |
Update copyright year range in header of all files managed by GDB
This commit is the result of the following actions:
- Running gdb/copyright.py to update all of the copyright headers to
include 2024,
- Manually updating a few files the copyright.py script told me to
update, these files had copyright headers embedded within the
file,
- Regenerating gdbsupport/Makefile.in to refresh it's copyright
date,
- Using grep to find other files that still mentioned 2023. If
these files were updated last year from 2022 to 2023 then I've
updated them this year to 2024.
I'm sure I've probably missed some dates. Feel free to fix them up as
you spot them.
|
||
|
|
fe6356def6 |
PowerPC and aarch64: Fix reverse stepping failure
When running GDB's testsuite on aarch64-linux/Ubuntu 20.04 (also spotted on
the ppc backend), there are failures in gdb.reverse/solib-precsave.exp and
gdb.reverse/solib-reverse.exp.
The failure happens around the following code:
38 b[1] = shr2(17); /* middle part two */
40 b[0] = 6; b[1] = 9; /* generic statement, end part two */
42 shr1 ("message 1\n"); /* shr1 one */
Normal execution:
- step from line 38 will land on line 40.
- step from line 40 will land on line 42.
Reverse execution:
- step from line 42 will land on line 40.
- step from line 40 will land on line 40.
- step from line 40 will land on line 38.
The problem here is that line 40 contains two contiguous but distinct
PC ranges in the line table, like so:
Line 40 - [0x7ec ~ 0x7f4]
Line 40 - [0x7f4 ~ 0x7fc]
The two distinct ranges are generated because GCC started outputting source
column information, which GDB doesn't take into account at the moment.
When stepping forward from line 40, we skip both of these ranges and land on
line 42. When stepping backward from line 42, we stop at the start PC of the
second (or first, going backwards) range of line 40.
Since we've reached ecs->event_thread->control.step_range_start, we stop
stepping backwards.
The above issues were fixed by introducing a new function that looks for
adjacent PC ranges for the same line, until we notice a line change. Then
we take that as the start PC of the range. The new start PC for the range
is used for the control.step_range_start when setting up a step range.
The test case gdb.reverse/map-to-same-line.exp is added to test the fix
for the above reverse step issues.
Patch has been tested on PowerPC, X86 and AArch64 with no regressions.
|
||
|
|
45fd40cf54 |
Step over thread exit, always delete the thread non-silently
With AMD GPU debugging, I noticed that when stepping over a breakpoint placed on top of the s_endpgm instruction inline (displaced=off), GDB would behave differently -- it wouldn't print the wave exit. E.g: With displaced stepping, or no breakpoint at all: stepi [AMDGPU Wave 1:4:1:1 (0,0,0)/0 exited] Command aborted, thread exited. (gdb) With inline stepping: stepi Command aborted, thread exited. (gdb) In the cases we see the "exited" notification, handle_thread_exit is what first called delete_thread on the exiting thread, which is non-silent. With inline stepping, however, handle_thread_exit ends up in update_thread_list (via restart_threads) before any delete_thread call. Thus, amd_dbgapi_target::update_thread_list notices that the wave is gone and deletes it with delete_thread_silent. This commit fixes it, by making handle_thread_exited call set_thread_exited (with the default silent=false) early, which emits the user-visible notification. Approved-By: Simon Marchi <simon.marchi@efficios.com> Change-Id: I22ab3145e18d07c99dace45576307b9f9d5d966f |
||
|
|
249d081287 |
displaced_step_finish: Don't fetch the regcache of exited threads
displaced_step_finish can be called with event_status.kind == TARGET_WAITKIND_THREAD_EXITED, and in that case it is not possible to get at the already-exited thread's registers. This patch moves the get_thread_regcache calls to branches that actually need it, where we know the thread is still alive. It also adds an assertion to get_thread_regcache, to help catching these broken cases sooner. Approved-By: Simon Marchi <simon.marchi@efficios.com> Change-Id: I63b5eacb3e02a538fc5087c270d8025adfda88c3 |
||
|
|
d0b5914979 |
Ensure selected thread after thread exit stop
While making step over thread exit work properly on AMDGPU, I noticed that if there's a breakpoint on top of the exit syscall, and, displaced stepping is off, then when GDB reports "Command aborted, thread exited.", GDB also switches focus to a random thread, instead of leaving the exited thread as selected: (gdb) thread [Current thread is 6, lane 0 (AMDGPU Lane 1:4:1:1/0 (0,0,0)[0,0,0])] (gdb) si Command aborted, thread exited. (gdb) thread [Current thread is 5 (Thread 0x7ffff626f640 (LWP 3248392))] (gdb) The previous patch extended gdb.threads/step-over-thread-exit.exp to exercise this on GNU/Linux (on the CPU side), and there, after that "si", we always end up with the exiting thread as selected even without this fix, but that's just a concidence, there's a code path that happens to select the exiting thread for an unrelated reason. This commit add the explict switch, fixing the latent problem for GNU/Linux, and the actual problem on AMDGPU. I wrote a gdb.rocm/ testcase for this, but it can't be upstreamed yet, until more pieces of the DWARF machinery are upstream as well. Approved-By: Simon Marchi <simon.marchi@efficios.com> Change-Id: I6ff57a79514ac0142bba35c749fe83d53d9e4e51 |
||
|
|
14414227bf |
[gdb] Fix segfault in for_each_block, part 2
The previous commit describes PR gdb/30547, a segfault when running test-case
gdb.base/vfork-follow-parent.exp on powerpc64 (likewise on s390x).
The root cause for the segmentation fault is that linux_is_uclinux gives an
incorrect result: it returns true instead of false.
So, why does linux_is_uclinux:
...
int
linux_is_uclinux (void)
{
CORE_ADDR dummy;
return (target_auxv_search (AT_NULL, &dummy) > 0
&& target_auxv_search (AT_PAGESZ, &dummy) == 0);
...
return true?
This is because ppc_linux_target_wordsize returns 4 instead of 8, causing
ppc_linux_nat_target::auxv_parse to misinterpret the auxv vector.
So, why does ppc_linux_target_wordsize:
...
int
ppc_linux_target_wordsize (int tid)
{
int wordsize = 4;
/* Check for 64-bit inferior process. This is the case when the host is
64-bit, and in addition the top bit of the MSR register is set. */
long msr;
errno = 0;
msr = (long) ptrace (PTRACE_PEEKUSER, tid, PT_MSR * 8, 0);
if (errno == 0 && ppc64_64bit_inferior_p (msr))
wordsize = 8;
return wordsize;
}
...
return 4?
Specifically, we get this result because because tid == 0, so we get
errno == ESRCH.
The tid == 0 is caused by the switch_to_no_thread in
handle_vfork_child_exec_or_exit:
...
/* Switch to no-thread while running clone_program_space, so
that clone_program_space doesn't want to read the
selected frame of a dead process. */
scoped_restore_current_thread restore_thread;
switch_to_no_thread ();
inf->pspace = new program_space (maybe_new_address_space ());
...
but moving the maybe_new_address_space call to before that gives us the
same result. The tid is no longer 0, but we still get ESRCH because the
thread has exited.
Fix this in handle_vfork_child_exec_or_exit by doing the
maybe_new_address_space call in the context of the vfork parent.
Tested on top of trunk on x86_64-linux and ppc64le-linux.
Tested on top of gdb-14-branch on ppc64-linux.
Co-Authored-By: Simon Marchi <simon.marchi@polymtl.ca>
PR gdb/30547
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30547
|
||
|
|
f9582a22db |
[gdb] Fix segfault in for_each_block, part 1
When running test-case gdb.base/vfork-follow-parent.exp on powerpc64 (likewise
on s390x), I run into:
...
(gdb) PASS: gdb.base/vfork-follow-parent.exp: \
exec_file=vfork-follow-parent-exit: target-non-stop=on: non-stop=off: \
resolution_method=schedule-multiple: print unblock_parent = 1
continue^M
Continuing.^M
Reading symbols from vfork-follow-parent-exit...^M
^M
^M
Fatal signal: Segmentation fault^M
----- Backtrace -----^M
0x1027d3e7 gdb_internal_backtrace_1^M
src/gdb/bt-utils.c:122^M
0x1027d54f _Z22gdb_internal_backtracev^M
src/gdb/bt-utils.c:168^M
0x1057643f handle_fatal_signal^M
src/gdb/event-top.c:889^M
0x10576677 handle_sigsegv^M
src/gdb/event-top.c:962^M
0x3fffa7610477 ???^M
0x103f2144 for_each_block^M
src/gdb/dcache.c:199^M
0x103f235b _Z17dcache_invalidateP13dcache_struct^M
src/gdb/dcache.c:251^M
0x10bde8c7 _Z24target_dcache_invalidatev^M
src/gdb/target-dcache.c:50^M
...
or similar.
The root cause for the segmentation fault is that linux_is_uclinux gives an
incorrect result: it should always return false, given that we're running on a
regular linux system, but instead it returns first true, then false.
In more detail, the segmentation fault happens as follows:
- a program space with an address space is created
- a second program space is about to be created. maybe_new_address_space
is called, and because linux_is_uclinux returns true, maybe_new_address_space
returns false, and no new address space is created
- a second program space with the same address space is created
- a program space is deleted. Because linux_is_uclinux now returns false,
gdbarch_has_shared_address_space (current_inferior ()->arch ()) returns
false, and the address space is deleted
- when gdb uses the address space of the remaining program space, we run into
the segfault, because the address space is deleted.
Hardcoding linux_is_uclinux to false makes the test-case pass.
We leave addressing the root cause for the following commit in this series.
For now, prevent the segmentation fault by making the address space a refcounted
object.
This was already suggested here [1]:
...
A better solution might be to have the address spaces be reference counted
...
Tested on top of trunk on x86_64-linux and ppc64le-linux.
Tested on top of gdb-14-branch on ppc64-linux.
Co-Authored-By: Simon Marchi <simon.marchi@polymtl.ca>
PR gdb/30547
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30547
[1] https://sourceware.org/pipermail/gdb-patches/2023-October/202928.html
|
||
|
|
f087eb2765 |
gdb: make catch_syscall_enabled return bool
Make it return a bool and adjust a few comparisons where it's used. Change-Id: Ic77d23b0dcfcfc9195dfe65e4c7ff9cf3229f6fb |
||
|
|
6b09f1342c |
gdb: Replace gdb::optional with std::optional
Since GDB now requires C++17, we don't need the internally maintained gdb::optional implementation. This patch does the following replacing: - gdb::optional -> std::optional - gdb::in_place -> std::in_place - #include "gdbsupport/gdb_optional.h" -> #include <optional> This change has mostly been done automatically. One exception is gdbsupport/thread-pool.* which did not use the gdb:: prefix as it already lives in the gdb namespace. Change-Id: I19a92fa03e89637bab136c72e34fd351524f65e9 Approved-By: Tom Tromey <tom@tromey.com> Approved-By: Pedro Alves <pedro@palves.net> |
||
|
|
fb84fbf8a5 |
gdb/infrun: simplify process_event_stop_test
The previous commit introduced some local variables to make some if statements simpler. This commit uses them more liberally throughout the process_event_stop_test in order to simplify the function a little more. No functional changes are expected. Approved-By: Tom Tromey <tom@tromey.com> |
||
|
|
bf2813aff8 |
gdb/record: print frame information when exiting a recursive call
Currently, when GDB is reverse stepping out of a function into the same function due to a recursive call, it doesn't print frame information, as reported by PR record/29178. This happens because when the inferior leaves the current frame, GDB decides to refresh the step information, clobbering the original step_frame_id, making it impossible to figure out later on that the frame has been changed. This commit changes GDB so that, if we notice we're in this exact situation, we won't refresh the step information. Because of implementation details, this change can cause some debug information to be read when it normally wouldn't before, which showed up as a regression on gdb.dwarf2/dw2-out-of-range-end-of-seq. Since that isn't a problem, the test was changed to allow for the new output. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29178 Approved-By: Tom Tromey <tom@tromey.com> |
||
|
|
4133662031 |
gdb: pass address_space to target dcache functions
A simple refactor to make the reference to current_program_space bubble up one level. No behavior changes expected. Change-Id: I237cf2f45ae73c35bcb433ce40e3c03cef6b87e2 |
||
|
|
9c742269ec |
gdb: remove get_current_regcache
Remove get_current_regcache, inlining the call to get_thread_regcache in callers. When possible, pass the right thread_info object known from the local context. Otherwise, fall back to passing `inferior_thread ()`. This makes the reference to global context bubble up one level, a small step towards the long term goal of reducing the number of references to global context (or rather, moving those references as close as possible to the top of the call tree). No behavior change expected. Change-Id: Ifa6980c88825d803ea586546b6b4c633c33be8d6 |
||
|
|
7438771288 |
gdb: remove regcache's address space
While looking at the regcache code, I noticed that the address space (passed to regcache when constructing it, and available through regcache::aspace) wasn't relevant for the regcache itself. Callers of regcache::aspace use that method because it appears to be a convenient way of getting the address space for a thread, if you already have the regcache. But there is always another way to get the address space, as the callers pretty much always know which thread they are dealing with. The regcache code itself doesn't use the address space. This patch removes anything related to address_space from the regcache code, and updates callers to get it from the thread in context. This removes a bit of unnecessary complexity from the regcache code. The current get_thread_arch_regcache function gets an address_space for the given thread using the target_thread_address_space function (which calls the target_ops::thread_address_space method). This suggest that there might have been the intention of supporting per-thread address spaces. But digging through the history, I did not find any such case. Maybe this method was just added because we needed a way to get an address space from a ptid (because constructing a regcache required an address space), and this seemed like the right way to do it, I don't know. The only implementations of thread_address_space and process_stratum_target::thread_address_space and linux_nat_target::thread_address_space, which essentially just return the inferior's address space. And thread_address_space is only used in the current get_thread_arch_regcache, which gets removed. So, I think that the thread_address_space target method can be removed, and we can assume that it's fine to use the inferior's address space everywhere. Callers of regcache::aspace are updated to get the address space from the relevant inferior, either using some context they already know about, or in last resort using the current global context. So, to summarize: - remove everything in regcache related to address spaces - in particular, remove get_thread_arch_regcache, and rename get_thread_arch_aspace_regcache to get_thread_arch_regcache - remove target_ops::thread_address_space, and target_thread_address_space - adjust all users of regcache::aspace to get the address space another way Change-Id: I04fd41b22c83fe486522af7851c75bcfb31c88c7 |
||
|
|
9488c32734 |
Cancel execution command on thread exit, when stepping, nexting, etc.
If your target has no support for TARGET_WAITKIND_NO_RESUMED events (and no way to support them, such as the yet-unsubmitted AMDGPU target), and you step over thread exit with scheduler-locking on, this is what you get: (gdb) n [Thread ... exited] *hang* Getting back the prompt by typing Ctrl-C may not even work, since no inferior thread is running to receive the SIGINT. Even if it works, it seems unnecessarily harsh. If you started an execution command for which there's a clear thread of interest (step, next, until, etc.), and that thread disappears, then I think it's more user friendly if GDB just detects the situation and aborts the command, giving back the prompt. That is what this commit implements. It does this by explicitly requesting the target to report thread exit events whenever the main resumed thread has a thread_fsm. Note that unlike stepping over a breakpoint, we don't need to enable clone events in this case. With this patch, we get: (gdb) n [Thread 0x7ffff7d89700 (LWP 3961883) exited] Command aborted, thread exited. (gdb) Reviewed-By: Andrew Burgess <aburgess@redhat.com> Change-Id: I901ab64c91d10830590b2dac217b5264635a2b95 |
||
|
|
7ac958f267 |
Don't resume new threads if scheduler-locking is in effect
If scheduler-locking is in effect, e.g., with "set scheduler-locking on", and you step over a function that spawns a new thread, the new thread is allowed to run free, at least until some event is hit, at which point, whether the new thread is re-resumed depends on a number of seemingly random factors. E.g., if the target is all-stop, and the parent thread hits a breakpoint, and GDB decides the breakpoint isn't interesting to report to the user, then the parent thread is resumed, but the new thread is left stopped. I think that letting the new threads run with scheduler-locking enabled is a defect. This commit fixes that, making use of the new clone events on Linux, and of target_thread_events() on targets where new threads have no connection to the thread that spawned them. Testcase and documentation changes included. Approved-By: Eli Zaretskii <eliz@gnu.org> Reviewed-By: Andrew Burgess <aburgess@redhat.com> Change-Id: Ie12140138b37534b7fc1d904da34f0f174aa11ce |
||
|
|
d828dbed9c |
stop_all_threads: (re-)enable async before waiting for stops
Running the gdb.threads/step-over-thread-exit-while-stop-all-threads.exp testcase added later in the series against gdbserver, after the TARGET_WAITKIND_NO_RESUMED fix from the following patch, would run into an infinite loop in stop_all_threads, leading to a timeout: FAIL: gdb.threads/step-over-thread-exit-while-stop-all-threads.exp: displaced-stepping=off: target-non-stop=on: iter 0: continue (timeout) The is really a latent bug, and it is about the fact that stop_all_threads stops listening to events from a target as soon as it sees a TARGET_WAITKIND_NO_RESUMED, ignoring that TARGET_WAITKIND_NO_RESUMED may be delayed. handle_no_resumed knows how to handle delayed no-resumed events, but stop_all_threads was never taught to. In more detail, here's what happens with that testcase: #1 - Multiple threads report breakpoint hits to gdb. #2 - gdb picks one events, and it's for thread 1. All other stops are left pending. thread 1 needs to move past a breakpoint, so gdb stops all threads to start an inline step over for thread 1. While stopping threads, some of the threads that were still running report events that are also left pending. #2 - gdb steps thread 1 #3 - Thread 1 exits while stepping (it steps over an exit syscall), gdbserver reports thread exit for thread 1 #4 - Thread 1 was the last resumed thread, so gdbserver also reports no-resumed: [remote] Notification received: Stop:w0;p3445d0.3445d3 [remote] Sending packet: $vStopped#55 [remote] Packet received: N [remote] Sending packet: $vStopped#55 [remote] Packet received: OK #5 - gdb processes the thread exit for thread 1, finishes the step over and restarts threads. #6 - gdb picks the next event to process out of one of the resumed threads with pending events: [infrun] random_resumed_with_pending_wait_status: Found 32 events, selecting #11 #7 - This is again a breakpoint hit and the breakpoint needs to be stepped over too, so gdb starts a step-over dance again. #8 - We reach stop_all_threads, which finds that some threads need to be stopped. #9 - wait_one finally consumes the no-resumed event queue by #4. Seeing this, wait_one disable target async, to stop listening for events out of the remote target. #10 - We still haven't seen all the stops expected, so stop_all_threads tries another iteration. #11 - Because the remote target is no longer async, and there are no other targets, wait_one return no-resumed immediately without polling the remote target. #12 - We still haven't seen all the stops expected, so stop_all_threads tries another iteration. goto #11, looping forever. Fix this by explicitly enabling/re-enabling target async on targets that can async, before waiting for stops. Reviewed-By: Andrew Burgess <aburgess@redhat.com> Change-Id: Ie3ffb0df89635585a6631aa842689cecc989e33f |
||
|
|
21d4830415 |
gdb: clear step over information on thread exit (PR gdb/27338)
GDB doesn't handle correctly the case where a thread steps over a
breakpoint (using either in-line or displaced stepping), and the
executed instruction causes the thread to exit.
Using the test program included later in the series, this is what it
looks like with displaced-stepping, on x86-64 Linux, where we have two
displaced-step buffers:
$ ./gdb -q -nx --data-directory=data-directory build/binutils-gdb/gdb/testsuite/outputs/gdb.threads/step-over-thread-exit/step-over-thread-exit -ex "b my_exit_syscall" -ex r
Reading symbols from build/binutils-gdb/gdb/testsuite/outputs/gdb.threads/step-over-thread-exit/step-over-thread-exit...
Breakpoint 1 at 0x123c: file src/binutils-gdb/gdb/testsuite/lib/my-syscalls.S, line 68.
Starting program: build/binutils-gdb/gdb/testsuite/outputs/gdb.threads/step-over-thread-exit/step-over-thread-exit
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/../lib/libthread_db.so.1".
[New Thread 0x7ffff7c5f640 (LWP 2915510)]
[Switching to Thread 0x7ffff7c5f640 (LWP 2915510)]
Thread 2 "step-over-threa" hit Breakpoint 1, my_exit_syscall () at src/binutils-gdb/gdb/testsuite/lib/my-syscalls.S:68
68 syscall
(gdb) c
Continuing.
[New Thread 0x7ffff7c5f640 (LWP 2915524)]
[Thread 0x7ffff7c5f640 (LWP 2915510) exited]
[Switching to Thread 0x7ffff7c5f640 (LWP 2915524)]
Thread 3 "step-over-threa" hit Breakpoint 1, my_exit_syscall () at src/binutils-gdb/gdb/testsuite/lib/my-syscalls.S:68
68 syscall
(gdb) c
Continuing.
[New Thread 0x7ffff7c5f640 (LWP 2915616)]
[Thread 0x7ffff7c5f640 (LWP 2915524) exited]
[Switching to Thread 0x7ffff7c5f640 (LWP 2915616)]
Thread 4 "step-over-threa" hit Breakpoint 1, my_exit_syscall () at src/binutils-gdb/gdb/testsuite/lib/my-syscalls.S:68
68 syscall
(gdb) c
Continuing.
... hangs ...
The first two times we do "continue", we displaced-step the syscall
instruction that causes the thread to exit. When the thread exits,
the main thread, waiting on pthread_join, is unblocked. It spawns a
new thread, which hits the breakpoint on the syscall again. However,
infrun was never notified that the displaced-stepping threads are done
using the displaced-step buffer, so now both buffers are marked as
used. So when we do the third continue, there are no buffers
available to displaced-step the syscall, so the thread waits forever
for its turn.
When trying the same but with in-line step over (displaced-stepping
disabled):
$ ./gdb -q -nx --data-directory=data-directory \
build/binutils-gdb/gdb/testsuite/outputs/gdb.threads/step-over-thread-exit/step-over-thread-exit \
-ex "b my_exit_syscall" -ex "set displaced-stepping off" -ex r
Reading symbols from build/binutils-gdb/gdb/testsuite/outputs/gdb.threads/step-over-thread-exit/step-over-thread-exit...
Breakpoint 1 at 0x123c: file src/binutils-gdb/gdb/testsuite/lib/my-syscalls.S, line 68.
Starting program: build/binutils-gdb/gdb/testsuite/outputs/gdb.threads/step-over-thread-exit/step-over-thread-exit
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/../lib/libthread_db.so.1".
[New Thread 0x7ffff7c5f640 (LWP 2928290)]
[Switching to Thread 0x7ffff7c5f640 (LWP 2928290)]
Thread 2 "step-over-threa" hit Breakpoint 1, my_exit_syscall () at src/binutils-gdb/gdb/testsuite/lib/my-syscalls.S:68
68 syscall
(gdb) c
Continuing.
[Thread 0x7ffff7c5f640 (LWP 2928290) exited]
No unwaited-for children left.
(gdb) i th
Id Target Id Frame
1 Thread 0x7ffff7c60740 (LWP 2928285) "step-over-threa" 0x00007ffff7f7c9b7 in __pthread_clockjoin_ex () from /usr/lib/libpthread.so.0
The current thread <Thread ID 2> has terminated. See `help thread'.
(gdb) thread 1
[Switching to thread 1 (Thread 0x7ffff7c60740 (LWP 2928285))]
#0 0x00007ffff7f7c9b7 in __pthread_clockjoin_ex () from /usr/lib/libpthread.so.0
(gdb) c
Continuing.
^C^C
... hangs ...
The "continue" causes an in-line step to occur, meaning the main
thread is stopped while we step the syscall. The stepped thread exits
when executing the syscall, the linux-nat target notices there are no
more resumed threads to be waited for, so returns
TARGET_WAITKIND_NO_RESUMED, which causes the prompt to return. But
infrun never clears the in-line step over info. So if we try
continuing the main thread, GDB doesn't resume it, because it thinks
there's an in-line step in progress that we need to wait for to
finish, and we are stuck there.
To fix this, infrun needs to be informed when a thread doing a
displaced or in-line step over exits. We can do that with the new
target_set_thread_options mechanism which is optimal for only enabling
exit events of the thread that needs it; or, if that is not supported,
by using target_thread_events, which enables thread exit events for
all threads. This is done by this commit.
This patch then modifies handle_inferior_event in infrun.c to clean up
any step-over the exiting thread might have been doing at the time of
the exit. The cases to consider are:
- the exiting thread was doing an in-line step-over with an all-stop
target
- the exiting thread was doing an in-line step-over with a non-stop
target
- the exiting thread was doing a displaced step-over with a non-stop
target
The displaced-stepping buffer implementation in displaced-stepping.c
is modified to account for the fact that it's possible that we
"finish" a displaced step after a thread exit event. The buffer that
the exiting thread was using is marked as available again and the
original instructions under the scratch pad are restored. However, it
skips applying the fixup, which wouldn't make sense since the thread
does not exist anymore.
Another case that needs handling is if a displaced-stepping thread
exits, and the event is reported while we are in stop_all_threads. We
should call displaced_step_finish in the handle_one function, in that
case. It was already called in other code paths, just not the "thread
exited" path.
This commit doesn't make infrun ask the target to report the
TARGET_WAITKIND_THREAD_EXITED events yet, that'll be done later in the
series.
Note that "stop_print_frame = false;" line is moved to normal_stop,
because TARGET_WAITKIND_THREAD_EXITED can also end up with the event
transmorphed into TARGET_WAITKIND_NO_RESUMED. Moving it to
normal_stop keeps it centralized.
Co-authored-by: Simon Marchi <simon.marchi@efficios.com>
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=27338
Reviewed-By: Andrew Burgess <aburgess@redhat.com>
Change-Id: I745c6955d7ef90beb83bcf0ff1d1ac8b9b6285a5
|
||
|
|
d8d96409c8 |
Introduce GDB_THREAD_OPTION_EXIT thread option, fix step-over-thread-exit
When stepping over a breakpoint with displaced stepping, GDB needs to be informed if the stepped thread exits, otherwise the displaced stepping buffer that was allocated to that thread leaks, and this can result in deadlock, with other threads waiting for their turn to displaced step, but their turn never comes. Similarly, when stepping over a breakpoint in line, GDB also needs to be informed if the stepped thread exits, so that is can clear the step over state and re-resume threads. This commit makes it possible for GDB to ask the target to report thread exit events for a given thread, using the new "thread options" mechanism introduced by a previous patch. This only adds the core bits. Following patches in the series will teach the Linux backends (native & gdbserver) to handle the GDB_THREAD_OPTION_EXIT option, and then a later patch will make use of these thread exit events to clean up displaced stepping and inline stepping state properly. Reviewed-By: Andrew Burgess <aburgess@redhat.com> Change-Id: I96b719fdf7fee94709e98bb3a90751d8134f3a38 Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=27338 |
||
|
|
7730e5c6c2 |
Move deleting thread on TARGET_WAITKIND_THREAD_EXITED to core
Currently, infrun assumes that when TARGET_WAITKIND_THREAD_EXITED is reported, the corresponding GDB thread has already been removed from the GDB thread list. Later in the series, that will no longer work, as infrun will need to refer to the thread's thread_info when it processes TARGET_WAITKIND_THREAD_EXITED. As preparation, this patch makes deleting the GDB thread responsibility of infrun, instead of the target. Reviewed-By: Andrew Burgess <aburgess@redhat.com> Change-Id: I013d87f61ffc9aaca49f0d6ce2a43e3ea69274de |
||
|
|
65c459abeb |
Thread options & clone events (core + remote)
A previous patch taught GDB about a new TARGET_WAITKIND_THREAD_CLONED event kind, and made the Linux target report clone events. A following patch will teach Linux GDBserver to do the same thing. However, for remote debugging, it wouldn't be ideal for GDBserver to report every clone event to GDB, when GDB only cares about such events in some specific situations. Reporting clone events all the time would be potentially chatty. We don't enable thread create/exit events all the time for the same reason. Instead we have the QThreadEvents packet. QThreadEvents is target-wide, though. This patch makes GDB instead explicitly request that the target reports clone events or not, on a per-thread basis. In order to be able to do that with GDBserver, we need a new remote protocol feature. Since a following patch will want to enable thread exit events on per-thread basis too, the packet introduced here is more generic than just for clone events. It lets you enable/disable a set of options at once, modelled on Linux ptrace's PTRACE_SETOPTIONS. IOW, this commit introduces a new QThreadOptions packet, that lets you specify a set of per-thread event options you want to enable. The packet accepts a list of options/thread-id pairs, similarly to vCont, processed left to right, with the options field being a number interpreted as a bit mask of options. The only option defined in this commit is GDB_THREAD_OPTION_CLONE (0x1), which ask the remote target to report clone events. Another patch later in the series will introduce another option. For example, this packet sets option "1" (clone events) on thread p1000.2345: QThreadOptions;1:p1000.2345 and this clears options for all threads of process 1000, and then sets option "1" (clone events) on thread p1000.2345: QThreadOptions;0:p1000.-1;1:p1000.2345 This clears options of all threads of all processes: QThreadOptions;0 The target reports the set of supported options by including "QThreadOptions=<supported options>" in its qSupported response. infrun is then tweaked to enable GDB_THREAD_OPTION_CLONE when stepping over a breakpoint. Unlike PTRACE_SETOPTIONS, fork/vfork/clone children do NOT inherit their parent's thread options. This is so that GDB can send e.g., "QThreadOptions;0;1:TID" without worrying about threads it doesn't know about yet. Documentation for this new remote protocol feature is included in a documentation patch later in the series. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=19675 Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=27830 Reviewed-By: Andrew Burgess <aburgess@redhat.com> Change-Id: Ie41e5093b2573f14cf6ac41b0b5804eba75be37e |
||
|
|
0d36baa9af |
Step over clone syscall w/ breakpoint, TARGET_WAITKIND_THREAD_CLONED
(A good chunk of the problem statement in the commit log below is
Andrew's, adjusted for a different solution, and for covering
displaced stepping too. The testcase is mostly Andrew's too.)
This commit addresses bugs gdb/19675 and gdb/27830, which are about
stepping over a breakpoint set at a clone syscall instruction, one is
about displaced stepping, and the other about in-line stepping.
Currently, when a new thread is created through a clone syscall, GDB
sets the new thread running. With 'continue' this makes sense
(assuming no schedlock):
- all-stop mode, user issues 'continue', all threads are set running,
a newly created thread should also be set running.
- non-stop mode, user issues 'continue', other pre-existing threads
are not affected, but as the new thread is (sort-of) a child of the
thread the user asked to run, it makes sense that the new threads
should be created in the running state.
Similarly, if we are stopped at the clone syscall, and there's no
software breakpoint at this address, then the current behaviour is
fine:
- all-stop mode, user issues 'stepi', stepping will be done in place
(as there's no breakpoint to step over). While stepping the thread
of interest all the other threads will be allowed to continue. A
newly created thread will be set running, and then stopped once the
thread of interest has completed its step.
- non-stop mode, user issues 'stepi', stepping will be done in place
(as there's no breakpoint to step over). Other threads might be
running or stopped, but as with the continue case above, the new
thread will be created running. The only possible issue here is
that the new thread will be left running after the initial thread
has completed its stepi. The user would need to manually select
the thread and interrupt it, this might not be what the user
expects. However, this is not something this commit tries to
change.
The problem then is what happens when we try to step over a clone
syscall if there is a breakpoint at the syscall address.
- For both all-stop and non-stop modes, with in-line stepping:
+ user issues 'stepi',
+ [non-stop mode only] GDB stops all threads. In all-stop mode all
threads are already stopped.
+ GDB removes s/w breakpoint at syscall address,
+ GDB single steps just the thread of interest, all other threads
are left stopped,
+ New thread is created running,
+ Initial thread completes its step,
+ [non-stop mode only] GDB resumes all threads that it previously
stopped.
There are two problems in the in-line stepping scenario above:
1. The new thread might pass through the same code that the initial
thread is in (i.e. the clone syscall code), in which case it will
fail to hit the breakpoint in clone as this was removed so the
first thread can single step,
2. The new thread might trigger some other stop event before the
initial thread reports its step completion. If this happens we
end up triggering an assertion as GDB assumes that only the
thread being stepped should stop. The assert looks like this:
infrun.c:5899: internal-error: int finish_step_over(execution_control_state*): Assertion `ecs->event_thread->control.trap_expected' failed.
- For both all-stop and non-stop modes, with displaced stepping:
+ user issues 'stepi',
+ GDB starts the displaced step, moves thread's PC to the
out-of-line scratch pad, maybe adjusts registers,
+ GDB single steps the thread of interest, [non-stop mode only] all
other threads are left as they were, either running or stopped.
In all-stop, all other threads are left stopped.
+ New thread is created running,
+ Initial thread completes its step, GDB re-adjusts its PC,
restores/releases scratchpad,
+ [non-stop mode only] GDB resumes the thread, now past its
breakpoint.
+ [all-stop mode only] GDB resumes all threads.
There is one problem with the displaced stepping scenario above:
3. When the parent thread completed its step, GDB adjusted its PC,
but did not adjust the child's PC, thus that new child thread
will continue execution in the scratch pad, invoking undefined
behavior. If you're lucky, you see a crash. If unlucky, the
inferior gets silently corrupted.
What is needed is for GDB to have more control over whether the new
thread is created running or not. Issue #1 above requires that the
new thread not be allowed to run until the breakpoint has been
reinserted. The only way to guarantee this is if the new thread is
held in a stopped state until the single step has completed. Issue #3
above requires that GDB is informed of when a thread clones itself,
and of what is the child's ptid, so that GDB can fixup both the parent
and the child.
When looking for solutions to this problem I considered how GDB
handles fork/vfork as these have some of the same issues. The main
difference between fork/vfork and clone is that the clone events are
not reported back to core GDB. Instead, the clone event is handled
automatically in the target code and the child thread is immediately
set running.
Note we have support for requesting thread creation events out of the
target (TARGET_WAITKIND_THREAD_CREATED). However, those are reported
for the new/child thread. That would be sufficient to address in-line
stepping (issue #1), but not for displaced-stepping (issue #3). To
handle displaced-stepping, we need an event that is reported to the
_parent_ of the clone, as the information about the displaced step is
associated with the clone parent. TARGET_WAITKIND_THREAD_CREATED
includes no indication of which thread is the parent that spawned the
new child. In fact, for some targets, like e.g., Windows, it would be
impossible to know which thread that was, as thread creation there
doesn't work by "cloning".
The solution implemented here is to model clone on fork/vfork, and
introduce a new TARGET_WAITKIND_THREAD_CLONED event. This event is
similar to TARGET_WAITKIND_FORKED and TARGET_WAITKIND_VFORKED, except
that we end up with a new thread in the same process, instead of a new
thread of a new process. Like FORKED and VFORKED, THREAD_CLONED
waitstatuses have a child_ptid property, and the child is held stopped
until GDB explicitly resumes it. This addresses the in-line stepping
case (issues #1 and #2).
The infrun code that handles displaced stepping fixup for the child
after a fork/vfork event is thus reused for THREAD_CLONE, with some
minimal conditions added, addressing the displaced stepping case
(issue #3).
The native Linux backend is adjusted to unconditionally report
TARGET_WAITKIND_THREAD_CLONED events to the core.
Following the follow_fork model in core GDB, we introduce a
target_follow_clone target method, which is responsible for making the
new clone child visible to the rest of GDB.
Subsequent patches will add clone events support to the remote
protocol and gdbserver.
displaced_step_in_progress_thread becomes unused with this patch, but
a new use will reappear later in the series. To avoid deleting it and
readding it back, this patch marks it with attribute unused, and the
latter patch removes the attribute again. We need to do this because
the function is static, and with no callers, the compiler would warn,
(error with -Werror), breaking the build.
This adds a new gdb.threads/stepi-over-clone.exp testcase, which
exercises stepping over a clone syscall, with displaced stepping vs
inline stepping, and all-stop vs non-stop. We already test stepping
over clone syscalls with gdb.base/step-over-syscall.exp, but this test
uses pthreads, while the other test uses raw clone, and this one is
more thorough. The testcase passes on native GNU/Linux, but fails
against GDBserver. GDBserver will be fixed by a later patch in the
series.
Co-authored-by: Andrew Burgess <aburgess@redhat.com>
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=19675
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=27830
Change-Id: I95c06024736384ae8542a67ed9fdf6534c325c8e
Reviewed-By: Andrew Burgess <aburgess@redhat.com>
|
||
|
|
6a534f85cb |
gdb/linux: Delete all other LWPs immediately on ptrace exec event
I noticed that on an Ubuntu 20.04 system, after a following patch
("Step over clone syscall w/ breakpoint,
TARGET_WAITKIND_THREAD_CLONED"), the gdb.threads/step-over-exec.exp
was passing cleanly, but still, we'd end up with four new unexpected
GDB core dumps:
=== gdb Summary ===
# of unexpected core files 4
# of expected passes 48
That said patch is making the pre-existing
gdb.threads/step-over-exec.exp testcase (almost silently) expose a
latent problem in gdb/linux-nat.c, resulting in a GDB crash when:
#1 - a non-leader thread execs
#2 - the post-exec program stops somewhere
#3 - you kill the inferior
Instead of #3 directly, the testcase just returns, which ends up in
gdb_exit, tearing down GDB, which kills the inferior, and is thus
equivalent to #3 above.
Vis (after said patch is applied):
$ gdb --args ./gdb /home/pedro/gdb/build/gdb/testsuite/outputs/gdb.threads/step-over-exec/step-over-exec-execr-thread-other-diff-text-segs-true
...
(top-gdb) r
...
(gdb) b main
...
(gdb) r
...
Breakpoint 1, main (argc=1, argv=0x7fffffffdb88) at /home/pedro/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.threads/step-over-exec.c:69
69 argv0 = argv[0];
(gdb) c
Continuing.
[New Thread 0x7ffff7d89700 (LWP 2506975)]
Other going in exec.
Exec-ing /home/pedro/gdb/build/gdb/testsuite/outputs/gdb.threads/step-over-exec/step-over-exec-execr-thread-other-diff-text-segs-true-execd
process 2506769 is executing new program: /home/pedro/gdb/build/gdb/testsuite/outputs/gdb.threads/step-over-exec/step-over-exec-execr-thread-other-diff-text-segs-true-execd
Thread 1 "step-over-exec-" hit Breakpoint 1, main () at /home/pedro/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.threads/step-over-exec-execd.c:28
28 foo ();
(gdb) k
...
Thread 1 "gdb" received signal SIGSEGV, Segmentation fault.
0x000055555574444c in thread_info::has_pending_waitstatus (this=0x0) at ../../src/gdb/gdbthread.h:393
393 return m_suspend.waitstatus_pending_p;
(top-gdb) bt
#0 0x000055555574444c in thread_info::has_pending_waitstatus (this=0x0) at ../../src/gdb/gdbthread.h:393
#1 0x0000555555a884d1 in get_pending_child_status (lp=0x5555579b8230, ws=0x7fffffffd130) at ../../src/gdb/linux-nat.c:1345
#2 0x0000555555a8e5e6 in kill_unfollowed_child_callback (lp=0x5555579b8230) at ../../src/gdb/linux-nat.c:3564
#3 0x0000555555a92a26 in gdb::function_view<int (lwp_info*)>::bind<int, lwp_info*>(int (*)(lwp_info*))::{lambda(gdb::fv_detail::erased_callable, lwp_info*)#1}::operator()(gdb::fv_detail::erased_callable, lwp_info*) const (this=0x0, ecall=..., args#0=0x5555579b8230) at ../../src/gdb/../gdbsupport/function-view.h:284
#4 0x0000555555a92a51 in gdb::function_view<int (lwp_info*)>::bind<int, lwp_info*>(int (*)(lwp_info*))::{lambda(gdb::fv_detail::erased_callable, lwp_info*)#1}::_FUN(gdb::fv_detail::erased_callable, lwp_info*) () at ../../src/gdb/../gdbsupport/function-view.h:278
#5 0x0000555555a91f84 in gdb::function_view<int (lwp_info*)>::operator()(lwp_info*) const (this=0x7fffffffd210, args#0=0x5555579b8230) at ../../src/gdb/../gdbsupport/function-view.h:247
#6 0x0000555555a87072 in iterate_over_lwps(ptid_t, gdb::function_view<int (lwp_info*)>) (filter=..., callback=...) at ../../src/gdb/linux-nat.c:864
#7 0x0000555555a8e732 in linux_nat_target::kill (this=0x55555653af40 <the_amd64_linux_nat_target>) at ../../src/gdb/linux-nat.c:3590
#8 0x0000555555cfdc11 in target_kill () at ../../src/gdb/target.c:911
...
The root of the problem is that when a non-leader LWP execs, it just
changes its tid to the tgid, replacing the pre-exec leader thread,
becoming the new leader. There's no thread exit event for the execing
thread. It's as if the old pre-exec LWP vanishes without trace. The
ptrace man page says:
"PTRACE_O_TRACEEXEC (since Linux 2.5.46)
Stop the tracee at the next execve(2). A waitpid(2) by the
tracer will return a status value such that
status>>8 == (SIGTRAP | (PTRACE_EVENT_EXEC<<8))
If the execing thread is not a thread group leader, the thread
ID is reset to thread group leader's ID before this stop.
Since Linux 3.0, the former thread ID can be retrieved with
PTRACE_GETEVENTMSG."
When the core of GDB processes an exec events, it deletes all the
threads of the inferior. But, that is too late -- deleting the thread
does not delete the corresponding LWP, so we end leaving the pre-exec
non-leader LWP stale in the LWP list. That's what leads to the crash
above -- linux_nat_target::kill iterates over all LWPs, and after the
patch in question, that code will look for the corresponding
thread_info for each LWP. For the pre-exec non-leader LWP still
listed, won't find one.
This patch fixes it, by deleting the pre-exec non-leader LWP (and
thread) from the LWP/thread lists as soon as we get an exec event out
of ptrace.
GDBserver does not need an equivalent fix, because it is already doing
this, as side effect of mourning the pre-exec process, in
gdbserver/linux-low.cc:
else if (event == PTRACE_EVENT_EXEC && cs.report_exec_events)
{
...
/* Delete the execing process and all its threads. */
mourn (proc);
switch_to_thread (nullptr);
The crash with gdb.threads/step-over-exec.exp is not observable on
newer systems, which postdate the glibc change to move "libpthread.so"
internals to "libc.so.6", because right after the exec, GDB traps a
load event for "libc.so.6", which leads to GDB trying to open
libthread_db for the post-exec inferior, and, on such systems that
succeeds. When we load libthread_db, we call
linux_stop_and_wait_all_lwps, which, as the name suggests, stops all
lwps, and then waits to see their stops. While doing this, GDB
detects that the pre-exec stale LWP is gone, and deletes it.
If we use "catch exec" to stop right at the exec before the
"libc.so.6" load event ever happens, and issue "kill" right there,
then GDB crashes on newer systems as well. So instead of tweaking
gdb.threads/step-over-exec.exp to cover the fix, add a new
gdb.threads/threads-after-exec.exp testcase that uses "catch exec".
The test also uses the new "maint info linux-lwps" command if testing
on Linux native, which also exposes the stale LWP problem with an
unfixed GDB.
Also tweak a comment in infrun.c:follow_exec referring to how
linux-nat.c used to behave, as it would become stale otherwise.
Reviewed-By: Andrew Burgess <aburgess@redhat.com>
Change-Id: I21ec18072c7750f3a972160ae6b9e46590376643
|
||
|
|
1bd70cb9f8 |
rs6000, Fix Linux DWARF register mapping
Overview of issues fixed by the patch.
The primary issue this patch fixes is the DWARF register mapping for
Linux. The changes in ppc-linux-tdep.c fix the DWARF register mapping
issues. The register mapping issue is responsible for two of the
five regression bugs seen in gdb.base/store.exp.
Once the register mapping was fixed, an underlying issue with the unwinding
of the signal trampoline in common-code in ifrun.c was found. This
underlying bug is best described by Ulrich in the following description.
The unwinder bug shows up on platforms where the kernel uses a trampoline
to dispatch "calls to" the signal handler (not just *returns from* the
signal handler). Many platforms use a trampoline for signal return, and
that is working fine, but the only platform I'm (Ulrich) aware of that
uses a trampoline for signal handler calls is (recent kernels for)
PowerPC. I believe the rationale for using a trampoline here
is to improve performance by avoiding unbalancing of the
branch predictor's call/return stack.
However, on PowerPC the bug is dormant as well as it is hidden
by *another* bug that prevents correct unwinding out of the
signal trampoline. This is because the custom CFI for the
trampoline uses a register number (VSCR) that is not ever used
by compiler-generated CFI, and that particular register is
mapped to an invalid number by the current PowerPC DWARF mapper.
The underlying unwinder bug is exposed by the "new" regression failures
in gdb.base/sigstep.exp. These failures were previously masked by
the fact that GDB was not seeing a valid frame when it tried to unwind
the frames. The sigstep.exp test is specifically testing stepping into
a signal handler. With the correct DWARF register mapping in place,
specifically the VSCR mapping, the signal trampoline code now unwinds to a
valid frame exposing the pre-existing bug in how the signal handler on
PowerPC works. The one line change infrun.c fixes the exiting bug in
the common-code for platforms that use a trampoline to dispatch calls
to the signal handler by not stopping in the SIGTRAMP_FRAME.
Detailed description of the DWARF register mapping fix.
The PowerPC DWARF register mapping is the same for the .eh_frame and
.debug_frame on Linux. PowerPC uses different mapping for .eh_frame and
.debug_frame on other operating systems. The current GDB support for
mapping the DWARF registers in rs6000_linux_dwarf2_reg_to_regnum and
rs6000_adjust_frame_regnum file gdb/rs6000-tdep.c is not correct for Linux.
The files have some legacy mappings for spe_acc, spefscr, EV which was
removed from GCC in 2017.
This patch adds a two new functions rs6000_linux_dwarf2_reg_to_regnum,
and rs6000_linux_adjust_frame_regnum in file gdb/ppc-linux-tdep.c to handle
the DWARF register mappings on Linux. Function
rs6000_linux_dwarf2_reg_to_regnum is installed for both gdb_dwarf_to_regnum
and gdbarch_stab_reg_to_regnum since the mappings are the same.
The ppc_linux_init_abi function in gdb/ppc-linux-tdep.c is updated to
call set_gdbarch_dwarf2_reg_to_regnum map the new function
rs6000_linux_dwarf2_reg_to_regnum for the architecture. Similarly,
dwarf2_frame_set_adjust_regnum is called to map
rs6000_linux_adjust_frame_regnum into the architecture.
Additional detail on the signal handling fix.
The specific sequence of events for handling a signal on most
architectures is as follows:
1) Some code is running when a signal arrives.
2) The kernel handles the signal and dispatches to the handler.
...
However on PowerPC the sequence of events is:
1) Some code is running when a signal arrives.
2) The kernel handles the signal and dispatches to the trampoline.
3) The trampoline performs a normal function call to the handler.
...
We want the "nexti" to step into, not over, signal handlers invoked by
the kernel. This is the case for most platforms as the kernel puts a
signal trampoline frame onto the stack to handle proper return after the
handler. However, on some platforms such as PowerPC, the kernel actually
uses a trampoline to handle *invocation* of the handler. We do not
want GDB to stop in the SIGTRAMP_FRAME. The issue is fixed in function
process_event_stop_test by adding a check that the frame is not a
SIGTRAMP_FRAME to the if statement to stop in a subroutine call. This
prevents GDB from erroneously detecting the trampoline invocation as a
subroutine call.
This patch fixes two regression test failures in gdb.base/store.exp.
The patch then fixes an exposed, dormant, signal handling issue that
is exposed in the signal handling test gdb.base/sigstep.exp.
The patch has been tested on Power 8 LE/BE, Power 9 LE/BE, Power 10 with
no new regressions. Note, only two of the five failures in store.exp
are fixed. The remaining three failures are fixed in a following
patch.
|
||
|
|
99d9c3b92c |
gdb: remove target_gdbarch
This function is just a wrapper around the current inferior's gdbarch. I find that having that wrapper just obscures where the arch is coming from, and that it's often used as "I don't know which arch to use so I'll use this magical target_gdbarch function that gets me an arch" when the arch should in fact come from something in the context (a thread, objfile, symbol, etc). I think that removing it and inlining `current_inferior ()->arch ()` everywhere will make it a bit clearer where that arch comes from and will trigger people into reflecting whether this is the right place to get the arch or not. Change-Id: I79f14b4e4934c88f91ca3a3155f5fc3ea2fadf6b Reviewed-By: John Baldwin <jhb@FreeBSD.org> Approved-By: Andrew Burgess <aburgess@redhat.com> |
||
|
|
27b1f19f8f |
gdb: add inferior::{arch, set_arch}
Make the inferior's gdbarch field private, and add getters and setters. This helped me by allowing putting breakpoints on set_arch to know when the inferior's arch was set. A subsequent patch in this series also adds more things in set_arch. Change-Id: I0005bd1ef4cd6b612af501201cec44e457998eec Reviewed-By: John Baldwin <jhb@FreeBSD.org> Approved-By: Andrew Burgess <aburgess@redhat.com> |
||
|
|
9324bfeab9 |
gdb: remove the silent parameter from exit_inferior_1 and cleanup
After the previous commit, exit_inferior_1 no longer makes use of the silent parameter. This commit removes this parameter and cleans up the callers. After doing this exit_inferior_1, exit_inferior, and exit_inferior_silent are all equivalent, so rename exit_inferior_1 to exit_inferior and delete exit_inferior_silent, update all the callers. Also I spotted the declaration exit_inferior_num_silent in inferior.h, but this function is not defined anywhere, so I deleted the declaration. There should be no user visible changes after this commit. |
||
|
|
05e1cac249 |
gdb: fix vfork regressions when target-non-stop is off
It was pointed out on the mailing list[1] that after this commit: commit |
||
|
|
b1c0ab2080 |
gdb: avoid double stop after failed breakpoint condition check
This commit replaces this earlier commit:
commit
|