Compare commits

...

10 Commits

Author SHA1 Message Date
Pedro Alves
a99047e605 Make scoped_restore_current_thread's cdtors exception free (RFC)
If the remote target closes while we're reading registers/memory for
restoring the selected frame in scoped_restore_current_thread's dtor,
the corresponding TARGET_CLOSE_ERROR error is swallowed by the
scoped_restore_current_thread's dtor, because letting exceptions
escape from a dtor is bad.  It isn't great to lose that errors like
that, though.  I've been thinking about how to avoid it, and I came up
with this patch.

The idea here is to make scoped_restore_current_thread's dtor do as
little as possible, to avoid any work that might throw in the first
place.  And to do that, instead of having the dtor call
restore_selected_frame, which re-finds the previously selected frame,
just record the frame_id/level of the desired selected frame, and have
get_selected_frame find the frame the next time it is called.  In
effect, this implements most of Cagney's suggestion, here:

  /* On demand, create the selected frame and then return it.  If the
     selected frame can not be created, this function prints then throws
     an error.  When MESSAGE is non-NULL, use it for the error message,
     otherwize use a generic error message.  */
  /* FIXME: cagney/2002-11-28: At present, when there is no selected
     frame, this function always returns the current (inner most) frame.
     It should instead, when a thread has previously had its frame
     selected (but not resumed) and the frame cache invalidated, find
     and then return that thread's previously selected frame.  */
  extern struct frame_info *get_selected_frame (const char *message);

The only thing missing to fully implement that would be to make
reinit_frame_cache just clear selected_frame instead of calling
select_frame(NULL), and the call select_frame(NULL) explicitly in the
places where we really wanted reinit_frame_cache to go back to the
current frame too.  That can done separately, though, I'm not
proposing to do that in this patch.

Note that this patch renames restore_selected_frame to
lookup_selected_frame, and adds a new restore_selected_frame function
that doesn't throw, to be paired with the also-new save_selected_frame
function.

lookup_selected_frame should really move from thread.c to frame.c, but
I didn't do that here, just to avoid churn in the patch while it
collects comments.  I did make it extern and declared it in frame.h
already, preparing for the move.  I will do the move as a follow up
patch if people agree with this approach.

Incidentally, this patch alone would fix the crashes fixed by the
previous patches in the series, because with this,
scoped_restore_current_thread's constructor doesn't throw either.

gdb/ChangeLog:

	* blockframe.c (block_innermost_frame): Use get_selected_frame.
	* frame.c
	(scoped_restore_selected_frame::scoped_restore_selected_frame):
	Use save_selected_frame.  Save language as well.
	(scoped_restore_selected_frame::~scoped_restore_selected_frame):
	Use restore_selected_frame, and restore language as well.
	(selected_frame_id, selected_frame_level): New.
	(selected_frame): Update comments.
	(save_selected_frame, restore_selected_frame): New.
	(get_selected_frame): Use lookup_selected_frame.
	(get_selected_frame_if_set): Delete.
	(select_frame): Record selected_frame_level and selected_frame_id.
	* frame.h (scoped_restore_selected_frame) <m_level, m_lang>: New
	fields.
	(get_selected_frame_if_set): Delete declaration.
	(select_frame): Update comments.
	(save_selected_frame, restore_selected_frame)
	(lookup_selected_frame): Declare.
	* gdbthread.h (scoped_restore_current_thread) <m_lang>: New field.
	* stack.c (select_frame_command_core, frame_command_core): Use
	get_selected_frame.
	* thread.c (restore_selected_frame): Rename to ...
	(lookup_selected_frame): ... this and make extern.  Select the
	current frame if the frame level is -1.
	(scoped_restore_current_thread::restore): Also restore the
	language.
	(scoped_restore_current_thread::~scoped_restore_current_thread):
	Don't try/catch.
	(scoped_restore_current_thread::scoped_restore_current_thread):
	Save the language as well.  Use save_selected_frame.
2020-07-09 12:13:11 +01:00
Pedro Alves
1ad36a4b89 Fix crash if connection drops in scoped_restore_current_thread's ctor, part 2
Running the testsuite against an Asan-enabled build of GDB makes
gdb.base/multi-target.exp expose this bug.

scoped_restore_current_thread's ctor calls get_frame_id to record the
selected frame's ID to restore later.  If the frame ID hasn't been
computed yet, it will be computed on the spot, and that will usually
require accessing the target's memory and registers.  If the remote
connection closes, while we're computing the frame ID, the remote
target exits its inferiors, unpushes itself, and throws a
TARGET_CLOSE_ERROR error.  Exiting the inferiors deletes the
inferior's threads.

scoped_restore_current_thread increments the current thread's refcount
to prevent the thread from being deleted from under its feet.
However, the code that does that isn't considering the case of the
thread being deleted from within get_frame_id.  It only increments the
refcount _after_ get_frame_id returns.  So if the current thread is
indeed deleted, the

     tp->incref ();

statement references a stale TP pointer.

Incrementing the refcounts earlier fixes it.

We should probably also let the TARGET_CLOSE_ERROR error propagate in
this case.  That alone would fix it, though it seems better to tweak
the refcount handling too.

gdb/ChangeLog:

	* thread.c
	(scoped_restore_current_thread::scoped_restore_current_thread):
	Incref the thread before calling get_frame_id instead of after.
	Let TARGET_CLOSE_ERROR propagate.
2020-07-09 00:29:08 +01:00
Pedro Alves
84c63c819e Fix crash if connection drops in scoped_restore_current_thread's ctor, part 1
Running the testsuite against an Asan-enabled build of GDB makes
gdb.base/multi-target.exp expose this bug.

scoped_restore_current_thread's ctor calls get_frame_id to record the
selected frame's ID to restore later.  If the frame ID hasn't been
computed yet, it will be computed on the spot, and that will usually
require accessing the target's memory and registers, which requires
remote accesses.  If the remote connection closes while we're
computing the frame ID, the remote target exits its inferiors,
unpushes itself, and throws a TARGET_CLOSE_ERROR error.

If that happens, GDB can currently crash, here:

> ==18555==ERROR: AddressSanitizer: heap-use-after-free on address 0x621004670aa8 at pc 0x0000007ab125 bp 0x7ffdecaecd20 sp 0x7ffdecaecd10
> READ of size 4 at 0x621004670aa8 thread T0
>     #0 0x7ab124 in dwarf2_frame_this_id src/binutils-gdb/gdb/dwarf2/frame.c:1228
>     #1 0x983ec5 in compute_frame_id src/binutils-gdb/gdb/frame.c:550
>     #2 0x9841ee in get_frame_id(frame_info*) src/binutils-gdb/gdb/frame.c:582
>     #3 0x1093faa in scoped_restore_current_thread::scoped_restore_current_thread() src/binutils-gdb/gdb/thread.c:1462
>     #4 0xaee5ba in fetch_inferior_event(void*) src/binutils-gdb/gdb/infrun.c:3968
>     #5 0xaa990b in inferior_event_handler(inferior_event_type, void*) src/binutils-gdb/gdb/inf-loop.c:43
>     #6 0xea61b6 in remote_async_serial_handler src/binutils-gdb/gdb/remote.c:14161
>     #7 0xefca8a in run_async_handler_and_reschedule src/binutils-gdb/gdb/ser-base.c:137
>     #8 0xefcd23 in fd_event src/binutils-gdb/gdb/ser-base.c:188
>     #9 0x15a7416 in handle_file_event src/binutils-gdb/gdbsupport/event-loop.cc:548
>     #10 0x15a7c36 in gdb_wait_for_event src/binutils-gdb/gdbsupport/event-loop.cc:673
>     #11 0x15a5dbb in gdb_do_one_event() src/binutils-gdb/gdbsupport/event-loop.cc:215
>     #12 0xbfe62d in start_event_loop src/binutils-gdb/gdb/main.c:356
>     #13 0xbfe935 in captured_command_loop src/binutils-gdb/gdb/main.c:416
>     #14 0xc01d39 in captured_main src/binutils-gdb/gdb/main.c:1253
>     #15 0xc01dc9 in gdb_main(captured_main_args*) src/binutils-gdb/gdb/main.c:1268
>     #16 0x414ddd in main src/binutils-gdb/gdb/gdb.c:32
>     #17 0x7f590110b82f in __libc_start_main ../csu/libc-start.c:291
>     #18 0x414bd8 in _start (build/binutils-gdb/gdb/gdb+0x414bd8)

What happens is that above, we're in dwarf2_frame_this_id, just after
the dwarf2_frame_cache call.  The "cache" variable that the
dwarf2_frame_cache function returned is already stale.  It's been
released here, from within the dwarf2_frame_cache:

(top-gdb) bt
#0  reinit_frame_cache () at src/gdb/frame.c:1855
#1  0x00000000014ff7b0 in switch_to_no_thread () at src/gdb/thread.c:1301
#2  0x0000000000f66d3e in switch_to_inferior_no_thread (inf=0x615000338180) at src/gdb/inferior.c:626
#3  0x00000000012f3826 in remote_unpush_target (target=0x6170000c5900) at src/gdb/remote.c:5521
#4  0x00000000013097e0 in remote_target::readchar (this=0x6170000c5900, timeout=2) at src/gdb/remote.c:9137
#5  0x000000000130be4d in remote_target::getpkt_or_notif_sane_1 (this=0x6170000c5900, buf=0x6170000c5918, forever=0, expecting_notif=0, is_notif=0x0) at src/gdb/remote.c:9683
#6  0x000000000130c8ab in remote_target::getpkt_sane (this=0x6170000c5900, buf=0x6170000c5918, forever=0) at src/gdb/remote.c:9790
#7  0x000000000130bc0d in remote_target::getpkt (this=0x6170000c5900, buf=0x6170000c5918, forever=0) at src/gdb/remote.c:9623
#8  0x000000000130838e in remote_target::remote_read_bytes_1 (this=0x6170000c5900, memaddr=0x7fffffffcdc0, myaddr=0x6080000ad3bc "", len_units=64, unit_size=1, xfered_len_units=0x7fff6a29b9a0) at src/gdb/remote.c:8860
#9  0x0000000001308bd2 in remote_target::remote_read_bytes (this=0x6170000c5900, memaddr=0x7fffffffcdc0, myaddr=0x6080000ad3bc "", len=64, unit_size=1, xfered_len=0x7fff6a29b9a0) at src/gdb/remote.c:8987
#10 0x0000000001311ed1 in remote_target::xfer_partial (this=0x6170000c5900, object=TARGET_OBJECT_MEMORY, annex=0x0, readbuf=0x6080000ad3bc "", writebuf=0x0, offset=140737488342464, len=64, xfered_len=0x7fff6a29b9a0) at src/gdb/remote.c:10988
#11 0x00000000014ba969 in raw_memory_xfer_partial (ops=0x6170000c5900, readbuf=0x6080000ad3bc "", writebuf=0x0, memaddr=140737488342464, len=64, xfered_len=0x7fff6a29b9a0) at src/gdb/target.c:918
#12 0x00000000014bb720 in target_xfer_partial (ops=0x6170000c5900, object=TARGET_OBJECT_RAW_MEMORY, annex=0x0, readbuf=0x6080000ad3bc "", writebuf=0x0, offset=140737488342464, len=64, xfered_len=0x7fff6a29b9a0) at src/gdb/target.c:1148
#13 0x00000000014bc3b5 in target_read_partial (ops=0x6170000c5900, object=TARGET_OBJECT_RAW_MEMORY, annex=0x0, buf=0x6080000ad3bc "", offset=140737488342464, len=64, xfered_len=0x7fff6a29b9a0) at src/gdb/target.c:1380
#14 0x00000000014bc593 in target_read (ops=0x6170000c5900, object=TARGET_OBJECT_RAW_MEMORY, annex=0x0, buf=0x6080000ad3bc "", offset=140737488342464, len=64) at src/gdb/target.c:1419
#15 0x00000000014bbd4d in target_read_raw_memory (memaddr=0x7fffffffcdc0, myaddr=0x6080000ad3bc "", len=64) at src/gdb/target.c:1252
#16 0x0000000000bf27df in dcache_read_line (dcache=0x6060001eddc0, db=0x6080000ad3a0) at src/gdb/dcache.c:336
#17 0x0000000000bf2b72 in dcache_peek_byte (dcache=0x6060001eddc0, addr=0x7fffffffcdd8, ptr=0x6020001231b0 "") at src/gdb/dcache.c:403
#18 0x0000000000bf3103 in dcache_read_memory_partial (ops=0x6170000c5900, dcache=0x6060001eddc0, memaddr=0x7fffffffcdd8, myaddr=0x6020001231b0 "", len=8, xfered_len=0x7fff6a29bf20) at src/gdb/dcache.c:484
#19 0x00000000014bafe9 in memory_xfer_partial_1 (ops=0x6170000c5900, object=TARGET_OBJECT_STACK_MEMORY, readbuf=0x6020001231b0 "", writebuf=0x0, memaddr=140737488342488, len=8, xfered_len=0x7fff6a29bf20) at src/gdb/target.c:1034
#20 0x00000000014bb212 in memory_xfer_partial (ops=0x6170000c5900, object=TARGET_OBJECT_STACK_MEMORY, readbuf=0x6020001231b0 "", writebuf=0x0, memaddr=140737488342488, len=8, xfered_len=0x7fff6a29bf20) at src/gdb/target.c:1076
#21 0x00000000014bb6b3 in target_xfer_partial (ops=0x6170000c5900, object=TARGET_OBJECT_STACK_MEMORY, annex=0x0, readbuf=0x6020001231b0 "", writebuf=0x0, offset=140737488342488, len=8, xfered_len=0x7fff6a29bf20) at src/gdb/target.c:1133
#22 0x000000000164564d in read_value_memory (val=0x60f000029440, bit_offset=0, stack=1, memaddr=0x7fffffffcdd8, buffer=0x6020001231b0 "", length=8) at src/gdb/valops.c:956
#23 0x0000000001680fff in value_fetch_lazy_memory (val=0x60f000029440) at src/gdb/value.c:3764
#24 0x0000000001681efd in value_fetch_lazy (val=0x60f000029440) at src/gdb/value.c:3910
#25 0x0000000001676143 in value_optimized_out (value=0x60f000029440) at src/gdb/value.c:1411
#26 0x0000000000e0fcb8 in frame_register_unwind (next_frame=0x6210066bfde0, regnum=16, optimizedp=0x7fff6a29c200, unavailablep=0x7fff6a29c240, lvalp=0x7fff6a29c2c0, addrp=0x7fff6a29c300, realnump=0x7fff6a29c280, bufferp=0x7fff6a29c3a0 "@\304)j\377\177") at src/gdb/frame.c:1144
#27 0x0000000000e10418 in frame_unwind_register (next_frame=0x6210066bfde0, regnum=16, buf=0x7fff6a29c3a0 "@\304)j\377\177") at src/gdb/frame.c:1196
#28 0x0000000000f00431 in i386_unwind_pc (gdbarch=0x6210043d0110, next_frame=0x6210066bfde0) at src/gdb/i386-tdep.c:1969
#29 0x0000000000e39724 in gdbarch_unwind_pc (gdbarch=0x6210043d0110, next_frame=0x6210066bfde0) at src/gdb/gdbarch.c:3056
#30 0x0000000000c2ea90 in dwarf2_tailcall_sniffer_first (this_frame=0x6210066bfde0, tailcall_cachep=0x6210066bfee0, entry_cfa_sp_offsetp=0x0) at src/gdb/dwarf2/frame-tailcall.c:423
#31 0x0000000000c36bdb in dwarf2_frame_cache (this_frame=0x6210066bfde0, this_cache=0x6210066bfdf8) at src/gdb/dwarf2/frame.c:1198
#32 0x0000000000c36eb3 in dwarf2_frame_this_id (this_frame=0x6210066bfde0, this_cache=0x6210066bfdf8, this_id=0x6210066bfe40) at src/gdb/dwarf2/frame.c:1226

Note that remote_target::readchar in frame #3 throws
TARGET_CLOSE_ERROR after that remote_unpush_target in frame #3
returns.

The problem is that that TARGET_CLOSE_ERROR is swallowed by
value_optimized_out in frame #25.

If we fix that one, then we run into dwarf2_tailcall_sniffer_first
swallowing the exception in frame #30 too.

The attached patch fixes it by making those spots swallow fewer kinds
of errors.

gdb/ChangeLog:

	* frame-tailcall.c (dwarf2_tailcall_sniffer_first): Only swallow
	NO_ENTRY_VALUE_ERROR.
	* value.c (value_optimized_out): Only swallow OPTIMIZED_OUT_ERROR.
2020-07-08 15:43:02 +01:00
Simon Marchi
6f07d888c0 Fix GDB busy loop when interrupting non-stop program (PR 26199)
When interrupting a program in non-stop, the program gets interrupted
correctly, but GDB busy loops (the event loop is always woken up).

Here is how to reproduce it:

 1. Start GDB: ./gdb -nx --data-directory=data-directory -ex "set non-stop 1" --args  /bin/sleep 60
 2. Run the program with "run"
 3. Interrupt with ^C.
 4. Look into htop, see GDB taking 100% CPU

Debugging `handle_file_event`, we see that the event source that wakes
up the event loop is the linux-nat one:

 (top-gdb) p file_ptr.proc
 $5 = (handler_func *) 0xb9cccd <handle_target_event(int, gdb_client_data)>
				 ^^^^^^^^^^^^^^^^^^^
					 |
					 \-- the linux-nat callback

Debugging fetch_inferior_event and do_target_wait, we see that we
don't actually call `wait` on the linux-nat target, because
inferior_matches returns false:

 auto inferior_matches = [&wait_ptid] (inferior *inf)
   {
     return (inf->process_target () != NULL
	     && (threads_are_executing (inf->process_target ())
		 || threads_are_resumed_pending_p (inf))
	     && ptid_t (inf->pid).matches (wait_ptid));
   };

because `threads_are_executing` is false.

What happens is:

 1. User types ctrl-c, that writes in the linux-nat pipe, waking up
    the event source.

 2. linux-nat's wait gets called, the SIGINT event is returned, but
    before returning, it marks the pipe again, in order for wait to
    get called again:

    /* If we requested any event, and something came out, assume there
       may be more.  If we requested a specific lwp or process, also
       assume there may be more.  */
    if (target_is_async_p ()
	&& ((ourstatus->kind != TARGET_WAITKIND_IGNORE
	     && ourstatus->kind != TARGET_WAITKIND_NO_RESUMED)
	    || ptid != minus_one_ptid))
      async_file_mark ();

 3. The SIGINT event is handled, the program is stopped, the stop
    notification is printed.

 4. The event loop is woken up again because of the `async_file_mark`
    of step 2.

 5. Because `inferior_matches` returns false, we never call
    linux-nat's wait, so the pipe stays readable.

 6. Goto 4.

Pedro says:

This commit fixes it by letting do_target_wait call target_wait even
if threads_are_executing is false.  This will normally result in the
target returning TARGET_WAITKIND_NO_RESUMED, and _not_ marking its
event source again.  This results in infrun only calling into the
target only once (i.e., breaking the busy loop).

Note that the busy loop bug didn't trigger in all-stop mode because
all-stop handles this by unregistering the target from the event loop
as soon as it was all stopped -- see
inf-loop.c:inferior_event_handler's INF_EXEC_COMPLETE handling.  If we
remove that non-stop check from inferior_event_handler, and replace
the target_has_execution check for threads_are_executing instead, it
also fixes the issue for non-stop.  I considered that as the final
solution, but decided that the solution proposed here instead is just
simpler and more future-proof design.  With the
TARGET_WAITKIND_NO_RESUMED handling fixes done in the previous
patches, I think it should be possible to always keep the target
registered in the event loop, meaning we could eliminate the
target_async(0) call from inferior_event_handler as well as most of
the target_async(1) calls in the target backends.  That would allow in
the future e.g., the remote target reporting asynchronous
notifications even if all threads are stopped.  I haven't attempted
that, though.

gdb/ChangeLog:
yyyy-mm-dd  Simon Marchi  <simon.marchi@polymtl.ca>
	    Pedro Alves  <pedro@palves.net>

	PR gdb/26199
	* infrun.c (threads_are_resumed_pending_p): Delete.
	(do_target_wait): Remove threads_are_executing and
	threads_are_resumed_pending_p checks from the inferior_matches
	lambda.  Update comments.
2020-07-06 19:58:21 +01:00
Pedro Alves
419e460add Testcase for previous handle_no_resumed fixes
This adds a testcase that covers the scenarios described in the
previous two commits.

gdb/testsuite/ChangeLog:

	PR gdb/26199
	* gdb.multi/multi-target.c (exit_thread): New.
	(thread_start): Break loop if EXIT_THREAD.
	* gdb.multi/multi-target.exp (test_no_unwaited_for): New proc.
	(top level) Call test_no_resumed.
2020-07-06 19:58:09 +01:00
Pedro Alves
367aff81be Make handle_no_resumed transfer terminal
Let's consider the same use case as in the previous commit:

Say you have two inferiors 1 and 2, each connected to a different
target, A and B.

Now say you set inferior 2 running, with "continue &".

Now you select a thread of inferior 1, say thread 1.2, and continue in
the foreground.  All other threads of inferior 1 are left stopped.
Thread 1.2 exits, and thus target A has no other resumed thread, so it
reports TARGET_WAITKIND_NO_RESUMED.

At this point, because the threads of inferior 2 are still executing
the TARGET_WAITKIND_NO_RESUMED event is ignored.

Now, the user types Ctrl-C.  Because GDB had previously put inferior 1
in the foreground, the kernel sends the SIGINT to that inferior.
However, no thread in that inferior is executing right now, so ptrace
never intercepts the SIGINT -- it is never dequeued by any thread.
The result is that GDB's CLI is stuck.  There's no way to get back the
prompt (unless inferior 2 happens to report some event).

The fix in this commit is to make handle_no_resumed give the terminal
to some other inferior that still has threads executing so that a
subsequent Ctrl-C reaches that target first (and then GDB intercepts
the SIGINT).  This is a bit hacky, but seems like the best we can do
with the current design.

I think that putting all native inferiors in their own session would
help fixing this in a clean way, since with that a Ctrl-C on GDB's
terminal will _always_ reach GDB first, and then GDB can decide how to
pause the inferior.  But that's a much larger change.

The testcase added by the following patch needs this fix.

gdb/ChangeLog:

	PR gdb/26199
	* infrun.c (handle_no_resumed): Transfer terminal to inferior with
	executing threads.
2020-07-06 19:57:57 +01:00
Pedro Alves
85dae34612 Fix handle_no_resumed w/ multiple targets
handle_no_resumed is currently not considering multiple targets.

Say you have two inferiors 1 and 2, each connected to a different
target, A and B.

Now say you set inferior 2 running, with "continue &".

Now you select a thread of inferior 1, say thread 1.2, and continue in
the foreground.  All other threads of inferior 1 are left stopped.
Thread 1.2 exits, and thus target A has no other resumed thread, so it
reports TARGET_WAITKIND_NO_RESUMED.

At this point, if both inferiors were running in the same target,
handle_no_resumed would realize that threads of inferior 2 are still
executing, so the TARGET_WAITKIND_NO_RESUMED event should be ignored.
But because handle_no_resumed only walks the threads of the current
target, it misses noticing that threads of inferior 2 are still
executing.  The fix is just to walk over all threads of all targets.

A testcase covering the use case above will be added in a following
patch.  It can't be added yet because it depends on yet another fix to
handle_no_resumed not included here.

gdb/ChangeLog:

	PR gdb/26199
	* infrun.c (handle_no_resumed): Handle multiple targets.
2020-07-06 19:57:46 +01:00
Pedro Alves
f7bf4732bb Avoid constant stream of TARGET_WAITKIND_NO_RESUMED
If we hit the synchronous execution command case described by
handle_no_resumed, and handle_no_resumed determines that the event
should be ignored, because it found a thread that is executing, we end
up in prepare_to_wait.

There, if the current target is not registered in the event loop right
now, we call mark_infrun_async_event_handler.  With that event handler
marked, the event loop calls again into fetch_inferior_event, which
calls target_wait, which returns TARGET_WAITKIND_NO_RESUMED, and we
end up in handle_no_resumed, again ignoring the event and marking
infrun_async_event_handler.  The result is that GDB is now always
keeping the CPU 100% busy in this loop, even though it continues to be
able to react to input and to real target events, because we still go
through the event-loop.

The problem is that marking of the infrun_async_event_handler in
prepare_to_wait.  That is there to handle targets that don't support
asynchronous execution.  So the correct predicate is whether async
execution is supported, not whether the target is async right now.

gdb/ChangeLog:

	PR gdb/26199
	* infrun.c (prepare_to_wait): Check target_can_async_p instead of
	target_is_async_p.
2020-07-06 19:57:34 +01:00
Pedro Alves
ce502e47c0 Fix latent bug in target_pass_ctrlc
We were checking the thr->executing of an exited thread.

gdb/ChangeLog:

	PR gdb/26199
	* target.c (target_pass_ctrlc): Looking at the inferiors
	non-exited threads, not all threads.
2020-07-06 19:57:20 +01:00
Pedro Alves
14b98a10ca Fix spurious unhandled remote %Stop notifications
In non-stop mode, remote targets mark an async event source whose
callback is supposed to result in calling remote_target::wait_ns to
either process the event queue, or acknowledge an incoming %Stop
notification.

The callback in question is remote_async_inferior_event_handler, where
we call inferior_event_handler, to end up in fetch_inferior_event ->
target_wait -> remote_target::wait -> remote_target::wait_ns.

A problem here however is that when debugging multiple targets,
fetch_inferior_event can pull events out of any target picked at
random, for event fairness.  This means that when
remote_async_inferior_event_handler returns, remote_target::wait may
have not been called at all, and thus pending notifications may have
not been acked.  Because async event sources auto-clear, when
remote_async_inferior_event_handler returns the async event handler is
no longer marked, so the event loop won't automatically call
remote_async_inferior_event_handler again to try to process the
pending remote notifications/queue.  The result is that stop events
may end up not processed, e.g., "interrupt -a" seemingly not managing
to stop all threads.

Fix this by making remote_async_inferior_event_handler mark the event
handler again before returning, if necessary.

Maybe a better fix would be to make async event handlers not
auto-clear themselves, make that the responsibility of the callback,
so that the event loop would keep calling the callback automatically.
Or, we could try making so that fetch_inferior_event would optionally
handle events only for the target that it got passed down via
parameter.  However, I don't think now just before branching is the
time to try to do any such change.

gdb/ChangeLog:

	PR gdb/26199
	* remote.c (remote_target::open_1): Pass remote target pointer as
	data to create_async_event_handler.
	(remote_async_inferior_event_handler): Mark async event handler
	before returning if the remote target still has either pending
	events or unacknowledged notifications.
2020-07-04 13:02:11 +01:00
13 changed files with 343 additions and 129 deletions

View File

@@ -448,14 +448,10 @@ find_gnu_ifunc_target_type (CORE_ADDR resolver_funaddr)
struct frame_info *
block_innermost_frame (const struct block *block)
{
struct frame_info *frame;
if (block == NULL)
return NULL;
frame = get_selected_frame_if_set ();
if (frame == NULL)
frame = get_current_frame ();
frame_info *frame = get_selected_frame (NULL);
while (frame != NULL)
{
const struct block *frame_block = get_frame_block (frame, NULL);

View File

@@ -377,7 +377,6 @@ dwarf2_tailcall_sniffer_first (struct frame_info *this_frame,
get_frame_address_in_block will decrease it by 1 in such case. */
this_pc = get_frame_address_in_block (this_frame);
/* Catch any unwinding errors. */
try
{
int sp_regnum;
@@ -439,7 +438,22 @@ dwarf2_tailcall_sniffer_first (struct frame_info *this_frame,
{
if (entry_values_debug)
exception_print (gdb_stdout, except);
return;
switch (except.error)
{
case NO_ENTRY_VALUE_ERROR:
/* Thrown by call_site_find_chain. */
case MEMORY_ERROR:
case OPTIMIZED_OUT_ERROR:
case NOT_AVAILABLE_ERROR:
/* These can normally happen when we try to access an
optimized out or unavailable register, either in a
physical register or spilled to memory. */
return;
}
/* Let unexpected errors propagate. */
throw;
}
/* Ambiguous unwind or unambiguous unwind verified as matching. */

View File

@@ -297,17 +297,15 @@ frame_stash_invalidate (void)
/* See frame.h */
scoped_restore_selected_frame::scoped_restore_selected_frame ()
{
m_fid = get_frame_id (get_selected_frame (NULL));
m_lang = current_language->la_language;
save_selected_frame (&m_fid, &m_level);
}
/* See frame.h */
scoped_restore_selected_frame::~scoped_restore_selected_frame ()
{
frame_info *frame = frame_find_by_id (m_fid);
if (frame == NULL)
warning (_("Unable to restore previously selected frame."));
else
select_frame (frame);
restore_selected_frame (m_fid, m_level);
set_language (m_lang);
}
/* Flag to control debugging. */
@@ -1641,10 +1639,51 @@ get_current_frame (void)
}
/* The "selected" stack frame is used by default for local and arg
access. May be zero, for no selected frame. */
access. */
/* The "single source of truth" for the selected frame is the
SELECTED_FRAME_ID / SELECTED_FRAME_LEVEL pair. Frame IDs can be
saved/restored across reinitializing the frame cache, while
frame_info pointers can't (frame_info objects are invalidated). If
we know the corresponding frame_info object, it is cached in
SELECTED_FRAME. If SELECTED_FRAME_ID / SELECTED_FRAME_LEVEL are
null_ptid / -1, and the target has stack and is stopped, the
selected frame is the current (innermost) frame, otherwise there's
no selected frame. */
static frame_id selected_frame_id = null_frame_id;
static int selected_frame_level = -1;
/* The cached frame_info object pointing to the selected frame.
Looked up on demand by get_selected_frame. */
static struct frame_info *selected_frame;
/* See frame.h. */
void
save_selected_frame (frame_id *frame_id, int *frame_level)
noexcept
{
*frame_id = selected_frame_id;
*frame_level = selected_frame_level;
}
/* See frame.h. */
void
restore_selected_frame (frame_id a_frame_id, int frame_level)
noexcept
{
/* save_selected_frame never returns level == 0, so we shouldn't see
it here either. */
gdb_assert (frame_level != 0);
selected_frame_id = a_frame_id;
selected_frame_level = frame_level;
/* Will be looked up latter by get_seleted_frame. */
selected_frame = nullptr;
}
int
has_stack_frames (void)
{
@@ -1682,24 +1721,14 @@ get_selected_frame (const char *message)
{
if (message != NULL && !has_stack_frames ())
error (("%s"), message);
/* Hey! Don't trust this. It should really be re-finding the
last selected frame of the currently selected thread. This,
though, is better than nothing. */
select_frame (get_current_frame ());
lookup_selected_frame (selected_frame_id, selected_frame_level);
}
/* There is always a frame. */
gdb_assert (selected_frame != NULL);
return selected_frame;
}
/* If there is a selected frame, return it. Otherwise, return NULL. */
struct frame_info *
get_selected_frame_if_set (void)
{
return selected_frame;
}
/* This is a variant of get_selected_frame() which can be called when
the inferior does not have a frame; in that case it will return
NULL instead of calling error(). */
@@ -1712,12 +1741,42 @@ deprecated_safe_get_selected_frame (void)
return get_selected_frame (NULL);
}
/* Select frame FI (or NULL - to invalidate the current frame). */
/* Select frame FI (or NULL - to invalidate the selected frame). */
void
select_frame (struct frame_info *fi)
{
selected_frame = fi;
selected_frame_level = frame_relative_level (fi);
if (selected_frame_level == 0)
{
/* Treat the current frame especially -- we want to always
save/restore it without warning, even if the frame ID changes
(see lookup_selected_frame). E.g.:
// The current frame is selected, the target had just stopped.
{
scoped_restore_selected_frame restore_frame;
some_operation_that_changes_the_stack ();
}
// scoped_restore_selected_frame's dtor runs, but the
// original frame_id can't be found. No matter whether it
// is found or not, we still end up with the now-current
// frame selected. Warning in lookup_selected_frame in this
// case seems pointless.
Also get_frame_id may access the target's registers/memory,
and thus skipping get_frame_id optimizes the common case.
Saving the selected frame this way makes get_selected_frame
and restore_current_frame return/re-select whatever frame is
the innermost (current) then. */
selected_frame_level = -1;
selected_frame_id = null_frame_id;
}
else
selected_frame_id = get_frame_id (fi);
/* NOTE: cagney/2002-05-04: FI can be NULL. This occurs when the
frame is being invalidated. */

View File

@@ -181,8 +181,14 @@ public:
private:
/* The ID of the previously selected frame. */
/* The ID and level of the previously selected frame. */
struct frame_id m_fid;
int m_level;
/* Save/restore the language as well, because selecting a frame
changes the current language to the frame's language if "set
language auto". */
enum language m_lang;
};
/* Methods for constructing and comparing Frame IDs. */
@@ -329,13 +335,33 @@ extern void reinit_frame_cache (void);
and then return that thread's previously selected frame. */
extern struct frame_info *get_selected_frame (const char *message);
/* If there is a selected frame, return it. Otherwise, return NULL. */
extern struct frame_info *get_selected_frame_if_set (void);
/* Select a specific frame. NULL, apparently implies re-select the
inner most frame. */
/* Select a specific frame. NULL implies re-select the inner most
frame. */
extern void select_frame (struct frame_info *);
/* Save the frame ID and frame level of the selected frame in FRAME_ID
and FRAME_LEVEL, to be restored later with restore_selected_frame.
This is preferred over getting the same info out of
get_selected_frame directly because this function does not create
the selected-frame's frame_info object if it hasn't been created
yet, and thus doesn't throw. */
extern void save_selected_frame (frame_id *frame_id, int *frame_level)
noexcept;
/* Restore selected frame as saved with save_selected_frame. Does not
try to find the corresponding frame_info object. Instead the next
call to get_selected_frame will look it up and cache the result.
This function does not throw, it is designed to be safe to called
from the destructors of RAII types. */
extern void restore_selected_frame (frame_id frame_id, int frame_level)
noexcept;
/* Lookup the frame_info object for the selected frame FRAME_ID /
FRAME_LEVEL and cache the result. If FRAME_LEVEL > 0 and the
originally selected frame isn't found, warn and select the
innermost (current) frame. */
extern void lookup_selected_frame (frame_id frame_id, int frame_level);
/* Given a FRAME, return the next (more inner, younger) or previous
(more outer, older) frame. */
extern struct frame_info *get_prev_frame (struct frame_info *);

View File

@@ -667,6 +667,10 @@ private:
frame_id m_selected_frame_id;
int m_selected_frame_level;
bool m_was_stopped;
/* Save/restore the language as well, because selecting a frame
changes the current language to the frame's language if "set
language auto". */
enum language m_lang;
};
/* Returns a pointer into the thread_info corresponding to

View File

@@ -3601,23 +3601,9 @@ do_target_wait_1 (inferior *inf, ptid_t ptid,
return event_ptid;
}
/* Returns true if INF has any resumed thread with a status
pending. */
static bool
threads_are_resumed_pending_p (inferior *inf)
{
for (thread_info *tp : inf->non_exited_threads ())
if (tp->resumed
&& tp->suspend.waitstatus_pending_p)
return true;
return false;
}
/* Wrapper for target_wait that first checks whether threads have
pending statuses to report before actually asking the target for
more events. Polls for events from all inferiors/targets. */
more events. Polls for events from all inferiors/targets. */
static bool
do_target_wait (ptid_t wait_ptid, execution_control_state *ecs, int options)
@@ -3625,20 +3611,18 @@ do_target_wait (ptid_t wait_ptid, execution_control_state *ecs, int options)
int num_inferiors = 0;
int random_selector;
/* For fairness, we pick the first inferior/target to poll at
random, and then continue polling the rest of the inferior list
starting from that one in a circular fashion until the whole list
is polled once. */
/* For fairness, we pick the first inferior/target to poll at random
out of all inferiors that may report events, and then continue
polling the rest of the inferior list starting from that one in a
circular fashion until the whole list is polled once. */
auto inferior_matches = [&wait_ptid] (inferior *inf)
{
return (inf->process_target () != NULL
&& (threads_are_executing (inf->process_target ())
|| threads_are_resumed_pending_p (inf))
&& ptid_t (inf->pid).matches (wait_ptid));
};
/* First see how many resumed inferiors we have. */
/* First see how many matching inferiors we have. */
for (inferior *inf : all_inferiors ())
if (inferior_matches (inf))
num_inferiors++;
@@ -3649,7 +3633,7 @@ do_target_wait (ptid_t wait_ptid, execution_control_state *ecs, int options)
return false;
}
/* Now randomly pick an inferior out of those that were resumed. */
/* Now randomly pick an inferior out of those that matched. */
random_selector = (int)
((num_inferiors * (double) rand ()) / (RAND_MAX + 1.0));
@@ -3658,7 +3642,7 @@ do_target_wait (ptid_t wait_ptid, execution_control_state *ecs, int options)
"infrun: Found %d inferiors, starting at #%d\n",
num_inferiors, random_selector);
/* Select the Nth inferior that was resumed. */
/* Select the Nth inferior that matched. */
inferior *selected = nullptr;
@@ -3670,7 +3654,7 @@ do_target_wait (ptid_t wait_ptid, execution_control_state *ecs, int options)
break;
}
/* Now poll for events out of each of the resumed inferior's
/* Now poll for events out of each of the matching inferior's
targets, starting from the selected one. */
auto do_wait = [&] (inferior *inf)
@@ -3680,8 +3664,8 @@ do_target_wait (ptid_t wait_ptid, execution_control_state *ecs, int options)
return (ecs->ws.kind != TARGET_WAITKIND_IGNORE);
};
/* Needed in all-stop+target-non-stop mode, because we end up here
spuriously after the target is all stopped and we've already
/* Needed in 'all-stop + target-non-stop' mode, because we end up
here spuriously after the target is all stopped and we've already
reported the stop to the user, polling for events. */
scoped_restore_current_thread restore_thread;
@@ -5068,23 +5052,77 @@ handle_no_resumed (struct execution_control_state *ecs)
have resumed threads _now_. In the example above, this removes
thread 3 from the thread list. If thread 2 was re-resumed, we
ignore this event. If we find no thread resumed, then we cancel
the synchronous command show "no unwaited-for " to the user. */
update_thread_list ();
the synchronous command and show "no unwaited-for " to the
user. */
for (thread_info *thread : all_non_exited_threads (ecs->target))
inferior *curr_inf = current_inferior ();
scoped_restore_current_thread restore_thread;
for (auto *target : all_non_exited_process_targets ())
{
if (thread->executing
|| thread->suspend.waitstatus_pending_p)
switch_to_target_no_thread (target);
update_thread_list ();
}
/* If:
- the current target has no thread executing, and
- the current inferior is native, and
- the current inferior is the one which has the terminal, and
- we did nothing,
then a Ctrl-C from this point on would remain stuck in the
kernel, until a thread resumes and dequeues it. That would
result in the GDB CLI not reacting to Ctrl-C, not able to
interrupt the program. To address this, if the current inferior
no longer has any thread executing, we give the terminal to some
other inferior that has at least one thread executing. */
bool swap_terminal = true;
/* Whether to ignore this TARGET_WAITKIND_NO_RESUMED event, or
whether to report it to the user. */
bool ignore_event = false;
for (thread_info *thread : all_non_exited_threads ())
{
if (swap_terminal && thread->executing)
{
/* There were no unwaited-for children left in the target at
some point, but there are now. Just ignore. */
if (thread->inf != curr_inf)
{
target_terminal::ours ();
switch_to_thread (thread);
target_terminal::inferior ();
}
swap_terminal = false;
}
if (!ignore_event
&& (thread->executing
|| thread->suspend.waitstatus_pending_p))
{
/* Either there were no unwaited-for children left in the
target at some point, but there are now, or some target
other than the eventing one has unwaited-for children
left. Just ignore. */
if (debug_infrun)
fprintf_unfiltered (gdb_stdlog,
"infrun: TARGET_WAITKIND_NO_RESUMED "
"(ignoring: found resumed)\n");
prepare_to_wait (ecs);
return 1;
ignore_event = true;
}
if (ignore_event && !swap_terminal)
break;
}
if (ignore_event)
{
switch_to_inferior_no_thread (curr_inf);
prepare_to_wait (ecs);
return 1;
}
/* Go ahead and report the event. */
@@ -8116,7 +8154,11 @@ prepare_to_wait (struct execution_control_state *ecs)
ecs->wait_some_more = 1;
if (!target_is_async_p ())
/* If the target can't async, emulate it by marking the infrun event
handler such that as soon as we get back to the event-loop, we
immediately end up in fetch_inferior_event again calling
target_wait. */
if (!target_can_async_p ())
mark_infrun_async_event_handler ();
}

View File

@@ -5605,7 +5605,7 @@ remote_target::open_1 (const char *name, int from_tty, int extended_p)
/* Register extra event sources in the event loop. */
rs->remote_async_inferior_event_token
= create_async_event_handler (remote_async_inferior_event_handler, NULL);
= create_async_event_handler (remote_async_inferior_event_handler, remote);
rs->notif_state = remote_notif_state_allocate (remote);
/* Reset the target state; these things will be queried either by
@@ -14164,6 +14164,19 @@ static void
remote_async_inferior_event_handler (gdb_client_data data)
{
inferior_event_handler (INF_REG_EVENT);
remote_target *remote = (remote_target *) data;
remote_state *rs = remote->get_remote_state ();
/* inferior_event_handler may have consumed an event pending on the
infrun side without calling target_wait on the REMOTE target, or
may have pulled an event out of a different target. Keep trying
for this remote target as long it still has either pending events
or unacknowledged notifications. */
if (rs->notif_state->pending_event[notif_client_stop.id] != NULL
|| !rs->stop_reply_queue.empty ())
mark_async_event_handler (rs->remote_async_inferior_event_token);
}
int

View File

@@ -1836,9 +1836,9 @@ trailing_outermost_frame (int count)
static void
select_frame_command_core (struct frame_info *fi, bool ignored)
{
struct frame_info *prev_frame = get_selected_frame_if_set ();
frame_info *prev_frame = get_selected_frame (NULL);
select_frame (fi);
if (get_selected_frame_if_set () != prev_frame)
if (get_selected_frame (NULL) != prev_frame)
gdb::observers::user_selected_context_changed.notify (USER_SELECTED_FRAME);
}
@@ -1857,10 +1857,9 @@ select_frame_for_mi (struct frame_info *fi)
static void
frame_command_core (struct frame_info *fi, bool ignored)
{
struct frame_info *prev_frame = get_selected_frame_if_set ();
frame_info *prev_frame = get_selected_frame (nullptr);
select_frame (fi);
if (get_selected_frame_if_set () != prev_frame)
if (get_selected_frame (nullptr) != prev_frame)
gdb::observers::user_selected_context_changed.notify (USER_SELECTED_FRAME);
else
print_selected_thread_frame (current_uiout, USER_SELECTED_FRAME);

View File

@@ -3274,7 +3274,7 @@ target_pass_ctrlc (void)
if (proc_target == NULL)
continue;
for (thread_info *thr : inf->threads ())
for (thread_info *thr : inf->non_exited_threads ())
{
/* A thread can be THREAD_STOPPED and executing, while
running an infcall. */

View File

@@ -26,12 +26,14 @@
static pthread_barrier_t barrier;
volatile int exit_thread;
static void *
thread_start (void *arg)
{
pthread_barrier_wait (&barrier);
while (1)
while (!exit_thread)
sleep (1);
return NULL;
}

View File

@@ -439,6 +439,77 @@ proc test_info_inferiors {multi_process} {
}
}
# Test that when there's a foreground execution command in progress, a
# TARGET_WAITKIND_NO_RESUMED for a particular target is ignored when
# other targets are still resumed.
proc test_no_resumed {} {
proc test_no_resumed_infs {inf_A inf_B} {
global gdb_prompt
if {![setup "off"]} {
untested "setup failed"
return
}
gdb_test "thread $inf_A.2" "Switching to thread $inf_A\.2 .*" \
"select thread of target A"
gdb_test_no_output "set scheduler-locking on"
gdb_test_multiple "continue &" "" {
-re "Continuing.*$gdb_prompt " {
pass $gdb_test_name
}
}
gdb_test "thread $inf_B.2" "Switching to thread $inf_B\.2 .*" \
"select thread of target B"
gdb_test "p exit_thread = 1" " = 1" \
"set the thread to exit on resumption"
# Wait 3 seconds. If we see any response from GDB, such as
# "No unwaited-for children left." it's a bug.
gdb_test_multiple "continue" "continue" {
-timeout 3
timeout {
pass $gdb_test_name
}
}
# Now stop the program (all targets).
send_gdb "\003"
gdb_test_multiple "" "send_gdb control C" {
-re "received signal SIGINT.*$gdb_prompt $" {
pass $gdb_test_name
}
}
gdb_test_multiple "info threads" "all threads stopped" {
-re "\\\(running\\\).*$gdb_prompt $" {
fail $gdb_test_name
}
-re "$gdb_prompt $" {
pass $gdb_test_name
}
}
}
# inferior 1 -> native
# inferior 2 -> extended-remote 1
# inferior 5 -> extended-remote 2
set inferiors {1 2 5}
foreach_with_prefix inf_A $inferiors {
foreach_with_prefix inf_B $inferiors {
if {$inf_A == $inf_B} {
continue
}
test_no_resumed_infs $inf_A $inf_B
}
}
}
# Make a core file with two threads upfront. Several tests load the
# same core file.
prepare_core
@@ -467,4 +538,9 @@ with_test_prefix "info-inferiors" {
}
}
# Test TARGET_WAITKIND_NO_RESUMED handling with multiple targets.
with_test_prefix "no-resumed" {
test_no_resumed
}
cleanup_gdbservers

View File

@@ -1325,20 +1325,25 @@ switch_to_thread (process_stratum_target *proc_target, ptid_t ptid)
switch_to_thread (thr);
}
static void
restore_selected_frame (struct frame_id a_frame_id, int frame_level)
/* See frame.h. */
void
lookup_selected_frame (struct frame_id a_frame_id, int frame_level)
{
struct frame_info *frame = NULL;
int count;
/* This means there was no selected frame. */
/* This either means there was no selected frame, or the selected
frame was the innermost (the current frame). */
if (frame_level == -1)
{
select_frame (NULL);
select_frame (get_current_frame ());
return;
}
gdb_assert (frame_level >= 0);
/* select_frame never saves 0 in SELECTED_FRAME_LEVEL, so we
shouldn't see it here. */
gdb_assert (frame_level > 0);
/* Restore by level first, check if the frame id is the same as
expected. If that fails, try restoring by frame id. If that
@@ -1409,22 +1414,14 @@ scoped_restore_current_thread::restore ()
&& target_has_stack
&& target_has_memory)
restore_selected_frame (m_selected_frame_id, m_selected_frame_level);
set_language (m_lang);
}
scoped_restore_current_thread::~scoped_restore_current_thread ()
{
if (!m_dont_restore)
{
try
{
restore ();
}
catch (const gdb_exception &ex)
{
/* We're in a dtor, there's really nothing else we can do
but swallow the exception. */
}
}
restore ();
if (m_thread != NULL)
m_thread->decref ();
@@ -1433,46 +1430,21 @@ scoped_restore_current_thread::~scoped_restore_current_thread ()
scoped_restore_current_thread::scoped_restore_current_thread ()
{
m_thread = NULL;
m_inf = current_inferior ();
m_inf->incref ();
m_lang = current_language->la_language;
if (inferior_ptid != null_ptid)
{
thread_info *tp = inferior_thread ();
struct frame_info *frame;
m_thread = inferior_thread ();
m_thread->incref ();
m_was_stopped = tp->state == THREAD_STOPPED;
if (m_was_stopped
&& target_has_registers
&& target_has_stack
&& target_has_memory)
{
/* When processing internal events, there might not be a
selected frame. If we naively call get_selected_frame
here, then we can end up reading debuginfo for the
current frame, but we don't generally need the debuginfo
at this point. */
frame = get_selected_frame_if_set ();
}
else
frame = NULL;
try
{
m_selected_frame_id = get_frame_id (frame);
m_selected_frame_level = frame_relative_level (frame);
}
catch (const gdb_exception_error &ex)
{
m_selected_frame_id = null_frame_id;
m_selected_frame_level = -1;
}
tp->incref ();
m_thread = tp;
m_was_stopped = m_thread->state == THREAD_STOPPED;
save_selected_frame (&m_selected_frame_id, &m_selected_frame_level);
}
m_inf->incref ();
else
m_thread = NULL;
}
/* See gdbthread.h. */

View File

@@ -1412,7 +1412,18 @@ value_optimized_out (struct value *value)
}
catch (const gdb_exception_error &ex)
{
/* Fall back to checking value->optimized_out. */
switch (ex.error)
{
case MEMORY_ERROR:
case OPTIMIZED_OUT_ERROR:
case NOT_AVAILABLE_ERROR:
/* These can normally happen when we try to access an
optimized out or unavailable register, either in a
physical register or spilled to memory. */
break;
default:
throw;
}
}
}