gdb: enable target_async around stop_all_threads call in process_initial_stop_replies

The following scenario hangs:

 - maint set target-non-stop on
 - `gdbserver --attach`
 - a multi-threaded program

For example:

Terminal 1:

    $ gnome-calculator&
    [1] 495731
    $ ../gdbserver/gdbserver --once --attach :1234 495731
    Attached; pid = 495731
    Listening on port 1234

Terminal 2:

    $ ./gdb -nx -q --data-directory=data-directory /usr/bin/gnome-calculator -ex "maint set target-non-stop on" -ex "tar rem :1234"
    Reading symbols from /usr/bin/gnome-calculator...
    (No debugging symbols found in /usr/bin/gnome-calculator)
    Remote debugging using :1234
    * hangs *

What happens is:

 - The protocol between gdb and gdbserver is in non-stop mode, but the
   user-visible behavior is all-stop
 - On connect, gdbserver sends one stop reply for one thread that is
   stops, the others stay running
 - In process_initial_stop_replies, gdb calls stop_all_threads to stop
   these other threads, because we are using the all-stop user-visible
   mode
 - stop_all_threads sends a stop request for all the running threads and
   then waits for resulting events
 - At this point, the remote target is in target_async(0) mode, which
   makes stop_all_threads not consider it for events
 - stop_all_threads loops indefinitely (it does not even block
   indefinitely, it is in an infinite busy loop) because there are no
   event sources.  wait_one_event returns a TARGET_WAITKIND_NO_RESUMED
   wait status.

Fix that by making the remote target async around the stop_all_threads
call.

I haven't implemented it because I'm not sure how to do it, but I think
it would be a good idea to have, in stop_all_threads / wait_one /
handle_one, an assert to check that if we are expecting one or more
event, then there are some targets that are in a state where they can
supply some events.  Otherwise, we'll necessarily be stuck in this
infinite loop, and it's probably due to a bug in GDB.  I'm not too sure
where to put this or how to express it though.  Perhaps in
stop_all_threads, here:

	  for (int i = 0; i < waits_needed; i++)
	    {
	      wait_one_event event = wait_one ();
	      *here*
	      if (handle_one (event))
		break;
	    }

If at that point, the returned event is TARGET_WAITKIND_NO_RESUMED,
there's a problem.  We expect some event, because we've asked some
threads to stop, but all targets are answering that they won't have any
events for us.  That's a contradiction, and a sign that something has
gone wrong.  It could perhaps event be:

    gdb_assert (event.ws.kind != TARGET_WAITKIND_NO_RESUMED);

in handle_one, as the idea is the same in prepare_for_detach.

A bit more sophisticated would be: we know which targets we are
expecting waits from, since we know which threads we have asked to
stop.  So if any of these targets returns TARGET_WAITKIND_NO_RESUMED,
something is fishy.

Add a test that tests attaching with gdbserver's --attach flag to a
multi-threaded program, and then connecting to it.  Without the fix, the
test reproduces the hang.

Change-Id: If6f6690a4887ca66693ef1af64791dda4c65f24f
This commit is contained in:
Simon Marchi
2021-06-07 13:15:05 -04:00
parent f08d6b8e02
commit abe8cab7cb
3 changed files with 117 additions and 1 deletions

View File

@@ -4609,7 +4609,15 @@ remote_target::process_initial_stop_replies (int from_tty)
the inferiors. */
if (!non_stop)
{
stop_all_threads ();
{
/* At this point, the remote target is not async. It needs to be for
the poll in stop_all_threads to consider events from it, so enable
it temporarily. */
gdb_assert (!this->is_async_p ());
SCOPE_EXIT { target_async (0); };
target_async (1);
stop_all_threads ();
}
/* If all threads of an inferior were already stopped, we
haven't setup the inferior yet. */

View File

@@ -0,0 +1,29 @@
#include <pthread.h>
#include <unistd.h>
static const int NTHREADS = 10;
static pthread_barrier_t barrier;
static void *
thread_func (void *p)
{
pthread_barrier_wait (&barrier);
return NULL;
}
int
main (void)
{
alarm (60);
pthread_t threads[NTHREADS];
pthread_barrier_init (&barrier, NULL, NTHREADS + 2);
for (int i = 0; i < NTHREADS; i++)
pthread_create (&threads[i], NULL, thread_func, NULL);
pthread_barrier_wait (&barrier);
for (int i = 0; i < NTHREADS; i++)
pthread_join (threads[i], NULL);
}

View File

@@ -0,0 +1,79 @@
# This testcase is part of GDB, the GNU debugger.
# Copyright 2021 Free Software Foundation, Inc.
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# Test attaching to a multi-threaded process using gdbserver's --attach flag.
load_lib gdbserver-support.exp
standard_testfile
if { [skip_gdbserver_tests] } {
return 0
}
if {![can_spawn_for_attach]} {
return 0
}
# Start the test program, attach to it using gdbserver's --attach flag, connect
# to it with GDB, check that what we see makes sense.
proc run_one_test { non-stop target-non-stop } {
save_vars { ::GDBFLAGS } {
# If GDB and GDBserver are both running locally, set the sysroot to avoid
# reading files via the remote protocol.
if { ![is_remote host] && ![is_remote target] } {
set ::GDBFLAGS "$::GDBFLAGS -ex \"set sysroot\""
}
if { [prepare_for_testing "failed to prepare" $::testfile $::srcfile \
{debug pthreads}] } {
return -1
}
}
# Make sure we're disconnected, in case we're testing with an
# extended-remote board, therefore already connected.
gdb_test "disconnect" ".*"
set target_exec [gdbserver_download_current_prog]
set test_spawn_id [spawn_wait_for_attach $::binfile]
set testpid [spawn_id_get_pid $test_spawn_id]
lassign [gdbserver_start "" "--attach $testpid"] unused gdbserver_address
gdb_test_no_output "set non-stop ${non-stop}"
gdb_test_no_output "maint set target-non-stop ${target-non-stop}"
gdb_target_cmd "remote" $gdbserver_address
# There should be 11 threads.
gdb_test "thread 11" "Switching to thread 11.*"
kill_wait_spawned_process $test_spawn_id
gdbserver_exit 0
}
foreach_with_prefix non-stop {0 1} {
foreach_with_prefix target-non-stop {0 1} {
# This combination does not make sense.
if { ${non-stop} == 1 && ${target-non-stop} == 0} {
continue
}
run_one_test ${non-stop} ${target-non-stop}
}
}