I would like to use gdb::parallel_for_each to implement the parallelism
of the DWARF unit indexing. However, the existing implementation of
gdb::parallel_for_each is blocking, which doesn't work with the model
used by the DWARF indexer, which is asynchronous and callback-based.
Add an asynchronouys version of gdb::parallel_for_each that will be
suitable for this task.
This new version accepts a callback that is invoked when the parallel
for each is complete.
This function uses the same strategy as gdb::task_group to invoke the
"done" callback: worker threads have a shared_ptr reference to some
object. The last worker thread to drop its reference causes the object
to be deleted, which invokes the callback.
Unlike for the sync version of gdb::parallel_for_each, it's not possible
to keep any state in the calling thread's stack, because that disappears
immediately after starting the workers. So all the state is kept in
that same shared object.
There is a limitation that the sync version doesn't have, regarding the
arguments you can pass to the worker objects: it's not possibly to rely
on references. There are more details in a comment in the code.
It would be possible to implement the sync version of
gdb::parallel_for_each on top of the async version, but I decided not to
do it to avoid the unnecessary dynamic allocation of the shared object,
and to avoid adding the limitations on passing references I mentioned
just above. But if we judge that it would be an acceptable cost to
avoid the duplication, we could do it.
Add a self test for the new function.
Change-Id: I6173defb1e09856d137c1aa05ad51cbf521ea0b0
Approved-By: Tom Tromey <tom@tromey.com>
In preparation for a following patch that will re-use the shared work
queue algorithm, move it to a separate class.
Change-Id: Id05cf8898a5d162048fa8fa056fbf7e0441bfb78
Approved-By: Tom Tromey <tom@tromey.com>
I think it would be convenient for parallel_for_each to pass an
iterator_range to the worker function, instead of separate begin and end
parameters. This allows using a ranged for loop directly.
Change-Id: I8f9681da65b0eb00b738379dfd2f4dc6fb1ee612
Approved-By: Tom Tromey <tom@tromey.com>
gdb::parallel_for_each uses static partitioning of the workload, meaning
that each worker thread receives a similar number of work items. Change
it to use dynamic partitioning, where worker threads pull work items
from a shared work queue when they need to.
Note that gdb::parallel_for_each is currently only used for processing
minimal symbols in GDB. I am looking at improving the startup
performance of GDB, where the minimal symbol process is one step.
With static partitioning, there is a risk of workload imbalance if some
threads receive "easier" work than others. Some threads sit still while
others finish working on their share of the work. This is not
desirable, because the gdb::parallel_for_each takes as long as the
slowest thread takes.
When loading a file with a lot of minimal symbols (~600k) in GDB, with
"maint set per-command time on", I observe some imbalance:
Time for "minsyms install worker": wall 0.732, user 0.550, sys 0.041, user+sys 0.591, 80.7 % CPU
Time for "minsyms install worker": wall 0.881, user 0.722, sys 0.071, user+sys 0.793, 90.0 % CPU
Time for "minsyms install worker": wall 2.107, user 1.804, sys 0.147, user+sys 1.951, 92.6 % CPU
Time for "minsyms install worker": wall 2.351, user 2.003, sys 0.151, user+sys 2.154, 91.6 % CPU
Time for "minsyms install worker": wall 2.611, user 2.322, sys 0.235, user+sys 2.557, 97.9 % CPU
Time for "minsyms install worker": wall 3.074, user 2.729, sys 0.203, user+sys 2.932, 95.4 % CPU
Time for "minsyms install worker": wall 3.486, user 3.074, sys 0.260, user+sys 3.334, 95.6 % CPU
Time for "minsyms install worker": wall 3.927, user 3.475, sys 0.336, user+sys 3.811, 97.0 % CPU
^
----´
The fastest thread took 0.732 seconds to complete its work (and then sat
still), while the slowest took 3.927 seconds. This means the
parallel_for_each took a bit less than 4 seconds.
Even if the number of minimal symbols assigned to each worker is the
same, I suppose that some symbols (e.g. those that need demangling) take
longer to process, which could explain the imbalance.
With this patch, things are much more balanced:
Time for "minsym install worker": wall 2.807, user 2.222, sys 0.144, user+sys 2.366, 84.3 % CPU
Time for "minsym install worker": wall 2.808, user 2.073, sys 0.131, user+sys 2.204, 78.5 % CPU
Time for "minsym install worker": wall 2.804, user 1.994, sys 0.151, user+sys 2.145, 76.5 % CPU
Time for "minsym install worker": wall 2.808, user 1.977, sys 0.135, user+sys 2.112, 75.2 % CPU
Time for "minsym install worker": wall 2.808, user 2.061, sys 0.142, user+sys 2.203, 78.5 % CPU
Time for "minsym install worker": wall 2.809, user 2.012, sys 0.146, user+sys 2.158, 76.8 % CPU
Time for "minsym install worker": wall 2.809, user 2.178, sys 0.137, user+sys 2.315, 82.4 % CPU
Time for "minsym install worker": wall 2.820, user 2.141, sys 0.170, user+sys 2.311, 82.0 % CPU
^
----´
In this version, the parallel_for_each took about 2.8 seconds,
representing a reduction of ~1.2 seconds for this step. Not
life-changing, but it's still good I think.
Note that this patch helps when loading big programs. My go-to test
program for this is telegram-desktop that I built from source. For
small programs (including loading gdb itself), it makes no perceptible
difference.
Now the technical bits:
- One impact that this change has on the minimal symbol processing
specifically is that not all calls to compute_and_set_names (a
critical region guarded by a mutex) are done at the end of each
worker thread's task anymore.
Before this patch, each thread would compute the names and hash values for
all the minimal symbols it has been assigned, and then would call
compute_and_set_names for all of them, while holding the mutex (thus
preventing other threads from doing this same step).
With the shared work queue approach, each thread grabs a batch of of
minimal symbols, computes the names and hash values for them, and
then calls compute_and_set_names (with the mutex held) for this batch
only. It then repeats that until the work queue is empty.
There are therefore more small and spread out compute_and_set_names
critical sections, instead of just one per worker thread at the end.
Given that before this patch the work was not well balanced among worker
threads, I guess that threads would enter that critical region at
roughly different times, causing little contention.
In the "with this patch" results, the CPU utilization numbers are not
as good, suggesting that there is some contention. But I don't know
if it's contention due to the compute_and_set_names critical section
or the shared work queue critical section. That can be investigated
later. In any case, what ultimately counts is the wall time, which
improves.
- One choice I had to make was to decide how many work items (in this
case minimal symbols) each worker should pop when getting work from
the shared queue. The general wisdom is that:
- popping too few items, and the synchronization overhead becomes
significant, and the total processing time increases
- popping too many items, and we get some imbalance back, and the
total processing time increases again
I experimented using a dynamic batch size proportional to the number
of remaining work items. It worked well in some cases but not
always. So I decided to keep it simple, with a fixed batch size.
That can always be tweaked later.
- I want to still be able to use scoped_time_it to measure the time
that each worker thread spent working on the task. I find it really
handy when measuring the performance impact of changes.
Unfortunately, the current interface of gdb::parallel_for_each, which
receives a simple callback, is not well-suited for that, once I
introduce the dynamic partitioning. The callback would get called
once for each work item batch (multiple time for each worker thread),
so it's not possible to maintain a per-worker thread object for the
duration of the parallel for.
To allow this, I changed gdb::parallel_for_each to receive a worker
type as a template parameter. Each worker thread creates one local
instance of that type, and calls operator() on it for each work item
batch. By having a scoped_time_it object as a field of that worker,
we can get the timings per worker thread.
The drawbacks of this approach is that we must now define the
parallel task in a separate class and manually capture any context we
need as fields of that class.
Change-Id: Ibf1fea65c91f76a95b9ed8f706fd6fa5ef52d9cf
Approved-By: Tom Tromey <tom@tromey.com>
I started working on this patch because I noticed that this
parallel_for_each test:
/* Check that if there are fewer tasks than threads, then we won't
end up with a null result. */
is not really checking anything. And then, this patch ended with
several changes, leading to general refactor of the whole file.
This test verifies, using std::all_of, that no entry in the intresults
vector is nullptr. However, nothing ever adds anything to intresults.
Since the vector is always empty, std::all_of is always true. This
state probably dates back to afdd136635 ("Back out some
parallel_for_each features"), which removed the ability for
parallel_for_each to return a vector of results. That commit removed
some tests, but left this one in, I'm guessing as an oversight.
One good idea in this test is to check that the worker never receives
empty ranges. I think we should always test for that. I think it's
also a good idea to test with exactly one item, that's a good edge case.
To achieve this without adding some more code duplication, factor out
the core functionality of the test in yet another test_one function (I'm
running out of ideas for names). In there, check that the range
received by the worker is not empty. Doing this pointed out that the
worker is actually called with empty ranges in some cases, necessitating
some minor changes in parallel-for.h.
Then, instead of only checking that the sum of the ranges received by
worker functions is the right count, save the elements received as part
of those ranges (in a vector), and check that this vector contains each
expected element exactly once. This should make the test a bit more
robust (otherwise we could have the right number of calls, but on the
wrong items).
Then, a subsequent patch in this series changes the interface or
parallel_for_each to use iterator_range. The only hiccup is that it
doesn't really work if the "RandomIt" type of the parallel_for_each is
"int". iterator_range<int>::size wouldn't work, as std::distance
doesn't work on two ints. Fix this in the test right away by building
an std::vector<int> to use as input.
Finally, run the test with the default thread pool thread count in
addition to counts 0, 1 an 3, currently tested. I'm thinking that it
doesn't hurt to test parallel_for_each in the configuration that it's
actually used with.
Change-Id: I5adf3d61e6ffe3bc249996660f0a34b281490d54
Approved-By: Tom Tromey <tom@tromey.com>
This value will likely never change at runtime, so we might as well make
it a template parameter. This has the "advantage" of being able to
remove the unnecessary param from gdb::sequential_for_each.
Change-Id: Ia172ab8e08964e30d4e3378a95ccfa782abce674
Approved-By: Tom Tromey <tom@tromey.com>
This updates the copyright headers to include 2025. I did this by
running gdb/copyright.py and then manually modifying a few files as
noted by the script.
Approved-By: Eli Zaretskii <eliz@gnu.org>
Eli mentioned [1] that given that we use US English spelling in our
documentation, we should use "behavior" instead of "behaviour".
In wikipedia-common-misspellings.txt there's a rule:
...
behavour->behavior, behaviour
...
which leaves this as a choice.
Add an overriding rule to hardcode the choice to common-misspellings.txt:
...
behavour->behavior
...
and add a rule to rewrite behaviour into behavior:
...
behaviour->behavior
...
and re-run spellcheck.sh on gdb*.
Tested on x86_64-linux.
[1] https://sourceware.org/pipermail/gdb-patches/2024-November/213371.html
This commit is the result of the following actions:
- Running gdb/copyright.py to update all of the copyright headers to
include 2024,
- Manually updating a few files the copyright.py script told me to
update, these files had copyright headers embedded within the
file,
- Regenerating gdbsupport/Makefile.in to refresh it's copyright
date,
- Using grep to find other files that still mentioned 2023. If
these files were updated last year from 2022 to 2023 then I've
updated them this year to 2024.
I'm sure I've probably missed some dates. Feel free to fix them up as
you spot them.
Now that the DWARF reader does not use parallel_for_each, we can
remove some of the features that were added just for it: return values
and task sizing.
The thread_pool typed tasks feature could also be removed, but I
haven't done so here. This one seemed less intrusive and perhaps more
likely to be needed at some point.
Given that GDB now requires C++17, we can replace gdb::invoke_result
with std::invoke_result which is provided by <type_traits>.
This patch also removes gdbsupport/invoke-result.h as it is not used
anymore.
Change-Id: I7e567356d38d6b3d85d8797d61cfc83f6f933f22
Approved-By: Tom Tromey <tom@tromey.com>
Approved-By: Pedro Alves <pedro@palves.net>
I found that parallel_for_each would submit empty tasks to the thread
pool. For example, this can happen if the number of tasks is smaller
than the number of available threads. In the DWARF reader, this
resulted in the cooked index containing empty sub-indices. This patch
arranges to instead shrink the result vector and process the trailing
entries in the calling thread.
This commit is the result of running the gdb/copyright.py script,
which automated the update of the copyright year range for all
source files managed by the GDB project to be updated to include
year 2023.
When building with Clang 14 (using gcc 12 libstdc++ headers), I get:
CXX dwarf2/read.o
In file included from /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:94:
/home/simark/src/binutils-gdb/gdb/../gdbsupport/parallel-for.h:142:21: error: 'result_of<(lambda at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:7124:5) (__gnu_debug::_Safe_iterator<__gnu_cxx::__normal_iterator<std::unique_ptr<dwarf2_per_cu_data, dwarf2_per_cu_data_deleter> *, std::__cxx1998::vector<std::unique_ptr<dwarf2_per_cu_data, dwarf2_per_cu_data_deleter>>>, std::vector<std::unique_ptr<dwarf2_per_cu_data, dwarf2_per_cu_data_deleter>>, std::random_access_iterator_tag>, __gnu_debug::_Safe_iterator<__gnu_cxx::__normal_iterator<std::unique_ptr<dwarf2_per_cu_data, dwarf2_per_cu_data_deleter> *, std::__cxx1998::vector<std::unique_ptr<dwarf2_per_cu_data, dwarf2_per_cu_data_deleter>>>, std::vector<std::unique_ptr<dwarf2_per_cu_data, dwarf2_per_cu_data_deleter>>, std::random_access_iterator_tag>)>' is deprecated: use 'std::invoke_result' instead [-Werror,-Wdeprecated-declarations]
= typename std::result_of<RangeFunction (RandomIt, RandomIt)>::type;
^
/home/simark/src/binutils-gdb/gdb/dwarf2/read.c:7122:14: note: in instantiation of function template specialization 'gdb::parallel_for_each<__gnu_debug::_Safe_iterator<__gnu_cxx::__normal_iterator<std::unique_ptr<dwarf2_per_cu_data, dwarf2_per_cu_data_deleter> *, std::__cxx1998::vector<std::unique_ptr<dwarf2_per_cu_data, dwarf2_per_cu_data_deleter>>>, std::vector<std::unique_ptr<dwarf2_per_cu_data, dwarf2_per_cu_data_deleter>>, std::random_access_iterator_tag>, (lambda at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:7124:5)>' requested here
= gdb::parallel_for_each (1, per_bfd->all_comp_units.begin (),
^
/usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/12.1.1/../../../../include/c++/12.1.1/type_traits:2597:9: note: 'result_of<(lambda at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:7124:5) (__gnu_debug::_Safe_iterator<__gnu_cxx::__normal_iterator<std::unique_ptr<dwarf2_per_cu_data, dwarf2_per_cu_data_deleter> *, std::__cxx1998::vector<std::unique_ptr<dwarf2_per_cu_data, dwarf2_per_cu_data_deleter>>>, std::vector<std::unique_ptr<dwarf2_per_cu_data, dwarf2_per_cu_data_deleter>>, std::random_access_iterator_tag>, __gnu_debug::_Safe_iterator<__gnu_cxx::__normal_iterator<std::unique_ptr<dwarf2_per_cu_data, dwarf2_per_cu_data_deleter> *, std::__cxx1998::vector<std::unique_ptr<dwarf2_per_cu_data, dwarf2_per_cu_data_deleter>>>, std::vector<std::unique_ptr<dwarf2_per_cu_data, dwarf2_per_cu_data_deleter>>, std::random_access_iterator_tag>)>' has been explicitly marked deprecated here
{ } _GLIBCXX17_DEPRECATED_SUGGEST("std::invoke_result");
^
/usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/12.1.1/../../../../include/c++/12.1.1/x86_64-pc-linux-gnu/bits/c++config.h:120:45: note: expanded from macro '_GLIBCXX17_DEPRECATED_SUGGEST'
# define _GLIBCXX17_DEPRECATED_SUGGEST(ALT) _GLIBCXX_DEPRECATED_SUGGEST(ALT)
^
/usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/12.1.1/../../../../include/c++/12.1.1/x86_64-pc-linux-gnu/bits/c++config.h:96:19: note: expanded from macro '_GLIBCXX_DEPRECATED_SUGGEST'
__attribute__ ((__deprecated__ ("use '" ALT "' instead")))
^
It complains about the use of std::result_of, which is deprecated in
C++17 and removed in C++20:
https://en.cppreference.com/w/cpp/types/result_of
Given we'll have to transition to std::invoke_result eventually, make a
GDB wrapper to mimimc std::invoke_result, which uses std::invoke_result
for C++ >= 17 and std::result_of otherwise. This way, it will be easy
to remove the wrapper in the future, just replace gdb:: with std::.
Tested by building with gcc 12 in -std=c++11 and -std=c++17 mode, and
clang in -std=c++17 mode (I did not test fully with clang in -std=c++11
mode because there are other unrelated issues).
Change-Id: I50debde0a3307a7bc67fcf8fceefda51860efc1d
Add a task_size parameter to parallel_for_each, defaulting to nullptr, and use
the task size to distribute similarly-sized chunks to the threads.
Tested on x86_64-linux.
When I changed the initialization of parallel_for_each_debug from 0 to false,
I forgot to change the type from int to bool. Fix this.
Tested by rebuilding on x86_64-linux.
When running a task using parallel_for_each, we get the following
distribution:
...
Parallel for: n_elements: 7271
Parallel for: minimum elements per thread: 10
Parallel for: elts_per_thread: 1817
Parallel for: elements on worker thread 0 : 1817
Parallel for: elements on worker thread 1 : 1817
Parallel for: elements on worker thread 2 : 1817
Parallel for: elements on worker thread 3 : 0
Parallel for: elements on main thread : 1820
...
Note that there are 4 active threads, and scheduling elts_per_thread on each
of those handles 4 * 1817 == 7268, leaving 3 "left over" elements.
These leftovers are currently handled in the main thread.
That doesn't seem to matter much for this example, but for say 10 threads and
99 elements, you'd have 9 threads handling 9 elements and 1 thread handling 18
elements.
Instead, distribute the left over elements over the worker threads, such that
we have:
...
Parallel for: elements on worker thread 0 : 1818
Parallel for: elements on worker thread 1 : 1818
Parallel for: elements on worker thread 2 : 1818
Parallel for: elements on worker thread 3 : 0
Parallel for: elements on main thread : 1817
...
Tested on x86_64-linux.
Add a parallel_for_each_debug variable, set to false by default.
With an a.out compiled from hello world, we get with
parallel_for_each_debug == true:
...
$ gdb -q -batch a.out -ex start
...
Parallel for: n_elements: 7271
Parallel for: minimum elements per thread: 10
Parallel for: elts_per_thread: 1817
Parallel for: elements on worker thread 0 : 1817
Parallel for: elements on worker thread 1 : 1817
Parallel for: elements on worker thread 2 : 1817
Parallel for: elements on worker thread 3 : 0
Parallel for: elements on main thread : 1820
Temporary breakpoint 1, main () at /home/vries/hello.c:6
6 printf ("hello\n");
...
Tested on x86_64-linux.
Add a sequential_for_each alongside the parallel_for_each, which can be used
as a drop-in replacement.
This can be useful when debugging multi-threading behaviour, and you want to
limit multi-threading in a fine-grained way.
Tested on x86_64-linux, by using it instead of the parallel_for_each in
dwarf2_build_psymtabs_hard.
PR build/29110 points out that GDB fails to build on mingw when the
"win32" thread model is in use. It turns out that the Fedora cross
tools using the "posix" thread model, which somehow manages to support
std::future, whereas the win32 model does not.
While looking into this, I found that the configuring with
--disable-threading will also cause a build failure.
This patch fixes this build by introducing a compatibility wrapper for
std::future.
I am not able to test the win32 thread model build, but I'm going to
ask the reporter to try this patch.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29110
When building with -std=c++11, I get:
In file included from /home/smarchi/src/binutils-gdb/gdb/unittests/parallel-for-selftests.c:22: /home/smarchi/src/binutils-gdb/gdb/../gdbsupport/parallel-for.h:134:10: error: ‘result_of_t’ is not a member of ‘std’; did you mean ‘result_of’?
134 | std::result_of_t<RangeFunction (RandomIt, RandomIt)>
| ^~~~~~~~~~~
| result_of
This is because result_of_t has been introduced in C++14. Use the
equivalent result_of<...>::type instead.
result_of and result_of_t have been removed in C++20 though, so I think
we'll need some patches eventually to make the code use invoke_result
instead, depending on the C++ version.
Change-Id: I4817f361c0ebcdd4b32976898fc368bb302b61b9
This changes gdb::parallel_for_each to return a vector of the results.
However, if the passed-in function returns void, the return type
remains 'void'. This functionality is used later, to parallelize the
new indexer.
parallel_for_each currently requires each thread to process at least
10 elements. However, when indexing, it's fine for a thread to handle
just a single CU. This patch parameterizes this, and updates the one
user.
thread-pool.h requires CXX_STD_THREAD in order to even be included.
However, there's no deep reason for this, and during review we found
that one patch in the new DWARF indexer series unconditionally
requires the thread pool.
Because the thread pool already allows a task to be run in the calling
thread (for example if it is configured to have no threads in the
pool), it seemed straightforward to make this code ok to use when host
threads aren't available at all.
This patch implements this idea. I built it on a thread-less host
(mingw, before my recent configure patch) and verified that the result
builds.
After the thread-pool change, parallel-for.h no longer needs any
CXX_STD_THREAD checks at all, so this patch removes these as well.
This commit brings all the changes made by running gdb/copyright.py
as per GDB's Start of New Year Procedure.
For the avoidance of doubt, all changes in this commits were
performed by the script.
This commits the result of running gdb/copyright.py as per our Start
of New Year procedure...
gdb/ChangeLog
Update copyright year range in copyright header of all GDB files.
This patch moves the gdbsupport directory to the top level. This is
the next step in the ongoing project to move gdbserver to the top
level.
The bulk of this patch was created by "git mv gdb/gdbsupport gdbsupport".
This patch then adds a build system to gdbsupport and wires it into
the top level. Then it changes gdb to use the top-level build.
gdbserver, on the other hand, is not yet changed. It still does its
own build of gdbsupport.
ChangeLog
2020-01-14 Tom Tromey <tom@tromey.com>
* src-release.sh (GDB_SUPPORT_DIRS): Add gdbsupport.
* MAINTAINERS: Add gdbsupport.
* configure: Rebuild.
* configure.ac (configdirs): Add gdbsupport.
* gdbsupport: New directory, move from gdb/gdbsupport.
* Makefile.def (host_modules, dependencies): Add gnulib.
* Makefile.in: Rebuild.
gdb/ChangeLog
2020-01-14 Tom Tromey <tom@tromey.com>
* nat/x86-linux-dregs.c: Include configh.h.
* nat/linux-ptrace.c: Include configh.h.
* nat/linux-btrace.c: Include configh.h.
* defs.h: Include config.h, bfd.h.
* configure.ac: Don't source common.host.
(CONFIG_OBS, CONFIG_SRCS): Remove gdbsupport files.
* configure: Rebuild.
* acinclude.m4: Update path.
* Makefile.in (SUPPORT, LIBSUPPORT, INCSUPPORT): New variables.
(CONFIG_SRC_SUBDIR): Remove gdbsupport.
(INTERNAL_CFLAGS_BASE): Add INCSUPPORT.
(CLIBS): Add LIBSUPPORT.
(CDEPS): Likewise.
(COMMON_SFILES): Remove gdbsupport files.
(HFILES_NO_SRCDIR): Likewise.
(stamp-version): Update path to create-version.sh.
(ALLDEPFILES): Remove gdbsupport files.
gdb/gdbserver/ChangeLog
2020-01-14 Tom Tromey <tom@tromey.com>
* server.h: Include config.h.
* gdbreplay.c: Include config.h.
* configure: Rebuild.
* configure.ac: Don't source common.host.
* acinclude.m4: Update path.
* Makefile.in (INCSUPPORT): New variable.
(INCLUDE_CFLAGS): Add INCSUPPORT.
(SFILES): Update paths.
(version-generated.c): Update path to create-version.sh.
(gdbsupport/%-ipa.o, gdbsupport/%.o): Update paths.
gdbsupport/ChangeLog
2020-01-14 Tom Tromey <tom@tromey.com>
* common-defs.h: Add GDBSERVER case. Update includes.
* acinclude.m4, aclocal.m4, config.in, configure, configure.ac,
Makefile.am, Makefile.in, README: New files.
* Moved from ../gdb/gdbsupport/
Change-Id: I07632e7798635c1bab389bf885971e584fb4bb78