Infinity updates

In case you hadn’t noticed, I8C 0.0.1 was released last week. As well as the note compiler, this package contains I8X, an interpreter for unit testing compiled notes.

In case you had noticed and you’ve been working with it, please note there’s some compatibility-breaking source language and note format changes in the pipeline.

Infinity has a mailing list, and I want you to subscribe to it. Infinity isn’t just my project–anybody who might either produce or consume Infinity notes has a stake in this:

If you’re working on a debugger or other tool that might consume Infinity notes then you have a stake.
If you’re working on a library that might expose some API via Infinity notes then you have a stake.

I don’t know every issue that Infinity will have to solve, and you know things I need to know. So don’t sit on the sidelines! Send an empty message to infinity-subscribe@sourceware.org and get involved.

Infinity compiler update

I had hoped to make a first release of i8c today but I still have one open issue so you’ll have to wait until next week. Until then here’s some note source for you to ponder:

define test::factorial returns int
    argument int x
    extern func int (int) factorial

    swap
    dup
    load 1
    bne not_done_yet
    return

not_done_yet:
    dup
    load 1
    sub
    swap
    rot
    swap
    call
    mul

As hints I will a) point you at section 2.5 of the DWARF spec, and b) mention that argument and extern directives instruct the caller and runtime respectively to push entries onto the stack.

Your homework is to see how much of the language and runtime you can infer from this one function.

You may ask questions.

Infinity compiler

When I said I’d never written a compiler, that’s actually a lie: I did some work on the Ceylon compiler a few years back. It wasn’t a very enriching experience but at least I know what ASTs and visitors are.

Anyway, the C-like syntax I talked about Friday might have been a little optimistic. I wanted the source to not be full of stack-shuffling but that’s not going to happen. Partly because I’d been assuming that the problem of compiling to a stack-based instruction set was solved: I was thinking of Java, but of course Java has local variables so it’s not exactly a stack-based instruction set, it’s stack and 65,536 registers–kind of the opposite problem! I didn’t find much online about compiling to stack-based instruction sets other than “it’s hard”.

I could probably manage something, but given that the total output of the compiler for glibc will likely be a hundred or so bytes of instructions, it doesn’t make sense to spend a huge amount of time on it.

The Infinity function compiler is therefore going to be something of a helpful assembler.

Infinity function compiler

Yesterday I reworked the note format to allow functions as first-class objects, and then I had an idea about how to implement the note compiler.

I’d already thought I could have the compiler emit assembly to avoid having to think about endianness, but I couldn’t figure out how to do other platform-specific stuff like 32- vs 64-bit, or choosing which of the three LWP ID to thread descriptor mapping styles to use. My idea was to not only output to the assembler but to input from the C preprocessor as well. That way I can do all the conditional stuff in there, driven by glibc’s include files and CPPFLAGS.

Yeah, I think I can write a compiler that knows nothing about the platform it’s compiling for.

So, I spent all morning messing around with glibc’s build system. I’ve come up with this:

Each machine’s tls.h will have some Infinity definitions following the ones for libthread_db. Here’s the one for x86_64:

/* Magic for libthread_db to know how to do THREAD_SELF.  */
#define DB_THREAD_SELF_INCLUDE   /* For the FS constant.  */
#define DB_THREAD_SELF CONST_THREAD_AREA (64, FS)

/* Magic for Infinity to know how to do THREAD_SELF.  */
#define I8_THREAD_SELF I8_TS_CONST_THREAD_AREA
#define I8_TS_CTA_VALUE FS

I could probably generate the libthread_db ones from the Infinity ones… but… later.

glibc has a script to generate assembler include files, to get constants and offsets from C headers into the assembler. I wrote a tls-infinity.sym file that the build system turns into tls-infinity.h for me.
The source for map_lwp2thr is infinity_map_lwp2thr.i8. That includes tls-infinity.h and, on x86_64, conditionally includes infinity_lookup_th_unique_cta.i8 to perform the mapping in the correct way for that platform.
infinity_map_lwp2thr.i8 gets fed into the C preprocessor which a) does all the conditionality and b) expands all the constants (as a total bonus!)

This leaves me with this lovely file to write a compiler for.

I’ve never written a compiler.

Infinity progress update

Yesterday I had my first Infinity milestone. I have a glibc with notes implementing libthread_db’s td_ta_map_lwp2thr and a GDB that loads and executes them. I did a side-by-side test calling the Infinity libpthread::map_lwp2thr function every time libthread_db’s td_ta_map_lwp2thr was called and comparing the results… and it worked! You could see the DWARF if you’re interested.

I’ve had something of a look at all the five functions I need to implement now and I’m pretty sure I have a handle on what infrastructure is needed to support them. The one thing I’m missing at present is function calls (infinity-infinity and infinity-other), which I’ll likely handle with a new instruction (DW_OP_GNU_i8call or something). That’s probably the only extension I need. It’s definitely necessary because one of the arguments to td_ta_thr_iter is a function that td_ta_thr_iter then calls. Infinity basically needs first-class functions for this.

I also have a pretty good idea of how to hook it into GDB’s Linux thread_db code. I’ll write a bunch of functions in GDB’s common code with the same signatures as libthread_db’s functions and pass them to the existing thread_db code by pointer. Doing it that way means the existing thread_db code in GDB and gdbserver can be left unmodified. I’ll implement the bookkeeping stuff there too, probably using the thread_agent type to pass around whatever context I need.

At the moment I’m using GDB’s existing DWARF interpreter to execute the notes, but I may well end up writing a new one for it. Firstly because the existing interpreter is deeply embedded into the GDB side: it uses gdbarch, values, types, etc, none of which are currently in common code. And secondly because the existing interpreter is written to evaluate lots of different DWARF expressions once (or, a small number of times). It’s input is raw DWARF, which it decodes instruction-by-instruction. The use case for the Infinity interpreter is executing a small number of expressions lots of times, so some pre-analysis would be beneficial. The instructions could be expanded, with operands decoded and possibly translated, and common multi-instruction cases could be combined. Stuff like calls to builtin functions (ps_get_thread_area, ps_getpid) could be replaced with intrinsics. And, with some restrictions on the bytecode, I could pre-compute the stack depth and do one overflow check at function entry rather than overflow checks for every push. That kind of thing.

I’m expecting the addition of a second DWARF interpreter to GDB to be contentious, but they’ll optimized for different things and doing different things. For example, Infinity could work better with some type tracking (and will likely need it to make function calls secure) but it’s different from what GDB’s existing interpreter needs and it’s difficult to see how to combine the two without ending up with something that’s not very good at either. Not to mention that getting it to a point it can be moved to common code would likely slow it down a ton.

Anyway, that’s where I’m at.

td_ta_map_lwp2thr

To debug live processes on modern Linux GDB needs four libthread_db functions:

td_ta_map_lwp2thr (required for initial attach)
td_thr_get_info (required for initial attach)
td_thr_tls_get_addr (not required for initial attach, but required for “p errno” on regular executables)
td_thr_tlsbase (not required for initial attach, but required for “p errno” for -static -pthread executables)

To debug a corefile on modern Linux GDB needs one more libthread_db function:

td_ta_thr_iter

GDB makes some other libthread_db calls too, but these are bookkeeping that won’t be required with the replacement. So, the order of work will be:

Implement replacements for the four core functions.
Get those approved and committed in GDB, BFD and glibc (and in binutils, coreutils readelf).
Replace td_ta_thr_iter too, and get that committed.
Implement runtime-linker interface stuff to allow GDB to follow dlmopen.

The first (non-bookkeeping) function GDB calls is td_ta_map_lwp2thr and it’s a pig. If I can do td_ta_map_lwp2thr I can do anything.

When you call it, td_ta_map_lwp2thr has four ways it can proceed:

If __pthread_initialize_minimal has not gotten far enough we can’t rely on whatever’s in the thread registers. If this is the case, td_ta_map_lwp2thr checks that the LWP is the initial thread and sets th->th_unique to NULL. (Other bits of libthread_db spot this NULL and act accordingly.) td_ta_map_lwp2thr decides whether __pthread_initialize_minimal has gotten far enough by examining __stack_user.next in the inferior. If it’s NULL then __pthread_initialize_minimal has not gotten far enough.
On ta_howto_const_thread_area architectures (x86_64, aarch64, arm)
[glibc/sysdeps/*/nptl/tls.h has
#define DB_THREAD_SELF CONST_THREAD_AREA(bits, value)
which exports
const uint32_t _thread_db_const_thread_area = value;
from glibc/nptl_db/db_info.c]:
- td_ta_map_lwp2thr will call ps_get_thread_area with value
to set th->th_unique.

ps_get_thread_area (in GDB) does different things for different
architectures:
1. on x86_64, value is a register number (FS or GS)
  ps_get_thread_area returns the contents of that register.
2. on arm, GDB uses PTRACE_GET_THREAD_AREA, NULL and subtracts value from the result.
3. on aarch64, GDB uses PTRACE_GETREGSET, NT_ARM_TLS and subtracts value from the result.
On ta_howto_reg architectures (ppc*, s390*)
[glibc/sysdeps/*/nptl/tls.h has
  #define DB_THREAD_SELF REGISTER(bits, size, regofs, bias)...
which exports
  const uint32_t _thread_db_register32[3] = {size, regofs, bias};
and/or
  const uint32_t _thread_db_register64[3] = {size, regofs, bias};
from glibc/nptl_db/db_info.c]:

td_ta_map_lwp2thr will:
- call ps_lgetregs to get the inferior’s registers
- get the contents of the specified register (with _td_fetch_value_local)
  and
- SUBTRACT bias from the register’s contents
to set th->unique.
On ta_howto_reg_thread_area architectures (i386)
[glibc/sysdeps/*/nptl/tls.h has
  #define DB_THREAD_SELF REGISTER_THREAD_AREA(bits, size, regofs, bias)...
which exports
  const uint32_t _thread_db_register32_thread_area[3] = {size, regofs, bias};
and/or
  const uint32_t _thread_db_register64_thread_area[3] = {size, regofs, bias};
from glibc/nptl_db/db_info.c]:

td_ta_map_lwp2thr will:
- call ps_lgetregs to get the inferior’s registers
- get the contents of the specified register (with _td_fetch_value_local)
- RIGHT SHIFT the register’s contents by bias
  and
- call ps_get_thread_area with that number
to set th->unique.

ps_get_thread_area (in GDB) does different things for different
architectures:
1. on i386, GDB uses PTRACE_GET_THREAD_AREA, VALUE and returns the second element of the result.

Cases 2, 3, and 4 will obviously be hardwired into the specific architecture’s libpthread. But… yeah.

Infinity

I’m writing a replacement for libthread_db. It’s called Infinity.

Why? Because libthread_db is a pain in the ass for debuggers. GDB has to watch for inferiors loading thread libraries. It has to know that, for example, on GNU/Linux, when the inferior loads libpthread.so then GDB has to locate the corresponding libthread_db.so into itself and use that to inspect libpthread’s internal structures. How does GDB know where libthread_db is? It doesn’t, it has to search for it. How does it know, when it finds it, that the libthread_db it found is compatible with the libpthread the inferior loaded? It doesn’t, it has to load it to see, then unload it if it didn’t work. How does GDB know that the libthread_db it found is compatible with itself? It doesn’t, it has to load it and, erm, crash if it isn’t. How does GDB manage when the inferior (and its libthread_db) has a different ABI to GDB? Well, it doesn’t.

libthread_db means you can’t debug an application in a RHEL 6 container with a GDB in a RHEL 7 container. Probably. Not safely. Not without using gdbserver, anyway–and there’s no reason you should have to use gdbserver to debug what is essentially a native process.

So. Infinity. In Infinity, inspection functions for debuggers will be shipped as bytecode in ELF notes in the same file as the code they pertain to. libpthread.so, for example, will contain a bunch of Infinity notes, each representing some bit of functionality that GDB currently gets from libthread_db. When the inferior starts or loads libraries GDB will find the notes in the files it already loaded and register their functions. If GDB notices it has, for example, the full set of functions it requires for thread support then, boom, thread support switches on. This happens regardless of whether libpthread was dynamically or statically linked.

(If you’re using gdbserver, gdbserver gives GDB a list of Infinity functions it’s interested in. When GDB finds these functions it fires the (slightly rewritten) bytecode over to gdbserver and gdbserver takes it from there.)

Concrete things I have are: a bytecode format (but not the bytecode itself), an executable with a couple of handwritten notes (with some junk where the bytecode should be), a readelf that can decode the notes, a BFD that extracts the notes and a GDB that picks them up.

What I’m doing right now is rewriting a function I don’t understand (td_ta_map_lwp2thr) in a language I’m inventing as I go along (i8) that’ll be compiled with a compiler that barely exists (i8c) into a bytecode that’s totally undefined to be executed by an interpreter that doesn’t exist.

(The compiler’s going to be written in Python, and it’ll emit assembly language. It’s more of an assembler, really. Emitting assembler rather than going straight to bytecode simplifies things (e.g. the compiler won’t need to understand numbers!) at the expense of emitting some slightly crappy code (e.g. instruction sequences that add zero). I’m thinking GDB will eventually JIT the bytecode so this won’t matter. GDB will have to JIT if it’s to cope with millions of threads, but jitted Infinity should be faster than libthread_db. None of this is possible now, but it might be sooner than you thing with the GDB/GCC integration work that’s happening. Besides, I can think of about five different ways to make an interpreter skip null operations in zero time.)

Future applications for Infinity notes include the run-time linker interface.

Watch this space.

Remote debugging with GDB

→ originally posted on developers.redhat.com

This past few weeks I’ve been working on making remote debugging in GDB easier to use. What’s remote debugging? It’s where you run GDB on one machine and the program being debugged on another. To do this you need something to allow GDB to control the program being debugged, and that something is called the remote stub. GDB ships with a remote stub called gdbserver, but other remote stubs exist. You can write them into your own program too, which is handy if you’re using minimal or unusual hardware that cannot run regular applications… cellphone masts, satellites, that kind of thing. I bet you didn’t know GDB could do that!

If you’ve used remote debugging in GDB you’ll know it requires a certain amount of setup. You need to tell GDB how to access to your program’s binaries with a set sysroot command, you need to obtain a local copy of the main executable and supply that to GDB with a file command, and you need to tell GDB to commence remote debugging with a target remote command.

Until now. Now all you need is the target remote command.

This new code is really new. It’s not in any GDB release yet, let alone in RHEL or Fedora. It’s not even in the nightly GDB snapshot, it’s that fresh. So, with the caveat that none of these examples will work today unless you’re using a Git build, here’s some things you can do with gdbserver using the new code.

Here’s an example of a traditional remote debugging session, with the things you type in bold. In one window:

abc$ ssh xyz.example.com
xyz$ gdbserver :9999 --attach 5312
Attached; pid = 5312
Listening on port 9999

gdbserver attached to process 5312, stopped it, and is waiting for GDB to talk to it on TCP port 9999. Now, in another window:

abc$ gdb -q
(gdb) target remote xyz.example.com:9999
Remote debugging using xyz.example.com:9999
...lots of messages you can ignore...
(gdb) bt
#0 0x00000035b5edf098 in *__GI___poll (fds=0x27467a0, nfds=8,
timeout=<optimized out>) at ../sysdeps/unix/sysv/linux/poll.c:83
#1 0x00000035b76449f9 in ?? () from target:/lib64/libglib-2.0.so.0
#2 0x00000035b76451a5 in g_main_loop_run ()
from target:/lib64/libglib-2.0.so.0
#3 0x0000003dfd34dd17 in gtk_main ()
from target:/usr/lib64/libgtk-x11-2.0.so.0
#4 0x000000000040913d in main ()

Now you have GDB on one machine (abc) controlling process 5312 on another machine (xyz) via gdbserver. Here I did a backtrace, but you can do pretty much anything you can with regular, non-remote GDB.

I called that a “traditional” remote debugging session because that’s how a lot of people use this, but there’s a more flexible way of doing things if you’re using gdbserver as your stub. GDB and gdbserver can communicate over stdio pipes, so you can chain commands, and the new code to remove all the setup you used to need makes this really nice. Lets do that first example again, with pipes this time:

abc$ gdb -q
(gdb) target remote | ssh -T xyz.example.com gdbserver - --attach 5312
Remote debugging using | ssh -T xyz.example.com gdbserver - --attach 5312
Attached; pid = 5312
Remote debugging using stdio
...lots of messages...
(gdb)

The “-” in gdbserver’s argument list replaces the “:9999” in the previous example. It tells gdbserver we’re using stdio pipes rather than TCP port 9999. As well as configuring everything with single command, this has the advantage that the communication is through ssh; there’s no security in GDB’s remote protocol, so it’s not the kind of thing you want to do over the open internet.

What else can you do with this? Anything you can do through stdio pipes! You can enter Docker containers:

(gdb) target remote | sudo docker exec -i e0c1afa81e1d gdbserver - --attach 58
Remote debugging using | sudo docker exec -i e0c1afa81e1d gdbserver - --attach 58
Attached; pid = 58
Remote debugging using stdio
...

Notice how I slipped sudo in there too. Anything you can do over stdio pipes, remember? If you’re using Kubernetes you can use kubectl exec, or with OpenShift osc exec.

gdbserver can do more than just attach, you can start programs with it too:

(gdb) target remote | sudo docker exec -i e0c1afa81e1d gdbserver - /bin/sh
Remote debugging using | sudo docker exec -i e0c1afa81e1d gdbserver - /bin/sh
Process /bin/sh created; pid = 89
stdin/stdout redirected
Remote debugging using stdio
...

Or you can start it without any specific program, and then tell it what do do from within GDB. This is by far the most flexible way to use gdbserver. You can control more than one process, for example:

(gdb) target extended-remote | ssh -T root@xyz.example.com gdbserver --multi -
Remote debugging using | gdbserver --multi -
Remote debugging using stdio
(gdb) attach 774
...messages...
(gdb) add-inferior
Added inferior 2
(gdb) inferior 2
[Switching to inferior 2 [<null>] (<noexec>)]
(gdb) attach 871
...messages...
(gdb) info inferiors
Num Description Executable
* 2 process 871 target:/usr/sbin/httpd
  1 process 774 target:/usr/libexec/mysqld

Ready to debug that connection issue between your webserver and database?

Judgement Day

GDB will be the weapon we fight with if we accidentally build Skynet.

libiberty demangler fuzzer

I wrote a quick fuzzer for libiberty’s demangler this morning. Here’s how to get and use it:

$ mkdir workdir
$ cd workdir
$ git clone git@github.com:gbenson/binutils-gdb.git src
  ...
$ cd src
$ git co demangler
$ cd ..
$ mkdir build
$ cd build
$ CFLAGS="-g -O0" ../src/configure --with-separate-debug-dir=/usr/lib/debug
  ...
$ make
  ...
$ ../src/fuzzer.sh 
  ...
+ /home/gary/workdir/build/fuzzer
../src/fuzzer.sh: line 12: 22482 Segmentation fault      (core dumped) $fuzzer
# copy the executable and PID from the two lines above
$ gdb -batch /home/gary/workdir/build/fuzzer core.22482 -ex "bt"
...
Core was generated by `/home/gary/workdir/build/fuzzer'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x000000000040d5f2 in op_is_new_cast (op=0x7ffffac19930) at libiberty/cp-demangle.c:3064
3064	  const char *code = op->u.s_operator.op->code;
#0  0x000000000040d5f2 in op_is_new_cast (op=0x7ffffac19930) at libiberty/cp-demangle.c:3064
#1  0x000000000040dc6f in d_expression_1 (di=0x7ffffac19c90) at libiberty/cp-demangle.c:3232
#2  0x000000000040dfbc in d_expression (di=0x7ffffac19c90) at libiberty/cp-demangle.c:3315
#3  0x000000000040cff6 in d_array_type (di=0x7ffffac19c90) at libiberty/cp-demangle.c:2821
#4  0x000000000040c260 in cplus_demangle_type (di=0x7ffffac19c90) at libiberty/cp-demangle.c:2330
#5  0x000000000040cd70 in d_parmlist (di=0x7ffffac19c90) at libiberty/cp-demangle.c:2718
#6  0x000000000040ceb8 in d_bare_function_type (di=0x7ffffac19c90, has_return_type=0) at libiberty/cp-demangle.c:2772
#7  0x000000000040a9b2 in d_encoding (di=0x7ffffac19c90, top_level=1) at libiberty/cp-demangle.c:1287
#8  0x000000000040a6be in cplus_demangle_mangled_name (di=0x7ffffac19c90, top_level=1) at libiberty/cp-demangle.c:1164
#9  0x0000000000413131 in d_demangle_callback (mangled=0x7ffffac19ea0 "_Z1-Av23*;cG~Wo2Vu", options=259, callback=0x40ed11 , opaque=0x7ffffac19d70)
    at libiberty/cp-demangle.c:5862
#10 0x0000000000413267 in d_demangle (mangled=0x7ffffac19ea0 "_Z1-Av23*;cG~Wo2Vu", options=259, palc=0x7ffffac19dd8) at libiberty/cp-demangle.c:5913
#11 0x00000000004132d6 in cplus_demangle_v3 (mangled=0x7ffffac19ea0 "_Z1-Av23*;cG~Wo2Vu", options=259) at libiberty/cp-demangle.c:6070
#12 0x0000000000401a87 in cplus_demangle (mangled=0x7ffffac19ea0 "_Z1-Av23*;cG~Wo2Vu", options=259) at libiberty/cplus-dem.c:858
#13 0x0000000000400ea0 in main (argc=1, argv=0x7ffffac1a0a8) at /home/gary/workdir/src/fuzzer.c:26

The symbol that crashed it is one frame up from the bottom of the backtrace.