I’m writing a replacement for libthread_db. It’s called Infinity.
Why? Because libthread_db is a pain in the ass for debuggers. GDB has to watch for inferiors loading thread libraries. It has to know that, for example, on GNU/Linux, when the inferior loads
libpthread.so then GDB has to locate the corresponding
libthread_db.so into itself and use that to inspect libpthread’s internal structures. How does GDB know where libthread_db is? It doesn’t, it has to search for it. How does it know, when it finds it, that the libthread_db it found is compatible with the libpthread the inferior loaded? It doesn’t, it has to load it to see, then unload it if it didn’t work. How does GDB know that the libthread_db it found is compatible with itself? It doesn’t, it has to load it and, erm, crash if it isn’t. How does GDB manage when the inferior (and its libthread_db) has a different ABI to GDB? Well, it doesn’t.
libthread_db means you can’t debug an application in a RHEL 6 container with a GDB in a RHEL 7 container. Probably. Not safely. Not without using gdbserver, anyway–and there’s no reason you should have to use gdbserver to debug what is essentially a native process.
So. Infinity. In Infinity, inspection functions for debuggers will be shipped as bytecode in ELF notes in the same file as the code they pertain to. libpthread.so, for example, will contain a bunch of Infinity notes, each representing some bit of functionality that GDB currently gets from libthread_db. When the inferior starts or loads libraries GDB will find the notes in the files it already loaded and register their functions. If GDB notices it has, for example, the full set of functions it requires for thread support then, boom, thread support switches on. This happens regardless of whether libpthread was dynamically or statically linked.
(If you’re using gdbserver, gdbserver gives GDB a list of Infinity functions it’s interested in. When GDB finds these functions it fires the (slightly rewritten) bytecode over to gdbserver and gdbserver takes it from there.)
Concrete things I have are: a bytecode format (but not the bytecode itself), an executable with a couple of handwritten notes (with some junk where the bytecode should be), a readelf that can decode the notes, a BFD that extracts the notes and a GDB that picks them up.
What I’m doing right now is rewriting a function I don’t understand (
td_ta_map_lwp2thr) in a language I’m inventing as I go along (
i8) that’ll be compiled with a compiler that barely exists (
i8c) into a bytecode that’s totally undefined to be executed by an interpreter that doesn’t exist.
(The compiler’s going to be written in Python, and it’ll emit assembly language. It’s more of an assembler, really. Emitting assembler rather than going straight to bytecode simplifies things (e.g. the compiler won’t need to understand numbers!) at the expense of emitting some slightly crappy code (e.g. instruction sequences that add zero). I’m thinking GDB will eventually JIT the bytecode so this won’t matter. GDB will have to JIT if it’s to cope with millions of threads, but jitted Infinity should be faster than libthread_db. None of this is possible now, but it might be sooner than you thing with the GDB/GCC integration work that’s happening. Besides, I can think of about five different ways to make an interpreter skip null operations in zero time.)
Future applications for Infinity notes include the run-time linker interface.
Watch this space.