| # Debugging |
| |
| This page details various ways to debug LLDB itself and other LLDB tools. If |
| you want to know how to use LLDB in general, please refer to |
| {doc}`/use/tutorial`. |
| |
| As LLDB is generally split into 2 tools, `lldb` and `lldb-server` |
| (`debugserver` on Mac OS), the techniques shown here will not always apply to |
| both. With some knowledge of them all, you can mix and match as needed. |
| |
| In this document we refer to the initial `lldb` as the "debugger" and the |
| program being debugged as the "inferior". |
| |
| ## Building For Debugging |
| |
| To build LLDB with debugging information add the following to your CMake |
| configuration: |
| |
| ``` |
| -DCMAKE_BUILD_TYPE=Debug \ |
| -DLLDB_EXPORT_ALL_SYMBOLS=ON |
| ``` |
| |
| Note that the `lldb` you will use to do the debugging does not itself need to |
| have debug information. |
| |
| Then build as you normally would according to {doc}`/resources/build`. |
| |
| If you are going to debug in a way that doesn't need debug info (printf, strace, |
| etc.) we recommend adding `LLVM_ENABLE_ASSERTIONS=ON` to Release build |
| configurations. This will make LLDB fail earlier instead of continuing with |
| invalid state (assertions are enabled by default for Debug builds). |
| |
| ## Debugging `lldb` |
| |
| The simplest scenario is where we want to debug a local execution of `lldb` |
| like this one: |
| |
| ``` |
| ./bin/lldb test_program |
| ``` |
| |
| LLDB is like any other program, so you can use the same approach. |
| |
| ``` |
| ./bin/lldb -- ./bin/lldb /tmp/test.o |
| ``` |
| |
| That's it. At least, that's the minimum. There's nothing special about LLDB |
| being a debugger that means you can't attach another debugger to it like any |
| other program. |
| |
| What can be an issue is that both debuggers have command line interfaces which |
| makes it very confusing which one is which: |
| |
| ``` |
| (the debugger) |
| (lldb) run |
| Process 1741640 launched: '<...>/bin/lldb' (aarch64) |
| Process 1741640 stopped and restarted: thread 1 received signal: SIGCHLD |
| |
| (the inferior) |
| (lldb) target create "/tmp/test.o" |
| Current executable set to '/tmp/test.o' (aarch64). |
| ``` |
| |
| Another issue is that when you resume the inferior, it will not print the |
| `(lldb)` prompt because as far as it knows it hasn't changed state. A quick |
| way around that is to type something that is clearly not a command and hit |
| enter. |
| |
| ``` |
| (lldb) Process 1742266 stopped and restarted: thread 1 received signal: SIGCHLD |
| Process 1742266 stopped |
| * thread #1, name = 'lldb', stop reason = signal SIGSTOP |
| frame #0: 0x0000ffffed5bfbf0 libc.so.6`__GI___libc_read at read.c:26:10 |
| (lldb) c |
| Process 1742266 resuming |
| notacommand |
| error: 'notacommand' is not a valid command. |
| (lldb) |
| ``` |
| |
| You could just remember whether you are in the debugger or the inferior but |
| it's more for you to remember, and for interrupt based events you simply may not |
| be able to know. |
| |
| Here are some better approaches. First, you could use another debugger like GDB |
| to debug LLDB. Perhaps an IDE like Xcode or Visual Studio Code. Something which |
| runs LLDB under the hood so you don't have to type in commands to the debugger |
| yourself. |
| |
| Or you could change the prompt text for the debugger and/or inferior. |
| |
| ``` |
| $ ./bin/lldb -o "settings set prompt \"(lldb debugger) \"" -- \ |
| ./bin/lldb -o "settings set prompt \"(lldb inferior) \"" /tmp/test.o |
| <...> |
| (lldb) settings set prompt "(lldb debugger) " |
| (lldb debugger) run |
| <...> |
| (lldb) settings set prompt "(lldb inferior) " |
| (lldb inferior) |
| ``` |
| |
| If you want spacial separation you can run the inferior in one terminal then |
| attach to it in another. Remember that while paused in the debugger, the inferior |
| will not respond to input so you will have to `continue` in the debugger |
| first. |
| |
| ``` |
| (in terminal A) |
| $ ./bin/lldb /tmp/test.o |
| |
| (in terminal B) |
| $ ./bin/lldb ./bin/lldb --attach-pid $(pidof lldb) |
| ``` |
| |
| ### Placing Breakpoints |
| |
| Generally you will want to hit some breakpoint in the inferior `lldb`. To place |
| that breakpoint you must first stop the inferior. |
| |
| If you're debugging from another window this is done with `process interrupt`. |
| The inferior will stop, you place the breakpoint and then `continue`. Go back |
| to the inferior and input the command that should trigger the breakpoint. |
| |
| If you are running debugger and inferior in the same window, input `ctrl+c` |
| instead of `process interrupt` and then follow the rest of the steps. |
| |
| If you are doing this with `lldb-server` and find your breakpoint is never |
| hit, check that you are breaking in code that is actually run by |
| `lldb-server`. There are cases where code only used by `lldb` ends up |
| linked into `lldb-server`, so the debugger can break there but the breakpoint |
| will never be hit. |
| |
| ## Debugging `lldb-server` |
| |
| Note: If you are on MacOS you are likely using `debugserver` instead of |
| `lldb-server`. The spirit of these instructions applies but the specifics will |
| be different. |
| |
| We suggest you read {doc}`/use/remote` before attempting to debug `lldb-server` |
| as working out exactly what you want to debug requires that you understand its |
| various modes and behaviour. While you may not be literally debugging on a |
| remote target, think of your host machine as the "remote" in this scenario. |
| |
| The `lldb-server` options for your situation will depend on what part of it |
| or mode you are interested in. To work out what those are, recreate the scenario |
| first without any extra debugging layers. Let's say we want to debug |
| `lldb-server` during the following command: |
| |
| ``` |
| $ ./bin/lldb /tmp/test.o |
| ``` |
| |
| We can treat `lldb-server` as we treated `lldb` before, running it under |
| `lldb`. The equivalent to having `lldb` launch the `lldb-server` for us is |
| to start `lldb-server` in the `gdbserver` mode. |
| |
| The following commands recreate that, while debugging `lldb-server`: |
| |
| ``` |
| $ ./bin/lldb -- ./bin/lldb-server gdbserver :1234 /tmp/test.o |
| (lldb) target create "./bin/lldb-server" |
| Current executable set to '<...>/bin/lldb-server' (aarch64). |
| <...> |
| Process 1742485 launched: '<...>/bin/lldb-server' (aarch64) |
| Launched '/tmp/test.o' as process 1742586... |
| |
| (in another terminal) |
| $ ./bin/lldb /tmp/test.o -o "gdb-remote 1234" |
| ``` |
| |
| Note that the first `lldb` is the one debugging `lldb-server`. The second |
| `lldb` is debugging `/tmp/test.o` and is only used to trigger the |
| interesting code path in `lldb-server`. |
| |
| This is another case where you may want to layout your terminals in a |
| predictable way, or change the prompt of one or both copies of `lldb`. |
| |
| If you are debugging a scenario where the `lldb-server` starts in `platform` |
| mode, but you want to debug the `gdbserver` mode you'll have to work out what |
| subprocess it's starting for the `gdbserver` part. One way is to look at the |
| list of running processes and take the command line from there. |
| |
| In theory it should be possible to use LLDB's |
| `target.process.follow-fork-mode` or GDB's `follow-fork-mode` to |
| automatically debug the `gdbserver` process as it's created. However this |
| author has not been able to get either to work in this scenario so we suggest |
| making a more specific command wherever possible instead. |
| |
| Another option is to let `lldb-server` start up, then attach to the process |
| that's interesting to you. It's less automated and won't work if the bug occurs |
| during startup. However it is a good way to know you've found the right one, |
| then you can take its command line and run that directly. |
| |
| ### Output From `lldb-server` |
| |
| As `lldb-server` often launches subprocesses, output messages may be hidden |
| if they are emitted from the child processes. |
| |
| You can tell it to enable logging using the `--log-channels` option. For |
| example `--log-channels "posix ptrace"`. However that is not passed on to the |
| child processes. |
| |
| The same goes for `printf`. If it's called in a child process you won't see |
| the output. |
| |
| In these cases consider interactive debugging `lldb-server` or |
| working out a more specific command such that it does not have to spawn a |
| subprocess. For example if you start with `platform` mode, work out what |
| `gdbserver` mode process it spawns and run that command instead. |
| |
| Another option if you have `strace` available is to trace the whole process |
| tree and inspect the logs after the session has ended. |
| |
| ``` |
| $ strace -ff -o log -p $(pidof lldb-server) |
| ``` |
| |
| This will log all syscalls made by `lldb-server` and processes that it forks. |
| `-ff` tells `strace` to trace child processes and write the results to a |
| separate file for each process, named using the prefix given by `-o`. |
| |
| Search the log files for specific terms to find the process you're interested |
| in. For example, to find a process that acted as a `gdbserver` instance: |
| |
| ``` |
| $ grep "gdbserver" log.* |
| log.<N>:execve("<...>/lldb-server", [<...> "gdbserver", <...>) = 0 |
| ``` |
| |
| ## Remote Debugging |
| |
| If you want to debug part of LLDB running on a remote machine, the principles |
| are the same but we will have to start debug servers, then attach debuggers to |
| those servers. |
| |
| In the example below we're debugging an `lldb-server` `gdbserver` mode |
| command running on a remote machine. |
| |
| For simplicity we'll use the same `lldb-server` as the debug server |
| and the inferior, but it doesn't need to be that way. You can use `gdbserver` |
| (as in, GDB's debug server program) or a system installed `lldb-server` if you |
| suspect your local copy is not stable. As is the case in many of these |
| scenarios. |
| |
| ``` |
| $ <...>/bin/lldb-server gdbserver 0.0.0.0:54322 -- \ |
| <...>/bin/lldb-server gdbserver 0.0.0.0:54321 -- /tmp/test.o |
| ``` |
| |
| Now we have a debug server listening on port 54322 of our remote (`0.0.0.0` |
| means it's listening for external connections). This is where we will connect |
| `lldb` to, to debug the second `lldb-server`. |
| |
| To trigger behaviour in the second `lldb-server`, we will connect a second |
| `lldb` to port 54321 of the remote. |
| |
| This is the final configuration: |
| |
| ``` |
| Host | Remote |
| --------------------------------------------|-------------------- |
| lldb A debugs lldb-server on port 54322 -> | lldb-server A |
| | (which runs) |
| lldb B debugs /tmp/test.o on port 54321 -> | lldb-server B |
| | (which runs) |
| | /tmp/test.o |
| ``` |
| |
| You would use `lldb A` to place a breakpoint in the code you're interested in, |
| then `lldb B` to trigger `lldb-server B` to go into that code and hit the |
| breakpoint. `lldb-server A` is only here to let us debug `lldb-server B` |
| remotely. |
| |
| ## Debugging The Remote Protocol |
| |
| LLDB mostly follows the [GDB Remote Protocol](https://sourceware.org/gdb/onlinedocs/gdb/Remote-Protocol.html) |
| . Where there are differences it tries to handle both LLDB and GDB behaviour. |
| |
| LLDB does have extensions to the protocol which are documented in |
| [lldb-gdb-remote.txt](https://github.com/llvm/llvm-project/blob/main/lldb/docs/lldb-gdb-remote.txt) |
| and [lldb/docs/lldb-platform-packets.txt](https://github.com/llvm/llvm-project/blob/main/lldb/docs/lldb-platform-packets.txt). |
| |
| ### Logging Packets |
| |
| If you just want to observe packets, you can enable the `gdb-remote packets` |
| log channel. |
| |
| ``` |
| (lldb) log enable gdb-remote packets |
| (lldb) run |
| lldb < 1> send packet: + |
| lldb history[1] tid=0x264bfd < 1> send packet: + |
| lldb < 19> send packet: $QStartNoAckMode#b0 |
| lldb < 1> read packet: + |
| ``` |
| |
| You can do this on the `lldb-server` end as well by passing the option |
| `--log-channels "gdb-remote packets"`. Then you'll see both sides of the |
| connection. |
| |
| Some packets may be printed in a nicer way than others. For example XML packets |
| will print the literal XML, some binary packets may be decoded. Others will just |
| be printed unmodified. So do check what format you expect, a common one is hex |
| encoded bytes. |
| |
| You can enable this logging even when you are connecting to an `lldb-server` |
| in platform mode, this protocol is used for that too. |
| |
| ### Debugging Packet Exchanges |
| |
| Say you want to make `lldb` send a packet to `lldb-server`, then debug |
| how the latter builds its response. Maybe even see how `lldb` handles it once |
| it's sent back. |
| |
| That all takes time, so LLDB will likely time out and think the remote has gone |
| away. You can change the `plugin.process.gdb-remote.packet-timeout` setting |
| to prevent this. |
| |
| Here's an example, first we'll start an `lldb-server` being debugged by |
| `lldb`. Placing a breakpoint on a packet handler we know will be hit once |
| another `lldb` connects. |
| |
| ``` |
| $ lldb -- lldb-server gdbserver :1234 -- /tmp/test.o |
| <...> |
| (lldb) b GDBRemoteCommunicationServerCommon::Handle_qSupported |
| Breakpoint 1: where = <...> |
| (lldb) run |
| <...> |
| ``` |
| |
| Next we connect another `lldb` to this, with a timeout of 5 minutes: |
| |
| ``` |
| $ lldb /tmp/test.o |
| <...> |
| (lldb) settings set plugin.process.gdb-remote.packet-timeout 300 |
| (lldb) gdb-remote 1234 |
| ``` |
| |
| Doing so triggers the breakpoint in `lldb-server`, bringing us back into |
| `lldb`. Now we've got 5 minutes to do whatever we need before LLDB decides |
| the connection has failed. |
| |
| ``` |
| * thread #1, name = 'lldb-server', stop reason = breakpoint 1.1 |
| frame #0: 0x0000aaaaaacc6848 lldb-server<...> |
| lldb-server`lldb_private::process_gdb_remote::GDBRemoteCommunicationServerCommon::Handle_qSupported: |
| -> 0xaaaaaacc6848 <+0>: sub sp, sp, #0xc0 |
| <...> |
| (lldb) |
| ``` |
| |
| Once you're done simply `continue` the `lldb-server`. Back in the other |
| `lldb`, the connection process will continue as normal. |
| |
| ``` |
| Process 2510266 stopped |
| * thread #1, name = 'test.o', stop reason = signal SIGSTOP |
| frame #0: 0x0000fffff7fcd100 ld-2.31.so`_start |
| ld-2.31.so`_start: |
| -> 0xfffff7fcd100 <+0>: mov x0, sp |
| <...> |
| (lldb) |
| ``` |
| |
| ## Reducing Bugs |
| |
| This section covers reducing a bug that happens in LLDB itself, or where you |
| suspect that LLDB causes something else to behave abnormally. |
| |
| Since bugs vary wildly, the advice here is general and incomplete. Let your |
| instincts guide you and don't feel the need to try everything before reporting |
| an issue or asking for help. This is simply inspiration. |
| |
| ### Reduction |
| |
| The first step is to reduce unneeded complexity where it is cheap to do so. If |
| something is easily removed or frozen to a certain value, do so. The goal is to |
| keep the failure mode the same, with fewer dependencies. |
| |
| This includes, but is not limited to: |
| |
| - Removing test cases that don't crash. |
| - Replacing dynamic lookups with constant values. |
| - Replace supporting functions with stubs that do nothing. |
| - Moving the test case to less unique system. If your machine has an exotic |
| extension, try it on a readily available commodity machine. |
| - Removing irrelevant parts of the test program. |
| - Reproducing the issue without using the LLDB test runner. |
| - Converting a remote debugging scenario into a local one. |
| |
| Now we hopefully have a smaller reproducer than we started with. Next we need to |
| find out what components of the software stack might be failing. |
| |
| Some examples are listed below with suggestions for how to investigate them. |
| |
| - Debugger |
| |
| - Use a [released version of LLDB](https://github.com/llvm/llvm-project/releases). |
| - If on MacOS, try the system `lldb`. |
| - Try GDB or any other system debugger you might have e.g. Microsoft Visual |
| Studio. |
| |
| - Kernel |
| |
| - Start a virtual machine running a different version. `qemu-system` is |
| useful here. |
| - Try a different physical system running a different version. |
| - Remember that for most kernels, userspace crashing the kernel is always a |
| kernel bug. Even if the userspace program is doing something unconventional. |
| So it could be a bug in the application and the kernel. |
| |
| - Compiler and compiler options |
| |
| - Try other versions of the same compiler or your system compiler. |
| - Emit older versions of DWARF info, particularly DWARFv4 to v5, some tools |
| did/do not understand the new constructs. |
| - Reduce optimisation options as much as possible. |
| - Try all the language modes e.g. C++17/20 for C++. |
| - Link against LLVM's libcxx if you suspect a bug involving the system C++ |
| library. |
| - For languages other than C/C++ e.g. Rust, try making an equivalent program |
| in C/C++. LLDB tends to try to fit other languages into a C/C++ mould, so |
| porting the program can make triage and reporting much easier. |
| |
| - Operating system |
| |
| - Use docker to try various versions of Linux. |
| - Use `qemu-system` to emulate other operating systems e.g. FreeBSD. |
| |
| - Architecture |
| |
| - Use [QEMU user space emulation](https://www.qemu.org/docs/master/user/main.html) |
| to quickly test other architectures. Note that `lldb-server` cannot be used |
| with this as the ptrace APIs are not emulated. |
| - If you need to test a big endian system use QEMU to emulate s390x (user |
| space emulation for just `lldb`, `qemu-system` for testing |
| `lldb-server`). |
| |
| :::{note} |
| When using QEMU you may need to use the built in GDB stub, instead of |
| `lldb-server`. For example if you wanted to debug `lldb` running |
| inside `qemu-user-s390x` you would connect to the GDB stub provided |
| by QEMU. |
| |
| The same applies if you want to see how `lldb` would debug a test |
| program that is running on s390x. It's not totally accurate because |
| you're not using `lldb-server`, but this is fine for features that |
| are mostly implemented in `lldb`. |
| |
| If you are running a full system using `qemu-system`, you likely |
| want to connect to the `lldb-server` running within the userspace |
| of that system. |
| |
| If your test program is bare metal (meaning it requires no supporting |
| operating system) then connect to the built in GDB stub. This can be |
| useful when testing embedded systems or kernel debugging. |
| ::: |
| |
| ### Reducing Ptrace Related Bugs |
| |
| This section is written Linux specific but the same can likely be done on |
| other Unix or Unix like operating systems. |
| |
| Sometimes you will find `lldb-server` doing something with ptrace that causes |
| a problem. Your reproducer involves running `lldb` as well, this is not going |
| to go over well with kernel and is generally more difficult to explain if you |
| want to get help with it. |
| |
| If you think you can get your point across without this, no need. If you're |
| pretty sure you have for example found a Linux Kernel bug, doing this greatly |
| increases the chances it'll get fixed. |
| |
| We'll remove the LLDB dependency by making a smaller standalone program that |
| does the same actions. Starting with a skeleton program that forks and debugs |
| the inferior process. |
| |
| The program presented [here](https://eli.thegreenplace.net/2011/01/23/how-debuggers-work-part-1) |
| ([source](https://github.com/eliben/code-for-blog/blob/master/2011/simple_tracer.c)) |
| is a great starting point. There is also an AArch64 specific example in |
| [the LLDB examples folder](https://github.com/llvm/llvm-project/tree/main/lldb/examples/ptrace_example.c). |
| |
| For either, you'll need to modify that to fit your architecture. A tip for this |
| is to take any constants used in it, find in which function(s) they are used in |
| LLDB and then you'll find the equivalent constants in the same LLDB functions |
| for your architecture. |
| |
| Once that is running as expected we can convert `lldb-server`'s into calls in |
| this program. To get a log of those, run `lldb-server` with |
| `--log-channels "posix ptrace"`. You'll see output like: |
| |
| ``` |
| $ lldb-server gdbserver :1234 --log-channels "posix ptrace" -- /tmp/test.o |
| 1694099878.829990864 <...> ptrace(16896, 2659963, 0x0000000000000000, 0x000000000000007E, 0)=0x0 |
| 1694099878.830722332 <...> ptrace(16900, 2659963, 0x0000FFFFD14BF7CC, 0x0000FFFFD14BF7D0, 16)=0x0 |
| 1694099878.831967115 <...> ptrace(16900, 2659963, 0x0000FFFFD14BF66C, 0x0000FFFFD14BF630, 16)=0xffffffffffffffff |
| 1694099878.831982136 <...> ptrace() failed: Invalid argument |
| Launched '/tmp/test.o' as process 2659963... |
| ``` |
| |
| Each call is logged with its parameters and its result as the `=` on the end. |
| |
| From here you will need to use a combination of the [ptrace documentation](https://man7.org/linux/man-pages/man2/ptrace.2.html) |
| and Linux Kernel headers (`uapi/linux/ptrace.h` mainly) to figure out what |
| the calls are. |
| |
| The most important parameter is the first, which is the request number. In the |
| example above `16896`, which is hex `0x4200`, is `PTRACE_SETOPTIONS`. |
| |
| Luckily, you don't usually have to figure out all those early calls. Our |
| skeleton program will be doing all that, successfully we hope. |
| |
| What you should do is record just the interesting bit to you. Let's say |
| something odd is happening when you read the `tpidr` register (this is an |
| AArch64 register, just for example purposes). |
| |
| First, go to the `lldb-server` terminal and press enter a few times to put |
| some blank lines after the last logging output. |
| |
| Then go to your `lldb` and: |
| |
| ``` |
| (lldb) register read tpidr |
| tpidr = 0x0000fffff7fef320 |
| ``` |
| |
| You'll see this from `lldb-server`: |
| |
| ``` |
| <...> ptrace(16900, 2659963, 0x0000FFFFD14BF6CC, 0x0000FFFFD14BF710, 8)=0x0 |
| ``` |
| |
| If you don't see that, it may be because `lldb` has cached it. The easiest way |
| to clear that cache is to step. Remember that some registers are read every |
| step, so you'll have to adjust depending on the situation. |
| |
| Assuming you've got that line, you would look up what `116900` is. This is |
| `0x4204` in hex, which is `PTRACE_GETREGSET`. As we expected. |
| |
| The following parameters are not as we might expect because what we log is a bit |
| different from the literal ptrace call. See your platform's definition of |
| `PtraceWrapper` for the exact form. |
| |
| The point of all this is that by doing a single action you can get a few |
| isolated ptrace calls and you can then fill in the blanks and write |
| equivalent calls in the skeleton program. |
| |
| The final piece of this is likely breakpoints. Assuming your bug does not |
| require a hardware breakpoint, you can get software breakpoints by inserting |
| a break instruction into the inferior's code at compile time. Usually by using |
| an architecture specific assembly statement, as you will need to know exactly |
| how many instructions to overwrite later. |
| |
| Doing it this way instead of exactly copying what LLDB does will save a few |
| ptrace calls. The AArch64 example program shows how to do this. |
| |
| - The inferior contains `BRK #0` then `NOP`. |
| - 2 4-byte instructions means 8 bytes of data to replace, which matches the |
| minimum size you can write with `PTRACE_POKETEXT`. |
| - The inferior runs to the `BRK`, which brings us into the debugger. |
| - The debugger reads `PC` and writes `NOP` then `NOP` to the location |
| pointed to by `PC`. |
| - The debugger then single steps the inferior to the next instruction |
| (this is not required in this specific scenario, you could just continue but |
| it is included because this more closely matches what `lldb` does). |
| - The debugger then continues the inferior. |
| - The inferior exits, and the whole program exits. |
| |
| Using this technique you can emulate the usual "run to main, do a thing" type |
| reproduction steps. |
| |
| Finally, that "thing" is the ptrace calls you got from the `lldb-server` logs. |
| Add those to the debugger function and you now have a reproducer that doesn't |
| need any part of LLDB. |
| |
| ## Debugging Tests |
| |
| See {doc}`/resources/test`. |