Stack unwinding

For Frysk I am working to add .debug_frame support to libunwind. Since I didn’t know that much about stack unwinding and since some of the details are hard to grasp without some historical context I wrote the following to summarize all the relevant documentation pointers that I could find on stack unwinding. Please do let me know if I got any details wrong (the following is mostly written with x86/x86_64 in mind since that is what I am currently interested in).

Unwinding the call stack used to be something only a debugger would do and relied on the executable having a frame pointer in a dedicated register that points to the bottom of the stack frame for the current function which also contained the return address. Having a frame pointer allows you to quickly walk the call stack and get all the addresses. If you can map those to the names of the relevant functions they are in you have a nice backtrace for the user.

If you want to get more of the state in each call frame then you could rely on each function having a prologue and epilogue that saved and restored the registers of the caller (some architectures like x86 even have special instructions to help push and pop the relevant registers on the stack on function entry and return). Given a calling convention for a particular architecture you could use these to reliably find the original registers on the stack, which in turn with some debug info would give you the values of variables and arguments of the functions on the call stack.

Unfortunately compilers got smart and optimized code might not keep a frame pointer (frees up one more register) and might reschedule the function prologue and epilogue instructions between the other instructions in the function. All making it pretty hard for an unwinder to reconstruct the previous call frames on the stack. In particular x86_64 does away with a standard frame pointer. You can still get some information back by conservatively approximating the instructions in the function and guessing at the actual way the various registers are stored but this becomes pretty messy pretty quickly.

To help debuggers still get all the information needed to unwind a stack and restore all needed registers the debugging information (DWARF) generated by compilers was extended to include Call Frame Information (CFI) that allows a debugger to reconstruct the calling pc and registers of a function (see the DWARF 3 spec – section 6.4). This information is stored in the .debug_frame section of an ELF file. It uses a simplified version of the DWARF instructions (not all operands are relevant for reconstructing the registers). This section is not guaranteed to be available, it is not necessarily loaded into memory and can even be split off into its own debug info file in some distributions.

At the same time different languages got constructs (exceptions, continuations, global gotos, asynchronous garbage collectors, etc) which required some sort of reliable unwinding (and in some cases rewinding) of the call stack. Since some optimizations and some newer architectures also did away with a standard frame pointer another way to reliably unwind the stack was needed. This became the exception handler framework (.eh_frame) which is based on the DWARF CFI work but which is slightly different. Unfortunately nobody seems to have documented the precise differences between the formats. So you will have to carefully read both the DWARF standard and the LSB core specification Exception Frames side-by-side.

Note that a debugger that wants to walk a stack and recover all registers might need more information than some of these language constructs, which might only need unwind information for specific call sites. Depending on optimizations, architecture and language compiled (and sometimes specific distribution default choices) no, full or partial exception handler unwind information and/or frame pointers are generated (see the GCC options -funwind-tables and -fasynchronous-unwind-tables).

Both the DWARF and the exception handler specs are architecture neutral. But since you do need to a mapping between the actual registers and the specs you also need to consult the relevant architecture abi that defines the actual mapping. Sometimes these architecture abi specs also define some DWARF/EH extensions. See for example the x86_64 abi spec (Section 3.6 and 3.7).

Note that in practice what gcc generates overrides any of the above specs, and if a discrepancy is found the spec usually gets updated. And that one should be careful about bugs in the old DWARF 2 spec and extension of DWARF specified by the LSB (which mostly augment DWARF 2 to be like DWARF 3, at least for the exception handler sections).

If an .eh_frame section is available in an ELF file it is guaranteed to be loaded into memory. But depending on architecture and language being compiled might not be available at all (and neither might the frame pointer or the .debug_frame section). This does also mean that unwind information might be stored differently for different components linked together into a program if they were compiled with different flags or have different source languages, making cross component/language unwinding an interesting exercise.

One Comment

  1. Mitesh Shah says:

    Thanks for very clear understanding of the unwinding information. I understand that .eh_frame section is guranteed to be loaded into memory but if there is no .eh_frame section and only .debug_frame is present. How can I make sure that .debug_frame gets loaded into the memory? (mainly, how can I set its flag to alloc during section generation)

    Thanks