| # Variable Formatting |
| |
| LLDB has a data formatters subsystem that allows users to define custom display |
| options for their variables. |
| |
| Usually, when you type `frame variable` or run some expression LLDB will |
| automatically choose the way to display your results on a per-type basis, as in |
| the following example: |
| |
| ``` |
| (lldb) frame variable |
| (uint8_t) x = 'a' |
| (intptr_t) y = 124752287 |
| ``` |
| |
| Note: `frame variable` without additional arguments prints the list of |
| variables of the current frame. |
| |
| However, in certain cases, you may want to associate a different style to the |
| display for certain datatypes. To do so, you need to give hints to the debugger |
| as to how variables should be displayed. The LLDB type command allows you to do |
| just that. |
| |
| Using it you can change your visualization to look like this: |
| |
| ``` |
| (lldb) frame variable |
| (uint8_t) x = chr='a' dec=65 hex=0x41 |
| (intptr_t) y = 0x76f919f |
| ``` |
| |
| In addition, some data structures can encode their data in a way that is not |
| easily readable to the user, in which case a data formatter can be used to |
| show the data in a human readable way. For example, without a formatter, |
| printing a `std::deque<int>` with the elements `{2, 3, 4, 5, 6}` would |
| result in something like: |
| |
| ``` |
| (lldb) frame variable a_deque |
| (std::deque<Foo, std::allocator<int> >) $0 = { |
| std::_Deque_base<Foo, std::allocator<int> > = { |
| _M_impl = { |
| _M_map = 0x000000000062ceb0 |
| _M_map_size = 8 |
| _M_start = { |
| _M_cur = 0x000000000062cf00 |
| _M_first = 0x000000000062cf00 |
| _M_last = 0x000000000062d2f4 |
| _M_node = 0x000000000062cec8 |
| } |
| _M_finish = { |
| _M_cur = 0x000000000062d300 |
| _M_first = 0x000000000062d300 |
| _M_last = 0x000000000062d6f4 |
| _M_node = 0x000000000062ced0 |
| } |
| } |
| } |
| } |
| ``` |
| |
| which is very hard to understand. |
| |
| Note: `frame variable <var>` prints out the variable `<var>` in the current |
| frame. |
| |
| On the other hand, a proper formatter is able to produce the following output: |
| |
| ``` |
| (lldb) frame variable a_deque |
| (std::deque<Foo, std::allocator<int> >) $0 = size=5 { |
| [0] = 2 |
| [1] = 3 |
| [2] = 4 |
| [3] = 5 |
| [4] = 6 |
| } |
| ``` |
| |
| which is what the user would expect from a good debugger. |
| |
| Note: you can also use `v <var>` instead of `frame variable <var>`. |
| |
| It's worth mentioning that the `size=5` string is produced by a summary |
| provider and the list of children is produced by a synthetic child provider. |
| More information about these providers is available in {ref}`type-summary` and |
| {ref}`synthetic-children`, respectively. |
| |
| There are several features related to data visualization: formats, summaries, |
| filters, synthetic children. |
| |
| To reflect this, the type command has five subcommands: |
| |
| ``` |
| type format |
| type summary |
| type filter |
| type synthetic |
| type category |
| ``` |
| |
| These commands are meant to bind printing options to types. When variables are |
| printed, LLDB will first check if custom printing options have been associated |
| to a variable's type and, if so, use them instead of picking the default |
| choices. |
| |
| Each of the commands (except `type category`) has four subcommands available: |
| |
| - `add`: associates a new printing option to one or more types |
| - `delete`: deletes an existing association |
| - `list`: provides a listing of all associations |
| - `clear`: deletes all associations |
| |
| ## Type Format |
| |
| Type formats enable you to quickly override the default format for displaying |
| primitive types (the usual basic C/C++/ObjC types: int, float, char, ...). |
| |
| If for some reason you want all int variables in your program to print out as |
| hex, you can add a format to the int type. |
| |
| This is done by typing |
| |
| ``` |
| (lldb) type format add --format hex int |
| ``` |
| |
| at the LLDB command line. |
| |
| The `--format` (which you can shorten to `-f`) option accepts a [format |
| name](#format-name). Then, you provide one or more types to which you want the |
| new format applied. |
| |
| A frequent scenario is that your program has a typedef for a numeric type that |
| you know represents something that must be printed in a certain way. Again, you |
| can add a format just to that typedef by using type format add with the name |
| alias. |
| |
| But things can quickly get hierarchical. Let's say you have a situation like |
| the following: |
| |
| ``` |
| typedef int A; |
| typedef A B; |
| typedef B C; |
| typedef C D; |
| ``` |
| |
| and you want to show all A's as hex, all C's as byte arrays and leave the |
| defaults untouched for other types (albeit its contrived look, the example is |
| far from unrealistic in large software systems). |
| |
| If you simply type |
| |
| ``` |
| (lldb) type format add -f hex A |
| (lldb) type format add -f uint8_t[] C |
| ``` |
| |
| values of type B will be shown as hex and values of type D as byte arrays, as in: |
| |
| ``` |
| (lldb) frame variable -T |
| (A) a = 0x00000001 |
| (B) b = 0x00000002 |
| (C) c = {0x03 0x00 0x00 0x00} |
| (D) d = {0x04 0x00 0x00 0x00} |
| ``` |
| |
| This is because by default LLDB cascades formats through typedef chains. In |
| order to prevent cascading, use the option `-C` with the value `no` when |
| defining the type format: |
| |
| ``` |
| (lldb) type format add -C no -f hex A |
| (lldb) type format add -C no -f uint8_t[] C |
| ``` |
| |
| Without cascading, the same command gives different results: |
| |
| ``` |
| (lldb) frame variable -T |
| (A) a = 0x00000001 |
| (B) b = 2 |
| (C) c = {0x03 0x00 0x00 0x00} |
| (D) d = 4 |
| ``` |
| |
| Note that qualifiers such as `const` and `volatile` will be stripped when |
| matching types. For example: |
| |
| ``` |
| (lldb) frame var x y z |
| (int) x = 1 |
| (const int) y = 2 |
| (volatile int) z = 4 |
| (lldb) type format add -f hex int |
| (lldb) frame var x y z |
| (int) x = 0x00000001 |
| (const int) y = 0x00000002 |
| (volatile int) z = 0x00000004 |
| ``` |
| |
| Type formats *can* be applied to pointers and references by using the |
| `pointer` "adjective" before the type in the `type format` command. |
| However, they specify the format to be applied to the pointer or reference |
| and not the value being referenced or pointed to. Use `--skip-pointers` (`-p`) |
| and `--skip-references` (`-r`) to change this behavior. These two |
| options prevent LLDB from applying a format defined for type `T` to |
| values of type `T*` and `T&`, respectively. |
| |
| ``` |
| (lldb) type format add -f float32[] int |
| (lldb) frame variable pointer *pointer -T |
| (int *) pointer = {1.46991e-39 1.4013e-45} |
| (int) *pointer = {1.53302e-42} |
| (lldb) type format add -f float32[] int -p |
| (lldb) frame variable pointer *pointer -T |
| (int *) pointer = 0x0000000100100180 |
| (int) *pointer = {1.53302e-42} |
| ``` |
| |
| If you need to delete a custom format type, use `type format delete` followed |
| by the name of the type for which to delete the format. |
| |
| ``` |
| (lldb) type format delete int |
| ``` |
| |
| To delete *all* formats, use `type format clear`. |
| |
| To see all the formats defined, use `type format list`. |
| |
| Instead of installing a type format for the entire debugging session, you |
| can specify type formats in an ad hoc manner using the `-f` |
| option to the `frame variable` command. For example: |
| |
| ``` |
| (lldb) frame variable counter -f hex |
| ``` |
| |
| Will display the value of counter as an hexadecimal number. |
| |
| ### Type Formats |
| |
| (format-name)= |
| |
| ```{eval-rst} |
| +-----------------------------------------------+------------------+--------------------------------------------------------------------------+ |
| | **Format name** | **Abbreviation** | **Description** | |
| +-----------------------------------------------+------------------+--------------------------------------------------------------------------+ |
| | ``default`` | | the default LLDB algorithm is used to pick a format | |
| +-----------------------------------------------+------------------+--------------------------------------------------------------------------+ |
| | ``boolean`` | B | show this as a true/false boolean, using the customary rule that 0 is | |
| | | | false and everything else is true | |
| +-----------------------------------------------+------------------+--------------------------------------------------------------------------+ |
| | ``binary`` | b | show this as a sequence of bits | |
| +-----------------------------------------------+------------------+--------------------------------------------------------------------------+ |
| | ``bytes`` | y | show the bytes one after the other | |
| +-----------------------------------------------+------------------+--------------------------------------------------------------------------+ |
| | ``bytes with ASCII`` | Y | show the bytes, but try to display them as ASCII characters as well | |
| +-----------------------------------------------+------------------+--------------------------------------------------------------------------+ |
| | ``character`` | c | show the bytes as ASCII characters | |
| +-----------------------------------------------+------------------+--------------------------------------------------------------------------+ |
| | ``printable character`` | C | show the bytes as printable ASCII characters | |
| +-----------------------------------------------+------------------+--------------------------------------------------------------------------+ |
| | ``complex float`` | F | interpret this value as the real and imaginary part of a complex | |
| | | | floating-point number | |
| +-----------------------------------------------+------------------+--------------------------------------------------------------------------+ |
| | ``c-string`` | s | show this as a 0-terminated C string | |
| +-----------------------------------------------+------------------+--------------------------------------------------------------------------+ |
| | ``decimal`` | d | show this as a signed integer number (this does not perform a cast, it | |
| | | | simply shows the bytes as an integer with sign) | |
| +-----------------------------------------------+------------------+--------------------------------------------------------------------------+ |
| | ``enumeration`` | E | show this as an enumeration, printing the | |
| | | | value's name if available or the integer value otherwise | |
| +-----------------------------------------------+------------------+--------------------------------------------------------------------------+ |
| | ``hex`` | x | show this as in hexadecimal notation (this does | |
| | | | not perform a cast, it simply shows the bytes as hex) | |
| +-----------------------------------------------+------------------+--------------------------------------------------------------------------+ |
| | ``float`` | f | show this as a floating-point number (this does not perform a cast, it | |
| | | | simply interprets the bytes as an IEEE754 floating-point value) | |
| +-----------------------------------------------+------------------+--------------------------------------------------------------------------+ |
| | ``octal`` | o | show this in octal notation | |
| +-----------------------------------------------+------------------+--------------------------------------------------------------------------+ |
| | ``OSType`` | O | show this as a MacOS OSType | |
| +-----------------------------------------------+------------------+--------------------------------------------------------------------------+ |
| | ``unicode16`` | U | show this as UTF-16 characters | |
| +-----------------------------------------------+------------------+--------------------------------------------------------------------------+ |
| | ``unicode32`` | | show this as UTF-32 characters | |
| +-----------------------------------------------+------------------+--------------------------------------------------------------------------+ |
| | ``unsigned decimal`` | u | show this as an unsigned integer number (this does not perform a cast, | |
| | | | it simply shows the bytes as unsigned integer) | |
| +-----------------------------------------------+------------------+--------------------------------------------------------------------------+ |
| | ``pointer`` | p | show this as a native pointer (unless this is really a pointer, the | |
| | | | resulting address will probably be invalid) | |
| +-----------------------------------------------+------------------+--------------------------------------------------------------------------+ |
| | ``char[]`` | | show this as an array of characters | |
| +-----------------------------------------------+------------------+--------------------------------------------------------------------------+ |
| | ``int8_t[], uint8_t[]`` | | show this as an array of the corresponding integer type | |
| | ``int16_t[], uint16_t[]`` | | | |
| | ``int32_t[], uint32_t[]`` | | | |
| | ``int64_t[], uint64_t[]`` | | | |
| | ``uint128_t[]`` | | | |
| +-----------------------------------------------+------------------+--------------------------------------------------------------------------+ |
| | ``float32[], float64[]`` | | show this as an array of the corresponding | |
| | | | floating-point type | |
| +-----------------------------------------------+------------------+--------------------------------------------------------------------------+ |
| | ``complex integer`` | I | interpret this value as the real and imaginary part of a complex integer | |
| | | | number | |
| +-----------------------------------------------+------------------+--------------------------------------------------------------------------+ |
| | ``character array`` | a | show this as a character array | |
| +-----------------------------------------------+------------------+--------------------------------------------------------------------------+ |
| | ``address`` | A | show this as an address target (symbol/file/line + offset), possibly | |
| | | | also the string this address is pointing to | |
| +-----------------------------------------------+------------------+--------------------------------------------------------------------------+ |
| | ``hex float`` | | show this as hexadecimal floating point | |
| +-----------------------------------------------+------------------+--------------------------------------------------------------------------+ |
| | ``instruction`` | i | show this as an disassembled opcode | |
| +-----------------------------------------------+------------------+--------------------------------------------------------------------------+ |
| | ``void`` | v | don't show anything | |
| +-----------------------------------------------+------------------+--------------------------------------------------------------------------+ |
| ``` |
| |
| (type-summary)= |
| |
| ## Type Summary |
| |
| Type formats specify the way to format a variable/value with a fundamental type. |
| When you want to display a variable/value for a user-defined type,[^footnote-1] you need |
| another tool, type summaries. |
| |
| Type summaries work by extracting information from aggregate types and arranging it |
| in a user-defined format. Consider the following example: |
| |
| Before adding a type summary, for a C++ program with a `struct` declared like |
| |
| ```C++ |
| struct Person { |
| std::string name; |
| int age; |
| }; |
| ``` |
| |
| LLDB will format a variable that is an instance of the `Person` type as |
| |
| ``` |
| (lldb) frame variable -T p |
| (Person) p = { |
| (std::string) name = "Jaime" |
| (int) age = 44 |
| } |
| ``` |
| |
| Using a type summary, the LLDB user can specify that an instance of |
| the `Person` type be formatted as |
| |
| ``` |
| (lldb) frame variable -T p |
| (Person) p = The person is named "Jaime" and is 44 years old. |
| ``` |
| |
| There are two methods for specifying type summaries: |
| 1\. by binding a Summary String to the type; and |
| 2\. by writing and registering a Python script that returns the Summary String for a type. |
| |
| Method (1) was used to format the `Person` type shown above: |
| |
| ``` |
| (lldb) type summary add --summary-string "The person is named ${var.name} and is ${var.age} years old." Person |
| ``` |
| |
| ## Summary Format Matching On Pointers |
| |
| A summary formatter for a type `T` may not be appropriate to use |
| for pointers to that type. |
| |
| - If the formatter is only appropriate for the type and |
| not its pointers, use the `-p` option to `type summary` to restrict it to match |
| values of type `T`. |
| - If the formatter is appropriate for the type and its pointers, |
| use the `-d <number>` option to `type summary` to specify the number of |
| pointer indirections the formatter should match. The default value is 1. If you want |
| to also match `T**` set `-d` to 2, and so on. |
| |
| In all cases, the value passed to the summary formatter for interpolation |
| into the Summary String (see {ref}`summary-strings`) will be dereferenced to the type `T` |
| if the type `T` is reached in at most the number of dereferences you specify |
| to the `-d` option (or once (1), when using the default). |
| |
| For example, assume that the `Person` struct is in the scope of a C++ |
| program with the following declarations: |
| |
| ```C++ |
| Person p{.name = "Jaime", .age = 44}; |
| Person *q{new Person{.name = "Jaime", .age = 44}}; |
| Person **r{&q}; |
| ``` |
| |
| And the following Summary String has been bound to `Person` type using the `type summary` |
| command shown here: |
| |
| ``` |
| (lldb) type summary add --summary-string "The person is named ${var.name} and is ${var.age} years old." Person |
| ``` |
| |
| Then LLDB will produce output which, at first, is unexpected: |
| |
| ``` |
| (lldb) frame var p |
| (Person) The person is named "Jaime" and is 44 years old. |
| (lldb) frame var q |
| (Person *) 0x00000000004172b0 The person is named "Jaime" and is 44 years old. |
| (lldb) frame var r |
| (Person **) 0x00007fffffffd658 error: summary string parsing error |
| ``` |
| |
| Upon closer inspection, LLDB is doing exactly as instructed. `p` has type `T` |
| and needs to be dereferenced 0 times to reach type `T`. `q` |
| has type `*T` and needs to be dereferenced 1 time to reach type `T`. Because |
| the Summary String was installed with the default `-d` value of 1, the summary formatter |
| will use `*q` for interpolation. |
| |
| However, `r` has type `**T` and needs to be dereferenced 2 times to reach `T`. Because |
| the Summary String was installed with the default `-d` value of 1, the summary formatter |
| will use `*r` for interpolation. Because `*r` has type `*T` (and not `T`), the error |
| above is reasonable. |
| |
| (summary-strings)= |
| |
| ## Summary Strings |
| |
| A Summary String contains a sequence of tokens that are processed by LLDB to |
| generate the summary for a user-defined type. Summary Strings are written in |
| an LLDB-specific control language.[^footnote-2] |
| |
| Summary Strings can contain |
| |
| - references to the instance of the user-defined type being formatted by the |
| Summary String |
| - plain text, |
| - control characters, and |
| - special variables that have access to information about the current object |
| and the overall program state. |
| |
| `${var}` can be used refer to the variable being formatted by the Summary String. |
| |
| Plain text is any sequence of characters that doesn't contain a `{`, `}`, `$`, |
| or `\` character, which are the syntax control characters. |
| |
| Special variables can be used between `${` prefix, and `}` suffix. For example, |
| `var` could be used to refer to the instance of the variable being |
| formatted by the Summary String. Children of variables can be accessed using the |
| variable itself (e.g. `${var.child.otherchild}` or `${file.basename}`). A variable |
| can also be prefixed or suffixed with other symbols to change the way its value is |
| handled. For example, `${*var.int_pointer[0-3]}`. |
| |
| The simplest thing you can do is access a member variable of a class or structure |
| by typing its expression path. The expression path for a field named `y` in a |
| user-defined type is simply `.y`. Thus, to ask the Summary String to display |
| field `y` of an instance of `struct A` (below) you would type `${var.y}`. |
| |
| If you have code in a C++ program with the following declarations: |
| |
| ```C++ |
| struct A { |
| int x; |
| int y; |
| }; |
| struct B { |
| A x; |
| A y; |
| int *z; |
| }; |
| ``` |
| |
| And were writing a Summary String to format an instance of `B`, the expression |
| path for the `y` member of the `x` member would be `.x.y` and you would |
| use `${var.x.y}` in the Summary String. |
| |
| By default, a summary defined for type `T`, also works for types `T*` and `T&` |
| (as mentioned above, you can disable this behavior if desired). For this reason, |
| expression paths do not differentiate between `.` and `->`, and the above |
| expression path `.x.y` would be valid for displaying a value with type `B*`, |
| `B` or even if the actual declaration of `B` were instead: |
| |
| ```C++ |
| struct B { |
| A *x; |
| A y; |
| int *z; |
| }; |
| ``` |
| |
| This behavior differs from `frame variable` which does enforce the distinction |
| between `T` and `T*`. The rationale for this choice is that ignoring this |
| distinction enables you to write a Summary String once for type `T` and use |
| it for both `T` and `T*` instances. As a Summary String is mostly about |
| extracting nested members' information, a pointer to an object is just as good |
| as the object itself for that purpose. |
| |
| If you need to access the value of the integer pointed to by `B::z`, your |
| Summary String cannot simply use expression path `.z` because that symbol |
| refers to the pointer `z`. To access the value of the integer pointed to by |
| `B::z` in your Summary String, you should use the same expression path but |
| dereference it: `${*var.z}`. The `*` tells LLDB to get the object that |
| the expression path leads to and then dereference it. In this example, it is |
| equivalent to `*(bObject.z)` in C/C++ syntax. Because `.` and `->` |
| operators can be used interchangeably, there is no need to have dereferences |
| in the middle of an expression path. For example, you do not need to type |
| `${*(var.x).x}` to read `A::x` as contained in `*(B::x)`. Instead, you |
| can simply write `${var.x->x}`, or even `${var.x.x}`. The `*` operator |
| only binds to the result of the whole expression path, rather than piecewise, |
| and there is no way to use parentheses to change that behavior. |
| |
| ## Formatting Summary Elements |
| |
| An expression path can include formatting codes. Much like the type formats |
| discussed previously, you can also customize the way variables are displayed in |
| Summary Strings, regardless of the format they have applied to their types. To |
| do that, you can use %format inside an expression path, as in \${var.x->x%u}, |
| which would display the value of x as an unsigned integer. |
| |
| Additionally, custom output can be achieved by using an LLVM format string, |
| commencing with the `:` marker. To illustrate, compare `${var.byte%x}` and |
| `${var.byte:x-}`. The former uses LLDB's builtin hex formatting (`x`), |
| which unconditionally inserts a `0x` prefix, and also zero pads the value to |
| match the size of the type. The latter uses `llvm::formatv` formatting |
| (`:x-`), and will print only the hex value, with no `0x` prefix, and no |
| padding. This raw control is useful when composing multiple pieces into a |
| larger whole. |
| |
| You can also use some other special format markers, not available for formats |
| themselves, but which carry a special meaning when used in this context: |
| |
| ```{eval-rst} |
| +------------+--------------------------------------------------------------------------+ |
| | **Symbol** | **Description** | |
| +------------+--------------------------------------------------------------------------+ |
| | ``Symbol`` | ``Description`` | |
| +------------+--------------------------------------------------------------------------+ |
| | ``%S`` | Use this object's summary (the default for aggregate types) | |
| +------------+--------------------------------------------------------------------------+ |
| | ``%V`` | Use this object's value (the default for non-aggregate types) | |
| +------------+--------------------------------------------------------------------------+ |
| | ``%@`` | Use a language-runtime specific description (for C++ this does nothing, | |
| | | for Objective-C it calls the NSPrintForDebugger API) | |
| +------------+--------------------------------------------------------------------------+ |
| | ``%L`` | Use this object's location (memory address, register name, ...) | |
| +------------+--------------------------------------------------------------------------+ |
| | ``%#`` | Use the count of the children of this object | |
| +------------+--------------------------------------------------------------------------+ |
| | ``%T`` | Use this object's datatype name | |
| +------------+--------------------------------------------------------------------------+ |
| | ``%N`` | Print the variable's basename | |
| +------------+--------------------------------------------------------------------------+ |
| | ``%>`` | Print the expression path for this item | |
| +------------+--------------------------------------------------------------------------+ |
| ``` |
| |
| Since LLDB 3.7.0, you can also specify `${script.var:pythonFuncName}`. |
| |
| It is expected that the function name you use specifies a function whose |
| signature is the same as a Python summary function. The return string from the |
| function will be placed verbatim in the output. |
| |
| You cannot use element access, or formatting symbols, in combination with this |
| syntax. For example the following: |
| |
| ``` |
| ${script.var.element[0]:myFunctionName%@} |
| ``` |
| |
| is not valid and will cause the summary to fail to evaluate. |
| |
| ## Element Inlining |
| |
| Option --inline-children (-c) to type summary add tells LLDB not to look for a Summary String, but instead to just print a listing of all the object's children on one line. |
| |
| As an example, given a type pair: |
| |
| ``` |
| (lldb) frame variable --show-types a_pair |
| (pair) a_pair = { |
| (int) first = 1; |
| (int) second = 2; |
| } |
| ``` |
| |
| If one types the following commands: |
| |
| ``` |
| (lldb) type summary add --inline-children pair |
| ``` |
| |
| the output becomes: |
| |
| ``` |
| (lldb) frame variable a_pair |
| (pair) a_pair = (first=1, second=2) |
| ``` |
| |
| Of course, one can obtain the same effect by typing |
| |
| ``` |
| (lldb) type summary add pair --summary-string "(first=${var.first}, second=${var.second})" |
| ``` |
| |
| While the final result is the same, using --inline-children can often save |
| time. If one does not need to see the names of the variables, but just their |
| values, the option --omit-names (-O, uppercase letter o), can be combined with |
| --inline-children to obtain: |
| |
| ``` |
| (lldb) frame variable a_pair |
| (pair) a_pair = (1, 2) |
| ``` |
| |
| which is of course the same as typing |
| |
| ``` |
| (lldb) type summary add pair --summary-string "(${var.first}, ${var.second})" |
| ``` |
| |
| ## Bitfields And Array Syntax |
| |
| Sometimes, a basic type's value actually represents several different values |
| packed together in a bitfield. |
| |
| With the classical view, there is no way to look at them. Hexadecimal display |
| can help, but if the bits actually span nibble boundaries, the help is limited. |
| |
| Binary view would show it all without ambiguity, but is often too detailed and |
| hard to read for real-life scenarios. |
| |
| To cope with the issue, LLDB supports native bitfield formatting in summary |
| strings. If your expression paths leads to a so-called scalar type (the usual |
| int, float, char, double, short, long, long long, double, long double and |
| unsigned variants), you can ask LLDB to only access some bits out of the value |
| and display them in any format you like. If you only need one bit you can use |
| the [n], just like indexing an array. To extract multiple bits, you can use a |
| slice-like syntax: [n-m], e.g. |
| |
| ``` |
| (lldb) frame variable float_point |
| (float) float_point = -3.14159 |
| ``` |
| |
| ``` |
| (lldb) type summary add --summary-string "Sign: ${var[31]%B} Exponent: ${var[30-23]%x} Mantissa: ${var[0-22]%u}" float |
| (lldb) frame variable float_point |
| (float) float_point = -3.14159 Sign: true Exponent: 0x00000080 Mantissa: 4788184 |
| ``` |
| |
| In this example, LLDB shows the internal representation of a float variable by |
| extracting bitfields out of a float object. |
| |
| When typing a range, the extremes n and m are always included, and the order of |
| the indices is irrelevant. |
| |
| LLDB also allows to use a similar syntax to display array members inside a Summary String. |
| For instance, you may want to display all arrays of a given type using a more compact |
| notation than the default, and then just delve into individual array members that prove |
| interesting to your debugging task. You can tell LLDB to format arrays in special ways, |
| possibly independent of the way the array members' datatype is formatted. For example: |
| |
| ``` |
| (lldb) frame variable sarray |
| (Simple [3]) sarray = { |
| [0] = { |
| x = 1 |
| y = 2 |
| z = '\x03' |
| } |
| [1] = { |
| x = 4 |
| y = 5 |
| z = '\x06' |
| } |
| [2] = { |
| x = 7 |
| y = 8 |
| z = '\t' |
| } |
| } |
| |
| (lldb) type summary add --summary-string "${var[].x}" "Simple [3]" |
| |
| (lldb) frame variable sarray |
| (Simple [3]) sarray = [1,4,7] |
| ``` |
| |
| The [] symbol amounts to: if `var` is an array and I know its size, apply this Summary String |
| to every element of the array. Here, we are asking LLDB to display `.x` for every element of |
| the array, and in fact this is what happens. If you find some of those integers anomalous, |
| you can then inspect that one item in greater detail, without the array format getting in the way: |
| |
| ``` |
| (lldb) frame variable sarray[1] |
| (Simple) sarray[1] = { |
| x = 4 |
| y = 5 |
| z = '\x06' |
| } |
| ``` |
| |
| You can also ask LLDB to only print a subset of the array range by using the |
| same syntax used to extract bit for bitfields: |
| |
| ``` |
| (lldb) type summary add --summary-string "${var[1-2].x}" "Simple [3]" |
| |
| (lldb) frame variable sarray |
| (Simple [3]) sarray = [4,7] |
| ``` |
| |
| If you are dealing with a pointer that you know is an array, you can use this |
| syntax to display the elements contained in the pointed array instead of just |
| the pointer value. However, because pointers have no notion of their size, the |
| empty brackets [] operator does not work, and you must explicitly provide |
| higher and lower bounds. |
| |
| In general, LLDB needs the square brackets `operator []` in order to handle |
| arrays and pointers correctly, and for pointers it also needs a range. However, |
| a few special cases are defined to make your life easier: |
| |
| you can print a 0-terminated string (C-string) using the %s format, omitting |
| square brackets, as in: |
| |
| ``` |
| (lldb) type summary add --summary-string "${var%s}" "char *" |
| ``` |
| |
| This syntax works for char\* as well as for char[] because LLDB can rely on the |
| final 0 terminator to know when the string has ended. |
| |
| LLDB has default Summary Strings for char\* and char[] that use this special |
| case. On debugger startup, the following are defined automatically: |
| |
| ``` |
| (lldb) type summary add --summary-string "${var%s}" "char *" |
| (lldb) type summary add --summary-string "${var%s}" -x "char \[[0-9]+]" |
| ``` |
| |
| any of the array formats (int8_t[], float32{}, ...), and the y, Y and a formats |
| work to print an array of a non-aggregate type, even if square brackets are |
| omitted. |
| |
| ``` |
| (lldb) type summary add --summary-string "${var%int32_t[]}" "int [10]" |
| ``` |
| |
| This feature, however, is not enabled for pointers because there is no way for |
| LLDB to detect the end of the pointed data. |
| |
| This also does not work for other formats (e.g. boolean), and you must specify |
| the square brackets operator to get the expected output. |
| |
| ## Python Scripting |
| |
| Most of the times, Summary Strings prove good enough for the job of summarizing |
| the contents of a variable. However, as soon as you need to do more than |
| picking some values and rearranging them for display, Summary Strings stop |
| being an effective tool. This is because Summary Strings lack the power to |
| actually perform any kind of computation on the value of variables. |
| |
| To solve this issue, you can bind some Python scripting code as a summary for |
| your datatype, and that script has the ability to both extract child |
| variables as the Summary Strings do, and to perform active computation on the |
| extracted values. As a small example, let's say we have a Rectangle class: |
| |
| ``` |
| class Rectangle |
| { |
| private: |
| int height; |
| int width; |
| public: |
| Rectangle() : height(3), width(5) {} |
| Rectangle(int H) : height(H), width(H*2-1) {} |
| Rectangle(int H, int W) : height(H), width(W) {} |
| int GetHeight() { return height; } |
| int GetWidth() { return width; } |
| }; |
| ``` |
| |
| Summary Strings are effective to reduce the screen real estate used by the |
| default viewing mode, but are not effective if we want to display the area and |
| perimeter of Rectangle objects |
| |
| To obtain this, we can simply attach a small Python script to the Rectangle |
| class, as shown in this example: |
| |
| ``` |
| (lldb) type summary add -P Rectangle |
| Enter your Python command(s). Type 'DONE' to end. |
| def function (valobj,internal_dict,options): |
| height_val = valobj.GetChildMemberWithName('height') |
| width_val = valobj.GetChildMemberWithName('width') |
| height = height_val.GetValueAsUnsigned(0) |
| width = width_val.GetValueAsUnsigned(0) |
| area = height*width |
| perimeter = 2*(height + width) |
| return 'Area: ' + str(area) + ', Perimeter: ' + str(perimeter) |
| DONE |
| (lldb) frame variable |
| (Rectangle) r1 = Area: 20, Perimeter: 18 |
| (Rectangle) r2 = Area: 72, Perimeter: 36 |
| (Rectangle) r3 = Area: 16, Perimeter: 16 |
| ``` |
| |
| In order to write effective summary scripts, you need to know the LLDB public |
| API, which is the way Python code can access the LLDB object model. For further |
| details on the API you should look at the LLDB API reference documentation. |
| |
| As a brief introduction, your script is encapsulated into a function that is |
| passed two parameters: `valobj` and `internal_dict`. |
| |
| `internal_dict` is an internal support parameter used by LLDB and you should |
| not touch it. |
| |
| `valobj` is the object encapsulating the actual variable being displayed, and |
| its type is {any}`SBValue`. Out of the many possible operations on an {any}`SBValue`, the |
| basic one is retrieve the children objects it contains (essentially, the fields |
| of the object wrapped by it), by calling `GetChildMemberWithName()`, passing |
| it the child's name as a string. |
| |
| If the variable has a value, you can ask for it, and return it as a string |
| using `GetValue()`, or as a signed/unsigned number using |
| `GetValueAsSigned()`, `GetValueAsUnsigned()`. It is also possible to |
| retrieve an {any}`SBData` object by calling `GetData()` and then read the object's |
| contents out of the {any}`SBData`. |
| |
| If you need to delve into several levels of hierarchy, as you can do with |
| Summary Strings, you can use the method `GetValueForExpressionPath()`, |
| passing it an expression path just like those you could use for Summary Strings |
| (one of the differences is that dereferencing a pointer does not occur by |
| prefixing the path with a `` *` ``, but by calling the `Dereference()` method |
| on the returned {any}`SBValue`). If you need to access array slices, you cannot do |
| that (yet) via this method call, and you must use `GetChildAtIndex()` |
| querying it for the array items one by one. Also, handling custom formats is |
| something you have to deal with on your own. |
| |
| `options` Python summary formatters can optionally define this |
| third argument, which is an object of type `lldb.SBTypeSummaryOptions`, |
| allowing for a few customizations of the result. The decision to |
| adopt or not this third argument - and the meaning of options |
| thereof - is up to the individual formatter's writer. |
| |
| Other than interactively typing a Python script there are two other ways for |
| you to input a Python script as a summary: |
| |
| - using the --python-script option to type summary add and typing the script |
| code as an option argument; as in: |
| |
| ``` |
| (lldb) type summary add --python-script "height = valobj.GetChildMemberWithName('height').GetValueAsUnsigned(0);width = valobj.GetChildMemberWithName('width').GetValueAsUnsigned(0); return 'Area: %d' % (height*width)" Rectangle |
| ``` |
| |
| - using the --python-function (-F) option to type summary add and giving the |
| name of a Python function with the correct prototype. Most probably, you will |
| define (or have already defined) the function in the interactive interpreter, |
| or somehow loaded it from a file, using the command script import command. |
| LLDB will emit a warning if it is unable to find the function you passed, but |
| will still register the binding. |
| - using the `@lldb.summary` decorator on a function definition. This combines |
| the definition and registration into a single step. See |
| {ref}`decorator-formatters` for details. |
| |
| ## Regular Expression Typenames |
| |
| As you noticed, in order to associate the custom Summary String to the array |
| types, one must give the array size as part of the typename. This can long |
| become tiresome when using arrays of different sizes, Simple [3], Simple [9], |
| Simple [12], ... |
| |
| If you use the -x option, type names are treated as regular expressions instead |
| of type names. This would let you rephrase the above example for arrays of type |
| Simple [3] as: |
| |
| ``` |
| (lldb) type summary add --summary-string "${var[].x}" -x "Simple \[[0-9]+\]" |
| (lldb) frame variable |
| (Simple [3]) sarray = [1,4,7] |
| (Simple [2]) sother = [3,6] |
| ``` |
| |
| The above scenario works for Simple [3] as well as for any other array of |
| Simple objects. |
| |
| While this feature is mostly useful for arrays, you could also use regular |
| expressions to catch other type sets grouped by name. However, as regular |
| expression matching is slower than normal name matching, LLDB will first try to |
| match by name in any way it can, and only when this fails, will it resort to |
| regular expression matching. |
| |
| One of the ways LLDB uses this feature internally, is to match the names of STL |
| container classes, regardless of the template arguments provided. The details |
| for this are found at FormatManager.cpp |
| |
| The regular expression language used by LLDB is the POSIX extended language, as |
| defined by the Single UNIX Specification, of which macOS is a compliant |
| implementation. |
| |
| ## Names Summaries |
| |
| For a given type, there may be different meaningful summary representations. |
| However, currently, only one summary can be associated to a type at each |
| moment. If you need to temporarily override the association for a variable, |
| without changing the Summary String for to its type, you can use named |
| summaries. |
| |
| Named summaries work by attaching a name to a summary when creating it. Then, |
| when there is a need to attach the summary to a variable, the frame variable |
| command, supports a --summary option that tells LLDB to use the named summary |
| given instead of the default one. |
| |
| ``` |
| (lldb) type summary add --summary-string "x=${var.integer}" --name NamedSummary |
| (lldb) frame variable one |
| (i_am_cool) one = int = 3, float = 3.14159, char = 69 |
| (lldb) frame variable one --summary NamedSummary |
| (i_am_cool) one = x=3 |
| ``` |
| |
| When defining a named summary, binding it to one or more types becomes |
| optional. Even if you bind the named summary to a type, and later change the |
| Summary String for that type, the named summary will not be changed by that. |
| You can delete named summaries by using the type summary delete command, as if |
| the summary name was the datatype that the summary is applied to |
| |
| A summary attached to a variable using the --summary option, has the same |
| semantics that a custom format attached using the -f option has: it stays |
| attached till you attach a new one, or till you let your program run again. |
| |
| (synthetic-children)= |
| |
| ## Synthetic Children |
| |
| Summaries work well when one is able to navigate through an expression path. In |
| order for LLDB to do so, appropriate debugging information must be available. |
| |
| Some types are opaque, i.e. no knowledge of their internals is provided. When |
| that's the case, expression paths do not work correctly. |
| |
| In other cases, the internals are available to use in expression paths, but |
| they do not provide a user-friendly representation of the object's value. |
| |
| For instance, consider an STL vector, as implemented by the GNU C++ Library: |
| |
| ``` |
| (lldb) frame variable numbers -T |
| (std::vector<int>) numbers = { |
| (std::_Vector_base<int, std::allocator<int> >) std::_Vector_base<int, std::allocator<int> > = { |
| (std::_Vector_base<int, std::allocator&tl;int> >::_Vector_impl) _M_impl = { |
| (int *) _M_start = 0x00000001001008a0 |
| (int *) _M_finish = 0x00000001001008a8 |
| (int *) _M_end_of_storage = 0x00000001001008a8 |
| } |
| } |
| } |
| ``` |
| |
| Here, you can see how the type is implemented, and you can write a summary for |
| that implementation but that is not going to help you infer what items are |
| actually stored in the vector. |
| |
| What you would like to see is probably something like: |
| |
| ``` |
| (lldb) frame variable numbers -T |
| (std::vector<int>) numbers = { |
| (int) [0] = 1 |
| (int) [1] = 12 |
| (int) [2] = 123 |
| (int) [3] = 1234 |
| } |
| ``` |
| |
| Synthetic children are a way to get that result. |
| |
| The feature is based upon the idea of providing a new set of children for a |
| variable that replaces the ones available by default through the debug |
| information. In the example, we can use synthetic children to provide the |
| vector items as children for the std::vector object. |
| |
| In order to create synthetic children, you need to provide a Python class that |
| adheres to a given interface (the word is italicized because Python has no |
| explicit notion of interface, by that word we mean a given set of methods must |
| be implemented by the Python class): |
| |
| ```python |
| class SyntheticChildrenProvider: |
| def __init__(self, valobj: lldb.SBValue, internal_dict): |
| """" |
| This call should initialize the Python object using valobj as the |
| variable to provide synthetic children for. |
| """" |
| |
| def num_children(self, max_children: int) -> int: |
| """ |
| This call should return the number of children that you want your |
| object to have[1]. |
| """ |
| |
| def get_child_index(self, name: str) -> int: |
| """ |
| This call should return the index of the synthetic child whose name is |
| given as the argument. Array subscripting, names in the form "[N]", is |
| automatically supported. |
| Return -1 if there is no child at the index. |
| """ |
| |
| def get_child_at_index(self, index: int) -> lldb.SBValue | None: |
| """" |
| This call should return a new LLDB SBValue object representing the |
| child at the index given as argument. |
| """ |
| |
| def update(self) -> bool: |
| """" |
| This call should be used to update the internal state of this Python |
| object whenever the state of the variables in LLDB changes.[2] |
| Also, this method is invoked before any other method in the interface. |
| """ |
| |
| def has_children(self) -> bool: |
| """ |
| This call should return True if this object might have children, and |
| False if this object can be guaranteed not to have children.[3] |
| """ |
| |
| def get_value(self) -> lldb.SBValue | None: |
| """ |
| This call can return an SBValue to be presented as the value of the |
| synthetic value under consideration.[4] |
| """" |
| ``` |
| |
| As a warning, exceptions that are thrown by python formatters are caught |
| silently by LLDB and should be handled appropriately by the formatter itself. |
| Being more specific, in case of exceptions, LLDB might assume that the given |
| object has no children or it might skip printing some children, as they are |
| printed one by one. |
| |
| [1] The `max_children` argument is optional (since LLDB 3.8.0) and indicates the |
| maximum number of children that LLDB is interested in (at this moment). If the |
| computation of the number of children is expensive (for example, requires |
| traversing a linked list to determine its size) your implementation may return |
| `max_children` rather than the actual number. If the computation is cheap (e.g., the |
| number is stored as a field of the object), then you can always return the true |
| number of children (that is, ignore the `max_children` argument). |
| |
| [2] This method is optional. Also, a boolean value must be returned (since LLDB |
| 3.1.0). If `False` is returned, then whenever the process reaches a new stop, |
| this method will be invoked again to generate an updated list of the children |
| for a given variable. Otherwise, if `True` is returned, then the value is |
| cached and this method won't be called again, effectively freezing the state of |
| the value in subsequent stops. Beware that returning `True` incorrectly could |
| show misleading information to the user. |
| |
| [3] This method is optional (since LLDB 3.2.0). While implementing it in terms |
| of num_children is acceptable, implementors are encouraged to look for |
| optimized coding alternatives whenever reasonable. |
| |
| [4] This method is optional (since LLDB 3.5.2). The {any}`SBValue` you return here |
| will most likely be a numeric type (int, float, ...) as its value bytes will be |
| used as-if they were the value of the root {any}`SBValue` proper. As a shortcut for |
| this, you can inherit from lldb.SBSyntheticValueProvider, and just define |
| get_value as other methods are defaulted in the superclass as returning default |
| no-children responses. |
| |
| If a synthetic child provider supplies a special child named |
| `$$dereference$$` then it will be used when evaluating `operator *` and |
| `operator ->` in the frame variable command and related SB API |
| functions. It is possible to declare this synthetic child without |
| including it in the range of children displayed by LLDB. For example, |
| this subset of a synthetic children provider class would allow the |
| synthetic value to be dereferenced without actually showing any |
| synthetic children in the UI: |
| |
| ```python |
| class SyntheticChildrenProvider: |
| [...] |
| def num_children(self) -> int: |
| return 0 |
| |
| def get_child_index(self, name: str) -> int: |
| if name == '$$dereference$$': |
| return 0 |
| return -1 |
| |
| def get_child_at_index(self, index: int) -> lldb.SBValue | None: |
| if index == 0: |
| return <valobj resulting from dereference> |
| return None |
| ``` |
| |
| For examples of how synthetic children are created, you are encouraged to look |
| at examples/synthetic in the LLDB trunk. Please, be aware that the code in |
| those files (except bitfield/) is legacy code and is not maintained. You may |
| especially want to begin looking at this example to get a feel for this |
| feature, as it is a very easy and well commented example. |
| |
| The design pattern consistently used in synthetic providers shipping with LLDB |
| is to use the \_\_init\_\_ to store the {any}`SBValue` instance as a part of self. The |
| update function is then used to perform the actual initialization. Once a |
| synthetic children provider is written, one must load it into LLDB before it |
| can be used. Currently, one can use the LLDB script command to type Python code |
| interactively, or use the command script import fileName command to load Python |
| code from a Python module (ordinary rules apply to importing modules this way). |
| A third option is to type the code for the provider class interactively while |
| adding it. |
| |
| For example, let's pretend we have a class Foo for which a synthetic children |
| provider class Foo_Provider is available, in a Python module contained in file |
| \~/Foo_Tools.py. The following interaction sets Foo_Provider as a synthetic |
| children provider in LLDB: |
| |
| ``` |
| (lldb) command script import ~/Foo_Tools.py |
| (lldb) type synthetic add Foo --python-class Foo_Tools.Foo_Provider |
| (lldb) frame variable a_foo |
| (Foo) a_foo = { |
| x = 1 |
| y = "Hello world" |
| } |
| ``` |
| |
| As an alternative to the two-step import-then-register pattern above, you can |
| use the `@lldb.synthetic` decorator to combine both steps. See |
| {ref}`decorator-formatters` for details. |
| |
| LLDB has synthetic children providers for a core subset of STL classes, both in |
| the version provided by libstdcpp and by libcxx, as well as for several |
| Foundation classes. |
| |
| Synthetic children extend Summary Strings by enabling a new special variable: |
| `${svar`. |
| |
| This symbol tells LLDB to refer expression paths to the synthetic children |
| instead of the real ones. For instance, |
| |
| ``` |
| (lldb) type summary add --expand -x "std::vector<" --summary-string "${svar%#} items" |
| (lldb) frame variable numbers |
| (std::vector<int>) numbers = 4 items { |
| (int) [0] = 1 |
| (int) [1] = 12 |
| (int) [2] = 123 |
| (int) [3] = 1234 |
| } |
| ``` |
| |
| It's important to mention that LLDB invokes the synthetic child provider before |
| invoking the Summary String provider, which allows the latter to have access to |
| the actual displayable children. This applies to both inlined Summary Strings |
| and python-based summary providers. |
| |
| As a warning, when programmatically accessing the children or children count of |
| a variable that has a synthetic child provider, notice that LLDB hides the |
| actual raw children. For example, suppose we have a `std::vector`, which has |
| an actual in-memory property `__begin` marking the beginning of its data. |
| After the synthetic child provider is executed, the `std::vector` variable |
| won't show `__begin` as child anymore, even through the SB API. It will have |
| instead the children calculated by the provider. In case the actual raw |
| children are needed, a call to `value.GetNonSyntheticValue()` is enough to |
| get a raw version of the value. It is import to remember this when implementing |
| Summary String providers, as they run after the synthetic child provider. |
| |
| In some cases, if LLDB is unable to use the real object to get a child |
| specified in an expression path, it will automatically refer to the synthetic |
| children. While in summaries it is best to always use \${svar to make your |
| intentions clearer, interactive debugging can benefit from this behavior, as |
| in: |
| |
| ``` |
| (lldb) frame variable numbers[0] numbers[1] |
| (int) numbers[0] = 1 |
| (int) numbers[1] = 12 |
| ``` |
| |
| Unlike many other visualization features, however, the access to synthetic |
| children only works when using frame variable, and is not supported in |
| expression: |
| |
| ``` |
| (lldb) expression numbers[0] |
| Error [IRForTarget]: Call to a function '_ZNSt33vector<int, std::allocator<int> >ixEm' that is not present in the target |
| error: Couldn't convert the expression to DWARF |
| ``` |
| |
| The reason for this is that classes might have an overloaded `operator []`, |
| or other special provisions and the expression command chooses to ignore |
| synthetic children in the interest of equivalency with code you asked to have |
| compiled from source. |
| |
| (decorator-formatters)= |
| |
| ## Decorator-Based Formatter Registration |
| |
| Beginning in version 23, LLDB provides Python decorators that combine the |
| definition and registration of formatters into a single step. When a module |
| using these decorators is loaded via `command script import`, the formatters |
| are automatically registered. |
| |
| ### `@lldb.summary` |
| |
| The `@lldb.summary` decorator registers a Python function as a type summary |
| provider: |
| |
| ```python |
| import lldb |
| |
| @lldb.summary("Person") |
| def PersonSummary(valobj: lldb.SBValue, _) -> str: |
| name = valobj.GetChildMemberWithName("name").GetSummary() |
| age = valobj.GetChildMemberWithName("age").GetValueAsUnsigned() |
| return f"name={name}, age={age}" |
| ``` |
| |
| This is equivalent to defining the function and then running: |
| |
| ``` |
| (lldb) type summary add --python-function module.PersonSummary Person |
| ``` |
| |
| ### `@lldb.synthetic` |
| |
| The `@lldb.synthetic` decorator registers a Python class as a synthetic child |
| provider: |
| |
| ```python |
| import lldb |
| |
| @lldb.synthetic("Person") |
| class PersonChildren: |
| def __init__(self, valobj: lldb.SBValue, _) -> None: |
| self.valobj = valobj |
| |
| def update(self) -> bool: |
| self.name = self.valobj.GetChildMemberWithName("name") |
| self.age = self.valobj.GetChildMemberWithName("age") |
| return True |
| |
| def num_children(self) -> int: |
| return 2 |
| |
| def get_child_at_index(self, index) -> lldb.SBValue: |
| if index == 0: |
| return self.name |
| if index == 1: |
| return self.age |
| return lldb.SBValue() |
| ``` |
| |
| This is equivalent to defining the class and then running: |
| |
| ``` |
| (lldb) type synthetic add --python-class module.PersonChildren Person |
| ``` |
| |
| ### Passing Options |
| |
| Both decorators accept keyword arguments that map to command-line flags of the |
| corresponding `type summary add` or `type synthetic add` commands. |
| Underscores in keyword names are converted to hyphens. A value of `True` |
| produces a bare flag; any other value is passed as the flag's argument. |
| |
| For example, to register a summary for a regex type pattern with the expand |
| flag: |
| |
| ```python |
| @lldb.summary("^std::vector<.+>$", regex=True, expand=True) |
| def VectorSummary(valobj, internal_dict): |
| ... |
| ``` |
| |
| This is equivalent to: |
| |
| ``` |
| (lldb) type summary add --python-function module.VectorSummary --regex --expand "^std::vector<.+>$" |
| ``` |
| |
| Other commonly used options include `category` to place the formatter in a |
| specific category. |
| |
| ## Filters |
| |
| Filters are a solution to the display of complex classes. At times, classes |
| have many member variables but not all of these are actually necessary for the |
| user to see. |
| |
| A filter will solve this issue by only letting the user see those member |
| variables they care about. Of course, the equivalent of a filter can be |
| implemented easily using synthetic children, but a filter lets you get the job |
| done without having to write Python code. |
| |
| For instance, if your class Foobar has member variables named A thru Z, but you |
| only need to see the ones named B, H and Q, you can define a filter: |
| |
| ``` |
| (lldb) type filter add Foobar --child B --child H --child Q |
| (lldb) frame variable a_foobar |
| (Foobar) a_foobar = { |
| (int) B = 1 |
| (char) H = 'H' |
| (std::string) Q = "Hello world" |
| } |
| ``` |
| |
| ## Callback-based type matching |
| |
| Even though regular expression matching works well for the vast majority of data |
| formatters (you normally know the name of the type you're writing a formatter |
| for), there are some cases where it's useful to look at the type before deciding |
| what formatter to apply. |
| |
| As an example scenario, imagine we have a code generator that produces some |
| classes that inherit from a common `GeneratedObject` class, and we have a |
| summary function and a synthetic child provider that work for all |
| `GeneratedObject` instances (they all follow the same pattern). However, there |
| is no common pattern in the name of these classes, so we can't register the |
| formatter neither by name nor by regular expression. |
| |
| In that case, you can write a recognizer function like this: |
| |
| ```python |
| def is_generated_object(sbtype: lldb.SBType, internal_dict) -> bool: |
| for base in sbtype.get_bases_array(): |
| if base.GetName() == "GeneratedObject" |
| return True |
| return False |
| ``` |
| |
| And pass this function to `type summary add` and `type synthetic add` using |
| the flag `--recognizer-function`. |
| |
| ``` |
| (lldb) type summary add --expand --python-function my_summary_function --recognizer-function is_generated_object |
| (lldb) type synthetic add --python-class my_child_provider --recognizer-function is_generated_object |
| ``` |
| |
| ## Objective-C Dynamic Type Discovery |
| |
| When doing Objective-C development, you may notice that some of your variables |
| come out as of type id (for instance, items extracted from NSArray). By |
| default, LLDB will not show you the real type of the object. it can actually |
| dynamically discover the type of an Objective-C variable, much like the runtime |
| itself does when invoking a selector. In order to be shown the result of that |
| discovery that, however, a special option to frame variable or expression is |
| required: `--dynamic-type`. |
| |
| `--dynamic-type` can have one of three values: |
| |
| - `no-dynamic-values`: the default, prevents dynamic type discovery |
| - `no-run-target`: enables dynamic type discovery as long as running code on |
| the target is not required |
| - `run-target`: enables code execution on the target in order to perform |
| dynamic type discovery |
| |
| If you specify a value of either no-run-target or run-target, LLDB will detect |
| the dynamic type of your variables and show the appropriate formatters for |
| them. As an example: |
| |
| ``` |
| (lldb) expr @"Hello" |
| (NSString *) $0 = 0x00000001048000b0 @"Hello" |
| (lldb) expr -d no-run @"Hello" |
| (__NSCFString *) $1 = 0x00000001048000b0 @"Hello" |
| ``` |
| |
| Because LLDB uses a detection algorithm that does not need to invoke any |
| functions on the target process, no-run-target is enough for this to work. |
| |
| As a side note, the summary for NSString shown in the example is built right |
| into LLDB. It was initially implemented through Python (the code is still |
| available for reference at CFString.py). However, this is out of sync with the |
| current implementation of the NSString formatter (which is a C++ function |
| compiled into the LLDB core). |
| |
| ## Categories |
| |
| Categories are a way to group related formatters. For instance, LLDB itself |
| groups the formatters for STL types in a category named cpluspus. Basically, |
| categories act like containers in which to store formatters for a same library |
| or OS release. |
| |
| By default, several categories are created in LLDB: |
| |
| - default: this is the category where every formatter ends up, unless another category is specified |
| - objc: formatters for basic and common Objective-C types that do not specifically depend on macOS |
| - cplusplus: formatters for STL types (currently only libc++ and libstdc++ are supported). Enabled when debugging C++ targets. |
| - system: truly basic types for which a formatter is required |
| - AppKit: Cocoa classes |
| - CoreFoundation: CF classes |
| - CoreGraphics: CG classes |
| - CoreServices: CS classes |
| - VectorTypes: compact display for several vector types |
| |
| If you want to use a custom category for your formatters, all the `type ... add` |
| provide a `--category` (`-w`) option, that names the category to add the formatter |
| to. To delete the formatter, you then have to specify the correct category. |
| |
| Categories can be in one of two states: enabled and disabled. A category is |
| initially disabled, and can be enabled using the `type category enable` command. |
| To disable an enabled category, the command to use is `type category disable`. |
| |
| The order in which categories are enabled or disabled is significant, in that |
| LLDB uses that order when looking for formatters. Therefore, when you enable a |
| category, it becomes the second one to be searched (after default, which always |
| stays on top of the list). The default categories are enabled in such a way |
| that the search order is: |
| |
| - default |
| - objc |
| - CoreFoundation |
| - AppKit |
| - CoreServices |
| - CoreGraphics |
| - cplusplus |
| - VectorTypes |
| - system |
| |
| As said, cplusplus contain formatters for C++ STL data types. |
| system contains formatters for char\* and char[], which reflect the behavior of |
| older versions of LLDB which had built-in formatters for these types. Because |
| now these are formatters, you can even replace them with your own if so you |
| wish. |
| |
| There is no special command to create a category. When you place a formatter in |
| a category, if that category does not exist, it is automatically created. For |
| instance, |
| |
| ``` |
| (lldb) type summary add Foobar --summary-string "a foobar" --category newcategory |
| ``` |
| |
| automatically creates a (disabled) category named newcategory. |
| |
| Another way to create a new (empty) category, is to enable it, as in: |
| |
| ``` |
| (lldb) type category enable newcategory |
| ``` |
| |
| However, in this case LLDB warns you that enabling an empty category has no |
| effect. If you add formatters to the category after enabling it, they will be |
| honored. But an empty category per se does not change the way any type is |
| displayed. The reason the debugger warns you is that enabling an empty category |
| might be a typo, and you effectively wanted to enable a similarly-named but |
| not-empty category. |
| |
| ## Finding Formatters 101 |
| |
| Searching for a formatter (including formats, since LLDB 3.4.0) given a |
| variable goes through a rather intricate set of rules. Namely, what happens is |
| that LLDB starts looking in each enabled category, according to the order in |
| which they were enabled (latest enabled first). In each category, LLDB does the |
| following: |
| |
| - If there is a formatter for the type of the variable, use it |
| - If this object is a pointer, and there is a formatter for the pointee type |
| that does not skip pointers, use it |
| - If this object is a reference, and there is a formatter for the referred type |
| that does not skip references, use it |
| - If this object is an Objective-C class and dynamic types are enabled, look |
| for a formatter for the dynamic type of the object. If dynamic types are |
| disabled, or the search failed, look for a formatter for the declared type of |
| the object |
| - If this object's type is a typedef, go through typedef hierarchy (LLDB might |
| not be able to do this if the compiler has not emitted enough information. If |
| the required information to traverse typedef hierarchies is missing, type |
| cascading will not work. The clang compiler, part of the LLVM project, emits |
| the correct debugging information for LLDB to cascade). If at any level of |
| the hierarchy there is a valid formatter that can cascade, use it. |
| - If everything has failed, repeat the above search, looking for regular |
| expressions instead of exact matches |
| |
| If any of those attempts returned a valid formatter to be used, that one is |
| used, and the search is terminated (without going to look in other categories). |
| If nothing was found in the current category, the next enabled category is |
| scanned according to the same algorithm. If there are no more enabled |
| categories, the search has failed. |
| |
| **Warning**: previous versions of LLDB defined cascading to mean not only going |
| through typedef chains, but also through inheritance chains. This feature has |
| been removed since it significantly degrades performance. You need to set up |
| your formatters for every type in inheritance chains to which you want the |
| formatter to apply. |
| |
| [^footnote-1]: These types of variables go by different names depending on the language. In C++, in |
| particular, they are known as compound types. |
| |
| [^footnote-2]: If you are familiar with the syntax for Frame and Thread Formatting |
| you will feel right at home with the syntax for Summary Strings. |