| ======================================== | 
 | Machine IR (MIR) Format Reference Manual | 
 | ======================================== | 
 |  | 
 | .. contents:: | 
 |    :local: | 
 |  | 
 | .. warning:: | 
 |   This is a work in progress. | 
 |  | 
 | Introduction | 
 | ============ | 
 |  | 
 | This document is a reference manual for the Machine IR (MIR) serialization | 
 | format. MIR is a human readable serialization format that is used to represent | 
 | LLVM's :ref:`machine specific intermediate representation | 
 | <machine code representation>`. | 
 |  | 
 | The MIR serialization format is designed to be used for testing the code | 
 | generation passes in LLVM. | 
 |  | 
 | Overview | 
 | ======== | 
 |  | 
 | The MIR serialization format uses a YAML container. YAML is a standard | 
 | data serialization language, and the full YAML language spec can be read at | 
 | `yaml.org | 
 | <http://www.yaml.org/spec/1.2/spec.html#Introduction>`_. | 
 |  | 
 | A MIR file is split up into a series of `YAML documents`_. The first document | 
 | can contain an optional embedded LLVM IR module, and the rest of the documents | 
 | contain the serialized machine functions. | 
 |  | 
 | .. _YAML documents: http://www.yaml.org/spec/1.2/spec.html#id2800132 | 
 |  | 
 | MIR Testing Guide | 
 | ================= | 
 |  | 
 | You can use the MIR format for testing in two different ways: | 
 |  | 
 | - You can write MIR tests that invoke a single code generation pass using the | 
 |   ``-run-pass`` option in llc. | 
 |  | 
 | - You can use llc's ``-stop-after`` option with existing or new LLVM assembly | 
 |   tests and check the MIR output of a specific code generation pass. | 
 |  | 
 | Testing Individual Code Generation Passes | 
 | ----------------------------------------- | 
 |  | 
 | The ``-run-pass`` option in llc allows you to create MIR tests that invoke just | 
 | a single code generation pass. When this option is used, llc will parse an | 
 | input MIR file, run the specified code generation pass(es), and output the | 
 | resulting MIR code. | 
 |  | 
 | You can generate an input MIR file for the test by using the ``-stop-after`` or | 
 | ``-stop-before`` option in llc. For example, if you would like to write a test | 
 | for the post register allocation pseudo instruction expansion pass, you can | 
 | specify the machine copy propagation pass in the ``-stop-after`` option, as it | 
 | runs just before the pass that we are trying to test: | 
 |  | 
 |    ``llc -stop-after=machine-cp bug-trigger.ll -o test.mir`` | 
 |  | 
 | If the same pass is run multiple times, a run index can be included | 
 | after the name with a comma. | 
 |  | 
 |    ``llc -stop-after=dead-mi-elimination,1 bug-trigger.ll -o test.mir`` | 
 |  | 
 | After generating the input MIR file, you'll have to add a run line that uses | 
 | the ``-run-pass`` option to it. In order to test the post register allocation | 
 | pseudo instruction expansion pass on X86-64, a run line like the one shown | 
 | below can be used: | 
 |  | 
 |     ``# RUN: llc -o - %s -mtriple=x86_64-- -run-pass=postrapseudos | FileCheck %s`` | 
 |  | 
 | The MIR files are target dependent, so they have to be placed in the target | 
 | specific test directories (``lib/CodeGen/TARGETNAME``). They also need to | 
 | specify a target triple or a target architecture either in the run line or in | 
 | the embedded LLVM IR module. | 
 |  | 
 | Simplifying MIR files | 
 | ^^^^^^^^^^^^^^^^^^^^^ | 
 |  | 
 | The MIR code coming out of ``-stop-after``/``-stop-before`` is very verbose; | 
 | Tests are more accessible and future proof when simplified: | 
 |  | 
 | - Use the ``-simplify-mir`` option with llc. | 
 |  | 
 | - Machine function attributes often have default values or the test works just | 
 |   as well with default values. Typical candidates for this are: `alignment:`, | 
 |   `exposesReturnsTwice`, `legalized`, `regBankSelected`, `selected`. | 
 |   The whole `frameInfo` section is often unnecessary if there is no special | 
 |   frame usage in the function. `tracksRegLiveness` on the other hand is often | 
 |   necessary for some passes that care about block livein lists. | 
 |  | 
 | - The (global) `liveins:` list is typically only interesting for early | 
 |   instruction selection passes and can be removed when testing later passes. | 
 |   The per-block `liveins:` on the other hand are necessary if | 
 |   `tracksRegLiveness` is true. | 
 |  | 
 | - Branch probability data in block `successors:` lists can be dropped if the | 
 |   test doesn't depend on it. Example: | 
 |   `successors: %bb.1(0x40000000), %bb.2(0x40000000)` can be replaced with | 
 |   `successors: %bb.1, %bb.2`. | 
 |  | 
 | - MIR code contains a whole IR module. This is necessary because there are | 
 |   no equivalents in MIR for global variables, references to external functions, | 
 |   function attributes, metadata, debug info. Instead some MIR data references | 
 |   the IR constructs. You can often remove them if the test doesn't depend on | 
 |   them. | 
 |  | 
 | - Alias Analysis is performed on IR values. These are referenced by memory | 
 |   operands in MIR. Example: `:: (load 8 from %ir.foobar, !alias.scope !9)`. | 
 |   If the test doesn't depend on (good) alias analysis the references can be | 
 |   dropped: `:: (load 8)` | 
 |  | 
 | - MIR blocks can reference IR blocks for debug printing, profile information | 
 |   or debug locations. Example: `bb.42.myblock` in MIR references the IR block | 
 |   `myblock`. It is usually possible to drop the `.myblock` reference and simply | 
 |   use `bb.42`. | 
 |  | 
 | - If there are no memory operands or blocks referencing the IR then the | 
 |   IR function can be replaced by a parameterless dummy function like | 
 |   `define @func() { ret void }`. | 
 |  | 
 | - It is possible to drop the whole IR section of the MIR file if it only | 
 |   contains dummy functions (see above). The .mir loader will create the | 
 |   IR functions automatically in this case. | 
 |  | 
 | .. _limitations: | 
 |  | 
 | Limitations | 
 | ----------- | 
 |  | 
 | Currently the MIR format has several limitations in terms of which state it | 
 | can serialize: | 
 |  | 
 | - The target-specific state in the target-specific ``MachineFunctionInfo`` | 
 |   subclasses isn't serialized at the moment. | 
 |  | 
 | - The target-specific ``MachineConstantPoolValue`` subclasses (in the ARM and | 
 |   SystemZ backends) aren't serialized at the moment. | 
 |  | 
 | - The ``MCSymbol`` machine operands don't support temporary or local symbols. | 
 |  | 
 | - A lot of the state in ``MachineModuleInfo`` isn't serialized - only the CFI | 
 |   instructions and the variable debug information from MMI is serialized right | 
 |   now. | 
 |  | 
 | These limitations impose restrictions on what you can test with the MIR format. | 
 | For now, tests that would like to test some behaviour that depends on the state | 
 | of temporary or local ``MCSymbol``  operands or the exception handling state in | 
 | MMI, can't use the MIR format. As well as that, tests that test some behaviour | 
 | that depends on the state of the target specific ``MachineFunctionInfo`` or | 
 | ``MachineConstantPoolValue`` subclasses can't use the MIR format at the moment. | 
 |  | 
 | High Level Structure | 
 | ==================== | 
 |  | 
 | .. _embedded-module: | 
 |  | 
 | Embedded Module | 
 | --------------- | 
 |  | 
 | When the first YAML document contains a `YAML block literal string`_, the MIR | 
 | parser will treat this string as an LLVM assembly language string that | 
 | represents an embedded LLVM IR module. | 
 | Here is an example of a YAML document that contains an LLVM module: | 
 |  | 
 | .. code-block:: llvm | 
 |  | 
 |        define i32 @inc(ptr %x) { | 
 |        entry: | 
 |          %0 = load i32, ptr %x | 
 |          %1 = add i32 %0, 1 | 
 |          store i32 %1, ptr %x | 
 |          ret i32 %1 | 
 |        } | 
 |  | 
 | .. _YAML block literal string: http://www.yaml.org/spec/1.2/spec.html#id2795688 | 
 |  | 
 | Machine Functions | 
 | ----------------- | 
 |  | 
 | The remaining YAML documents contain the machine functions. This is an example | 
 | of such YAML document: | 
 |  | 
 | .. code-block:: text | 
 |  | 
 |      --- | 
 |      name:            inc | 
 |      tracksRegLiveness: true | 
 |      liveins: | 
 |        - { reg: '$rdi' } | 
 |      callSites: | 
 |        - { bb: 0, offset: 3, fwdArgRegs: | 
 |            - { arg: 0, reg: '$edi' } } | 
 |      body: | | 
 |        bb.0.entry: | 
 |          liveins: $rdi | 
 |  | 
 |          $eax = MOV32rm $rdi, 1, _, 0, _ | 
 |          $eax = INC32r killed $eax, implicit-def dead $eflags | 
 |          MOV32mr killed $rdi, 1, _, 0, _, $eax | 
 |          CALL64pcrel32 @foo <regmask...> | 
 |          RETQ $eax | 
 |      ... | 
 |  | 
 | The document above consists of attributes that represent the various | 
 | properties and data structures in a machine function. | 
 |  | 
 | The attribute ``name`` is required, and its value should be identical to the | 
 | name of a function that this machine function is based on. | 
 |  | 
 | The attribute ``body`` is a `YAML block literal string`_. Its value represents | 
 | the function's machine basic blocks and their machine instructions. | 
 |  | 
 | The attribute ``callSites`` is a representation of call site information which | 
 | keeps track of call instructions and registers used to transfer call arguments. | 
 |  | 
 | Machine Instructions Format Reference | 
 | ===================================== | 
 |  | 
 | The machine basic blocks and their instructions are represented using a custom, | 
 | human readable serialization language. This language is used in the | 
 | `YAML block literal string`_ that corresponds to the machine function's body. | 
 |  | 
 | A source string that uses this language contains a list of machine basic | 
 | blocks, which are described in the section below. | 
 |  | 
 | Machine Basic Blocks | 
 | -------------------- | 
 |  | 
 | A machine basic block is defined in a single block definition source construct | 
 | that contains the block's ID. | 
 | The example below defines two blocks that have an ID of zero and one: | 
 |  | 
 | .. code-block:: text | 
 |  | 
 |     bb.0: | 
 |       <instructions> | 
 |     bb.1: | 
 |       <instructions> | 
 |  | 
 | A machine basic block can also have a name. It should be specified after the ID | 
 | in the block's definition: | 
 |  | 
 | .. code-block:: text | 
 |  | 
 |     bb.0.entry:       ; This block's name is "entry" | 
 |        <instructions> | 
 |  | 
 | The block's name should be identical to the name of the IR block that this | 
 | machine block is based on. | 
 |  | 
 | .. _block-references: | 
 |  | 
 | Block References | 
 | ^^^^^^^^^^^^^^^^ | 
 |  | 
 | The machine basic blocks are identified by their ID numbers. Individual | 
 | blocks are referenced using the following syntax: | 
 |  | 
 | .. code-block:: text | 
 |  | 
 |     %bb.<id> | 
 |  | 
 | Example: | 
 |  | 
 | .. code-block:: llvm | 
 |  | 
 |     %bb.0 | 
 |  | 
 | The following syntax is also supported, but the former syntax is preferred for | 
 | block references: | 
 |  | 
 | .. code-block:: text | 
 |  | 
 |     %bb.<id>[.<name>] | 
 |  | 
 | Example: | 
 |  | 
 | .. code-block:: llvm | 
 |  | 
 |     %bb.1.then | 
 |  | 
 | Successors | 
 | ^^^^^^^^^^ | 
 |  | 
 | The machine basic block's successors have to be specified before any of the | 
 | instructions: | 
 |  | 
 | .. code-block:: text | 
 |  | 
 |     bb.0.entry: | 
 |       successors: %bb.1.then, %bb.2.else | 
 |       <instructions> | 
 |     bb.1.then: | 
 |       <instructions> | 
 |     bb.2.else: | 
 |       <instructions> | 
 |  | 
 | The branch weights can be specified in brackets after the successor blocks. | 
 | The example below defines a block that has two successors with branch weights | 
 | of 32 and 16: | 
 |  | 
 | .. code-block:: text | 
 |  | 
 |     bb.0.entry: | 
 |       successors: %bb.1.then(32), %bb.2.else(16) | 
 |  | 
 | .. _bb-liveins: | 
 |  | 
 | Live In Registers | 
 | ^^^^^^^^^^^^^^^^^ | 
 |  | 
 | The machine basic block's live in registers have to be specified before any of | 
 | the instructions: | 
 |  | 
 | .. code-block:: text | 
 |  | 
 |     bb.0.entry: | 
 |       liveins: $edi, $esi | 
 |  | 
 | The list of live in registers and successors can be empty. The language also | 
 | allows multiple live in register and successor lists - they are combined into | 
 | one list by the parser. | 
 |  | 
 | Miscellaneous Attributes | 
 | ^^^^^^^^^^^^^^^^^^^^^^^^ | 
 |  | 
 | The attributes ``IsAddressTaken``, ``IsLandingPad``, | 
 | ``IsInlineAsmBrIndirectTarget`` and ``Alignment`` can be specified in brackets | 
 | after the block's definition: | 
 |  | 
 | .. code-block:: text | 
 |  | 
 |     bb.0.entry (address-taken): | 
 |       <instructions> | 
 |     bb.2.else (align 4): | 
 |       <instructions> | 
 |     bb.3(landing-pad, align 4): | 
 |       <instructions> | 
 |     bb.4 (inlineasm-br-indirect-target): | 
 |       <instructions> | 
 |  | 
 | .. TODO: Describe the way the reference to an unnamed LLVM IR block can be | 
 |    preserved. | 
 |  | 
 | ``Alignment`` is specified in bytes, and must be a power of two. | 
 |  | 
 | .. _mir-instructions: | 
 |  | 
 | Machine Instructions | 
 | -------------------- | 
 |  | 
 | A machine instruction is composed of a name, | 
 | :ref:`machine operands <machine-operands>`, | 
 | :ref:`instruction flags <instruction-flags>`, and machine memory operands. | 
 |  | 
 | The instruction's name is usually specified before the operands. The example | 
 | below shows an instance of the X86 ``RETQ`` instruction with a single machine | 
 | operand: | 
 |  | 
 | .. code-block:: text | 
 |  | 
 |     RETQ $eax | 
 |  | 
 | However, if the machine instruction has one or more explicitly defined register | 
 | operands, the instruction's name has to be specified after them. The example | 
 | below shows an instance of the AArch64 ``LDPXpost`` instruction with three | 
 | defined register operands: | 
 |  | 
 | .. code-block:: text | 
 |  | 
 |     $sp, $fp, $lr = LDPXpost $sp, 2 | 
 |  | 
 | The instruction names are serialized using the exact definitions from the | 
 | target's ``*InstrInfo.td`` files, and they are case sensitive. This means that | 
 | similar instruction names like ``TSTri`` and ``tSTRi`` represent different | 
 | machine instructions. | 
 |  | 
 | .. _instruction-flags: | 
 |  | 
 | Instruction Flags | 
 | ^^^^^^^^^^^^^^^^^ | 
 |  | 
 | The flag ``frame-setup`` or ``frame-destroy`` can be specified before the | 
 | instruction's name: | 
 |  | 
 | .. code-block:: text | 
 |  | 
 |     $fp = frame-setup ADDXri $sp, 0, 0 | 
 |  | 
 | .. code-block:: text | 
 |  | 
 |     $x21, $x20 = frame-destroy LDPXi $sp | 
 |  | 
 | .. _registers: | 
 |  | 
 | Bundled Instructions | 
 | ^^^^^^^^^^^^^^^^^^^^ | 
 |  | 
 | The syntax for bundled instructions is the following: | 
 |  | 
 | .. code-block:: text | 
 |  | 
 |     BUNDLE implicit-def $r0, implicit-def $r1, implicit $r2 { | 
 |       $r0 = SOME_OP $r2 | 
 |       $r1 = ANOTHER_OP internal $r0 | 
 |     } | 
 |  | 
 | The first instruction is often a bundle header. The instructions between ``{`` | 
 | and ``}`` are bundled with the first instruction. | 
 |  | 
 | .. _mir-registers: | 
 |  | 
 | Registers | 
 | --------- | 
 |  | 
 | Registers are one of the key primitives in the machine instructions | 
 | serialization language. They are primarily used in the | 
 | :ref:`register machine operands <register-operands>`, | 
 | but they can also be used in a number of other places, like the | 
 | :ref:`basic block's live in list <bb-liveins>`. | 
 |  | 
 | The physical registers are identified by their name and by the '$' prefix sigil. | 
 | They use the following syntax: | 
 |  | 
 | .. code-block:: text | 
 |  | 
 |     $<name> | 
 |  | 
 | The example below shows three X86 physical registers: | 
 |  | 
 | .. code-block:: text | 
 |  | 
 |     $eax | 
 |     $r15 | 
 |     $eflags | 
 |  | 
 | The virtual registers are identified by their ID number and by the '%' sigil. | 
 | They use the following syntax: | 
 |  | 
 | .. code-block:: text | 
 |  | 
 |     %<id> | 
 |  | 
 | Example: | 
 |  | 
 | .. code-block:: text | 
 |  | 
 |     %0 | 
 |  | 
 | The null registers are represented using an underscore ('``_``'). They can also be | 
 | represented using a '``$noreg``' named register, although the former syntax | 
 | is preferred. | 
 |  | 
 | .. _machine-operands: | 
 |  | 
 | Machine Operands | 
 | ---------------- | 
 |  | 
 | There are eighteen different kinds of machine operands, and all of them can be | 
 | serialized. | 
 |  | 
 | Immediate Operands | 
 | ^^^^^^^^^^^^^^^^^^ | 
 |  | 
 | The immediate machine operands are untyped, 64-bit signed integers. The | 
 | example below shows an instance of the X86 ``MOV32ri`` instruction that has an | 
 | immediate machine operand ``-42``: | 
 |  | 
 | .. code-block:: text | 
 |  | 
 |     $eax = MOV32ri -42 | 
 |  | 
 | An immediate operand is also used to represent a subregister index when the | 
 | machine instruction has one of the following opcodes: | 
 |  | 
 | - ``EXTRACT_SUBREG`` | 
 |  | 
 | - ``INSERT_SUBREG`` | 
 |  | 
 | - ``REG_SEQUENCE`` | 
 |  | 
 | - ``SUBREG_TO_REG`` | 
 |  | 
 | In case this is true, the Machine Operand is printed according to the target. | 
 |  | 
 | For example: | 
 |  | 
 | In AArch64RegisterInfo.td: | 
 |  | 
 | .. code-block:: text | 
 |  | 
 |   def sub_32 : SubRegIndex<32>; | 
 |  | 
 | If the third operand is an immediate with the value ``15`` (target-dependent | 
 | value), based on the instruction's opcode and the operand's index the operand | 
 | will be printed as ``%subreg.sub_32``: | 
 |  | 
 | .. code-block:: text | 
 |  | 
 |     %1:gpr64 = SUBREG_TO_REG 0, %0, %subreg.sub_32 | 
 |  | 
 | For integers > 64bit, we use a special machine operand, ``MO_CImmediate``, | 
 | which stores the immediate in a ``ConstantInt`` using an ``APInt`` (LLVM's | 
 | arbitrary precision integers). | 
 |  | 
 | .. TODO: Describe the FPIMM immediate operands. | 
 |  | 
 | .. _register-operands: | 
 |  | 
 | Register Operands | 
 | ^^^^^^^^^^^^^^^^^ | 
 |  | 
 | The :ref:`register <registers>` primitive is used to represent the register | 
 | machine operands. The register operands can also have optional | 
 | :ref:`register flags <register-flags>`, | 
 | :ref:`a subregister index <subregister-indices>`, | 
 | and a reference to the tied register operand. | 
 | The full syntax of a register operand is shown below: | 
 |  | 
 | .. code-block:: text | 
 |  | 
 |     [<flags>] <register> [ :<subregister-idx-name> ] [ (tied-def <tied-op>) ] | 
 |  | 
 | This example shows an instance of the X86 ``XOR32rr`` instruction that has | 
 | 5 register operands with different register flags: | 
 |  | 
 | .. code-block:: text | 
 |  | 
 |   dead $eax = XOR32rr undef $eax, undef $eax, implicit-def dead $eflags, implicit-def $al | 
 |  | 
 | .. _register-flags: | 
 |  | 
 | Register Flags | 
 | ~~~~~~~~~~~~~~ | 
 |  | 
 | The table below shows all of the possible register flags along with the | 
 | corresponding internal ``llvm::RegState`` representation: | 
 |  | 
 | .. | 
 |    Keep this in sync with MachineInstrBuilder.h | 
 |  | 
 | .. list-table:: | 
 |    :header-rows: 1 | 
 |  | 
 |    * - Flag | 
 |      - Internal Value | 
 |      - Meaning | 
 |  | 
 |    * - ``implicit`` | 
 |      - ``RegState::Implicit`` | 
 |      - Not emitted register (e.g. carry, or temporary result). | 
 |  | 
 |    * - ``implicit-def`` | 
 |      - ``RegState::ImplicitDefine`` | 
 |      - ``implicit`` and ``def`` | 
 |  | 
 |    * - ``def`` | 
 |      - ``RegState::Define`` | 
 |      - Register definition. | 
 |  | 
 |    * - ``dead`` | 
 |      - ``RegState::Dead`` | 
 |      - Unused definition. | 
 |  | 
 |    * - ``killed`` | 
 |      - ``RegState::Kill`` | 
 |      - The last use of a register. | 
 |  | 
 |    * - ``undef`` | 
 |      - ``RegState::Undef`` | 
 |      - Value of the register doesn't matter. | 
 |  | 
 |    * - ``internal`` | 
 |      - ``RegState::InternalRead`` | 
 |      - Register reads a value that is defined inside the same instruction or bundle. | 
 |  | 
 |    * - ``early-clobber`` | 
 |      - ``RegState::EarlyClobber`` | 
 |      - Register definition happens before uses. | 
 |  | 
 |    * - ``debug-use`` | 
 |      - ``RegState::Debug`` | 
 |      - Register 'use' is for debugging purpose. | 
 |  | 
 |    * - ``renamable`` | 
 |      - ``RegState::Renamable`` | 
 |      - Register that may be renamed. | 
 |  | 
 | .. _subregister-indices: | 
 |  | 
 | Subregister Indices | 
 | ~~~~~~~~~~~~~~~~~~~ | 
 |  | 
 | The register machine operands can reference a portion of a register by using | 
 | the subregister indices. The example below shows an instance of the ``COPY`` | 
 | pseudo instruction that uses the X86 ``sub_8bit`` subregister index to copy 8 | 
 | lower bits from the 32-bit virtual register 0 to the 8-bit virtual register 1: | 
 |  | 
 | .. code-block:: text | 
 |  | 
 |     %1 = COPY %0:sub_8bit | 
 |  | 
 | The names of the subregister indices are target specific, and are typically | 
 | defined in the target's ``*RegisterInfo.td`` file. | 
 |  | 
 | Constant Pool Indices | 
 | ^^^^^^^^^^^^^^^^^^^^^ | 
 |  | 
 | A constant pool index (CPI) operand is printed using its index in the | 
 | function's ``MachineConstantPool`` and an offset. | 
 |  | 
 | For example, a CPI with the index 1 and offset 8: | 
 |  | 
 | .. code-block:: text | 
 |  | 
 |     %1:gr64 = MOV64ri %const.1 + 8 | 
 |  | 
 | For a CPI with the index 0 and offset -12: | 
 |  | 
 | .. code-block:: text | 
 |  | 
 |     %1:gr64 = MOV64ri %const.0 - 12 | 
 |  | 
 | A constant pool entry is bound to a LLVM IR ``Constant`` or a target-specific | 
 | ``MachineConstantPoolValue``. When serializing all the function's constants the | 
 | following format is used: | 
 |  | 
 | .. code-block:: text | 
 |  | 
 |     constants: | 
 |       - id:               <index> | 
 |         value:            <value> | 
 |         alignment:        <alignment> | 
 |         isTargetSpecific: <target-specific> | 
 |  | 
 | where: | 
 |   - ``<index>`` is a 32-bit unsigned integer; | 
 |   - ``<value>`` is a `LLVM IR Constant | 
 |     <https://www.llvm.org/docs/LangRef.html#constants>`_; | 
 |   - ``<alignment>`` is a 32-bit unsigned integer specified in bytes, and must be | 
 |     power of two; | 
 |   - ``<target-specific>`` is either true or false. | 
 |  | 
 | Example: | 
 |  | 
 | .. code-block:: text | 
 |  | 
 |     constants: | 
 |       - id:               0 | 
 |         value:            'double 3.250000e+00' | 
 |         alignment:        8 | 
 |       - id:               1 | 
 |         value:            'g-(LPC0+8)' | 
 |         alignment:        4 | 
 |         isTargetSpecific: true | 
 |  | 
 | Global Value Operands | 
 | ^^^^^^^^^^^^^^^^^^^^^ | 
 |  | 
 | The global value machine operands reference the global values from the | 
 | :ref:`embedded LLVM IR module <embedded-module>`. | 
 | The example below shows an instance of the X86 ``MOV64rm`` instruction that has | 
 | a global value operand named ``G``: | 
 |  | 
 | .. code-block:: text | 
 |  | 
 |     $rax = MOV64rm $rip, 1, _, @G, _ | 
 |  | 
 | The named global values are represented using an identifier with the '@' prefix. | 
 | If the identifier doesn't match the regular expression | 
 | `[-a-zA-Z$._][-a-zA-Z$._0-9]*`, then this identifier must be quoted. | 
 |  | 
 | The unnamed global values are represented using an unsigned numeric value with | 
 | the '@' prefix, like in the following examples: ``@0``, ``@989``. | 
 |  | 
 | Target-dependent Index Operands | 
 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 
 |  | 
 | A target index operand is a target-specific index and an offset. The | 
 | target-specific index is printed using target-specific names and a positive or | 
 | negative offset. | 
 |  | 
 | For example, the ``amdgpu-constdata-start`` is associated with the index ``0`` | 
 | in the AMDGPU backend. So if we have a target index operand with the index 0 | 
 | and the offset 8: | 
 |  | 
 | .. code-block:: text | 
 |  | 
 |     $sgpr2 = S_ADD_U32 _, target-index(amdgpu-constdata-start) + 8, implicit-def _, implicit-def _ | 
 |  | 
 | Jump-table Index Operands | 
 | ^^^^^^^^^^^^^^^^^^^^^^^^^ | 
 |  | 
 | A jump-table index operand with the index 0 is printed as following: | 
 |  | 
 | .. code-block:: text | 
 |  | 
 |     tBR_JTr killed $r0, %jump-table.0 | 
 |  | 
 | A machine jump-table entry contains a list of ``MachineBasicBlocks``. When serializing all the function's jump-table entries, the following format is used: | 
 |  | 
 | .. code-block:: text | 
 |  | 
 |     jumpTable: | 
 |       kind:             <kind> | 
 |       entries: | 
 |         - id:             <index> | 
 |           blocks:         [ <bbreference>, <bbreference>, ... ] | 
 |  | 
 | where ``<kind>`` is describing how the jump table is represented and emitted (plain address, relocations, PIC, etc.), and each ``<index>`` is a 32-bit unsigned integer and ``blocks`` contains a list of :ref:`machine basic block references <block-references>`. | 
 |  | 
 | Example: | 
 |  | 
 | .. code-block:: text | 
 |  | 
 |     jumpTable: | 
 |       kind:             inline | 
 |       entries: | 
 |         - id:             0 | 
 |           blocks:         [ '%bb.3', '%bb.9', '%bb.4.d3' ] | 
 |         - id:             1 | 
 |           blocks:         [ '%bb.7', '%bb.7', '%bb.4.d3', '%bb.5' ] | 
 |  | 
 | External Symbol Operands | 
 | ^^^^^^^^^^^^^^^^^^^^^^^^^ | 
 |  | 
 | An external symbol operand is represented using an identifier with the ``&`` | 
 | prefix. The identifier is surrounded with ""'s and escaped if it has any | 
 | special non-printable characters in it. | 
 |  | 
 | Example: | 
 |  | 
 | .. code-block:: text | 
 |  | 
 |     CALL64pcrel32 &__stack_chk_fail, csr_64, implicit $rsp, implicit-def $rsp | 
 |  | 
 | MCSymbol Operands | 
 | ^^^^^^^^^^^^^^^^^ | 
 |  | 
 | A MCSymbol operand is holding a pointer to a ``MCSymbol``. For the limitations | 
 | of this operand in MIR, see :ref:`limitations <limitations>`. | 
 |  | 
 | The syntax is: | 
 |  | 
 | .. code-block:: text | 
 |  | 
 |     EH_LABEL <mcsymbol Ltmp1> | 
 |  | 
 | Debug Instruction Reference Operands | 
 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 
 |  | 
 | A debug instruction reference operand is a pair of indices, referring to an | 
 | instruction and an operand within that instruction respectively; see | 
 | :ref:`Instruction referencing locations <instruction-referencing-locations>`. | 
 |  | 
 | The example below uses a reference to Instruction 1, Operand 0: | 
 |  | 
 | .. code-block:: text | 
 |  | 
 |     DBG_INSTR_REF !123, !DIExpression(DW_OP_LLVM_arg, 0), dbg-instr-ref(1, 0), debug-location !456 | 
 |  | 
 | CFIIndex Operands | 
 | ^^^^^^^^^^^^^^^^^ | 
 |  | 
 | A CFI Index operand is holding an index into a per-function side-table, | 
 | ``MachineFunction::getFrameInstructions()``, which references all the frame | 
 | instructions in a ``MachineFunction``. A ``CFI_INSTRUCTION`` may look like it | 
 | contains multiple operands, but the only operand it contains is the CFI Index. | 
 | The other operands are tracked by the ``MCCFIInstruction`` object. | 
 |  | 
 | The syntax is: | 
 |  | 
 | .. code-block:: text | 
 |  | 
 |     CFI_INSTRUCTION offset $w30, -16 | 
 |  | 
 | which may be emitted later in the MC layer as: | 
 |  | 
 | .. code-block:: text | 
 |  | 
 |     .cfi_offset w30, -16 | 
 |  | 
 | IntrinsicID Operands | 
 | ^^^^^^^^^^^^^^^^^^^^ | 
 |  | 
 | An Intrinsic ID operand contains a generic intrinsic ID or a target-specific ID. | 
 |  | 
 | The syntax for the ``returnaddress`` intrinsic is: | 
 |  | 
 | .. code-block:: text | 
 |  | 
 |    $x0 = COPY intrinsic(@llvm.returnaddress) | 
 |  | 
 | Predicate Operands | 
 | ^^^^^^^^^^^^^^^^^^ | 
 |  | 
 | A Predicate operand contains an IR predicate from ``CmpInst::Predicate``, like | 
 | ``ICMP_EQ``, etc. | 
 |  | 
 | For an int eq predicate ``ICMP_EQ``, the syntax is: | 
 |  | 
 | .. code-block:: text | 
 |  | 
 |    %2:gpr(s32) = G_ICMP intpred(eq), %0, %1 | 
 |  | 
 | .. TODO: Describe the parsers default behaviour when optional YAML attributes | 
 |    are missing. | 
 | .. TODO: Describe the syntax for virtual register YAML definitions. | 
 | .. TODO: Describe the machine function's YAML flag attributes. | 
 | .. TODO: Describe the syntax for the register mask machine operands. | 
 | .. TODO: Describe the frame information YAML mapping. | 
 | .. TODO: Describe the syntax of the stack object machine operands and their | 
 |    YAML definitions. | 
 | .. TODO: Describe the syntax of the block address machine operands. | 
 | .. TODO: Describe the syntax of the metadata machine operands, and the | 
 |    instructions debug location attribute. | 
 | .. TODO: Describe the syntax of the register live out machine operands. | 
 | .. TODO: Describe the syntax of the machine memory operands. | 
 |  | 
 | Comments | 
 | ^^^^^^^^ | 
 |  | 
 | Machine operands can have C/C++ style comments, which are annotations enclosed | 
 | between ``/*`` and ``*/`` to improve readability of e.g. immediate operands. | 
 | In the example below, ARM instructions EOR and BCC and immediate operands | 
 | ``14`` and ``0`` have been annotated with their condition codes (CC) | 
 | definitions, i.e. the ``always`` and ``eq`` condition codes: | 
 |  | 
 | .. code-block:: text | 
 |  | 
 |   dead renamable $r2, $cpsr = tEOR killed renamable $r2, renamable $r1, 14 /* CC::always */, $noreg | 
 |   t2Bcc %bb.4, 0 /* CC:eq */, killed $cpsr | 
 |  | 
 | As these annotations are comments, they are ignored by the MI parser. | 
 | Comments can be added or customized by overriding InstrInfo's hook | 
 | ``createMIROperandComment()``. | 
 |  | 
 | Debug-Info constructs | 
 | --------------------- | 
 |  | 
 | Most of the debugging information in a MIR file is to be found in the metadata | 
 | of the embedded module. Within a machine function, that metadata is referred to | 
 | by various constructs to describe source locations and variable locations. | 
 |  | 
 | Source locations | 
 | ^^^^^^^^^^^^^^^^ | 
 |  | 
 | Every MIR instruction may optionally have a trailing reference to a | 
 | ``DILocation`` metadata node, after all operands and symbols, but before | 
 | memory operands: | 
 |  | 
 | .. code-block:: text | 
 |  | 
 |    $rbp = MOV64rr $rdi, debug-location !12 | 
 |  | 
 | The source location attachment is synonymous with the ``!dbg`` metadata | 
 | attachment in LLVM-IR. The absence of a source location attachment will be | 
 | represented by an empty ``DebugLoc`` object in the machine instruction. | 
 |  | 
 | Fixed variable locations | 
 | ^^^^^^^^^^^^^^^^^^^^^^^^ | 
 |  | 
 | There are several ways of specifying variable locations. The simplest is | 
 | describing a variable that is permanently located on the stack. In the stack | 
 | or fixedStack attribute of the machine function, the variable, scope, and | 
 | any qualifying location modifier are provided: | 
 |  | 
 | .. code-block:: text | 
 |  | 
 |     - { id: 0, name: offset.addr, offset: -24, size: 8, alignment: 8, stack-id: default, | 
 |      4  debug-info-variable: '!1', debug-info-expression: '!DIExpression()', | 
 |         debug-info-location: '!2' } | 
 |  | 
 | Where: | 
 |  | 
 | - ``debug-info-variable`` identifies a DILocalVariable metadata node, | 
 |  | 
 | - ``debug-info-expression`` adds qualifiers to the variable location, | 
 |  | 
 | - ``debug-info-location`` identifies a DILocation metadata node. | 
 |  | 
 | These metadata attributes correspond to the operands of a ``#dbg_declare`` | 
 | IR debug record, see the :ref:`source level | 
 | debugging<debug_records>` documentation. | 
 |  | 
 | Varying variable locations | 
 | ^^^^^^^^^^^^^^^^^^^^^^^^^^ | 
 |  | 
 | Variables that are not always on the stack or change location are specified | 
 | with the ``DBG_VALUE``  meta machine instruction. It is synonymous with the | 
 | ``#dbg_value`` IR record, and is written: | 
 |  | 
 | .. code-block:: text | 
 |  | 
 |     DBG_VALUE $rax, $noreg, !123, !DIExpression(), debug-location !456 | 
 |  | 
 | The operands to which respectively: | 
 |  | 
 | 1. Identifies a machine location such as a register, immediate, or frame index, | 
 |  | 
 | 2. Is either $noreg, or immediate value zero if an extra level of indirection is to be added to the first operand, | 
 |  | 
 | 3. Identifies a ``DILocalVariable`` metadata node, | 
 |  | 
 | 4. Specifies an expression qualifying the variable location, either inline or as a metadata node reference, | 
 |  | 
 | While the source location identifies the ``DILocation`` for the scope of the | 
 | variable. The second operand (``IsIndirect``) is deprecated and to be deleted. | 
 | All additional qualifiers for the variable location should be made through the | 
 | expression metadata. | 
 |  | 
 | .. _instruction-referencing-locations: | 
 |  | 
 | Instruction referencing locations | 
 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 
 |  | 
 | This experimental feature aims to separate the specification of variable | 
 | *values* from the program point where a variable takes on that value. Changes | 
 | in variable value occur in the same manner as ``DBG_VALUE`` meta instructions | 
 | but using ``DBG_INSTR_REF``. Variable values are identified by a pair of | 
 | instruction number and operand number. Consider the example below: | 
 |  | 
 | .. code-block:: text | 
 |  | 
 |     $rbp = MOV64ri 0, debug-instr-number 1, debug-location !12 | 
 |     DBG_INSTR_REF !123, !DIExpression(DW_OP_LLVM_arg, 0), dbg-instr-ref(1, 0), debug-location !456 | 
 |  | 
 | Instruction numbers are directly attached to machine instructions with an | 
 | optional ``debug-instr-number`` attachment, before the optional | 
 | ``debug-location`` attachment. The value defined in ``$rbp`` in the code | 
 | above would be identified by the pair ``<1, 0>``. | 
 |  | 
 | The 3rd operand of the ``DBG_INSTR_REF`` above records the instruction | 
 | and operand number ``<1, 0>``, identifying the value defined by the ``MOV64ri``. | 
 | The first two operands to ``DBG_INSTR_REF`` are identical to ``DBG_VALUE_LIST``, | 
 | and the ``DBG_INSTR_REF`` s position records where the variable takes on the | 
 | designated value in the same way. | 
 |  | 
 | More information about how these constructs are used is available in | 
 | :doc:`InstrRefDebugInfo`. The related documents :doc:`SourceLevelDebugging` and | 
 | :doc:`HowToUpdateDebugInfo` may be useful as well. |