| ISASPEC - XML Based ISA Specification |
| ===================================== |
| |
| isaspec provides a mechanism to describe an instruction set in XML, and |
| generate a disassembler and assembler. The intention is |
| to describe the instruction set more formally than hand-coded assembler |
| and disassembler, and better decouple the shader compiler from the |
| underlying instruction encoding to simplify dealing with instruction |
| encoding differences between generations of GPU. |
| |
| Benefits of a formal ISA description, compared to hand-coded assemblers |
| and disassemblers, include easier detection of new bit combinations that |
| were not seen before in previous generations due to more rigorous |
| description of bits that are expect to be '0' or '1' or 'x' (dontcare) |
| and verification that different encodings don't have conflicting bits |
| (i.e. that the specification cannot result in more than one valid |
| interpretation of any bit pattern). |
| |
| The isaspec tool and XML schema are intended to be generic (not specific |
| to ir3), although there are currently a couple limitations due to short- |
| cuts taken to get things up and running (which are mostly not inherent to |
| the XML schema, and should not be too difficult to remove from the py and |
| decode/disasm utility): |
| |
| * Maximum "field" size is 64b |
| * Fixed instruction size |
| |
| Often times, especially when new functionality is added in later gens |
| while retaining (or at least mostly retaining) backwards compatibility |
| with encodings used in earlier generations, the actual encoding can be |
| rather messy to describe. To support this, isaspec provides many flexible |
| mechanism, such as conditional overrides and derived fields. This not |
| only allows for describing an irregular instruction encoding, but also |
| allows matching an existing disasm syntax (which might not have been |
| design around the idea of disassembly based on a formal ISA description). |
| |
| Bitsets |
| ------- |
| |
| The fundamental concept of matching a bit-pattern to an instruction |
| decoding/encoding is the concept of a hierarchical tree of bitsets. |
| This is intended to match how the HW decodes instructions, where certain |
| bits describe the instruction (and sub-encoding, and so on), and other |
| bits describe various operands to the instruction. |
| |
| Bitsets can also be used recursively as the type of a field described |
| in another bitset. |
| |
| The leaves of the tree of instruction bitsets represent every possible |
| instruction. Deciding which instruction a bitpattern is amounts to: |
| |
| .. code-block:: c |
| |
| m = (val & bitsets[n]->mask) & ~bitsets[n]->dontcare; |
| |
| if (m == bitsets[n]->match) { |
| /* we've found the instruction description */ |
| } |
| |
| For example, the starting point to decode an ir3 instruction is a 64b |
| bitset: |
| |
| .. code-block:: xml |
| |
| <bitset name="#instruction" size="64"> |
| <doc> |
| Encoding of an ir3 instruction. All instructions are 64b. |
| </doc> |
| </bitset> |
| |
| In the first level of instruction encoding hierarchy, the high three bits |
| group things into instruction "categories": |
| |
| .. code-block:: xml |
| |
| <bitset name="#instruction-cat2" extends="#instruction"> |
| <field name="DST" low="32" high="39" type="#reg-gpr"/> |
| <field name="REPEAT" low="40" high="41" type="#rptN"/> |
| <field name="SAT" pos="42" type="bool" display="(sat)"/> |
| <field name="SS" pos="44" type="bool" display="(ss)"/> |
| <field name="UL" pos="45" type="bool" display="(ul)"/> |
| <field name="DST_CONV" pos="46" type="bool"> |
| <doc> |
| Destination register is opposite precision as source, i.e. |
| if {FULL} is true then destination is half precision, and |
| visa versa. |
| </doc> |
| </field> |
| <derived name="DST_HALF" expr="#dest-half" type="bool" display="h"/> |
| <field name="EI" pos="47" type="bool" display="(ei)"/> |
| <field name="FULL" pos="52" type="bool"> |
| <doc>Full precision source registers</doc> |
| </field> |
| <field name="JP" pos="59" type="bool" display="(jp)"/> |
| <field name="SY" pos="60" type="bool" display="(sy)"/> |
| <pattern low="61" high="63">010</pattern> <!-- cat2 --> |
| <!-- |
| NOTE, both SRC1_R and SRC2_R are defined at this level because |
| SRC2_R is still a valid bit for (nopN) (REPEAT==0) for cat2 |
| instructions with only a single src |
| --> |
| <field name="SRC1_R" pos="43" type="bool" display="(r)"/> |
| <field name="SRC2_R" pos="51" type="bool" display="(r)"/> |
| <derived name="ZERO" expr="#zero" type="bool" display=""/> |
| </bitset> |
| |
| The ``<pattern>`` elements are the part(s) that determine which leaf-node |
| bitset matches against a given bit pattern. The leaf node's match/mask/ |
| dontcare bitmasks are a combination of those defined at the leaf node and |
| recursively each parent bitclass. |
| |
| For example, cat2 instructions (ALU instructions with up to two src |
| registers) can have either one or two source registers: |
| |
| .. code-block:: xml |
| |
| <bitset name="#instruction-cat2-1src" extends="#instruction-cat2"> |
| <override expr="#cat2-cat3-nop-encoding"> |
| <display> |
| {SY}{SS}{JP}{SAT}(nop{NOP}) {UL}{NAME} {EI}{DST_HALF}{DST}, {SRC1} |
| </display> |
| <derived name="NOP" expr="#cat2-cat3-nop-value" type="uint"/> |
| <field name="SRC1" low="0" high="15" type="#multisrc"> |
| <param name="ZERO" as="SRC_R"/> |
| <param name="FULL"/> |
| </field> |
| </override> |
| <display> |
| {SY}{SS}{JP}{SAT}{REPEAT}{UL}{NAME} {EI}{DST_HALF}{DST}, {SRC1} |
| </display> |
| <pattern low="16" high="31">xxxxxxxxxxxxxxxx</pattern> |
| <pattern low="48" high="50">xxx</pattern> <!-- COND --> |
| <field name="SRC1" low="0" high="15" type="#multisrc"> |
| <param name="SRC1_R" as="SRC_R"/> |
| <param name="FULL"/> |
| </field> |
| </bitset> |
| |
| <bitset name="absneg.f" extends="#instruction-cat2-1src"> |
| <pattern low="53" high="58">000110</pattern> |
| </bitset> |
| |
| In this example, ``absneg.f`` is a concrete cat2 instruction (leaf node of |
| the bitset inheritance tree) which has a single src register. At the |
| ``#instruction-cat2-1src`` level, bits that are used for the 2nd src arg |
| and condition code (for cat2 instructions which use a condition code) are |
| defined as 'x' (dontcare), which matches our understanding of the hardware |
| (but also lets the disassembler flag cases where '1' bits show up in places |
| we don't expect, which may signal a new instruction (sub)encoding). |
| |
| You'll notice that ``SRC1`` refers back to a different bitset hierarchy |
| that describes various different src register encoding (used for cat2 and |
| cat4 instructions), i.e. GPR vs CONST vs relative GPR/CONST. For fields |
| which have bitset types, parameters can be "passed" in via ``<param>`` |
| elements, which can be referred to by the display template string, and/or |
| expressions. For example, this helps to deal with cases where other fields |
| outside of that bitset control the encoding/decoding, such as in the |
| ``#multisrc`` example: |
| |
| .. code-block:: xml |
| |
| <bitset name="#multisrc" size="16"> |
| <doc> |
| Encoding for instruction source which can be GPR/CONST/IMMED |
| or relative GPR/CONST. |
| </doc> |
| </bitset> |
| |
| ... |
| |
| <bitset name="#multisrc-gpr" extends="#multisrc"> |
| <display> |
| {ABSNEG}{SRC_R}{HALF}{SRC} |
| </display> |
| <derived name="HALF" expr="#multisrc-half" type="bool" display="h"/> |
| <field name="SRC" low="0" high="7" type="#reg-gpr"/> |
| <pattern low="8" high="13">000000</pattern> |
| <field name="ABSNEG" low="14" high="15" type="#absneg"/> |
| </bitset> |
| |
| At some level in the bitset inheritance hierarchy, there is expected to be a |
| ``<display>`` element specifying a template string used during bitset |
| decoding. The display template consists of references to fields (which may |
| be derived fields) specified as ``{FIELDNAME}`` and other characters |
| which are just echoed through to the resulting decoded bitset. |
| |
| It is possible to define a line column alignment value per field to influence |
| the visual output. It needs to be specified as ``{FIELDNAME:align=xx}``. |
| |
| The ``<override>`` element will be described in the next section, but it |
| provides for both different decoded instruction syntax/mnemonics (when |
| simply providing a different display template string) as well as instruction |
| encoding where different ranges of bits have a different meaning based on |
| some other bitfield (or combination of bitfields). In this example it is |
| used to cover the cases where ``SRCn_R`` has a different meaning and a |
| different disassembly syntax depending on whether ``REPEAT`` equals zero. |
| |
| Overrides |
| --------- |
| |
| In many cases, a bitset is not convenient for describing the expected |
| disasm syntax, and/or interpretation of some range of bits differs based |
| on some other field or combination of fields. These *could* be modeled |
| as different derived bitsets, at the expense of a combinatorical explosion |
| of the size of the bitset inheritance tree. For example, *every* cat2 |
| (and cat3) instruction has both a ``(nopN)`` interpretation in addition to |
| the ``(rptN`)`` interpretation. |
| |
| An ``<override>`` in a bitset allows to redefine the display string, and/or |
| field definitions from the default case. If the override's expr(ession) |
| evaluates to non-zero, ``<display>``, ``<field>``, and ``<derived>`` |
| elements take precedence over what is defined in the toplevel of the |
| bitset (i.e. the default case). |
| |
| Expressions |
| ----------- |
| |
| Both ``<override>`` and ``<derived>`` fields make use of ``<expr>`` elements, |
| either defined inline, or defined and named at the top level and referred to |
| by name in multiple other places. An expression is a simple 'C' expression |
| which can reference fields (including other derived fields) with the same |
| ``{FIELDNAME}`` syntax as display template strings. For example: |
| |
| .. code-block:: xml |
| |
| <expr name="#cat2-cat3-nop-encoding"> |
| (({SRC1_R} != 0) || ({SRC2_R} != 0)) && ({REPEAT} == 0) |
| </expr> |
| |
| In the case of ``<override>`` elements, the override applies if the expression |
| evaluates to non-zero. In the case of ``<derived>`` fields, the expression |
| evaluates to the value of the derived field. |
| |
| Encoding |
| -------- |
| |
| To facilitate instruction encoding, ``<encode>`` elements can be provided |
| to teach the generated instruction packing code how to map from data structures |
| representing the IR to fields. For example: |
| |
| .. code-block:: xml |
| |
| <bitset name="#instruction" size="64"> |
| <doc> |
| Encoding of an ir3 instruction. All instructions are 64b. |
| </doc> |
| <gen min="300"/> |
| <encode type="struct ir3_instruction *" case-prefix="OPC_"> |
| <!-- |
| Define mapping from encode src to individual fields, |
| which are common across all instruction categories |
| at the root instruction level |
| |
| Not all of these apply to all instructions, but we |
| can define mappings here for anything that is used |
| in more than one instruction category. For things |
| that are specific to a single instruction category, |
| mappings should be defined at that level instead. |
| --> |
| <map name="DST">src->regs[0]</map> |
| <map name="SRC1">src->regs[1]</map> |
| <map name="SRC2">src->regs[2]</map> |
| <map name="SRC3">src->regs[3]</map> |
| <map name="REPEAT">src->repeat</map> |
| <map name="SS">!!(src->flags & IR3_INSTR_SS)</map> |
| <map name="JP">!!(src->flags & IR3_INSTR_JP)</map> |
| <map name="SY">!!(src->flags & IR3_INSTR_SY)</map> |
| <map name="UL">!!(src->flags & IR3_INSTR_UL)</map> |
| <map name="EQ">0</map> <!-- We don't use this (yet) --> |
| <map name="SAT">!!(src->flags & IR3_INSTR_SAT)</map> |
| </encode> |
| </bitset> |
| |
| The ``type`` attribute specifies that the input to encoding an instruction |
| is a ``struct ir3_instruction *``. In the case of bitset hierarchies with |
| multiple possible leaf nodes, a ``case-prefix`` attribute should be supplied |
| along with a function that maps the bitset encode source to an enum value |
| with the specified prefix prepended to uppercase'd leaf node name. I.e. in |
| this case, "add.f" becomes ``OPC_ADD_F``. |
| |
| Individual ``<map>`` elements teach the encoder how to map from the encode |
| source to fields in the encoded instruction. |