docs/drivers/freedreno/isaspec.rst - third_party/mesa - Git at Google

 ISASPEC - XML Based ISA Specification
 =====================================

 isaspec provides a mechanism to describe an instruction set in XML, and
 generate a disassembler and assembler.  The intention is
 to describe the instruction set more formally than hand-coded assembler
 and disassembler, and better decouple the shader compiler from the
 underlying instruction encoding to simplify dealing with instruction
 encoding differences between generations of GPU.

 Benefits of a formal ISA description, compared to hand-coded assemblers
 and disassemblers, include easier detection of new bit combinations that
 were not seen before in previous generations due to more rigorous
 description of bits that are expect to be '0' or '1' or 'x' (dontcare)
 and verification that different encodings don't have conflicting bits
 (i.e. that the specification cannot result in more than one valid
 interpretation of any bit pattern).

 The isaspec tool and XML schema are intended to be generic (not specific
 to ir3), although there are currently a couple limitations due to short-
 cuts taken to get things up and running (which are mostly not inherent to
 the XML schema, and should not be too difficult to remove from the py and
 decode/disasm utility):

 * Maximum "field" size is 64b
 * Fixed instruction size

 Often times, especially when new functionality is added in later gens
 while retaining (or at least mostly retaining) backwards compatibility
 with encodings used in earlier generations, the actual encoding can be
 rather messy to describe.  To support this, isaspec provides many flexible
 mechanism, such as conditional overrides and derived fields.  This not
 only allows for describing an irregular instruction encoding, but also
 allows matching an existing disasm syntax (which might not have been
 design around the idea of disassembly based on a formal ISA description).

 Bitsets
 -------

 The fundamental concept of matching a bit-pattern to an instruction
 decoding/encoding is the concept of a hierarchical tree of bitsets.
 This is intended to match how the HW decodes instructions, where certain
 bits describe the instruction (and sub-encoding, and so on), and other
 bits describe various operands to the instruction.

 Bitsets can also be used recursively as the type of a field described
 in another bitset.

 The leaves of the tree of instruction bitsets represent every possible
 instruction.  Deciding which instruction a bitpattern is amounts to:

 .. code-block:: c

    m = (val & bitsets[n]->mask) & ~bitsets[n]->dontcare;

    if (m == bitsets[n]->match) {
       /* we've found the instruction description */
    }

 For example, the starting point to decode an ir3 instruction is a 64b
 bitset:

 .. code-block:: xml

    <bitset name="#instruction" size="64">
    	<doc>
    		Encoding of an ir3 instruction.  All instructions are 64b.
    	</doc>
    </bitset>

 In the first level of instruction encoding hierarchy, the high three bits
 group things into instruction "categories":

 .. code-block:: xml

    <bitset name="#instruction-cat2" extends="#instruction">
    	<field name="DST" low="32" high="39" type="#reg-gpr"/>
    	<field name="REPEAT" low="40" high="41" type="#rptN"/>
    	<field name="SAT" pos="42" type="bool" display="(sat)"/>
    	<field name="SS" pos="44" type="bool" display="(ss)"/>
    	<field name="UL" pos="45" type="bool" display="(ul)"/>
    	<field name="DST_CONV" pos="46" type="bool">
    		<doc>
    			Destination register is opposite precision as source, i.e.
    			if {FULL} is true then destination is half precision, and
    			visa versa.
    		</doc>
    	</field>
    	<derived name="DST_HALF" expr="#dest-half" type="bool" display="h"/>
    	<field name="EI" pos="47" type="bool" display="(ei)"/>
    	<field name="FULL" pos="52" type="bool">
    		<doc>Full precision source registers</doc>
    	</field>
    	<field name="JP" pos="59" type="bool" display="(jp)"/>
    	<field name="SY" pos="60" type="bool" display="(sy)"/>
    	<pattern low="61" high="63">010</pattern>  <!-- cat2 -->
    	<!--
    		NOTE, both SRC1_R and SRC2_R are defined at this level because
    		SRC2_R is still a valid bit for (nopN) (REPEAT==0) for cat2
    		instructions with only a single src
    	 -->
    	<field name="SRC1_R" pos="43" type="bool" display="(r)"/>
    	<field name="SRC2_R" pos="51" type="bool" display="(r)"/>
    	<derived name="ZERO" expr="#zero" type="bool" display=""/>
    </bitset>

 The ``<pattern>`` elements are the part(s) that determine which leaf-node
 bitset matches against a given bit pattern.  The leaf node's match/mask/
 dontcare bitmasks are a combination of those defined at the leaf node and
 recursively each parent bitclass.

 For example, cat2 instructions (ALU instructions with up to two src
 registers) can have either one or two source registers:

 .. code-block:: xml

    <bitset name="#instruction-cat2-1src" extends="#instruction-cat2">
    	<override expr="#cat2-cat3-nop-encoding">
    		<display>
    			{SY}{SS}{JP}{SAT}(nop{NOP}) {UL}{NAME} {EI}{DST_HALF}{DST}, {SRC1}
    		</display>
    		<derived name="NOP" expr="#cat2-cat3-nop-value" type="uint"/>
    		<field name="SRC1" low="0" high="15" type="#multisrc">
    			<param name="ZERO" as="SRC_R"/>
    			<param name="FULL"/>
    		</field>
    	</override>
    	<display>
    		{SY}{SS}{JP}{SAT}{REPEAT}{UL}{NAME} {EI}{DST_HALF}{DST}, {SRC1}
    	</display>
    	<pattern low="16" high="31">xxxxxxxxxxxxxxxx</pattern>
    	<pattern low="48" high="50">xxx</pattern>  <!-- COND -->
    	<field name="SRC1" low="0" high="15" type="#multisrc">
    		<param name="SRC1_R" as="SRC_R"/>
    		<param name="FULL"/>
    	</field>
    </bitset>

    <bitset name="absneg.f" extends="#instruction-cat2-1src">
    	<pattern low="53" high="58">000110</pattern>
    </bitset>

 In this example, ``absneg.f`` is a concrete cat2 instruction (leaf node of
 the bitset inheritance tree) which has a single src register.  At the
 ``#instruction-cat2-1src`` level, bits that are used for the 2nd src arg
 and condition code (for cat2 instructions which use a condition code) are
 defined as 'x' (dontcare), which matches our understanding of the hardware
 (but also lets the disassembler flag cases where '1' bits show up in places
 we don't expect, which may signal a new instruction (sub)encoding).

 You'll notice that ``SRC1`` refers back to a different bitset hierarchy
 that describes various different src register encoding (used for cat2 and
 cat4 instructions), i.e. GPR vs CONST vs relative GPR/CONST.  For fields
 which have bitset types, parameters can be "passed" in via ``<param>``
 elements, which can be referred to by the display template string, and/or
 expressions.  For example, this helps to deal with cases where other fields
 outside of that bitset control the encoding/decoding, such as in the
 ``#multisrc`` example:

 .. code-block:: xml

    <bitset name="#multisrc" size="16">
    	<doc>
    		Encoding for instruction source which can be GPR/CONST/IMMED
    		or relative GPR/CONST.
    	</doc>
    </bitset>

    ...

    <bitset name="#multisrc-gpr" extends="#multisrc">
    	<display>
    		{ABSNEG}{SRC_R}{HALF}{SRC}
    	</display>
    	<derived name="HALF" expr="#multisrc-half" type="bool" display="h"/>
    	<field name="SRC" low="0" high="7" type="#reg-gpr"/>
    	<pattern low="8" high="13">000000</pattern>
    	<field name="ABSNEG" low="14" high="15" type="#absneg"/>
    </bitset>

 At some level in the bitset inheritance hierarchy, there is expected to be a
 ``<display>`` element specifying a template string used during bitset
 decoding.  The display template consists of references to fields (which may
 be derived fields) specified as ``{FIELDNAME}`` and other characters
 which are just echoed through to the resulting decoded bitset.

 It is possible to define a line column alignment value per field to influence
 the visual output. It needs to be specified as ``{FIELDNAME:align=xx}``.

 The ``<override>`` element will be described in the next section, but it
 provides for both different decoded instruction syntax/mnemonics (when
 simply providing a different display template string) as well as instruction
 encoding where different ranges of bits have a different meaning based on
 some other bitfield (or combination of bitfields).  In this example it is
 used to cover the cases where ``SRCn_R`` has a different meaning and a
 different disassembly syntax depending on whether ``REPEAT`` equals zero.

 Overrides
 ---------

 In many cases, a bitset is not convenient for describing the expected
 disasm syntax, and/or interpretation of some range of bits differs based
 on some other field or combination of fields.  These *could* be modeled
 as different derived bitsets, at the expense of a combinatorical explosion
 of the size of the bitset inheritance tree.  For example, *every* cat2
 (and cat3) instruction has both a ``(nopN)`` interpretation in addition to
 the ``(rptN`)`` interpretation.

 An ``<override>`` in a bitset allows to redefine the display string, and/or
 field definitions from the default case.  If the override's expr(ession)
 evaluates to non-zero, ``<display>``, ``<field>``, and ``<derived>``
 elements take precedence over what is defined in the toplevel of the
 bitset (i.e. the default case).

 Expressions
 -----------

 Both ``<override>`` and ``<derived>`` fields make use of ``<expr>`` elements,
 either defined inline, or defined and named at the top level and referred to
 by name in multiple other places.  An expression is a simple 'C' expression
 which can reference fields (including other derived fields) with the same
 ``{FIELDNAME}`` syntax as display template strings.  For example:

 .. code-block:: xml

    <expr name="#cat2-cat3-nop-encoding">
    	(({SRC1_R} != 0) || ({SRC2_R} != 0)) &amp;&amp; ({REPEAT} == 0)
    </expr>

 In the case of ``<override>`` elements, the override applies if the expression
 evaluates to non-zero.  In the case of ``<derived>`` fields, the expression
 evaluates to the value of the derived field.

 Encoding
 --------

 To facilitate instruction encoding, ``<encode>`` elements can be provided
 to teach the generated instruction packing code how to map from data structures
 representing the IR to fields.  For example:

 .. code-block:: xml

    <bitset name="#instruction" size="64">
    	<doc>
    		Encoding of an ir3 instruction.  All instructions are 64b.
    	</doc>
    	<gen min="300"/>
    	<encode type="struct ir3_instruction *" case-prefix="OPC_">
    		<!--
    			Define mapping from encode src to individual fields,
    			which are common across all instruction categories
    			at the root instruction level

    			Not all of these apply to all instructions, but we
    			can define mappings here for anything that is used
    			in more than one instruction category.  For things
    			that are specific to a single instruction category,
    			mappings should be defined at that level instead.
    		 -->
    		<map name="DST">src->regs[0]</map>
    		<map name="SRC1">src->regs[1]</map>
    		<map name="SRC2">src->regs[2]</map>
    		<map name="SRC3">src->regs[3]</map>
    		<map name="REPEAT">src->repeat</map>
    		<map name="SS">!!(src->flags &amp; IR3_INSTR_SS)</map>
    		<map name="JP">!!(src->flags &amp; IR3_INSTR_JP)</map>
    		<map name="SY">!!(src->flags &amp; IR3_INSTR_SY)</map>
    		<map name="UL">!!(src->flags &amp; IR3_INSTR_UL)</map>
    		<map name="EQ">0</map>  <!-- We don't use this (yet) -->
    		<map name="SAT">!!(src->flags &amp; IR3_INSTR_SAT)</map>
    	</encode>
    </bitset>

 The ``type`` attribute specifies that the input to encoding an instruction
 is a ``struct ir3_instruction *``.  In the case of bitset hierarchies with
 multiple possible leaf nodes, a ``case-prefix`` attribute should be supplied
 along with a function that maps the bitset encode source to an enum value
 with the specified prefix prepended to uppercase'd leaf node name.  I.e. in
 this case, "add.f" becomes ``OPC_ADD_F``.

 Individual ``<map>`` elements teach the encoder how to map from the encode
 source to fields in the encoded instruction.
	ISASPEC - XML Based ISA Specification
	=====================================

	isaspec provides a mechanism to describe an instruction set in XML, and
	generate a disassembler and assembler. The intention is
	to describe the instruction set more formally than hand-coded assembler
	and disassembler, and better decouple the shader compiler from the
	underlying instruction encoding to simplify dealing with instruction
	encoding differences between generations of GPU.

	Benefits of a formal ISA description, compared to hand-coded assemblers
	and disassemblers, include easier detection of new bit combinations that
	were not seen before in previous generations due to more rigorous
	description of bits that are expect to be '0' or '1' or 'x' (dontcare)
	and verification that different encodings don't have conflicting bits
	(i.e. that the specification cannot result in more than one valid
	interpretation of any bit pattern).

	The isaspec tool and XML schema are intended to be generic (not specific
	to ir3), although there are currently a couple limitations due to short-
	cuts taken to get things up and running (which are mostly not inherent to
	the XML schema, and should not be too difficult to remove from the py and
	decode/disasm utility):

	* Maximum "field" size is 64b
	* Fixed instruction size

	Often times, especially when new functionality is added in later gens
	while retaining (or at least mostly retaining) backwards compatibility
	with encodings used in earlier generations, the actual encoding can be
	rather messy to describe. To support this, isaspec provides many flexible
	mechanism, such as conditional overrides and derived fields. This not
	only allows for describing an irregular instruction encoding, but also
	allows matching an existing disasm syntax (which might not have been
	design around the idea of disassembly based on a formal ISA description).

	Bitsets
	-------

	The fundamental concept of matching a bit-pattern to an instruction
	decoding/encoding is the concept of a hierarchical tree of bitsets.
	This is intended to match how the HW decodes instructions, where certain
	bits describe the instruction (and sub-encoding, and so on), and other
	bits describe various operands to the instruction.

	Bitsets can also be used recursively as the type of a field described
	in another bitset.

	The leaves of the tree of instruction bitsets represent every possible
	instruction. Deciding which instruction a bitpattern is amounts to:

	.. code-block:: c

	m = (val & bitsets[n]->mask) & ~bitsets[n]->dontcare;

	if (m == bitsets[n]->match) {
	/* we've found the instruction description */
	}

	For example, the starting point to decode an ir3 instruction is a 64b
	bitset:

	.. code-block:: xml

	<bitset name="#instruction" size="64">
	<doc>
	Encoding of an ir3 instruction. All instructions are 64b.
	</doc>
	</bitset>

	In the first level of instruction encoding hierarchy, the high three bits
	group things into instruction "categories":

	.. code-block:: xml

	<bitset name="#instruction-cat2" extends="#instruction">
	<field name="DST" low="32" high="39" type="#reg-gpr"/>
	<field name="REPEAT" low="40" high="41" type="#rptN"/>
	<field name="SAT" pos="42" type="bool" display="(sat)"/>
	<field name="SS" pos="44" type="bool" display="(ss)"/>
	<field name="UL" pos="45" type="bool" display="(ul)"/>
	<field name="DST_CONV" pos="46" type="bool">
	<doc>
	Destination register is opposite precision as source, i.e.
	if {FULL} is true then destination is half precision, and
	visa versa.
	</doc>
	</field>
	<derived name="DST_HALF" expr="#dest-half" type="bool" display="h"/>
	<field name="EI" pos="47" type="bool" display="(ei)"/>
	<field name="FULL" pos="52" type="bool">
	<doc>Full precision source registers</doc>
	</field>
	<field name="JP" pos="59" type="bool" display="(jp)"/>
	<field name="SY" pos="60" type="bool" display="(sy)"/>
	<pattern low="61" high="63">010</pattern> <!-- cat2 -->
	<!--
	NOTE, both SRC1_R and SRC2_R are defined at this level because
	SRC2_R is still a valid bit for (nopN) (REPEAT==0) for cat2
	instructions with only a single src
	-->
	<field name="SRC1_R" pos="43" type="bool" display="(r)"/>
	<field name="SRC2_R" pos="51" type="bool" display="(r)"/>
	<derived name="ZERO" expr="#zero" type="bool" display=""/>
	</bitset>

	The ``<pattern>`` elements are the part(s) that determine which leaf-node
	bitset matches against a given bit pattern. The leaf node's match/mask/
	dontcare bitmasks are a combination of those defined at the leaf node and
	recursively each parent bitclass.

	For example, cat2 instructions (ALU instructions with up to two src
	registers) can have either one or two source registers:

	.. code-block:: xml

	<bitset name="#instruction-cat2-1src" extends="#instruction-cat2">
	<override expr="#cat2-cat3-nop-encoding">
	<display>
	{SY}{SS}{JP}{SAT}(nop{NOP}) {UL}{NAME} {EI}{DST_HALF}{DST}, {SRC1}
	</display>
	<derived name="NOP" expr="#cat2-cat3-nop-value" type="uint"/>
	<field name="SRC1" low="0" high="15" type="#multisrc">
	<param name="ZERO" as="SRC_R"/>
	<param name="FULL"/>
	</field>
	</override>
	<display>
	{SY}{SS}{JP}{SAT}{REPEAT}{UL}{NAME} {EI}{DST_HALF}{DST}, {SRC1}
	</display>
	<pattern low="16" high="31">xxxxxxxxxxxxxxxx</pattern>
	<pattern low="48" high="50">xxx</pattern> <!-- COND -->
	<field name="SRC1" low="0" high="15" type="#multisrc">
	<param name="SRC1_R" as="SRC_R"/>
	<param name="FULL"/>
	</field>
	</bitset>

	<bitset name="absneg.f" extends="#instruction-cat2-1src">
	<pattern low="53" high="58">000110</pattern>
	</bitset>

	In this example, ``absneg.f`` is a concrete cat2 instruction (leaf node of
	the bitset inheritance tree) which has a single src register. At the
	``#instruction-cat2-1src`` level, bits that are used for the 2nd src arg
	and condition code (for cat2 instructions which use a condition code) are
	defined as 'x' (dontcare), which matches our understanding of the hardware
	(but also lets the disassembler flag cases where '1' bits show up in places
	we don't expect, which may signal a new instruction (sub)encoding).

	You'll notice that ``SRC1`` refers back to a different bitset hierarchy
	that describes various different src register encoding (used for cat2 and
	cat4 instructions), i.e. GPR vs CONST vs relative GPR/CONST. For fields
	which have bitset types, parameters can be "passed" in via ``<param>``
	elements, which can be referred to by the display template string, and/or
	expressions. For example, this helps to deal with cases where other fields
	outside of that bitset control the encoding/decoding, such as in the
	``#multisrc`` example:

	.. code-block:: xml

	<bitset name="#multisrc" size="16">
	<doc>
	Encoding for instruction source which can be GPR/CONST/IMMED
	or relative GPR/CONST.
	</doc>
	</bitset>

	...

	<bitset name="#multisrc-gpr" extends="#multisrc">
	<display>
	{ABSNEG}{SRC_R}{HALF}{SRC}
	</display>
	<derived name="HALF" expr="#multisrc-half" type="bool" display="h"/>
	<field name="SRC" low="0" high="7" type="#reg-gpr"/>
	<pattern low="8" high="13">000000</pattern>
	<field name="ABSNEG" low="14" high="15" type="#absneg"/>
	</bitset>

	At some level in the bitset inheritance hierarchy, there is expected to be a
	``<display>`` element specifying a template string used during bitset
	decoding. The display template consists of references to fields (which may
	be derived fields) specified as ``{FIELDNAME}`` and other characters
	which are just echoed through to the resulting decoded bitset.

	It is possible to define a line column alignment value per field to influence
	the visual output. It needs to be specified as ``{FIELDNAME:align=xx}``.

	The ``<override>`` element will be described in the next section, but it
	provides for both different decoded instruction syntax/mnemonics (when
	simply providing a different display template string) as well as instruction
	encoding where different ranges of bits have a different meaning based on
	some other bitfield (or combination of bitfields). In this example it is
	used to cover the cases where ``SRCn_R`` has a different meaning and a
	different disassembly syntax depending on whether ``REPEAT`` equals zero.

	Overrides
	---------

	In many cases, a bitset is not convenient for describing the expected
	disasm syntax, and/or interpretation of some range of bits differs based
	on some other field or combination of fields. These could be modeled
	as different derived bitsets, at the expense of a combinatorical explosion
	of the size of the bitset inheritance tree. For example, every cat2
	(and cat3) instruction has both a ``(nopN)`` interpretation in addition to
	the ``(rptN`)`` interpretation.

	An ``<override>`` in a bitset allows to redefine the display string, and/or
	field definitions from the default case. If the override's expr(ession)
	evaluates to non-zero, ``<display>``, ``<field>``, and ``<derived>``
	elements take precedence over what is defined in the toplevel of the
	bitset (i.e. the default case).

	Expressions
	-----------

	Both ``<override>`` and ``<derived>`` fields make use of ``<expr>`` elements,
	either defined inline, or defined and named at the top level and referred to
	by name in multiple other places. An expression is a simple 'C' expression
	which can reference fields (including other derived fields) with the same
	``{FIELDNAME}`` syntax as display template strings. For example:

	.. code-block:: xml

	<expr name="#cat2-cat3-nop-encoding">
	(({SRC1_R} != 0) \|\| ({SRC2_R} != 0)) && ({REPEAT} == 0)
	</expr>

	In the case of ``<override>`` elements, the override applies if the expression
	evaluates to non-zero. In the case of ``<derived>`` fields, the expression
	evaluates to the value of the derived field.

	Encoding
	--------

	To facilitate instruction encoding, ``<encode>`` elements can be provided
	to teach the generated instruction packing code how to map from data structures
	representing the IR to fields. For example:

	.. code-block:: xml

	<bitset name="#instruction" size="64">
	<doc>
	Encoding of an ir3 instruction. All instructions are 64b.
	</doc>
	<gen min="300"/>
	<encode type="struct ir3_instruction *" case-prefix="OPC_">
	<!--
	Define mapping from encode src to individual fields,
	which are common across all instruction categories
	at the root instruction level

	Not all of these apply to all instructions, but we
	can define mappings here for anything that is used
	in more than one instruction category. For things
	that are specific to a single instruction category,
	mappings should be defined at that level instead.
	-->
	<map name="DST">src->regs[0]</map>
	<map name="SRC1">src->regs[1]</map>
	<map name="SRC2">src->regs[2]</map>
	<map name="SRC3">src->regs[3]</map>
	<map name="REPEAT">src->repeat</map>
	<map name="SS">!!(src->flags & IR3_INSTR_SS)</map>
	<map name="JP">!!(src->flags & IR3_INSTR_JP)</map>
	<map name="SY">!!(src->flags & IR3_INSTR_SY)</map>
	<map name="UL">!!(src->flags & IR3_INSTR_UL)</map>
	<map name="EQ">0</map> <!-- We don't use this (yet) -->
	<map name="SAT">!!(src->flags & IR3_INSTR_SAT)</map>
	</encode>
	</bitset>

	The ``type`` attribute specifies that the input to encoding an instruction
	is a ``struct ir3_instruction *``. In the case of bitset hierarchies with
	multiple possible leaf nodes, a ``case-prefix`` attribute should be supplied
	along with a function that maps the bitset encode source to an enum value
	with the specified prefix prepended to uppercase'd leaf node name. I.e. in
	this case, "add.f" becomes ``OPC_ADD_F``.

	Individual ``<map>`` elements teach the encoder how to map from the encode
	source to fields in the encoded instruction.