yasm_arch.7 - third_party/yasm - Git at Google

 .\"Generated by db2man.xsl. Don't modify this, modify the source.
 .de Sh \" Subsection
 .br
 .if t .Sp
 .ne 5
 .PP
 \fB\\$1\fR
 .PP
 ..
 .de Sp \" Vertical space (when we can't use .PP)
 .if t .sp .5v
 .if n .sp
 ..
 .de Ip \" List item
 .br
 .ie \\n(.$>=3 .ne \\$3
 .el .ne 3
 .IP "\\$1" \\$2
 ..
 .TH "YASM_ARCH" 7 "September 2004" "YASM" "YASM Architectures"
 .SH NAME
 yasm_arch \- YASM Architectures
 .SH "SYNOPSIS"
 .ad l
 .hy 0
 .HP 5
 \fByasm\fR \fB\-a\ \fIarch\fR\fR [\fB\-m\ \fImachine\fR\fR] \fB\fI\&.\&.\&.\fR\fR
 .ad
 .hy

 .SH "DESCRIPTION"

 .PP
 The standard YASM distribution includes a number of loadable modules for different target architectures\&. Additional target architectures may be installed as third\-party modules\&. Each target architecture can support one or more machine architectures\&.

 .PP
 The architecture and machine are selected on the \fByasm\fR(1) command line by use of the \fB\-a \fIarch\fR\fR and \fB\-m \fImachine\fR\fR command line options, respectively\&.

 .SH "X86 ARCHITECTURE"

 .PP
 The ``x86'' architecture supports the IA\-32 instruction set and derivatives and the AMD64 instruction set\&. It consists of two machines: ``x86'' (for the IA\-32 and derivatives) and ``amd64'' (for the AMD64 and derivatives)\&. The default machine for the ``x86'' architecture is the ``x86'' machine\&.

 .SS "BITS Setting"

 .PP
 The x86 architecture BITS setting specifies to YASM the processor mode in which the generated code is intended to execute\&. x86 processors can run in three different major execution modes: 16\-bit, 32\-bit, and on AMD64\-supporting processors, 64\-bit\&. As the x86 instruction set contains portions whose function is execution\-mode dependent (such as operand\-size and address\-size override prefixes), YASM cannot assemble x86 instructions correctly unless it is told by the user in what processor mode the code will execute\&.

 .PP
 The BITS setting can be changed in a variety of ways\&. When using the NASM\-compatible parser, the BITS setting can be changed directly via the use of the \fBBITS xx\fR assembler directive\&. The default BITS setting is determined by the object format in use\&.

 .SS "BITS 64 Extensions"

 .PP
 When an AMD64\-supporting processor is executing in 64\-bit mode, a number of additional extensions are available, including extra general purpose registers, extra SSE2 registers, and RIP\-relative addressing\&.

 .PP
 The additional 64\-bit general purpose registers are named r8\-r15\&. There are also 8\-bit (rXb), 16\-bit (rXw), and 32\-bit (rXd) subregisters that map to the least significant 8, 16, or 32 bits of the 64\-bit register\&. The original 8 general purpose registers have also been extended to 64\-bits: eax, edx, ecx, ebx, esi, edi, esp, and ebp have new 64\-bit versions called rax, rdx, rcx, rbx, rsi, rdi, rsp, and rbp respectively\&. The old 32\-bit registers map to the least significant bits of the new 64\-bit registers\&.

 .PP
 New 8\-bit registers are also available that map to the 8 least significant bits of rsi, rdi, rsp, and rbp\&. These are called sil, dil, spl, and bpl respectively\&. Unfortunately, due to the way instructions are encoded, these new 8\-bit registers are encoded the same as the old 8\-bit registers ah, dh, ch, and bh\&. The processor tells which is being used by the presence of the new REX prefix that is used to specify the other extended registers\&. This means it is illegal to mix the use of ah, dh, ch, and bh with an instruction that requires the REX prefix for other reasons\&. For instance:

 .IP
 add ah, [r10]
 .PP
 (NASM syntax) is not a legal instruction because the use of r10 requires a REX prefix, making it impossible to use ah\&.

 .PP
 In 64\-bit mode, an additional 8 SSE2 registers are also available\&. These are named xmm8\-xmm15\&.

 .PP
 By default, most operations in 64\-bit mode remain 32\-bit; operations that are 64\-bit usually require a REX prefix (one bit in the REX prefix determines whether an operation is 64\-bit or 32\-bit)\&. Thus, essentially all 32\-bit instructions have a 64\-bit version, and the 64\-bit versions of instructions can use extended registers ``for free'' (as the REX prefix is already present)\&. Examples in NASM syntax:

 .IP
 mov eax, 1  ; 32\-bit instruction
 .IP
 mov rcx, 1  ; 64\-bit instruction
 .PP
 Instructions that modify the stack (push, pop, call, ret, enter, and leave) are implicitly 64\-bit\&. Their 32\-bit counterparts are not available, but their 16\-bit counterparts are\&. Examples in NASM syntax:

 .IP
 push eax  ; illegal instruction
 .IP
 push rbx  ; 1\-byte instruction
 .IP
 push r11  ; 2\-byte instruction with REX prefix
 .PP
 Results of 32\-bit operations are implicitly zero\-extended to the upper 32 bits of the corresponding 64\-bit register\&. 16 and 8 bit operations, on the other hand, do not affect upper bits of the register (just as in 32\-bit and 16\-bit modes)\&. This can be used to generate smaller code in some instances\&. Examples in NASM syntax:

 .IP
 mov ecx, 1  ; 1 byte shorter than mov rcx, 1
 .IP
 and edx, 3  ; equivalent to and rdx, 3
 .PP
 For most instructions in 64\-bit mode, immediate values remain 32 bits; their value is sign\-extended into the upper 32 bits of the target register prior to being used\&. The exception is the mov instruction, which can take a 64\-bit immediate when the destination is a 64\-bit register\&. Examples in NASM syntax:

 .IP
 add rax, 1                  ; legal
 .IP
 add rax, 0xffffffff         ; sign\-extended
 .IP
 add rax, \-1                 ; same as above
 .IP
 add rax, 0xffffffffffffffff ; warning (>32 bit)
 .IP
 mov eax, 1                  ; 5 byte instruction
 .IP
 mov rax, 1                  ; 10 byte instruction
 .IP
 mov rbx, 0x1234567890abcdef ; 10 byte instruction
 .IP
 mov rcx, 0xffffffff         ; 10 byte instruction
 .IP
 mov ecx, \-1 ; 5 byte instruction equivalent to above
 .PP
 Just like immediates, displacements, for the most part, remain 32 bits and are sign extended prior to use\&. Again, the exception is one restricted form of the mov instruction: between the al/ax/eax/rax register and a 64\-bit absolute address (no registers allowed in the effective address)\&. In NASM syntax, use of the 64\-bit absolute form requires \fB[qword]\fR\&. Examples in NASM syntax:

 .IP
 mov eax, [1]    ; 32 bit, with sign extension
 .IP
 mov al, [rax\-1] ; 32 bit, with sign extension
 .IP
 mov al, [qword 0x1122334455667788] ; 64\-bit absolute
 .IP
 mov al, [0x1122334455667788] ; truncated to 32\-bit (warning)
 .PP
 In 64\-bit mode, a new form of effective addressing is available to make it easier to write position\-independent code\&. Any memory reference may be made RIP relative (RIP is the instruction pointer register, which contains the address of the location immediately following the current instruction)\&.

 .PP
 In NASM syntax, there are two ways to specify RIP\-relative addressing:

 .IP
 mov dword [rip+10], 1
 .PP
 stores the value 1 ten bytes after the end of the instruction\&. \fB10\fR can also be a symbolic constant, and will be treated the same way\&. On the other hand,

 .IP
 mov dword [symb wrt rip], 1
 .PP
 stores the value 1 into the address of symbol \fBsymb\fR\&. This is distinctly different than the behavior of:

 .IP
 mov dword [symb+rip], 1
 .PP
 which takes the address of the end of the instruction, adds the address of \fBsymb\fR to it, then stores the value 1 there\&. If \fBsymb\fR is a variable, this will NOT store the value 1 into the \fBsymb\fR variable!

 .SH "LC3B ARCHITECTURE"

 .PP
 The ``lc3b'' architecture supports the LC\-3b ISA as used in the ECE 312 (now ECE 411) course at the University of Illinois, Urbana\-Champaign, as well as other university courses\&. See \fIhttp://courses.ece.uiuc.edu/ece411/\fR for more details and example code\&. The ``lc3b'' architecture consists of only one machine: ``lc3b''\&.

 .SH "SEE ALSO"

 .PP
 \fByasm\fR(1)

 .SH "BUGS"

 .PP
 When using the ``x86'' architecture, it is overly easy to generate AMD64 code (using the \fBBITS 64\fR directive) and generate a 32\-bit object file (by failing to specify \fB\-m amd64\fR on the command line)\&. Similarly, specifying \fB\-m amd64\fR does not default the BITS setting to 64\&.

 .SH AUTHOR
 Peter Johnson <peter@tortall\&.net>.
	.\"Generated by db2man.xsl. Don't modify this, modify the source.
	.de Sh \" Subsection
	.br
	.if t .Sp
	.ne 5
	.PP
	\fB\\$1\fR
	.PP
	..
	.de Sp \" Vertical space (when we can't use .PP)
	.if t .sp .5v
	.if n .sp
	..
	.de Ip \" List item
	.br
	.ie \\n(.$>=3 .ne \\$3
	.el .ne 3
	.IP "\\$1" \\$2
	..
	.TH "YASM_ARCH" 7 "September 2004" "YASM" "YASM Architectures"
	.SH NAME
	yasm_arch \- YASM Architectures
	.SH "SYNOPSIS"
	.ad l
	.hy 0
	.HP 5
	\fByasm\fR \fB\-a\ \fIarch\fR\fR [\fB\-m\ \fImachine\fR\fR] \fB\fI\&.\&.\&.\fR\fR
	.ad
	.hy

	.SH "DESCRIPTION"

	.PP
	The standard YASM distribution includes a number of loadable modules for different target architectures\&. Additional target architectures may be installed as third\-party modules\&. Each target architecture can support one or more machine architectures\&.

	.PP
	The architecture and machine are selected on the \fByasm\fR(1) command line by use of the \fB\-a \fIarch\fR\fR and \fB\-m \fImachine\fR\fR command line options, respectively\&.

	.SH "X86 ARCHITECTURE"

	.PP
	The ``x86'' architecture supports the IA\-32 instruction set and derivatives and the AMD64 instruction set\&. It consists of two machines: ``x86'' (for the IA\-32 and derivatives) and ``amd64'' (for the AMD64 and derivatives)\&. The default machine for the ``x86'' architecture is the ``x86'' machine\&.

	.SS "BITS Setting"

	.PP
	The x86 architecture BITS setting specifies to YASM the processor mode in which the generated code is intended to execute\&. x86 processors can run in three different major execution modes: 16\-bit, 32\-bit, and on AMD64\-supporting processors, 64\-bit\&. As the x86 instruction set contains portions whose function is execution\-mode dependent (such as operand\-size and address\-size override prefixes), YASM cannot assemble x86 instructions correctly unless it is told by the user in what processor mode the code will execute\&.

	.PP
	The BITS setting can be changed in a variety of ways\&. When using the NASM\-compatible parser, the BITS setting can be changed directly via the use of the \fBBITS xx\fR assembler directive\&. The default BITS setting is determined by the object format in use\&.

	.SS "BITS 64 Extensions"

	.PP
	When an AMD64\-supporting processor is executing in 64\-bit mode, a number of additional extensions are available, including extra general purpose registers, extra SSE2 registers, and RIP\-relative addressing\&.

	.PP
	The additional 64\-bit general purpose registers are named r8\-r15\&. There are also 8\-bit (rXb), 16\-bit (rXw), and 32\-bit (rXd) subregisters that map to the least significant 8, 16, or 32 bits of the 64\-bit register\&. The original 8 general purpose registers have also been extended to 64\-bits: eax, edx, ecx, ebx, esi, edi, esp, and ebp have new 64\-bit versions called rax, rdx, rcx, rbx, rsi, rdi, rsp, and rbp respectively\&. The old 32\-bit registers map to the least significant bits of the new 64\-bit registers\&.

	.PP
	New 8\-bit registers are also available that map to the 8 least significant bits of rsi, rdi, rsp, and rbp\&. These are called sil, dil, spl, and bpl respectively\&. Unfortunately, due to the way instructions are encoded, these new 8\-bit registers are encoded the same as the old 8\-bit registers ah, dh, ch, and bh\&. The processor tells which is being used by the presence of the new REX prefix that is used to specify the other extended registers\&. This means it is illegal to mix the use of ah, dh, ch, and bh with an instruction that requires the REX prefix for other reasons\&. For instance:

	.IP
	add ah, [r10]
	.PP
	(NASM syntax) is not a legal instruction because the use of r10 requires a REX prefix, making it impossible to use ah\&.

	.PP
	In 64\-bit mode, an additional 8 SSE2 registers are also available\&. These are named xmm8\-xmm15\&.

	.PP
	By default, most operations in 64\-bit mode remain 32\-bit; operations that are 64\-bit usually require a REX prefix (one bit in the REX prefix determines whether an operation is 64\-bit or 32\-bit)\&. Thus, essentially all 32\-bit instructions have a 64\-bit version, and the 64\-bit versions of instructions can use extended registers ``for free'' (as the REX prefix is already present)\&. Examples in NASM syntax:

	.IP
	mov eax, 1 ; 32\-bit instruction
	.IP
	mov rcx, 1 ; 64\-bit instruction
	.PP
	Instructions that modify the stack (push, pop, call, ret, enter, and leave) are implicitly 64\-bit\&. Their 32\-bit counterparts are not available, but their 16\-bit counterparts are\&. Examples in NASM syntax:

	.IP
	push eax ; illegal instruction
	.IP
	push rbx ; 1\-byte instruction
	.IP
	push r11 ; 2\-byte instruction with REX prefix
	.PP
	Results of 32\-bit operations are implicitly zero\-extended to the upper 32 bits of the corresponding 64\-bit register\&. 16 and 8 bit operations, on the other hand, do not affect upper bits of the register (just as in 32\-bit and 16\-bit modes)\&. This can be used to generate smaller code in some instances\&. Examples in NASM syntax:

	.IP
	mov ecx, 1 ; 1 byte shorter than mov rcx, 1
	.IP
	and edx, 3 ; equivalent to and rdx, 3
	.PP
	For most instructions in 64\-bit mode, immediate values remain 32 bits; their value is sign\-extended into the upper 32 bits of the target register prior to being used\&. The exception is the mov instruction, which can take a 64\-bit immediate when the destination is a 64\-bit register\&. Examples in NASM syntax:

	.IP
	add rax, 1 ; legal
	.IP
	add rax, 0xffffffff ; sign\-extended
	.IP
	add rax, \-1 ; same as above
	.IP
	add rax, 0xffffffffffffffff ; warning (>32 bit)
	.IP
	mov eax, 1 ; 5 byte instruction
	.IP
	mov rax, 1 ; 10 byte instruction
	.IP
	mov rbx, 0x1234567890abcdef ; 10 byte instruction
	.IP
	mov rcx, 0xffffffff ; 10 byte instruction
	.IP
	mov ecx, \-1 ; 5 byte instruction equivalent to above
	.PP
	Just like immediates, displacements, for the most part, remain 32 bits and are sign extended prior to use\&. Again, the exception is one restricted form of the mov instruction: between the al/ax/eax/rax register and a 64\-bit absolute address (no registers allowed in the effective address)\&. In NASM syntax, use of the 64\-bit absolute form requires \fB[qword]\fR\&. Examples in NASM syntax:

	.IP
	mov eax, [1] ; 32 bit, with sign extension
	.IP
	mov al, [rax\-1] ; 32 bit, with sign extension
	.IP
	mov al, [qword 0x1122334455667788] ; 64\-bit absolute
	.IP
	mov al, [0x1122334455667788] ; truncated to 32\-bit (warning)
	.PP
	In 64\-bit mode, a new form of effective addressing is available to make it easier to write position\-independent code\&. Any memory reference may be made RIP relative (RIP is the instruction pointer register, which contains the address of the location immediately following the current instruction)\&.

	.PP
	In NASM syntax, there are two ways to specify RIP\-relative addressing:

	.IP
	mov dword [rip+10], 1
	.PP
	stores the value 1 ten bytes after the end of the instruction\&. \fB10\fR can also be a symbolic constant, and will be treated the same way\&. On the other hand,

	.IP
	mov dword [symb wrt rip], 1
	.PP
	stores the value 1 into the address of symbol \fBsymb\fR\&. This is distinctly different than the behavior of:

	.IP
	mov dword [symb+rip], 1
	.PP
	which takes the address of the end of the instruction, adds the address of \fBsymb\fR to it, then stores the value 1 there\&. If \fBsymb\fR is a variable, this will NOT store the value 1 into the \fBsymb\fR variable!

	.SH "LC3B ARCHITECTURE"

	.PP
	The ``lc3b'' architecture supports the LC\-3b ISA as used in the ECE 312 (now ECE 411) course at the University of Illinois, Urbana\-Champaign, as well as other university courses\&. See \fIhttp://courses.ece.uiuc.edu/ece411/\fR for more details and example code\&. The ``lc3b'' architecture consists of only one machine: ``lc3b''\&.

	.SH "SEE ALSO"

	.PP
	\fByasm\fR(1)

	.SH "BUGS"

	.PP
	When using the ``x86'' architecture, it is overly easy to generate AMD64 code (using the \fBBITS 64\fR directive) and generate a 32\-bit object file (by failing to specify \fB\-m amd64\fR on the command line)\&. Similarly, specifying \fB\-m amd64\fR does not default the BITS setting to 64\&.

	.SH AUTHOR
	Peter Johnson <peter@tortall\&.net>.