| This is as.info, produced by makeinfo version 4.3 from as.texinfo. |
| |
| START-INFO-DIR-ENTRY |
| * As: (as). The GNU assembler. |
| * Gas: (as). The GNU assembler. |
| END-INFO-DIR-ENTRY |
| |
| This file documents the GNU Assembler "as". |
| |
| Copyright (C) 1991, 92, 93, 94, 95, 96, 97, 98, 99, 2000, 2001, 2002 |
| Free Software Foundation, Inc. |
| |
| Permission is granted to copy, distribute and/or modify this document |
| under the terms of the GNU Free Documentation License, Version 1.1 or |
| any later version published by the Free Software Foundation; with no |
| Invariant Sections, with no Front-Cover Texts, and with no Back-Cover |
| Texts. A copy of the license is included in the section entitled "GNU |
| Free Documentation License". |
| |
| |
| File: as.info, Node: Comments, Next: Symbol Intro, Prev: Whitespace, Up: Syntax |
| |
| Comments |
| ======== |
| |
| There are two ways of rendering comments to `as'. In both cases the |
| comment is equivalent to one space. |
| |
| Anything from `/*' through the next `*/' is a comment. This means |
| you may not nest these comments. |
| |
| /* |
| The only way to include a newline ('\n') in a comment |
| is to use this sort of comment. |
| */ |
| |
| /* This sort of comment does not nest. */ |
| |
| Anything from the "line comment" character to the next newline is |
| considered a comment and is ignored. The line comment character is `;' |
| for the AMD 29K family; `;' on the ARC; `@' on the ARM; `;' for the |
| H8/300 family; `!' for the H8/500 family; `;' for the HPPA; `#' on the |
| i386 and x86-64; `#' on the i960; `;' for the PDP-11; `;' for picoJava; |
| `;' for Motorola PowerPC; `!' for the Renesas / SuperH SH; `!' on the |
| SPARC; `#' on the ip2k; `#' on the m32r; `|' on the 680x0; `#' on the |
| 68HC11 and 68HC12; `;' on the M880x0; `#' on the Vax; `!' for the Z8000; |
| `#' on the V850; `#' for Xtensa systems; see *Note Machine |
| Dependencies::. |
| |
| On some machines there are two different line comment characters. |
| One character only begins a comment if it is the first non-whitespace |
| character on a line, while the other always begins a comment. |
| |
| The V850 assembler also supports a double dash as starting a comment |
| that extends to the end of the line. |
| |
| `--'; |
| |
| To be compatible with past assemblers, lines that begin with `#' |
| have a special interpretation. Following the `#' should be an absolute |
| expression (*note Expressions::): the logical line number of the _next_ |
| line. Then a string (*note Strings: Strings.) is allowed: if present |
| it is a new logical file name. The rest of the line, if any, should be |
| whitespace. |
| |
| If the first non-whitespace characters on the line are not numeric, |
| the line is ignored. (Just like a comment.) |
| |
| # This is an ordinary comment. |
| # 42-6 "new_file_name" # New logical file name |
| # This is logical line # 36. |
| This feature is deprecated, and may disappear from future versions |
| of `as'. |
| |
| |
| File: as.info, Node: Symbol Intro, Next: Statements, Prev: Comments, Up: Syntax |
| |
| Symbols |
| ======= |
| |
| A "symbol" is one or more characters chosen from the set of all |
| letters (both upper and lower case), digits and the three characters |
| `_.$'. On most machines, you can also use `$' in symbol names; |
| exceptions are noted in *Note Machine Dependencies::. No symbol may |
| begin with a digit. Case is significant. There is no length limit: |
| all characters are significant. Symbols are delimited by characters |
| not in that set, or by the beginning of a file (since the source |
| program must end with a newline, the end of a file is not a possible |
| symbol delimiter). *Note Symbols::. |
| |
| |
| File: as.info, Node: Statements, Next: Constants, Prev: Symbol Intro, Up: Syntax |
| |
| Statements |
| ========== |
| |
| A "statement" ends at a newline character (`\n') or line separator |
| character. (The line separator is usually `;', unless this conflicts |
| with the comment character; *note Machine Dependencies::.) The newline |
| or separator character is considered part of the preceding statement. |
| Newlines and separators within character constants are an exception: |
| they do not end statements. |
| |
| It is an error to end any statement with end-of-file: the last |
| character of any input file should be a newline. |
| |
| An empty statement is allowed, and may include whitespace. It is |
| ignored. |
| |
| A statement begins with zero or more labels, optionally followed by a |
| key symbol which determines what kind of statement it is. The key |
| symbol determines the syntax of the rest of the statement. If the |
| symbol begins with a dot `.' then the statement is an assembler |
| directive: typically valid for any computer. If the symbol begins with |
| a letter the statement is an assembly language "instruction": it |
| assembles into a machine language instruction. Different versions of |
| `as' for different computers recognize different instructions. In |
| fact, the same symbol may represent a different instruction in a |
| different computer's assembly language. |
| |
| A label is a symbol immediately followed by a colon (`:'). |
| Whitespace before a label or after a colon is permitted, but you may not |
| have whitespace between a label's symbol and its colon. *Note Labels::. |
| |
| For HPPA targets, labels need not be immediately followed by a |
| colon, but the definition of a label must begin in column zero. This |
| also implies that only one label may be defined on each line. |
| |
| label: .directive followed by something |
| another_label: # This is an empty statement. |
| instruction operand_1, operand_2, ... |
| |
| |
| File: as.info, Node: Constants, Prev: Statements, Up: Syntax |
| |
| Constants |
| ========= |
| |
| A constant is a number, written so that its value is known by |
| inspection, without knowing any context. Like this: |
| .byte 74, 0112, 092, 0x4A, 0X4a, 'J, '\J # All the same value. |
| .ascii "Ring the bell\7" # A string constant. |
| .octa 0x123456789abcdef0123456789ABCDEF0 # A bignum. |
| .float 0f-314159265358979323846264338327\ |
| 95028841971.693993751E-40 # - pi, a flonum. |
| |
| * Menu: |
| |
| * Characters:: Character Constants |
| * Numbers:: Number Constants |
| |
| |
| File: as.info, Node: Characters, Next: Numbers, Up: Constants |
| |
| Character Constants |
| ------------------- |
| |
| There are two kinds of character constants. A "character" stands |
| for one character in one byte and its value may be used in numeric |
| expressions. String constants (properly called string _literals_) are |
| potentially many bytes and their values may not be used in arithmetic |
| expressions. |
| |
| * Menu: |
| |
| * Strings:: Strings |
| * Chars:: Characters |
| |
| |
| File: as.info, Node: Strings, Next: Chars, Up: Characters |
| |
| Strings |
| ....... |
| |
| A "string" is written between double-quotes. It may contain |
| double-quotes or null characters. The way to get special characters |
| into a string is to "escape" these characters: precede them with a |
| backslash `\' character. For example `\\' represents one backslash: |
| the first `\' is an escape which tells `as' to interpret the second |
| character literally as a backslash (which prevents `as' from |
| recognizing the second `\' as an escape character). The complete list |
| of escapes follows. |
| |
| `\b' |
| Mnemonic for backspace; for ASCII this is octal code 010. |
| |
| `\f' |
| Mnemonic for FormFeed; for ASCII this is octal code 014. |
| |
| `\n' |
| Mnemonic for newline; for ASCII this is octal code 012. |
| |
| `\r' |
| Mnemonic for carriage-Return; for ASCII this is octal code 015. |
| |
| `\t' |
| Mnemonic for horizontal Tab; for ASCII this is octal code 011. |
| |
| `\ DIGIT DIGIT DIGIT' |
| An octal character code. The numeric code is 3 octal digits. For |
| compatibility with other Unix systems, 8 and 9 are accepted as |
| digits: for example, `\008' has the value 010, and `\009' the |
| value 011. |
| |
| `\`x' HEX-DIGITS...' |
| A hex character code. All trailing hex digits are combined. |
| Either upper or lower case `x' works. |
| |
| `\\' |
| Represents one `\' character. |
| |
| `\"' |
| Represents one `"' character. Needed in strings to represent this |
| character, because an unescaped `"' would end the string. |
| |
| `\ ANYTHING-ELSE' |
| Any other character when escaped by `\' gives a warning, but |
| assembles as if the `\' was not present. The idea is that if you |
| used an escape sequence you clearly didn't want the literal |
| interpretation of the following character. However `as' has no |
| other interpretation, so `as' knows it is giving you the wrong |
| code and warns you of the fact. |
| |
| Which characters are escapable, and what those escapes represent, |
| varies widely among assemblers. The current set is what we think the |
| BSD 4.2 assembler recognizes, and is a subset of what most C compilers |
| recognize. If you are in doubt, do not use an escape sequence. |
| |
| |
| File: as.info, Node: Chars, Prev: Strings, Up: Characters |
| |
| Characters |
| .......... |
| |
| A single character may be written as a single quote immediately |
| followed by that character. The same escapes apply to characters as to |
| strings. So if you want to write the character backslash, you must |
| write `'\\' where the first `\' escapes the second `\'. As you can |
| see, the quote is an acute accent, not a grave accent. A newline |
| immediately following an acute accent is taken as a literal character |
| and does not count as the end of a statement. The value of a character |
| constant in a numeric expression is the machine's byte-wide code for |
| that character. `as' assumes your character code is ASCII: `'A' means |
| 65, `'B' means 66, and so on. |
| |
| |
| File: as.info, Node: Numbers, Prev: Characters, Up: Constants |
| |
| Number Constants |
| ---------------- |
| |
| `as' distinguishes three kinds of numbers according to how they are |
| stored in the target machine. _Integers_ are numbers that would fit |
| into an `int' in the C language. _Bignums_ are integers, but they are |
| stored in more than 32 bits. _Flonums_ are floating point numbers, |
| described below. |
| |
| * Menu: |
| |
| * Integers:: Integers |
| * Bignums:: Bignums |
| * Flonums:: Flonums |
| |
| |
| File: as.info, Node: Integers, Next: Bignums, Up: Numbers |
| |
| Integers |
| ........ |
| |
| A binary integer is `0b' or `0B' followed by zero or more of the |
| binary digits `01'. |
| |
| An octal integer is `0' followed by zero or more of the octal digits |
| (`01234567'). |
| |
| A decimal integer starts with a non-zero digit followed by zero or |
| more digits (`0123456789'). |
| |
| A hexadecimal integer is `0x' or `0X' followed by one or more |
| hexadecimal digits chosen from `0123456789abcdefABCDEF'. |
| |
| Integers have the usual values. To denote a negative integer, use |
| the prefix operator `-' discussed under expressions (*note Prefix |
| Operators: Prefix Ops.). |
| |
| |
| File: as.info, Node: Bignums, Next: Flonums, Prev: Integers, Up: Numbers |
| |
| Bignums |
| ....... |
| |
| A "bignum" has the same syntax and semantics as an integer except |
| that the number (or its negative) takes more than 32 bits to represent |
| in binary. The distinction is made because in some places integers are |
| permitted while bignums are not. |
| |
| |
| File: as.info, Node: Flonums, Prev: Bignums, Up: Numbers |
| |
| Flonums |
| ....... |
| |
| A "flonum" represents a floating point number. The translation is |
| indirect: a decimal floating point number from the text is converted by |
| `as' to a generic binary floating point number of more than sufficient |
| precision. This generic floating point number is converted to a |
| particular computer's floating point format (or formats) by a portion |
| of `as' specialized to that computer. |
| |
| A flonum is written by writing (in order) |
| * The digit `0'. (`0' is optional on the HPPA.) |
| |
| * A letter, to tell `as' the rest of the number is a flonum. `e' is |
| recommended. Case is not important. |
| |
| On the H8/300, H8/500, Renesas / SuperH SH, and AMD 29K |
| architectures, the letter must be one of the letters `DFPRSX' (in |
| upper or lower case). |
| |
| On the ARC, the letter must be one of the letters `DFRS' (in upper |
| or lower case). |
| |
| On the Intel 960 architecture, the letter must be one of the |
| letters `DFT' (in upper or lower case). |
| |
| On the HPPA architecture, the letter must be `E' (upper case only). |
| |
| * An optional sign: either `+' or `-'. |
| |
| * An optional "integer part": zero or more decimal digits. |
| |
| * An optional "fractional part": `.' followed by zero or more |
| decimal digits. |
| |
| * An optional exponent, consisting of: |
| |
| * An `E' or `e'. |
| |
| * Optional sign: either `+' or `-'. |
| |
| * One or more decimal digits. |
| |
| |
| At least one of the integer part or the fractional part must be |
| present. The floating point number has the usual base-10 value. |
| |
| `as' does all processing using integers. Flonums are computed |
| independently of any floating point hardware in the computer running |
| `as'. |
| |
| |
| File: as.info, Node: Sections, Next: Symbols, Prev: Syntax, Up: Top |
| |
| Sections and Relocation |
| *********************** |
| |
| * Menu: |
| |
| * Secs Background:: Background |
| * Ld Sections:: Linker Sections |
| * As Sections:: Assembler Internal Sections |
| * Sub-Sections:: Sub-Sections |
| * bss:: bss Section |
| |
| |
| File: as.info, Node: Secs Background, Next: Ld Sections, Up: Sections |
| |
| Background |
| ========== |
| |
| Roughly, a section is a range of addresses, with no gaps; all data |
| "in" those addresses is treated the same for some particular purpose. |
| For example there may be a "read only" section. |
| |
| The linker `ld' reads many object files (partial programs) and |
| combines their contents to form a runnable program. When `as' emits an |
| object file, the partial program is assumed to start at address 0. |
| `ld' assigns the final addresses for the partial program, so that |
| different partial programs do not overlap. This is actually an |
| oversimplification, but it suffices to explain how `as' uses sections. |
| |
| `ld' moves blocks of bytes of your program to their run-time |
| addresses. These blocks slide to their run-time addresses as rigid |
| units; their length does not change and neither does the order of bytes |
| within them. Such a rigid unit is called a _section_. Assigning |
| run-time addresses to sections is called "relocation". It includes the |
| task of adjusting mentions of object-file addresses so they refer to |
| the proper run-time addresses. For the H8/300 and H8/500, and for the |
| Renesas / SuperH SH, `as' pads sections if needed to ensure they end on |
| a word (sixteen bit) boundary. |
| |
| An object file written by `as' has at least three sections, any of |
| which may be empty. These are named "text", "data" and "bss" sections. |
| |
| When it generates COFF or ELF output, `as' can also generate |
| whatever other named sections you specify using the `.section' |
| directive (*note `.section': Section.). If you do not use any |
| directives that place output in the `.text' or `.data' sections, these |
| sections still exist, but are empty. |
| |
| When `as' generates SOM or ELF output for the HPPA, `as' can also |
| generate whatever other named sections you specify using the `.space' |
| and `.subspace' directives. See `HP9000 Series 800 Assembly Language |
| Reference Manual' (HP 92432-90001) for details on the `.space' and |
| `.subspace' assembler directives. |
| |
| Additionally, `as' uses different names for the standard text, data, |
| and bss sections when generating SOM output. Program text is placed |
| into the `$CODE$' section, data into `$DATA$', and BSS into `$BSS$'. |
| |
| Within the object file, the text section starts at address `0', the |
| data section follows, and the bss section follows the data section. |
| |
| When generating either SOM or ELF output files on the HPPA, the text |
| section starts at address `0', the data section at address `0x4000000', |
| and the bss section follows the data section. |
| |
| To let `ld' know which data changes when the sections are relocated, |
| and how to change that data, `as' also writes to the object file |
| details of the relocation needed. To perform relocation `ld' must |
| know, each time an address in the object file is mentioned: |
| * Where in the object file is the beginning of this reference to an |
| address? |
| |
| * How long (in bytes) is this reference? |
| |
| * Which section does the address refer to? What is the numeric |
| value of |
| (ADDRESS) - (START-ADDRESS OF SECTION)? |
| |
| * Is the reference to an address "Program-Counter relative"? |
| |
| In fact, every address `as' ever uses is expressed as |
| (SECTION) + (OFFSET INTO SECTION) |
| |
| Further, most expressions `as' computes have this section-relative |
| nature. (For some object formats, such as SOM for the HPPA, some |
| expressions are symbol-relative instead.) |
| |
| In this manual we use the notation {SECNAME N} to mean "offset N |
| into section SECNAME." |
| |
| Apart from text, data and bss sections you need to know about the |
| "absolute" section. When `ld' mixes partial programs, addresses in the |
| absolute section remain unchanged. For example, address `{absolute 0}' |
| is "relocated" to run-time address 0 by `ld'. Although the linker |
| never arranges two partial programs' data sections with overlapping |
| addresses after linking, _by definition_ their absolute sections must |
| overlap. Address `{absolute 239}' in one part of a program is always |
| the same address when the program is running as address `{absolute |
| 239}' in any other part of the program. |
| |
| The idea of sections is extended to the "undefined" section. Any |
| address whose section is unknown at assembly time is by definition |
| rendered {undefined U}--where U is filled in later. Since numbers are |
| always defined, the only way to generate an undefined address is to |
| mention an undefined symbol. A reference to a named common block would |
| be such a symbol: its value is unknown at assembly time so it has |
| section _undefined_. |
| |
| By analogy the word _section_ is used to describe groups of sections |
| in the linked program. `ld' puts all partial programs' text sections |
| in contiguous addresses in the linked program. It is customary to |
| refer to the _text section_ of a program, meaning all the addresses of |
| all partial programs' text sections. Likewise for data and bss |
| sections. |
| |
| Some sections are manipulated by `ld'; others are invented for use |
| of `as' and have no meaning except during assembly. |
| |
| |
| File: as.info, Node: Ld Sections, Next: As Sections, Prev: Secs Background, Up: Sections |
| |
| Linker Sections |
| =============== |
| |
| `ld' deals with just four kinds of sections, summarized below. |
| |
| *named sections* |
| *text section* |
| *data section* |
| These sections hold your program. `as' and `ld' treat them as |
| separate but equal sections. Anything you can say of one section |
| is true of another. When the program is running, however, it is |
| customary for the text section to be unalterable. The text |
| section is often shared among processes: it contains instructions, |
| constants and the like. The data section of a running program is |
| usually alterable: for example, C variables would be stored in the |
| data section. |
| |
| *bss section* |
| This section contains zeroed bytes when your program begins |
| running. It is used to hold uninitialized variables or common |
| storage. The length of each partial program's bss section is |
| important, but because it starts out containing zeroed bytes there |
| is no need to store explicit zero bytes in the object file. The |
| bss section was invented to eliminate those explicit zeros from |
| object files. |
| |
| *absolute section* |
| Address 0 of this section is always "relocated" to runtime address |
| 0. This is useful if you want to refer to an address that `ld' |
| must not change when relocating. In this sense we speak of |
| absolute addresses being "unrelocatable": they do not change |
| during relocation. |
| |
| *undefined section* |
| This "section" is a catch-all for address references to objects |
| not in the preceding sections. |
| |
| An idealized example of three relocatable sections follows. The |
| example uses the traditional section names `.text' and `.data'. Memory |
| addresses are on the horizontal axis. |
| |
| +-----+----+--+ |
| partial program # 1: |ttttt|dddd|00| |
| +-----+----+--+ |
| |
| text data bss |
| seg. seg. seg. |
| |
| +---+---+---+ |
| partial program # 2: |TTT|DDD|000| |
| +---+---+---+ |
| |
| +--+---+-----+--+----+---+-----+~~ |
| linked program: | |TTT|ttttt| |dddd|DDD|00000| |
| +--+---+-----+--+----+---+-----+~~ |
| |
| addresses: 0 ... |
| |
| |
| File: as.info, Node: As Sections, Next: Sub-Sections, Prev: Ld Sections, Up: Sections |
| |
| Assembler Internal Sections |
| =========================== |
| |
| These sections are meant only for the internal use of `as'. They |
| have no meaning at run-time. You do not really need to know about these |
| sections for most purposes; but they can be mentioned in `as' warning |
| messages, so it might be helpful to have an idea of their meanings to |
| `as'. These sections are used to permit the value of every expression |
| in your assembly language program to be a section-relative address. |
| |
| ASSEMBLER-INTERNAL-LOGIC-ERROR! |
| An internal assembler logic error has been found. This means |
| there is a bug in the assembler. |
| |
| expr section |
| The assembler stores complex expression internally as combinations |
| of symbols. When it needs to represent an expression as a symbol, |
| it puts it in the expr section. |
| |
| |
| File: as.info, Node: Sub-Sections, Next: bss, Prev: As Sections, Up: Sections |
| |
| Sub-Sections |
| ============ |
| |
| Assembled bytes conventionally fall into two sections: text and data. |
| You may have separate groups of data in named sections that you want to |
| end up near to each other in the object file, even though they are not |
| contiguous in the assembler source. `as' allows you to use |
| "subsections" for this purpose. Within each section, there can be |
| numbered subsections with values from 0 to 8192. Objects assembled |
| into the same subsection go into the object file together with other |
| objects in the same subsection. For example, a compiler might want to |
| store constants in the text section, but might not want to have them |
| interspersed with the program being assembled. In this case, the |
| compiler could issue a `.text 0' before each section of code being |
| output, and a `.text 1' before each group of constants being output. |
| |
| Subsections are optional. If you do not use subsections, everything |
| goes in subsection number zero. |
| |
| Each subsection is zero-padded up to a multiple of four bytes. |
| (Subsections may be padded a different amount on different flavors of |
| `as'.) |
| |
| Subsections appear in your object file in numeric order, lowest |
| numbered to highest. (All this to be compatible with other people's |
| assemblers.) The object file contains no representation of |
| subsections; `ld' and other programs that manipulate object files see |
| no trace of them. They just see all your text subsections as a text |
| section, and all your data subsections as a data section. |
| |
| To specify which subsection you want subsequent statements assembled |
| into, use a numeric argument to specify it, in a `.text EXPRESSION' or |
| a `.data EXPRESSION' statement. When generating COFF or ELF output, you |
| can also use an extra subsection argument with arbitrary named |
| sections: `.section NAME, EXPRESSION'. EXPRESSION should be an |
| absolute expression. (*Note Expressions::.) If you just say `.text' |
| then `.text 0' is assumed. Likewise `.data' means `.data 0'. Assembly |
| begins in `text 0'. For instance: |
| .text 0 # The default subsection is text 0 anyway. |
| .ascii "This lives in the first text subsection. *" |
| .text 1 |
| .ascii "But this lives in the second text subsection." |
| .data 0 |
| .ascii "This lives in the data section," |
| .ascii "in the first data subsection." |
| .text 0 |
| .ascii "This lives in the first text section," |
| .ascii "immediately following the asterisk (*)." |
| |
| Each section has a "location counter" incremented by one for every |
| byte assembled into that section. Because subsections are merely a |
| convenience restricted to `as' there is no concept of a subsection |
| location counter. There is no way to directly manipulate a location |
| counter--but the `.align' directive changes it, and any label |
| definition captures its current value. The location counter of the |
| section where statements are being assembled is said to be the "active" |
| location counter. |
| |
| |
| File: as.info, Node: bss, Prev: Sub-Sections, Up: Sections |
| |
| bss Section |
| =========== |
| |
| The bss section is used for local common variable storage. You may |
| allocate address space in the bss section, but you may not dictate data |
| to load into it before your program executes. When your program starts |
| running, all the contents of the bss section are zeroed bytes. |
| |
| The `.lcomm' pseudo-op defines a symbol in the bss section; see |
| *Note `.lcomm': Lcomm. |
| |
| The `.comm' pseudo-op may be used to declare a common symbol, which |
| is another form of uninitialized symbol; see *Note `.comm': Comm. |
| |
| When assembling for a target which supports multiple sections, such |
| as ELF or COFF, you may switch into the `.bss' section and define |
| symbols as usual; see *Note `.section': Section. You may only assemble |
| zero values into the section. Typically the section will only contain |
| symbol definitions and `.skip' directives (*note `.skip': Skip.). |
| |
| |
| File: as.info, Node: Symbols, Next: Expressions, Prev: Sections, Up: Top |
| |
| Symbols |
| ******* |
| |
| Symbols are a central concept: the programmer uses symbols to name |
| things, the linker uses symbols to link, and the debugger uses symbols |
| to debug. |
| |
| _Warning:_ `as' does not place symbols in the object file in the |
| same order they were declared. This may break some debuggers. |
| |
| * Menu: |
| |
| * Labels:: Labels |
| * Setting Symbols:: Giving Symbols Other Values |
| * Symbol Names:: Symbol Names |
| * Dot:: The Special Dot Symbol |
| * Symbol Attributes:: Symbol Attributes |
| |
| |
| File: as.info, Node: Labels, Next: Setting Symbols, Up: Symbols |
| |
| Labels |
| ====== |
| |
| A "label" is written as a symbol immediately followed by a colon |
| `:'. The symbol then represents the current value of the active |
| location counter, and is, for example, a suitable instruction operand. |
| You are warned if you use the same symbol to represent two different |
| locations: the first definition overrides any other definitions. |
| |
| On the HPPA, the usual form for a label need not be immediately |
| followed by a colon, but instead must start in column zero. Only one |
| label may be defined on a single line. To work around this, the HPPA |
| version of `as' also provides a special directive `.label' for defining |
| labels more flexibly. |
| |
| |
| File: as.info, Node: Setting Symbols, Next: Symbol Names, Prev: Labels, Up: Symbols |
| |
| Giving Symbols Other Values |
| =========================== |
| |
| A symbol can be given an arbitrary value by writing a symbol, |
| followed by an equals sign `=', followed by an expression (*note |
| Expressions::). This is equivalent to using the `.set' directive. |
| *Note `.set': Set. |
| |
| |
| File: as.info, Node: Symbol Names, Next: Dot, Prev: Setting Symbols, Up: Symbols |
| |
| Symbol Names |
| ============ |
| |
| Symbol names begin with a letter or with one of `._'. On most |
| machines, you can also use `$' in symbol names; exceptions are noted in |
| *Note Machine Dependencies::. That character may be followed by any |
| string of digits, letters, dollar signs (unless otherwise noted in |
| *Note Machine Dependencies::), and underscores. For the AMD 29K |
| family, `?' is also allowed in the body of a symbol name, though not at |
| its beginning. |
| |
| Case of letters is significant: `foo' is a different symbol name |
| than `Foo'. |
| |
| Each symbol has exactly one name. Each name in an assembly language |
| program refers to exactly one symbol. You may use that symbol name any |
| number of times in a program. |
| |
| Local Symbol Names |
| ------------------ |
| |
| Local symbols help compilers and programmers use names temporarily. |
| They create symbols which are guaranteed to be unique over the entire |
| scope of the input source code and which can be referred to by a simple |
| notation. To define a local symbol, write a label of the form `N:' |
| (where N represents any positive integer). To refer to the most recent |
| previous definition of that symbol write `Nb', using the same number as |
| when you defined the label. To refer to the next definition of a local |
| label, write `Nf'-- The `b' stands for"backwards" and the `f' stands |
| for "forwards". |
| |
| There is no restriction on how you can use these labels, and you can |
| reuse them too. So that it is possible to repeatedly define the same |
| local label (using the same number `N'), although you can only refer to |
| the most recently defined local label of that number (for a backwards |
| reference) or the next definition of a specific local label for a |
| forward reference. It is also worth noting that the first 10 local |
| labels (`0:'...`9:') are implemented in a slightly more efficient |
| manner than the others. |
| |
| Here is an example: |
| |
| 1: branch 1f |
| 2: branch 1b |
| 1: branch 2f |
| 2: branch 1b |
| |
| Which is the equivalent of: |
| |
| label_1: branch label_3 |
| label_2: branch label_1 |
| label_3: branch label_4 |
| label_4: branch label_3 |
| |
| Local symbol names are only a notational device. They are |
| immediately transformed into more conventional symbol names before the |
| assembler uses them. The symbol names stored in the symbol table, |
| appearing in error messages and optionally emitted to the object file. |
| The names are constructed using these parts: |
| |
| `L' |
| All local labels begin with `L'. Normally both `as' and `ld' |
| forget symbols that start with `L'. These labels are used for |
| symbols you are never intended to see. If you use the `-L' option |
| then `as' retains these symbols in the object file. If you also |
| instruct `ld' to retain these symbols, you may use them in |
| debugging. |
| |
| `NUMBER' |
| This is the number that was used in the local label definition. |
| So if the label is written `55:' then the number is `55'. |
| |
| `C-B' |
| This unusual character is included so you do not accidentally |
| invent a symbol of the same name. The character has ASCII value |
| of `\002' (control-B). |
| |
| `_ordinal number_' |
| This is a serial number to keep the labels distinct. The first |
| definition of `0:' gets the number `1'. The 15th definition of |
| `0:' gets the number `15', and so on. Likewise the first |
| definition of `1:' gets the number `1' and its 15th defintion gets |
| `15' as well. |
| |
| So for example, the first `1:' is named `L1C-B1', the 44th `3:' is |
| named `L3C-B44'. |
| |
| Dollar Local Labels |
| ------------------- |
| |
| `as' also supports an even more local form of local labels called |
| dollar labels. These labels go out of scope (ie they become undefined) |
| as soon as a non-local label is defined. Thus they remain valid for |
| only a small region of the input source code. Normal local labels, by |
| contrast, remain in scope for the entire file, or until they are |
| redefined by another occurrence of the same local label. |
| |
| Dollar labels are defined in exactly the same way as ordinary local |
| labels, except that instead of being terminated by a colon, they are |
| terminated by a dollar sign. eg `55$'. |
| |
| They can also be distinguished from ordinary local labels by their |
| transformed name which uses ASCII character `\001' (control-A) as the |
| magic character to distinguish them from ordinary labels. Thus the 5th |
| defintion of `6$' is named `L6C-A5'. |
| |
| |
| File: as.info, Node: Dot, Next: Symbol Attributes, Prev: Symbol Names, Up: Symbols |
| |
| The Special Dot Symbol |
| ====================== |
| |
| The special symbol `.' refers to the current address that `as' is |
| assembling into. Thus, the expression `melvin: .long .' defines |
| `melvin' to contain its own address. Assigning a value to `.' is |
| treated the same as a `.org' directive. Thus, the expression `.=.+4' |
| is the same as saying `.space 4'. |
| |
| |
| File: as.info, Node: Symbol Attributes, Prev: Dot, Up: Symbols |
| |
| Symbol Attributes |
| ================= |
| |
| Every symbol has, as well as its name, the attributes "Value" and |
| "Type". Depending on output format, symbols can also have auxiliary |
| attributes. |
| |
| If you use a symbol without defining it, `as' assumes zero for all |
| these attributes, and probably won't warn you. This makes the symbol |
| an externally defined symbol, which is generally what you would want. |
| |
| * Menu: |
| |
| * Symbol Value:: Value |
| * Symbol Type:: Type |
| |
| |
| * a.out Symbols:: Symbol Attributes: `a.out' |
| |
| * COFF Symbols:: Symbol Attributes for COFF |
| |
| * SOM Symbols:: Symbol Attributes for SOM |
| |
| |
| File: as.info, Node: Symbol Value, Next: Symbol Type, Up: Symbol Attributes |
| |
| Value |
| ----- |
| |
| The value of a symbol is (usually) 32 bits. For a symbol which |
| labels a location in the text, data, bss or absolute sections the value |
| is the number of addresses from the start of that section to the label. |
| Naturally for text, data and bss sections the value of a symbol changes |
| as `ld' changes section base addresses during linking. Absolute |
| symbols' values do not change during linking: that is why they are |
| called absolute. |
| |
| The value of an undefined symbol is treated in a special way. If it |
| is 0 then the symbol is not defined in this assembler source file, and |
| `ld' tries to determine its value from other files linked into the same |
| program. You make this kind of symbol simply by mentioning a symbol |
| name without defining it. A non-zero value represents a `.comm' common |
| declaration. The value is how much common storage to reserve, in bytes |
| (addresses). The symbol refers to the first address of the allocated |
| storage. |
| |
| |
| File: as.info, Node: Symbol Type, Next: a.out Symbols, Prev: Symbol Value, Up: Symbol Attributes |
| |
| Type |
| ---- |
| |
| The type attribute of a symbol contains relocation (section) |
| information, any flag settings indicating that a symbol is external, and |
| (optionally), other information for linkers and debuggers. The exact |
| format depends on the object-code output format in use. |
| |
| |
| File: as.info, Node: a.out Symbols, Next: COFF Symbols, Prev: Symbol Type, Up: Symbol Attributes |
| |
| Symbol Attributes: `a.out' |
| -------------------------- |
| |
| * Menu: |
| |
| * Symbol Desc:: Descriptor |
| * Symbol Other:: Other |
| |
| |
| File: as.info, Node: Symbol Desc, Next: Symbol Other, Up: a.out Symbols |
| |
| Descriptor |
| .......... |
| |
| This is an arbitrary 16-bit value. You may establish a symbol's |
| descriptor value by using a `.desc' statement (*note `.desc': Desc.). |
| A descriptor value means nothing to `as'. |
| |
| |
| File: as.info, Node: Symbol Other, Prev: Symbol Desc, Up: a.out Symbols |
| |
| Other |
| ..... |
| |
| This is an arbitrary 8-bit value. It means nothing to `as'. |
| |
| |
| File: as.info, Node: COFF Symbols, Next: SOM Symbols, Prev: a.out Symbols, Up: Symbol Attributes |
| |
| Symbol Attributes for COFF |
| -------------------------- |
| |
| The COFF format supports a multitude of auxiliary symbol attributes; |
| like the primary symbol attributes, they are set between `.def' and |
| `.endef' directives. |
| |
| Primary Attributes |
| .................. |
| |
| The symbol name is set with `.def'; the value and type, |
| respectively, with `.val' and `.type'. |
| |
| Auxiliary Attributes |
| .................... |
| |
| The `as' directives `.dim', `.line', `.scl', `.size', and `.tag' can |
| generate auxiliary symbol table information for COFF. |
| |
| |
| File: as.info, Node: SOM Symbols, Prev: COFF Symbols, Up: Symbol Attributes |
| |
| Symbol Attributes for SOM |
| ------------------------- |
| |
| The SOM format for the HPPA supports a multitude of symbol |
| attributes set with the `.EXPORT' and `.IMPORT' directives. |
| |
| The attributes are described in `HP9000 Series 800 Assembly Language |
| Reference Manual' (HP 92432-90001) under the `IMPORT' and `EXPORT' |
| assembler directive documentation. |
| |
| |
| File: as.info, Node: Expressions, Next: Pseudo Ops, Prev: Symbols, Up: Top |
| |
| Expressions |
| *********** |
| |
| An "expression" specifies an address or numeric value. Whitespace |
| may precede and/or follow an expression. |
| |
| The result of an expression must be an absolute number, or else an |
| offset into a particular section. If an expression is not absolute, |
| and there is not enough information when `as' sees the expression to |
| know its section, a second pass over the source program might be |
| necessary to interpret the expression--but the second pass is currently |
| not implemented. `as' aborts with an error message in this situation. |
| |
| * Menu: |
| |
| * Empty Exprs:: Empty Expressions |
| * Integer Exprs:: Integer Expressions |
| |
| |
| File: as.info, Node: Empty Exprs, Next: Integer Exprs, Up: Expressions |
| |
| Empty Expressions |
| ================= |
| |
| An empty expression has no value: it is just whitespace or null. |
| Wherever an absolute expression is required, you may omit the |
| expression, and `as' assumes a value of (absolute) 0. This is |
| compatible with other assemblers. |
| |
| |
| File: as.info, Node: Integer Exprs, Prev: Empty Exprs, Up: Expressions |
| |
| Integer Expressions |
| =================== |
| |
| An "integer expression" is one or more _arguments_ delimited by |
| _operators_. |
| |
| * Menu: |
| |
| * Arguments:: Arguments |
| * Operators:: Operators |
| * Prefix Ops:: Prefix Operators |
| * Infix Ops:: Infix Operators |
| |
| |
| File: as.info, Node: Arguments, Next: Operators, Up: Integer Exprs |
| |
| Arguments |
| --------- |
| |
| "Arguments" are symbols, numbers or subexpressions. In other |
| contexts arguments are sometimes called "arithmetic operands". In this |
| manual, to avoid confusing them with the "instruction operands" of the |
| machine language, we use the term "argument" to refer to parts of |
| expressions only, reserving the word "operand" to refer only to machine |
| instruction operands. |
| |
| Symbols are evaluated to yield {SECTION NNN} where SECTION is one of |
| text, data, bss, absolute, or undefined. NNN is a signed, 2's |
| complement 32 bit integer. |
| |
| Numbers are usually integers. |
| |
| A number can be a flonum or bignum. In this case, you are warned |
| that only the low order 32 bits are used, and `as' pretends these 32 |
| bits are an integer. You may write integer-manipulating instructions |
| that act on exotic constants, compatible with other assemblers. |
| |
| Subexpressions are a left parenthesis `(' followed by an integer |
| expression, followed by a right parenthesis `)'; or a prefix operator |
| followed by an argument. |
| |
| |
| File: as.info, Node: Operators, Next: Prefix Ops, Prev: Arguments, Up: Integer Exprs |
| |
| Operators |
| --------- |
| |
| "Operators" are arithmetic functions, like `+' or `%'. Prefix |
| operators are followed by an argument. Infix operators appear between |
| their arguments. Operators may be preceded and/or followed by |
| whitespace. |
| |
| |
| File: as.info, Node: Prefix Ops, Next: Infix Ops, Prev: Operators, Up: Integer Exprs |
| |
| Prefix Operator |
| --------------- |
| |
| `as' has the following "prefix operators". They each take one |
| argument, which must be absolute. |
| |
| `-' |
| "Negation". Two's complement negation. |
| |
| `~' |
| "Complementation". Bitwise not. |
| |
| |
| File: as.info, Node: Infix Ops, Prev: Prefix Ops, Up: Integer Exprs |
| |
| Infix Operators |
| --------------- |
| |
| "Infix operators" take two arguments, one on either side. Operators |
| have precedence, but operations with equal precedence are performed left |
| to right. Apart from `+' or `-', both arguments must be absolute, and |
| the result is absolute. |
| |
| 1. Highest Precedence |
| |
| `*' |
| "Multiplication". |
| |
| `/' |
| "Division". Truncation is the same as the C operator `/' |
| |
| `%' |
| "Remainder". |
| |
| `<' |
| `<<' |
| "Shift Left". Same as the C operator `<<'. |
| |
| `>' |
| `>>' |
| "Shift Right". Same as the C operator `>>'. |
| |
| 2. Intermediate precedence |
| |
| `|' |
| "Bitwise Inclusive Or". |
| |
| `&' |
| "Bitwise And". |
| |
| `^' |
| "Bitwise Exclusive Or". |
| |
| `!' |
| "Bitwise Or Not". |
| |
| 3. Low Precedence |
| |
| `+' |
| "Addition". If either argument is absolute, the result has |
| the section of the other argument. You may not add together |
| arguments from different sections. |
| |
| `-' |
| "Subtraction". If the right argument is absolute, the result |
| has the section of the left argument. If both arguments are |
| in the same section, the result is absolute. You may not |
| subtract arguments from different sections. |
| |
| `==' |
| "Is Equal To" |
| |
| `<>' |
| "Is Not Equal To" |
| |
| `<' |
| "Is Less Than" |
| |
| `>' |
| "Is Greater Than" |
| |
| `>=' |
| "Is Greater Than Or Equal To" |
| |
| `<=' |
| "Is Less Than Or Equal To" |
| |
| The comparison operators can be used as infix operators. A |
| true results has a value of -1 whereas a false result has a |
| value of 0. Note, these operators perform signed |
| comparisons. |
| |
| 4. Lowest Precedence |
| |
| `&&' |
| "Logical And". |
| |
| `||' |
| "Logical Or". |
| |
| These two logical operations can be used to combine the |
| results of sub expressions. Note, unlike the comparison |
| operators a true result returns a value of 1 but a false |
| results does still return 0. Also note that the logical or |
| operator has a slightly lower precedence than logical and. |
| |
| |
| In short, it's only meaningful to add or subtract the _offsets_ in an |
| address; you can only have a defined section in one of the two |
| arguments. |
| |
| |
| File: as.info, Node: Pseudo Ops, Next: Machine Dependencies, Prev: Expressions, Up: Top |
| |
| Assembler Directives |
| ******************** |
| |
| All assembler directives have names that begin with a period (`.'). |
| The rest of the name is letters, usually in lower case. |
| |
| This chapter discusses directives that are available regardless of |
| the target machine configuration for the GNU assembler. Some machine |
| configurations provide additional directives. *Note Machine |
| Dependencies::. |
| |
| * Menu: |
| |
| * Abort:: `.abort' |
| |
| * ABORT:: `.ABORT' |
| |
| * Align:: `.align ABS-EXPR , ABS-EXPR' |
| * Ascii:: `.ascii "STRING"'... |
| * Asciz:: `.asciz "STRING"'... |
| * Balign:: `.balign ABS-EXPR , ABS-EXPR' |
| * Byte:: `.byte EXPRESSIONS' |
| * Comm:: `.comm SYMBOL , LENGTH ' |
| * Data:: `.data SUBSECTION' |
| |
| * Def:: `.def NAME' |
| |
| * Desc:: `.desc SYMBOL, ABS-EXPRESSION' |
| |
| * Dim:: `.dim' |
| |
| * Double:: `.double FLONUMS' |
| * Eject:: `.eject' |
| * Else:: `.else' |
| * Elseif:: `.elseif' |
| * End:: `.end' |
| |
| * Endef:: `.endef' |
| |
| * Endfunc:: `.endfunc' |
| * Endif:: `.endif' |
| * Equ:: `.equ SYMBOL, EXPRESSION' |
| * Equiv:: `.equiv SYMBOL, EXPRESSION' |
| * Err:: `.err' |
| * Exitm:: `.exitm' |
| * Extern:: `.extern' |
| * Fail:: `.fail' |
| |
| * File:: `.file STRING' |
| |
| * Fill:: `.fill REPEAT , SIZE , VALUE' |
| * Float:: `.float FLONUMS' |
| * Func:: `.func' |
| * Global:: `.global SYMBOL', `.globl SYMBOL' |
| |
| * Hidden:: `.hidden NAMES' |
| |
| * hword:: `.hword EXPRESSIONS' |
| * Ident:: `.ident' |
| * If:: `.if ABSOLUTE EXPRESSION' |
| * Incbin:: `.incbin "FILE"[,SKIP[,COUNT]]' |
| * Include:: `.include "FILE"' |
| * Int:: `.int EXPRESSIONS' |
| |
| * Internal:: `.internal NAMES' |
| |
| * Irp:: `.irp SYMBOL,VALUES'... |
| * Irpc:: `.irpc SYMBOL,VALUES'... |
| * Lcomm:: `.lcomm SYMBOL , LENGTH' |
| * Lflags:: `.lflags' |
| |
| * Line:: `.line LINE-NUMBER' |
| |
| * Ln:: `.ln LINE-NUMBER' |
| * Linkonce:: `.linkonce [TYPE]' |
| * List:: `.list' |
| * Long:: `.long EXPRESSIONS' |
| |
| * Macro:: `.macro NAME ARGS'... |
| * MRI:: `.mri VAL' |
| * Nolist:: `.nolist' |
| * Octa:: `.octa BIGNUMS' |
| * Org:: `.org NEW-LC , FILL' |
| * P2align:: `.p2align ABS-EXPR , ABS-EXPR' |
| |
| * PopSection:: `.popsection' |
| * Previous:: `.previous' |
| |
| * Print:: `.print STRING' |
| |
| * Protected:: `.protected NAMES' |
| |
| * Psize:: `.psize LINES, COLUMNS' |
| * Purgem:: `.purgem NAME' |
| |
| * PushSection:: `.pushsection NAME' |
| |
| * Quad:: `.quad BIGNUMS' |
| * Rept:: `.rept COUNT' |
| * Sbttl:: `.sbttl "SUBHEADING"' |
| |
| * Scl:: `.scl CLASS' |
| |
| * Section:: `.section NAME' |
| |
| * Set:: `.set SYMBOL, EXPRESSION' |
| * Short:: `.short EXPRESSIONS' |
| * Single:: `.single FLONUMS' |
| |
| * Size:: `.size [NAME , EXPRESSION]' |
| |
| * Skip:: `.skip SIZE , FILL' |
| * Sleb128:: `.sleb128 EXPRESSIONS' |
| * Space:: `.space SIZE , FILL' |
| |
| * Stab:: `.stabd, .stabn, .stabs' |
| |
| * String:: `.string "STR"' |
| * Struct:: `.struct EXPRESSION' |
| |
| * SubSection:: `.subsection' |
| * Symver:: `.symver NAME,NAME2@NODENAME' |
| |
| |
| * Tag:: `.tag STRUCTNAME' |
| |
| * Text:: `.text SUBSECTION' |
| * Title:: `.title "HEADING"' |
| |
| * Type:: `.type <INT | NAME , TYPE DESCRIPTION>' |
| |
| * Uleb128:: `.uleb128 EXPRESSIONS' |
| |
| * Val:: `.val ADDR' |
| |
| |
| * Version:: `.version "STRING"' |
| * VTableEntry:: `.vtable_entry TABLE, OFFSET' |
| * VTableInherit:: `.vtable_inherit CHILD, PARENT' |
| * Weak:: `.weak NAMES' |
| |
| * Word:: `.word EXPRESSIONS' |
| * Deprecated:: Deprecated Directives |
| |
| |
| File: as.info, Node: Abort, Next: ABORT, Up: Pseudo Ops |
| |
| `.abort' |
| ======== |
| |
| This directive stops the assembly immediately. It is for |
| compatibility with other assemblers. The original idea was that the |
| assembly language source would be piped into the assembler. If the |
| sender of the source quit, it could use this directive tells `as' to |
| quit also. One day `.abort' will not be supported. |
| |
| |
| File: as.info, Node: ABORT, Next: Align, Prev: Abort, Up: Pseudo Ops |
| |
| `.ABORT' |
| ======== |
| |
| When producing COFF output, `as' accepts this directive as a synonym |
| for `.abort'. |
| |
| When producing `b.out' output, `as' accepts this directive, but |
| ignores it. |
| |
| |
| File: as.info, Node: Align, Next: Ascii, Prev: ABORT, Up: Pseudo Ops |
| |
| `.align ABS-EXPR, ABS-EXPR, ABS-EXPR' |
| ===================================== |
| |
| Pad the location counter (in the current subsection) to a particular |
| storage boundary. The first expression (which must be absolute) is the |
| alignment required, as described below. |
| |
| The second expression (also absolute) gives the fill value to be |
| stored in the padding bytes. It (and the comma) may be omitted. If it |
| is omitted, the padding bytes are normally zero. However, on some |
| systems, if the section is marked as containing code and the fill value |
| is omitted, the space is filled with no-op instructions. |
| |
| The third expression is also absolute, and is also optional. If it |
| is present, it is the maximum number of bytes that should be skipped by |
| this alignment directive. If doing the alignment would require |
| skipping more bytes than the specified maximum, then the alignment is |
| not done at all. You can omit the fill value (the second argument) |
| entirely by simply using two commas after the required alignment; this |
| can be useful if you want the alignment to be filled with no-op |
| instructions when appropriate. |
| |
| The way the required alignment is specified varies from system to |
| system. For the a29k, hppa, m68k, m88k, w65, sparc, Xtensa, and |
| Renesas / SuperH SH, and i386 using ELF format, the first expression is |
| the alignment request in bytes. For example `.align 8' advances the |
| location counter until it is a multiple of 8. If the location counter |
| is already a multiple of 8, no change is needed. |
| |
| For other systems, including the i386 using a.out format, and the |
| arm and strongarm, it is the number of low-order zero bits the location |
| counter must have after advancement. For example `.align 3' advances |
| the location counter until it a multiple of 8. If the location counter |
| is already a multiple of 8, no change is needed. |
| |
| This inconsistency is due to the different behaviors of the various |
| native assemblers for these systems which GAS must emulate. GAS also |
| provides `.balign' and `.p2align' directives, described later, which |
| have a consistent behavior across all architectures (but are specific |
| to GAS). |
| |
| |
| File: as.info, Node: Ascii, Next: Asciz, Prev: Align, Up: Pseudo Ops |
| |
| `.ascii "STRING"'... |
| ==================== |
| |
| `.ascii' expects zero or more string literals (*note Strings::) |
| separated by commas. It assembles each string (with no automatic |
| trailing zero byte) into consecutive addresses. |
| |
| |
| File: as.info, Node: Asciz, Next: Balign, Prev: Ascii, Up: Pseudo Ops |
| |
| `.asciz "STRING"'... |
| ==================== |
| |
| `.asciz' is just like `.ascii', but each string is followed by a |
| zero byte. The "z" in `.asciz' stands for "zero". |
| |