| This is ld.info, produced by makeinfo version 4.3 from ./ld.texinfo. |
| |
| START-INFO-DIR-ENTRY |
| * Ld: (ld). The GNU linker. |
| END-INFO-DIR-ENTRY |
| |
| This file documents the GNU linker LD version 2.13.2.1. |
| |
| Copyright (C) 1991, 92, 93, 94, 95, 96, 97, 98, 99, 2000, 2001, 2002 |
| Free Software Foundation, Inc. |
| |
| |
| File: ld.info, Node: Overlay Description, Prev: Output Section Attributes, Up: SECTIONS |
| |
| Overlay description |
| ------------------- |
| |
| An overlay description provides an easy way to describe sections |
| which are to be loaded as part of a single memory image but are to be |
| run at the same memory address. At run time, some sort of overlay |
| manager will copy the overlaid sections in and out of the runtime |
| memory address as required, perhaps by simply manipulating addressing |
| bits. This approach can be useful, for example, when a certain region |
| of memory is faster than another. |
| |
| Overlays are described using the `OVERLAY' command. The `OVERLAY' |
| command is used within a `SECTIONS' command, like an output section |
| description. The full syntax of the `OVERLAY' command is as follows: |
| OVERLAY [START] : [NOCROSSREFS] [AT ( LDADDR )] |
| { |
| SECNAME1 |
| { |
| OUTPUT-SECTION-COMMAND |
| OUTPUT-SECTION-COMMAND |
| ... |
| } [:PHDR...] [=FILL] |
| SECNAME2 |
| { |
| OUTPUT-SECTION-COMMAND |
| OUTPUT-SECTION-COMMAND |
| ... |
| } [:PHDR...] [=FILL] |
| ... |
| } [>REGION] [:PHDR...] [=FILL] |
| |
| Everything is optional except `OVERLAY' (a keyword), and each |
| section must have a name (SECNAME1 and SECNAME2 above). The section |
| definitions within the `OVERLAY' construct are identical to those |
| within the general `SECTIONS' contruct (*note SECTIONS::), except that |
| no addresses and no memory regions may be defined for sections within |
| an `OVERLAY'. |
| |
| The sections are all defined with the same starting address. The |
| load addresses of the sections are arranged such that they are |
| consecutive in memory starting at the load address used for the |
| `OVERLAY' as a whole (as with normal section definitions, the load |
| address is optional, and defaults to the start address; the start |
| address is also optional, and defaults to the current value of the |
| location counter). |
| |
| If the `NOCROSSREFS' keyword is used, and there any references among |
| the sections, the linker will report an error. Since the sections all |
| run at the same address, it normally does not make sense for one |
| section to refer directly to another. *Note NOCROSSREFS: Miscellaneous |
| Commands. |
| |
| For each section within the `OVERLAY', the linker automatically |
| defines two symbols. The symbol `__load_start_SECNAME' is defined as |
| the starting load address of the section. The symbol |
| `__load_stop_SECNAME' is defined as the final load address of the |
| section. Any characters within SECNAME which are not legal within C |
| identifiers are removed. C (or assembler) code may use these symbols |
| to move the overlaid sections around as necessary. |
| |
| At the end of the overlay, the value of the location counter is set |
| to the start address of the overlay plus the size of the largest |
| section. |
| |
| Here is an example. Remember that this would appear inside a |
| `SECTIONS' construct. |
| OVERLAY 0x1000 : AT (0x4000) |
| { |
| .text0 { o1/*.o(.text) } |
| .text1 { o2/*.o(.text) } |
| } |
| |
| This will define both `.text0' and `.text1' to start at address 0x1000. |
| `.text0' will be loaded at address 0x4000, and `.text1' will be loaded |
| immediately after `.text0'. The following symbols will be defined: |
| `__load_start_text0', `__load_stop_text0', `__load_start_text1', |
| `__load_stop_text1'. |
| |
| C code to copy overlay `.text1' into the overlay area might look |
| like the following. |
| |
| extern char __load_start_text1, __load_stop_text1; |
| memcpy ((char *) 0x1000, &__load_start_text1, |
| &__load_stop_text1 - &__load_start_text1); |
| |
| Note that the `OVERLAY' command is just syntactic sugar, since |
| everything it does can be done using the more basic commands. The above |
| example could have been written identically as follows. |
| |
| .text0 0x1000 : AT (0x4000) { o1/*.o(.text) } |
| __load_start_text0 = LOADADDR (.text0); |
| __load_stop_text0 = LOADADDR (.text0) + SIZEOF (.text0); |
| .text1 0x1000 : AT (0x4000 + SIZEOF (.text0)) { o2/*.o(.text) } |
| __load_start_text1 = LOADADDR (.text1); |
| __load_stop_text1 = LOADADDR (.text1) + SIZEOF (.text1); |
| . = 0x1000 + MAX (SIZEOF (.text0), SIZEOF (.text1)); |
| |
| |
| File: ld.info, Node: MEMORY, Next: PHDRS, Prev: SECTIONS, Up: Scripts |
| |
| MEMORY command |
| ============== |
| |
| The linker's default configuration permits allocation of all |
| available memory. You can override this by using the `MEMORY' command. |
| |
| The `MEMORY' command describes the location and size of blocks of |
| memory in the target. You can use it to describe which memory regions |
| may be used by the linker, and which memory regions it must avoid. You |
| can then assign sections to particular memory regions. The linker will |
| set section addresses based on the memory regions, and will warn about |
| regions that become too full. The linker will not shuffle sections |
| around to fit into the available regions. |
| |
| A linker script may contain at most one use of the `MEMORY' command. |
| However, you can define as many blocks of memory within it as you |
| wish. The syntax is: |
| MEMORY |
| { |
| NAME [(ATTR)] : ORIGIN = ORIGIN, LENGTH = LEN |
| ... |
| } |
| |
| The NAME is a name used in the linker script to refer to the region. |
| The region name has no meaning outside of the linker script. Region |
| names are stored in a separate name space, and will not conflict with |
| symbol names, file names, or section names. Each memory region must |
| have a distinct name. |
| |
| The ATTR string is an optional list of attributes that specify |
| whether to use a particular memory region for an input section which is |
| not explicitly mapped in the linker script. As described in *Note |
| SECTIONS::, if you do not specify an output section for some input |
| section, the linker will create an output section with the same name as |
| the input section. If you define region attributes, the linker will use |
| them to select the memory region for the output section that it creates. |
| |
| The ATTR string must consist only of the following characters: |
| `R' |
| Read-only section |
| |
| `W' |
| Read/write section |
| |
| `X' |
| Executable section |
| |
| `A' |
| Allocatable section |
| |
| `I' |
| Initialized section |
| |
| `L' |
| Same as `I' |
| |
| `!' |
| Invert the sense of any of the preceding attributes |
| |
| If a unmapped section matches any of the listed attributes other than |
| `!', it will be placed in the memory region. The `!' attribute |
| reverses this test, so that an unmapped section will be placed in the |
| memory region only if it does not match any of the listed attributes. |
| |
| The ORIGIN is an expression for the start address of the memory |
| region. The expression must evaluate to a constant before memory |
| allocation is performed, which means that you may not use any section |
| relative symbols. The keyword `ORIGIN' may be abbreviated to `org' or |
| `o' (but not, for example, `ORG'). |
| |
| The LEN is an expression for the size in bytes of the memory region. |
| As with the ORIGIN expression, the expression must evaluate to a |
| constant before memory allocation is performed. The keyword `LENGTH' |
| may be abbreviated to `len' or `l'. |
| |
| In the following example, we specify that there are two memory |
| regions available for allocation: one starting at `0' for 256 kilobytes, |
| and the other starting at `0x40000000' for four megabytes. The linker |
| will place into the `rom' memory region every section which is not |
| explicitly mapped into a memory region, and is either read-only or |
| executable. The linker will place other sections which are not |
| explicitly mapped into a memory region into the `ram' memory region. |
| |
| MEMORY |
| { |
| rom (rx) : ORIGIN = 0, LENGTH = 256K |
| ram (!rx) : org = 0x40000000, l = 4M |
| } |
| |
| Once you define a memory region, you can direct the linker to place |
| specific output sections into that memory region by using the `>REGION' |
| output section attribute. For example, if you have a memory region |
| named `mem', you would use `>mem' in the output section definition. |
| *Note Output Section Region::. If no address was specified for the |
| output section, the linker will set the address to the next available |
| address within the memory region. If the combined output sections |
| directed to a memory region are too large for the region, the linker |
| will issue an error message. |
| |
| |
| File: ld.info, Node: PHDRS, Next: VERSION, Prev: MEMORY, Up: Scripts |
| |
| PHDRS Command |
| ============= |
| |
| The ELF object file format uses "program headers", also knows as |
| "segments". The program headers describe how the program should be |
| loaded into memory. You can print them out by using the `objdump' |
| program with the `-p' option. |
| |
| When you run an ELF program on a native ELF system, the system loader |
| reads the program headers in order to figure out how to load the |
| program. This will only work if the program headers are set correctly. |
| This manual does not describe the details of how the system loader |
| interprets program headers; for more information, see the ELF ABI. |
| |
| The linker will create reasonable program headers by default. |
| However, in some cases, you may need to specify the program headers more |
| precisely. You may use the `PHDRS' command for this purpose. When the |
| linker sees the `PHDRS' command in the linker script, it will not |
| create any program headers other than the ones specified. |
| |
| The linker only pays attention to the `PHDRS' command when |
| generating an ELF output file. In other cases, the linker will simply |
| ignore `PHDRS'. |
| |
| This is the syntax of the `PHDRS' command. The words `PHDRS', |
| `FILEHDR', `AT', and `FLAGS' are keywords. |
| |
| PHDRS |
| { |
| NAME TYPE [ FILEHDR ] [ PHDRS ] [ AT ( ADDRESS ) ] |
| [ FLAGS ( FLAGS ) ] ; |
| } |
| |
| The NAME is used only for reference in the `SECTIONS' command of the |
| linker script. It is not put into the output file. Program header |
| names are stored in a separate name space, and will not conflict with |
| symbol names, file names, or section names. Each program header must |
| have a distinct name. |
| |
| Certain program header types describe segments of memory which the |
| system loader will load from the file. In the linker script, you |
| specify the contents of these segments by placing allocatable output |
| sections in the segments. You use the `:PHDR' output section attribute |
| to place a section in a particular segment. *Note Output Section |
| Phdr::. |
| |
| It is normal to put certain sections in more than one segment. This |
| merely implies that one segment of memory contains another. You may |
| repeat `:PHDR', using it once for each segment which should contain the |
| section. |
| |
| If you place a section in one or more segments using `:PHDR', then |
| the linker will place all subsequent allocatable sections which do not |
| specify `:PHDR' in the same segments. This is for convenience, since |
| generally a whole set of contiguous sections will be placed in a single |
| segment. You can use `:NONE' to override the default segment and tell |
| the linker to not put the section in any segment at all. |
| |
| You may use the `FILEHDR' and `PHDRS' keywords appear after the |
| program header type to further describe the contents of the segment. |
| The `FILEHDR' keyword means that the segment should include the ELF |
| file header. The `PHDRS' keyword means that the segment should include |
| the ELF program headers themselves. |
| |
| The TYPE may be one of the following. The numbers indicate the |
| value of the keyword. |
| |
| `PT_NULL' (0) |
| Indicates an unused program header. |
| |
| `PT_LOAD' (1) |
| Indicates that this program header describes a segment to be |
| loaded from the file. |
| |
| `PT_DYNAMIC' (2) |
| Indicates a segment where dynamic linking information can be found. |
| |
| `PT_INTERP' (3) |
| Indicates a segment where the name of the program interpreter may |
| be found. |
| |
| `PT_NOTE' (4) |
| Indicates a segment holding note information. |
| |
| `PT_SHLIB' (5) |
| A reserved program header type, defined but not specified by the |
| ELF ABI. |
| |
| `PT_PHDR' (6) |
| Indicates a segment where the program headers may be found. |
| |
| EXPRESSION |
| An expression giving the numeric type of the program header. This |
| may be used for types not defined above. |
| |
| You can specify that a segment should be loaded at a particular |
| address in memory by using an `AT' expression. This is identical to the |
| `AT' command used as an output section attribute (*note Output Section |
| LMA::). The `AT' command for a program header overrides the output |
| section attribute. |
| |
| The linker will normally set the segment flags based on the sections |
| which comprise the segment. You may use the `FLAGS' keyword to |
| explicitly specify the segment flags. The value of FLAGS must be an |
| integer. It is used to set the `p_flags' field of the program header. |
| |
| Here is an example of `PHDRS'. This shows a typical set of program |
| headers used on a native ELF system. |
| |
| PHDRS |
| { |
| headers PT_PHDR PHDRS ; |
| interp PT_INTERP ; |
| text PT_LOAD FILEHDR PHDRS ; |
| data PT_LOAD ; |
| dynamic PT_DYNAMIC ; |
| } |
| |
| SECTIONS |
| { |
| . = SIZEOF_HEADERS; |
| .interp : { *(.interp) } :text :interp |
| .text : { *(.text) } :text |
| .rodata : { *(.rodata) } /* defaults to :text */ |
| ... |
| . = . + 0x1000; /* move to a new page in memory */ |
| .data : { *(.data) } :data |
| .dynamic : { *(.dynamic) } :data :dynamic |
| ... |
| } |
| |
| |
| File: ld.info, Node: VERSION, Next: Expressions, Prev: PHDRS, Up: Scripts |
| |
| VERSION Command |
| =============== |
| |
| The linker supports symbol versions when using ELF. Symbol versions |
| are only useful when using shared libraries. The dynamic linker can use |
| symbol versions to select a specific version of a function when it runs |
| a program that may have been linked against an earlier version of the |
| shared library. |
| |
| You can include a version script directly in the main linker script, |
| or you can supply the version script as an implicit linker script. You |
| can also use the `--version-script' linker option. |
| |
| The syntax of the `VERSION' command is simply |
| VERSION { version-script-commands } |
| |
| The format of the version script commands is identical to that used |
| by Sun's linker in Solaris 2.5. The version script defines a tree of |
| version nodes. You specify the node names and interdependencies in the |
| version script. You can specify which symbols are bound to which |
| version nodes, and you can reduce a specified set of symbols to local |
| scope so that they are not globally visible outside of the shared |
| library. |
| |
| The easiest way to demonstrate the version script language is with a |
| few examples. |
| |
| VERS_1.1 { |
| global: |
| foo1; |
| local: |
| old*; |
| original*; |
| new*; |
| }; |
| |
| VERS_1.2 { |
| foo2; |
| } VERS_1.1; |
| |
| VERS_2.0 { |
| bar1; bar2; |
| } VERS_1.2; |
| |
| This example version script defines three version nodes. The first |
| version node defined is `VERS_1.1'; it has no other dependencies. The |
| script binds the symbol `foo1' to `VERS_1.1'. It reduces a number of |
| symbols to local scope so that they are not visible outside of the |
| shared library; this is done using wildcard patterns, so that any |
| symbol whose name begins with `old', `original', or `new' is matched. |
| The wildcard patterns available are the same as those used in the shell |
| when matching filenames (also known as "globbing"). |
| |
| Next, the version script defines node `VERS_1.2'. This node depends |
| upon `VERS_1.1'. The script binds the symbol `foo2' to the version |
| node `VERS_1.2'. |
| |
| Finally, the version script defines node `VERS_2.0'. This node |
| depends upon `VERS_1.2'. The scripts binds the symbols `bar1' and |
| `bar2' are bound to the version node `VERS_2.0'. |
| |
| When the linker finds a symbol defined in a library which is not |
| specifically bound to a version node, it will effectively bind it to an |
| unspecified base version of the library. You can bind all otherwise |
| unspecified symbols to a given version node by using `global: *;' |
| somewhere in the version script. |
| |
| The names of the version nodes have no specific meaning other than |
| what they might suggest to the person reading them. The `2.0' version |
| could just as well have appeared in between `1.1' and `1.2'. However, |
| this would be a confusing way to write a version script. |
| |
| Node name can be omited, provided it is the only version node in the |
| version script. Such version script doesn't assign any versions to |
| symbols, only selects which symbols will be globally visible out and |
| which won't. |
| |
| { global: foo; bar; local: *; }; |
| |
| When you link an application against a shared library that has |
| versioned symbols, the application itself knows which version of each |
| symbol it requires, and it also knows which version nodes it needs from |
| each shared library it is linked against. Thus at runtime, the dynamic |
| loader can make a quick check to make sure that the libraries you have |
| linked against do in fact supply all of the version nodes that the |
| application will need to resolve all of the dynamic symbols. In this |
| way it is possible for the dynamic linker to know with certainty that |
| all external symbols that it needs will be resolvable without having to |
| search for each symbol reference. |
| |
| The symbol versioning is in effect a much more sophisticated way of |
| doing minor version checking that SunOS does. The fundamental problem |
| that is being addressed here is that typically references to external |
| functions are bound on an as-needed basis, and are not all bound when |
| the application starts up. If a shared library is out of date, a |
| required interface may be missing; when the application tries to use |
| that interface, it may suddenly and unexpectedly fail. With symbol |
| versioning, the user will get a warning when they start their program if |
| the libraries being used with the application are too old. |
| |
| There are several GNU extensions to Sun's versioning approach. The |
| first of these is the ability to bind a symbol to a version node in the |
| source file where the symbol is defined instead of in the versioning |
| script. This was done mainly to reduce the burden on the library |
| maintainer. You can do this by putting something like: |
| __asm__(".symver original_foo,foo@VERS_1.1"); |
| |
| in the C source file. This renames the function `original_foo' to be |
| an alias for `foo' bound to the version node `VERS_1.1'. The `local:' |
| directive can be used to prevent the symbol `original_foo' from being |
| exported. A `.symver' directive takes precedence over a version script. |
| |
| The second GNU extension is to allow multiple versions of the same |
| function to appear in a given shared library. In this way you can make |
| an incompatible change to an interface without increasing the major |
| version number of the shared library, while still allowing applications |
| linked against the old interface to continue to function. |
| |
| To do this, you must use multiple `.symver' directives in the source |
| file. Here is an example: |
| |
| __asm__(".symver original_foo,foo@"); |
| __asm__(".symver old_foo,foo@VERS_1.1"); |
| __asm__(".symver old_foo1,foo@VERS_1.2"); |
| __asm__(".symver new_foo,foo@@VERS_2.0"); |
| |
| In this example, `foo@' represents the symbol `foo' bound to the |
| unspecified base version of the symbol. The source file that contains |
| this example would define 4 C functions: `original_foo', `old_foo', |
| `old_foo1', and `new_foo'. |
| |
| When you have multiple definitions of a given symbol, there needs to |
| be some way to specify a default version to which external references to |
| this symbol will be bound. You can do this with the `foo@@VERS_2.0' |
| type of `.symver' directive. You can only declare one version of a |
| symbol as the default in this manner; otherwise you would effectively |
| have multiple definitions of the same symbol. |
| |
| If you wish to bind a reference to a specific version of the symbol |
| within the shared library, you can use the aliases of convenience (i.e. |
| `old_foo'), or you can use the `.symver' directive to specifically bind |
| to an external version of the function in question. |
| |
| You can also specify the language in the version script: |
| |
| VERSION extern "lang" { version-script-commands } |
| |
| The supported `lang's are `C', `C++', and `Java'. The linker will |
| iterate over the list of symbols at the link time and demangle them |
| according to `lang' before matching them to the patterns specified in |
| `version-script-commands'. |
| |
| |
| File: ld.info, Node: Expressions, Next: Implicit Linker Scripts, Prev: VERSION, Up: Scripts |
| |
| Expressions in Linker Scripts |
| ============================= |
| |
| The syntax for expressions in the linker script language is |
| identical to that of C expressions. All expressions are evaluated as |
| integers. All expressions are evaluated in the same size, which is 32 |
| bits if both the host and target are 32 bits, and is otherwise 64 bits. |
| |
| You can use and set symbol values in expressions. |
| |
| The linker defines several special purpose builtin functions for use |
| in expressions. |
| |
| * Menu: |
| |
| * Constants:: Constants |
| * Symbols:: Symbol Names |
| * Location Counter:: The Location Counter |
| * Operators:: Operators |
| * Evaluation:: Evaluation |
| * Expression Section:: The Section of an Expression |
| * Builtin Functions:: Builtin Functions |
| |
| |
| File: ld.info, Node: Constants, Next: Symbols, Up: Expressions |
| |
| Constants |
| --------- |
| |
| All constants are integers. |
| |
| As in C, the linker considers an integer beginning with `0' to be |
| octal, and an integer beginning with `0x' or `0X' to be hexadecimal. |
| The linker considers other integers to be decimal. |
| |
| In addition, you can use the suffixes `K' and `M' to scale a |
| constant by `1024' or `1024*1024' respectively. For example, the |
| following all refer to the same quantity: |
| _fourk_1 = 4K; |
| _fourk_2 = 4096; |
| _fourk_3 = 0x1000; |
| |
| |
| File: ld.info, Node: Symbols, Next: Location Counter, Prev: Constants, Up: Expressions |
| |
| Symbol Names |
| ------------ |
| |
| Unless quoted, symbol names start with a letter, underscore, or |
| period and may include letters, digits, underscores, periods, and |
| hyphens. Unquoted symbol names must not conflict with any keywords. |
| You can specify a symbol which contains odd characters or has the same |
| name as a keyword by surrounding the symbol name in double quotes: |
| "SECTION" = 9; |
| "with a space" = "also with a space" + 10; |
| |
| Since symbols can contain many non-alphabetic characters, it is |
| safest to delimit symbols with spaces. For example, `A-B' is one |
| symbol, whereas `A - B' is an expression involving subtraction. |
| |
| |
| File: ld.info, Node: Location Counter, Next: Operators, Prev: Symbols, Up: Expressions |
| |
| The Location Counter |
| -------------------- |
| |
| The special linker variable "dot" `.' always contains the current |
| output location counter. Since the `.' always refers to a location in |
| an output section, it may only appear in an expression within a |
| `SECTIONS' command. The `.' symbol may appear anywhere that an |
| ordinary symbol is allowed in an expression. |
| |
| Assigning a value to `.' will cause the location counter to be |
| moved. This may be used to create holes in the output section. The |
| location counter may never be moved backwards. |
| |
| SECTIONS |
| { |
| output : |
| { |
| file1(.text) |
| . = . + 1000; |
| file2(.text) |
| . += 1000; |
| file3(.text) |
| } = 0x12345678; |
| } |
| |
| In the previous example, the `.text' section from `file1' is located at |
| the beginning of the output section `output'. It is followed by a 1000 |
| byte gap. Then the `.text' section from `file2' appears, also with a |
| 1000 byte gap following before the `.text' section from `file3'. The |
| notation `= 0x12345678' specifies what data to write in the gaps (*note |
| Output Section Fill::). |
| |
| Note: `.' actually refers to the byte offset from the start of the |
| current containing object. Normally this is the `SECTIONS' statement, |
| whoes start address is 0, hence `.' can be used as an absolute address. |
| If `.' is used inside a section description however, it refers to the |
| byte offset from the start of that section, not an absolute address. |
| Thus in a script like this: |
| |
| SECTIONS |
| { |
| . = 0x100 |
| .text: { |
| *(.text) |
| . = 0x200 |
| } |
| . = 0x500 |
| .data: { |
| *(.data) |
| . += 0x600 |
| } |
| } |
| |
| The `.text' section will be assigned a starting address of 0x100 and |
| a size of exactly 0x200 bytes, even if there is not enough data in the |
| `.text' input sections to fill this area. (If there is too much data, |
| an error will be produced because this would be an attempt to move `.' |
| backwards). The `.data' section will start at 0x500 and it will have |
| an extra 0x600 bytes worth of space after the end of the values from |
| the `.data' input sections and before the end of the `.data' output |
| section itself. |
| |
| |
| File: ld.info, Node: Operators, Next: Evaluation, Prev: Location Counter, Up: Expressions |
| |
| Operators |
| --------- |
| |
| The linker recognizes the standard C set of arithmetic operators, |
| with the standard bindings and precedence levels: |
| precedence associativity Operators Notes |
| (highest) |
| 1 left ! - ~ (1) |
| 2 left * / % |
| 3 left + - |
| 4 left >> << |
| 5 left == != > < <= >= |
| 6 left & |
| 7 left | |
| 8 left && |
| 9 left || |
| 10 right ? : |
| 11 right &= += -= *= /= (2) |
| (lowest) |
| Notes: (1) Prefix operators (2) *Note Assignments::. |
| |
| |
| File: ld.info, Node: Evaluation, Next: Expression Section, Prev: Operators, Up: Expressions |
| |
| Evaluation |
| ---------- |
| |
| The linker evaluates expressions lazily. It only computes the value |
| of an expression when absolutely necessary. |
| |
| The linker needs some information, such as the value of the start |
| address of the first section, and the origins and lengths of memory |
| regions, in order to do any linking at all. These values are computed |
| as soon as possible when the linker reads in the linker script. |
| |
| However, other values (such as symbol values) are not known or needed |
| until after storage allocation. Such values are evaluated later, when |
| other information (such as the sizes of output sections) is available |
| for use in the symbol assignment expression. |
| |
| The sizes of sections cannot be known until after allocation, so |
| assignments dependent upon these are not performed until after |
| allocation. |
| |
| Some expressions, such as those depending upon the location counter |
| `.', must be evaluated during section allocation. |
| |
| If the result of an expression is required, but the value is not |
| available, then an error results. For example, a script like the |
| following |
| SECTIONS |
| { |
| .text 9+this_isnt_constant : |
| { *(.text) } |
| } |
| |
| will cause the error message `non constant expression for initial |
| address'. |
| |
| |
| File: ld.info, Node: Expression Section, Next: Builtin Functions, Prev: Evaluation, Up: Expressions |
| |
| The Section of an Expression |
| ---------------------------- |
| |
| When the linker evaluates an expression, the result is either |
| absolute or relative to some section. A relative expression is |
| expressed as a fixed offset from the base of a section. |
| |
| The position of the expression within the linker script determines |
| whether it is absolute or relative. An expression which appears within |
| an output section definition is relative to the base of the output |
| section. An expression which appears elsewhere will be absolute. |
| |
| A symbol set to a relative expression will be relocatable if you |
| request relocatable output using the `-r' option. That means that a |
| further link operation may change the value of the symbol. The symbol's |
| section will be the section of the relative expression. |
| |
| A symbol set to an absolute expression will retain the same value |
| through any further link operation. The symbol will be absolute, and |
| will not have any particular associated section. |
| |
| You can use the builtin function `ABSOLUTE' to force an expression |
| to be absolute when it would otherwise be relative. For example, to |
| create an absolute symbol set to the address of the end of the output |
| section `.data': |
| SECTIONS |
| { |
| .data : { *(.data) _edata = ABSOLUTE(.); } |
| } |
| |
| If `ABSOLUTE' were not used, `_edata' would be relative to the `.data' |
| section. |
| |
| |
| File: ld.info, Node: Builtin Functions, Prev: Expression Section, Up: Expressions |
| |
| Builtin Functions |
| ----------------- |
| |
| The linker script language includes a number of builtin functions for |
| use in linker script expressions. |
| |
| `ABSOLUTE(EXP)' |
| Return the absolute (non-relocatable, as opposed to non-negative) |
| value of the expression EXP. Primarily useful to assign an |
| absolute value to a symbol within a section definition, where |
| symbol values are normally section relative. *Note Expression |
| Section::. |
| |
| `ADDR(SECTION)' |
| Return the absolute address (the VMA) of the named SECTION. Your |
| script must previously have defined the location of that section. |
| In the following example, `symbol_1' and `symbol_2' are assigned |
| identical values: |
| SECTIONS { ... |
| .output1 : |
| { |
| start_of_output_1 = ABSOLUTE(.); |
| ... |
| } |
| .output : |
| { |
| symbol_1 = ADDR(.output1); |
| symbol_2 = start_of_output_1; |
| } |
| ... } |
| |
| `ALIGN(EXP)' |
| Return the location counter (`.') aligned to the next EXP boundary. |
| `ALIGN' doesn't change the value of the location counter--it just |
| does arithmetic on it. Here is an example which aligns the output |
| `.data' section to the next `0x2000' byte boundary after the |
| preceding section and sets a variable within the section to the |
| next `0x8000' boundary after the input sections: |
| SECTIONS { ... |
| .data ALIGN(0x2000): { |
| *(.data) |
| variable = ALIGN(0x8000); |
| } |
| ... } |
| |
| The first use of `ALIGN' in this example specifies the location of |
| a section because it is used as the optional ADDRESS attribute of |
| a section definition (*note Output Section Address::). The second |
| use of `ALIGN' is used to defines the value of a symbol. |
| |
| The builtin function `NEXT' is closely related to `ALIGN'. |
| |
| `BLOCK(EXP)' |
| This is a synonym for `ALIGN', for compatibility with older linker |
| scripts. It is most often seen when setting the address of an |
| output section. |
| |
| `DATA_SEGMENT_ALIGN(MAXPAGESIZE, COMMONPAGESIZE)' |
| This is equivalent to either |
| (ALIGN(MAXPAGESIZE) + (. & (MAXPAGESIZE - 1))) |
| or |
| (ALIGN(MAXPAGESIZE) + (. & (MAXPAGESIZE - COMMONPAGESIZE))) |
| |
| depending on whether the latter uses fewer COMMONPAGESIZE sized |
| pages for the data segment (area between the result of this |
| expression and `DATA_SEGMENT_END') than the former or not. If the |
| latter form is used, it means COMMONPAGESIZE bytes of runtime |
| memory will be saved at the expense of up to COMMONPAGESIZE wasted |
| bytes in the on-disk file. |
| |
| This expression can only be used directly in `SECTIONS' commands, |
| not in any output section descriptions and only once in the linker |
| script. COMMONPAGESIZE should be less or equal to MAXPAGESIZE and |
| should be the system page size the object wants to be optimized |
| for (while still working on system page sizes up to MAXPAGESIZE). |
| |
| Example: |
| . = DATA_SEGMENT_ALIGN(0x10000, 0x2000); |
| |
| `DATA_SEGMENT_END(EXP)' |
| This defines the end of data segment for `DATA_SEGMENT_ALIGN' |
| evaluation purposes. |
| |
| . = DATA_SEGMENT_END(.); |
| |
| `DEFINED(SYMBOL)' |
| Return 1 if SYMBOL is in the linker global symbol table and is |
| defined, otherwise return 0. You can use this function to provide |
| default values for symbols. For example, the following script |
| fragment shows how to set a global symbol `begin' to the first |
| location in the `.text' section--but if a symbol called `begin' |
| already existed, its value is preserved: |
| |
| SECTIONS { ... |
| .text : { |
| begin = DEFINED(begin) ? begin : . ; |
| ... |
| } |
| ... |
| } |
| |
| `LOADADDR(SECTION)' |
| Return the absolute LMA of the named SECTION. This is normally |
| the same as `ADDR', but it may be different if the `AT' attribute |
| is used in the output section definition (*note Output Section |
| LMA::). |
| |
| `MAX(EXP1, EXP2)' |
| Returns the maximum of EXP1 and EXP2. |
| |
| `MIN(EXP1, EXP2)' |
| Returns the minimum of EXP1 and EXP2. |
| |
| `NEXT(EXP)' |
| Return the next unallocated address that is a multiple of EXP. |
| This function is closely related to `ALIGN(EXP)'; unless you use |
| the `MEMORY' command to define discontinuous memory for the output |
| file, the two functions are equivalent. |
| |
| `SIZEOF(SECTION)' |
| Return the size in bytes of the named SECTION, if that section has |
| been allocated. If the section has not been allocated when this is |
| evaluated, the linker will report an error. In the following |
| example, `symbol_1' and `symbol_2' are assigned identical values: |
| SECTIONS{ ... |
| .output { |
| .start = . ; |
| ... |
| .end = . ; |
| } |
| symbol_1 = .end - .start ; |
| symbol_2 = SIZEOF(.output); |
| ... } |
| |
| `SIZEOF_HEADERS' |
| `sizeof_headers' |
| Return the size in bytes of the output file's headers. This is |
| information which appears at the start of the output file. You |
| can use this number when setting the start address of the first |
| section, if you choose, to facilitate paging. |
| |
| When producing an ELF output file, if the linker script uses the |
| `SIZEOF_HEADERS' builtin function, the linker must compute the |
| number of program headers before it has determined all the section |
| addresses and sizes. If the linker later discovers that it needs |
| additional program headers, it will report an error `not enough |
| room for program headers'. To avoid this error, you must avoid |
| using the `SIZEOF_HEADERS' function, or you must rework your linker |
| script to avoid forcing the linker to use additional program |
| headers, or you must define the program headers yourself using the |
| `PHDRS' command (*note PHDRS::). |
| |
| |
| File: ld.info, Node: Implicit Linker Scripts, Prev: Expressions, Up: Scripts |
| |
| Implicit Linker Scripts |
| ======================= |
| |
| If you specify a linker input file which the linker can not |
| recognize as an object file or an archive file, it will try to read the |
| file as a linker script. If the file can not be parsed as a linker |
| script, the linker will report an error. |
| |
| An implicit linker script will not replace the default linker script. |
| |
| Typically an implicit linker script would contain only symbol |
| assignments, or the `INPUT', `GROUP', or `VERSION' commands. |
| |
| Any input files read because of an implicit linker script will be |
| read at the position in the command line where the implicit linker |
| script was read. This can affect archive searching. |
| |
| |
| File: ld.info, Node: Machine Dependent, Next: BFD, Prev: Scripts, Up: Top |
| |
| Machine Dependent Features |
| ************************** |
| |
| `ld' has additional features on some platforms; the following |
| sections describe them. Machines where `ld' has no additional |
| functionality are not listed. |
| |
| * Menu: |
| |
| * H8/300:: `ld' and the H8/300 |
| * i960:: `ld' and the Intel 960 family |
| * ARM:: `ld' and the ARM family |
| * HPPA ELF32:: `ld' and HPPA 32-bit ELF |
| |
| * MMIX:: `ld' and MMIX |
| |
| * TI COFF:: `ld' and TI COFF |
| |
| |
| File: ld.info, Node: H8/300, Next: i960, Up: Machine Dependent |
| |
| `ld' and the H8/300 |
| =================== |
| |
| For the H8/300, `ld' can perform these global optimizations when you |
| specify the `--relax' command-line option. |
| |
| _relaxing address modes_ |
| `ld' finds all `jsr' and `jmp' instructions whose targets are |
| within eight bits, and turns them into eight-bit program-counter |
| relative `bsr' and `bra' instructions, respectively. |
| |
| _synthesizing instructions_ |
| `ld' finds all `mov.b' instructions which use the sixteen-bit |
| absolute address form, but refer to the top page of memory, and |
| changes them to use the eight-bit address form. (That is: the |
| linker turns `mov.b `@'AA:16' into `mov.b `@'AA:8' whenever the |
| address AA is in the top page of memory). |
| |
| |
| File: ld.info, Node: i960, Next: ARM, Prev: H8/300, Up: Machine Dependent |
| |
| `ld' and the Intel 960 family |
| ============================= |
| |
| You can use the `-AARCHITECTURE' command line option to specify one |
| of the two-letter names identifying members of the 960 family; the |
| option specifies the desired output target, and warns of any |
| incompatible instructions in the input files. It also modifies the |
| linker's search strategy for archive libraries, to support the use of |
| libraries specific to each particular architecture, by including in the |
| search loop names suffixed with the string identifying the architecture. |
| |
| For example, if your `ld' command line included `-ACA' as well as |
| `-ltry', the linker would look (in its built-in search paths, and in |
| any paths you specify with `-L') for a library with the names |
| |
| try |
| libtry.a |
| tryca |
| libtryca.a |
| |
| The first two possibilities would be considered in any event; the last |
| two are due to the use of `-ACA'. |
| |
| You can meaningfully use `-A' more than once on a command line, since |
| the 960 architecture family allows combination of target architectures; |
| each use will add another pair of name variants to search for when `-l' |
| specifies a library. |
| |
| `ld' supports the `--relax' option for the i960 family. If you |
| specify `--relax', `ld' finds all `balx' and `calx' instructions whose |
| targets are within 24 bits, and turns them into 24-bit program-counter |
| relative `bal' and `cal' instructions, respectively. `ld' also turns |
| `cal' instructions into `bal' instructions when it determines that the |
| target subroutine is a leaf routine (that is, the target subroutine does |
| not itself call any subroutines). |
| |
| |
| File: ld.info, Node: ARM, Next: HPPA ELF32, Prev: i960, Up: Machine Dependent |
| |
| `ld''s support for interworking between ARM and Thumb code |
| ========================================================== |
| |
| For the ARM, `ld' will generate code stubs to allow functions calls |
| betweem ARM and Thumb code. These stubs only work with code that has |
| been compiled and assembled with the `-mthumb-interwork' command line |
| option. If it is necessary to link with old ARM object files or |
| libraries, which have not been compiled with the -mthumb-interwork |
| option then the `--support-old-code' command line switch should be |
| given to the linker. This will make it generate larger stub functions |
| which will work with non-interworking aware ARM code. Note, however, |
| the linker does not support generating stubs for function calls to |
| non-interworking aware Thumb code. |
| |
| The `--thumb-entry' switch is a duplicate of the generic `--entry' |
| switch, in that it sets the program's starting address. But it also |
| sets the bottom bit of the address, so that it can be branched to using |
| a BX instruction, and the program will start executing in Thumb mode |
| straight away. |
| |
| |
| File: ld.info, Node: HPPA ELF32, Next: MMIX, Prev: ARM, Up: Machine Dependent |
| |
| `ld' and HPPA 32-bit ELF support |
| ================================ |
| |
| When generating a shared library, `ld' will by default generate |
| import stubs suitable for use with a single sub-space application. The |
| `--multi-subspace' switch causes `ld' to generate export stubs, and |
| different (larger) import stubs suitable for use with multiple |
| sub-spaces. |
| |
| Long branch stubs and import/export stubs are placed by `ld' in stub |
| sections located between groups of input sections. `--stub-group-size' |
| specifies the maximum size of a group of input sections handled by one |
| stub section. Since branch offsets are signed, a stub section may |
| serve two groups of input sections, one group before the stub section, |
| and one group after it. However, when using conditional branches that |
| require stubs, it may be better (for branch prediction) that stub |
| sections only serve one group of input sections. A negative value for |
| `N' chooses this scheme, ensuring that branches to stubs always use a |
| negative offset. Two special values of `N' are recognized, `1' and |
| `-1'. These both instruct `ld' to automatically size input section |
| groups for the branch types detected, with the same behaviour regarding |
| stub placement as other positive or negative values of `N' respectively. |
| |
| Note that `--stub-group-size' does not split input sections. A |
| single input section larger than the group size specified will of course |
| create a larger group (of one section). If input sections are too |
| large, it may not be possible for a branch to reach its stub. |
| |
| |
| File: ld.info, Node: MMIX, Next: TI COFF, Prev: HPPA ELF32, Up: Machine Dependent |
| |
| `ld' and MMIX |
| ============= |
| |
| For MMIX, there is choice of generating `ELF' object files or `mmo' |
| object files when linking. The simulator `mmix' understands the `mmo' |
| format. The binutils `objcopy' utility can translate between the two |
| formats. |
| |
| There is one special section, the `.MMIX.reg_contents' section. |
| Contents in this section is assumed to correspond to that of global |
| registers, and symbols referring to it are translated to special |
| symbols, equal to registers. In a final link, the start address of the |
| `.MMIX.reg_contents' section corresponds to the first allocated global |
| register multiplied by 8. Register `$255' is not included in this |
| section; it is always set to the program entry, which is at the symbol |
| `Main' for `mmo' files. |
| |
| Symbols with the prefix `__.MMIX.start.', for example |
| `__.MMIX.start..text' and `__.MMIX.start..data' are special; there must |
| be only one each, even if they are local. The default linker script |
| uses these to set the default start address of a section. |
| |
| Initial and trailing multiples of zero-valued 32-bit words in a |
| section, are left out from an mmo file. |
| |
| |
| File: ld.info, Node: TI COFF, Prev: MMIX, Up: Machine Dependent |
| |
| `ld''s support for various TI COFF versions |
| =========================================== |
| |
| The `--format' switch allows selection of one of the various TI COFF |
| versions. The latest of this writing is 2; versions 0 and 1 are also |
| supported. The TI COFF versions also vary in header byte-order format; |
| `ld' will read any version or byte order, but the output header format |
| depends on the default specified by the specific target. |
| |
| |
| File: ld.info, Node: BFD, Next: Reporting Bugs, Prev: Machine Dependent, Up: Top |
| |
| BFD |
| *** |
| |
| The linker accesses object and archive files using the BFD libraries. |
| These libraries allow the linker to use the same routines to operate on |
| object files whatever the object file format. A different object file |
| format can be supported simply by creating a new BFD back end and adding |
| it to the library. To conserve runtime memory, however, the linker and |
| associated tools are usually configured to support only a subset of the |
| object file formats available. You can use `objdump -i' (*note |
| objdump: (binutils.info)objdump.) to list all the formats available for |
| your configuration. |
| |
| As with most implementations, BFD is a compromise between several |
| conflicting requirements. The major factor influencing BFD design was |
| efficiency: any time used converting between formats is time which |
| would not have been spent had BFD not been involved. This is partly |
| offset by abstraction payback; since BFD simplifies applications and |
| back ends, more time and care may be spent optimizing algorithms for a |
| greater speed. |
| |
| One minor artifact of the BFD solution which you should bear in mind |
| is the potential for information loss. There are two places where |
| useful information can be lost using the BFD mechanism: during |
| conversion and during output. *Note BFD information loss::. |
| |
| * Menu: |
| |
| * BFD outline:: How it works: an outline of BFD |
| |
| |
| File: ld.info, Node: BFD outline, Up: BFD |
| |
| How it works: an outline of BFD |
| =============================== |
| |
| When an object file is opened, BFD subroutines automatically |
| determine the format of the input object file. They then build a |
| descriptor in memory with pointers to routines that will be used to |
| access elements of the object file's data structures. |
| |
| As different information from the object files is required, BFD |
| reads from different sections of the file and processes them. For |
| example, a very common operation for the linker is processing symbol |
| tables. Each BFD back end provides a routine for converting between |
| the object file's representation of symbols and an internal canonical |
| format. When the linker asks for the symbol table of an object file, it |
| calls through a memory pointer to the routine from the relevant BFD |
| back end which reads and converts the table into a canonical form. The |
| linker then operates upon the canonical form. When the link is finished |
| and the linker writes the output file's symbol table, another BFD back |
| end routine is called to take the newly created symbol table and |
| convert it into the chosen output format. |
| |
| * Menu: |
| |
| * BFD information loss:: Information Loss |
| * Canonical format:: The BFD canonical object-file format |
| |
| |
| File: ld.info, Node: BFD information loss, Next: Canonical format, Up: BFD outline |
| |
| Information Loss |
| ---------------- |
| |
| _Information can be lost during output._ The output formats |
| supported by BFD do not provide identical facilities, and information |
| which can be described in one form has nowhere to go in another format. |
| One example of this is alignment information in `b.out'. There is |
| nowhere in an `a.out' format file to store alignment information on the |
| contained data, so when a file is linked from `b.out' and an `a.out' |
| image is produced, alignment information will not propagate to the |
| output file. (The linker will still use the alignment information |
| internally, so the link is performed correctly). |
| |
| Another example is COFF section names. COFF files may contain an |
| unlimited number of sections, each one with a textual section name. If |
| the target of the link is a format which does not have many sections |
| (e.g., `a.out') or has sections without names (e.g., the Oasys format), |
| the link cannot be done simply. You can circumvent this problem by |
| describing the desired input-to-output section mapping with the linker |
| command language. |
| |
| _Information can be lost during canonicalization._ The BFD internal |
| canonical form of the external formats is not exhaustive; there are |
| structures in input formats for which there is no direct representation |
| internally. This means that the BFD back ends cannot maintain all |
| possible data richness through the transformation between external to |
| internal and back to external formats. |
| |
| This limitation is only a problem when an application reads one |
| format and writes another. Each BFD back end is responsible for |
| maintaining as much data as possible, and the internal BFD canonical |
| form has structures which are opaque to the BFD core, and exported only |
| to the back ends. When a file is read in one format, the canonical form |
| is generated for BFD and the application. At the same time, the back |
| end saves away any information which may otherwise be lost. If the data |
| is then written back in the same format, the back end routine will be |
| able to use the canonical form provided by the BFD core as well as the |
| information it prepared earlier. Since there is a great deal of |
| commonality between back ends, there is no information lost when |
| linking or copying big endian COFF to little endian COFF, or `a.out' to |
| `b.out'. When a mixture of formats is linked, the information is only |
| lost from the files whose format differs from the destination. |
| |