AUTOSAR from introduction to mastery - one article to read ld link script file

Posted by mkarabulut on Thu, 25 Nov 2021 00:07:05 +0100

1 connection script

One of the main purposes of connection scripts is to describe how the sections in the input file are mapped to the output file and control the memory layout of the output file. Almost all connection scripts only do these two things. However, when necessary, the connector script can also instruct the connector to perform many other operations. This is achieved through the commands described below

The connector always uses the connector script. If you do not provide it yourself, the connector will use a default script, which is compiled into the connector executable. You can use the '- verbose' command line option to display the contents of the default connector script. Some command line options, such as

'- r' or '- N', will affect the default connection script

You can provide your own connection script by using the '- T' command line option. When you do, your connection script will replace the default connection script

You can also use connection scripts implicitly by using them as input files to a connector, as if they were a connected file

2 basic connection script concept

We need to define some basic concepts and vocabulary to describe the connection scripting language

The connector combines multiple input files into a single output file. Both the output file and the input file exist in a data format called 'object file format'. Each file is called 'object file'. The output file is often called 'executable file', but we also call it object file because of needs. In each object file, Among other things, there is a section list. We sometimes call the section of the input file the input section; Similarly, a section in an output file is often called an output section

Each section in an object file has a name and a size. Most sections also have a related data block called section content. A section may be marked 'loadable', which means that when the output file is executed, the section should be loaded into internal memory. A section without content may be 'allocatable', The meaning is that there must be a space in the memory for this section, but no actual content is loaded here (in some cases, this memory must be marked as zero). A section that is neither loadable nor allocatable generally contains some debugging information

Each loadable or allocatable output section has two addresses. The first is' VMA 'or virtual memory address. This is the address owned when the output file is running. The second is' LMA', Or load memory address. The memory address to be loaded in this section. In most cases, the two addresses are the same. An example of their possible differences is that when a data section is in ROM, it is copied to ram when the program starts (this technology is often used to initialize global variables in ROM based systems). In this case, The ROM address is LMA and the RAM address is VMA

You can view the sections in the target file by using 'objdump' with '- h' option

Each target file also has a list of symbols, called 'symbol table'. A symbol may be defined or undefined

Each symbol has a name, and each defined symbol has an address. If you compile a C/C + + program into an object file, for each defined function and global or static variable, you will get a defined symbol. Each function or global variable that is only a reference in the input file will become an undefined symbol

You can use the 'nm' program to see the symbols in an object file, or use the 'objdump' program with the '- t' option

3 format of connection script

The connection script is a text file

You write a series of commands as a connection script. Each command is a keyword with parameters or an assignment to symbols. You can separate commands with semicolons. Spaces are generally ignored

Strings such as file name or format name can generally be typed directly. If the file name contains special characters, such as comma, which is usually used to separate file names, you can put the file name in double quotation marks. Double quotation marks cannot be used in the middle of file names

You can use comments in connection scripts as in C, separated by '/' and '/'. Just as in C, comments are syntactically equivalent to spaces

4 simple connection script example

Many scripts are fairly simple

The simplest possible script contains only one command: 'SECTIONS'. You can use' SECTIONS' to describe the memory layout of the output file

'SECTIONS' is a very powerful command. Here we will describe a very simple use. Let's assume that your program has only code SECTIONS, initialized data SECTIONS, and uninitialized data SECTIONS. These will exist in '. text', 'data' and '. bss' SECTIONS. In addition, let's further assume that there are only these SECTIONS in your input file

For this example, we say that the code should be loaded at the address' 0x10000 'and the data should start at 0x8000000. The following is a script to realize this function:

 
SECTIONS
{
    . = 0x10000;
    .text : { *(.text) }
    . = 0x8000000;
    .data : { *(.data) }
    .bss : { *(.bss) }
}

You use the keyword 'SECTIONS' to write the SECTIONS command, followed by a string of symbol assignments in curly braces and the contents described in the output section

In the above example, the first line in the 'SECTIONS' command assigns a value to a special symbol'. ', which is a positioning counter. If you do not specify the address of the output section in other ways (other methods will be described later), The address value is then set to the existing value of the positioning counter. The positioning counter is then added to the size of the output section. At the beginning of the 'SECTIONS' command, the positioning counter has the value' 0 '

The second line defines an output section, '. text'. The colon is required by syntax and can now be ignored. In the curly brackets after the section name, you list the names of all input sections that should be put into this output section. 'is a wildcard that matches any file name. Expression' (. text) 'means'. text' input sections in all input files

Because when the output section '. text' is defined, the value of the positioning counter is' 0x10000 ', and the connector will set the address of the'. text 'section in the output file to' 0x10000 '

The rest defines the '. data' section and '. bss' section in the output file. The connector will put the'. data 'output section at the address' 0x8000000'. After the connector places the '. data' output section, The value of the positioning counter is' 0x8000000 'plus the length of the'. data 'output section. The result is that the connector will put the'. bss' output section immediately after the '. data' section

The connector will ensure that each output section has the required alignment by increasing the value of the positioning counter if necessary. In this example, the addresses specified for '. text' and '. data' sections will meet the alignment constraints, but the connector may need to create a small gap between '. data' and '. bss' sections.

In this way, this is a simple but complete connection script.

Each connection is controlled by a 'connection script'. This script is written in the connection command language.

5 MEMORY command

The connector is configured by default to allow allocation of all available MEMORY blocks. You can use the 'MEMORY' command to reconfigure this setting.

The 'MEMORY' command describes the location and length of MEMORY blocks on the target platform. You can use it to describe which MEMORY areas can be used by the connector,
Which memory areas should be avoided. You can then allocate sections to specific memory areas. The connector is based on the memory locale section
For an area that is too full, a warning message will be prompted. Connectors do not disturb sections to accommodate available areas.

A connection script can contain the MEMORY command at most once. However, you can define as many MEMORY blocks as you want in the command, including syntax
As follows:

MEMORY
  {
    NAME [(ATTR)] : ORIGIN = ORIGIN, LENGTH = LEN
    ...
  }

NAME is the NAME used to reference the memory area in the connection script. Without the connection script, the region NAME has no practical significance. The zone NAME is stored in a
In a separate namespace, it will not conflict with symbol name, file name and section name. Each memory area must have a unique name.

The ATTR string is an optional list of attributes that indicate whether to use a specific attribute for an input segment that is not explicitly mapped in the connection script
Memory area of the. If you do not specify an output segment for some input segments, the connector creates an output segment with the same name as the input segment. If you decide
Defines the region properties that the connector uses to select a memory region for the output segment it creates.

ATTR string must contain one of the following characters and only one:
R
Read only section.
W
Read write section.
X
Executable section.
A
Allocable section.
I
Section initialized.
L
Same as' I '
!
Negates the previous property value.

If an unmapped section matches the above division '!' It will be put into the memory area. " Property negates the test, so
An unmapped section is put into memory only if it does not match the row attributes listed above.

ORIGIN is an expression about the starting address of a memory region. Before the memory allocation is executed, this expression must be evaluated to produce a constant,
This means that you cannot use any section related symbols. The keyword 'ORIGIN' can be abbreviated as' org 'or' o '(however, it cannot be written as
(such as' ORG ')

LEN is an expression about the long charge of memory area (in bytes). Like the ORIGIN expression, this expression is also valid before allocation
Must be evaluated as a constant value. The keyword 'LENGTH' can be abbreviated as' len 'or' l '.

In the following example, we specify two memory areas that can be allocated: one starts from 0 and has a length of 256kb, and the other starts from 0x4000000
At first, it has a length of 4mb. The connector will put those read-only or executable sections that are not explicitly mapped into the 'rom' memory area. It will put other sections that are not explicitly mapped into the 'ram' memory area.

MEMORY
  {
    rom (rx)  : ORIGIN = 0, LENGTH = 256K
    ram (!rx) : org = 0x40000000, l = 4M
  }

Once you have defined a memory area, you can also instruct the connector to put the specified output segment into the memory area by using
'> region' output segment attribute. For example, if you have a memory area named 'mem', you can use '> mem' in the output segment definition. For example
If no address is specified for the output segment, the connector will set the address to the next available address in the memory area. If the total is mapped to one
If the output segment of a memory region is too large for the region, the connector will prompt an error message.

6 enter section and garbage collection

After using the option - GC sections in the connection command line, the connector may filter out some sections that it considers useless. At this time, it is necessary to force the connector to retain some specific sections. You can use the KEEP() keyword to achieve this purpose, such as KEEP((.text)) or KEEP(SORT()(.text))

7 a complete lds file example

 
 
/*
    GNU linker script for STM32F405
*/

/* Specify the memory areas */
MEMORY
{
    FLASH (rx)      : ORIGIN = 0x08000000, LENGTH = 0x100000 /* entire flash, 1 MiB */
    FLASH_ISR (rx)  : ORIGIN = 0x08000000, LENGTH = 0x004000 /* sector 0, 16 KiB */
    FLASH_TEXT (rx) : ORIGIN = 0x08020000, LENGTH = 0x080000 /* sectors 5,6,7,8, 4*128KiB = 512 KiB (could increase it more) */
    CCMRAM (xrw)    : ORIGIN = 0x10000000, LENGTH = 0x010000 /* 64 KiB */
    RAM (xrw)       : ORIGIN = 0x20000000, LENGTH = 0x020000 /* 128 KiB */
}

/* top end of the stack */
_estack = ORIGIN(RAM) + LENGTH(RAM);

/* RAM extents for the garbage collector */
_ram_end = ORIGIN(RAM) + LENGTH(RAM);
_heap_end = 0x2001c000; /* tunable */

/* define output sections */
SECTIONS
{
    /* The startup code goes first into FLASH */
    .isr_vector :
    {
        . = ALIGN(4);
        KEEP(*(.isr_vector)) /* Startup code */

        . = ALIGN(4);
    } >FLASH_ISR

    /* The program code and other data goes into FLASH */
    .text :
    {
        . = ALIGN(4);
        *(.text)           /* .text sections (code) */
        *(.text*)          /* .text* sections (code) */
        *(.rodata)         /* .rodata sections (constants, strings, etc.) */
        *(.rodata*)        /* .rodata* sections (constants, strings, etc.) */
    /*  *(.glue_7)   */    /* glue arm to thumb code */
    /*  *(.glue_7t)  */    /* glue thumb to arm code */

        . = ALIGN(4);
        _etext = .;        /* define a global symbol at end of code */
        _sidata = _etext;  /* This is used by the startup in order to initialize the .data secion */
    } >FLASH_TEXT

    /*
    .ARM.extab :
    {
        *(.ARM.extab* .gnu.linkonce.armextab.*)
    } >FLASH

    .ARM :
    {
        __exidx_start = .;
        *(.ARM.exidx*)
        __exidx_end = .;
    } >FLASH
    */

    /* This is the initialized data section
    The program executes knowing that the data is in the RAM
    but the loader puts the initial values in the FLASH (inidata).
    It is one task of the startup to copy the initial values from FLASH to RAM. */
    .data : AT ( _sidata )
    {
        . = ALIGN(4);
        _sdata = .;        /* create a global symbol at data start; used by startup code in order to initialise the .data section in RAM */
        _ram_start = .;    /* create a global symbol at ram start for garbage collector */
        *(.data)           /* .data sections */
        *(.data*)          /* .data* sections */

        . = ALIGN(4);
        _edata = .;        /* define a global symbol at data end; used by startup code in order to initialise the .data section in RAM */
    } >RAM

    /* Uninitialized data section */
    .bss :
    {
        . = ALIGN(4);
        _sbss = .;         /* define a global symbol at bss start; used by startup code */
        *(.bss)
        *(.bss*)
        *(COMMON)

        . = ALIGN(4);
        _ebss = .;         /* define a global symbol at bss end; used by startup code */
    } >RAM

    /* this is to define the start of the heap, and make sure we have a minimum size */
    .heap :
    {
        . = ALIGN(4);
        _heap_start = .;    /* define a global symbol at heap start */
    } >RAM

    /* this just checks there is enough RAM for the stack */
    .stack :
    {
        . = ALIGN(4);
    } >RAM

    /* Remove information from the standard libraries */
    /*
    /DISCARD/ :
    {
        libc.a ( * )
        libm.a ( * )
        libgcc.a ( * )
    }
    */

    .ARM.attributes 0 : { *(.ARM.attributes) }
}

Topics: Embedded system AUTOSAR