一、目的

在之前的《GNU LD脚本命令语言（一）》、《GNU LD脚本命令语言（二）》我们介绍了GNU链接脚本的知识点，基本上对链接脚本中的SECTION、REGION、以及加载地址与执行地址的关系等内容有了一定的了解。

本篇主要讲解链接脚本的符号如何在C代码中进行访问。

二、介绍

实际问题分析

在使用RT-Thread Studio进行STM32H750的bootloader代码阅读过程，大家肯定看到过此段代码


RT_WEAK void rt_hw_board_init()
{
    extern void hw_board_init(char *clock_src, int32_t clock_src_freq, int32_t clock_target_freq);

    /* Heap initialization */
#if defined(RT_USING_HEAP)
    rt_system_heap_init((void *) HEAP_BEGIN, (void *) HEAP_END);  //①
#endif

    hw_board_init(BSP_CLOCK_SOURCE, BSP_CLOCK_SOURCE_FREQ_MHZ, BSP_CLOCK_SYSTEM_FREQ_MHZ);

    /* Set the shell console output device */
#if defined(RT_USING_DEVICE) && defined(RT_USING_CONSOLE)
    rt_console_set_device(RT_CONSOLE_DEVICE_NAME);
#endif

    /* Board underlying hardware initialization */
#ifdef RT_USING_COMPONENTS_INIT
    rt_components_board_init();
#endif

}

①注意此行代码中的两个宏HEAP_BEGIN和HEAP_END

#define RAM_START              (0x24000000)
#define RAM_SIZE               (512)
#define RAM_END                (RAM_START + RAM_SIZE * 1024)

#if defined(__CC_ARM) || defined(__CLANG_ARM)
extern int Image$$RW_IRAM1$$ZI$$Limit;
#define HEAP_BEGIN      ((void *)&Image$$RW_IRAM1$$ZI$$Limit)
#elif __ICCARM__
#pragma section="CSTACK"
#define HEAP_BEGIN      (__segment_end("CSTACK"))
#else
extern int __bss_end;
#define HEAP_BEGIN      ((void *)&__bss_end)  //①
#endif

#define HEAP_END                       STM32_SRAM1_END  //②

①HEAP_BEGIN的值是变量__bss_end的地址，注意此处的__bss_end的类型声明是extern的。

②HEAP_END的值是STM32H750的SRAM的地址末尾，即0x24000000 + 512 * 1024，这个很容易理解，从地址映射表上可以查询的到。

我们搜索整个工程没有找到在代码中找到__bss_end的定义。但是我们在链接脚本文件以及最后的map文件中找到__bss_end的符号

链接脚本中的__bss_end符号定义了bss段的末尾地址

通过map文件分析我们可以看到bss段的末尾地址为0x2400205c，也就是说从这个地址开始我们将SRAM中剩余的内存作为系统堆使用。

那么c代码中的&__bss_end和链接脚本中的__bss_end有什么关系呢？

参考文档

Source Code Reference (LD) (sourceware.org)

在链接脚本中我们可以定义变量，在源代码中（C/C++、Fortran ）也可以定义变量。

在链接脚本中定义的变量（或者叫符号）只有地址，没有值；

在源码中定义的变量既有地址又有值。

由于C++支持函数重载，所以编译后符号表中的函数名称会改变。

这边为了统一描述我们假设编译器编译后符号名称不变。

假如我们在链接脚本中定义了符号

_foo = 1000

那么就可以在源码中按照下面的代码引用这个符号

 extern int _foo;

注意源码中的符号我们只能对其取地址，因为这个变量只有地址没有值。

(void *)&_foo;

知识点

When a symbol is declared in a high level language such as C, two things happen. The first is that the compiler reserves enough space in the program’s memory to hold the value of the symbol. The second is that the compiler creates an entry in the program’s symbol table which holds the symbol’s address. ie the symbol table contains the address of the block of memory holding the symbol’s value

When a program references a symbol the compiler generates code that first accesses the symbol table to find the address of the symbol’s memory block and then code to read the value from that memory block.

在高级语中（例如C）声明一个符号时，其内部其实做了两个事，第一编译器在系统内存中保留了足够的内存空间来存放这个符号的值；第二编译器在系统符号表中创建了一个新的条目（表项）用来存放符号的地址，也就是说通过符号名可以找到符号占用的内存的地址，通过此地址，可以知道符号里面对应的值。

举例说明

int foo = 1000;  //①
foo = 1;  //②

①定义了一个变量其初值为1000，也就是foo占用的内存（四字节）里面保存的是数字1000；

②将foo占用的内存里面的值修改为1；这边涉及到两个操作，第一通过foo在符号表中找到其代表的地址值，第二通过找到的地址值找到foo占用的内存空间，然后修改为1

Linker scripts symbol declarations, by contrast, create an entry in the symbol table but do not assign any memory to them. Thus they are an address without a value. So for example the linker script definition:

  foo = 1000;
creates an entry in the symbol table called ‘foo’ which holds the address of memory location 1000, but nothing special is stored at address 1000. This means that you cannot access the value of a linker script defined symbol - it has no value - all you can do is access the address of a linker script defined symbol.

如果在链接脚本中定义了一个符号，只会在符号表中添加一个新的条目，不会为这个符号分配任何内存，也就说这个符号只有地址没有值。

上图中符号A只有地址没有值

因为符号只有地址，所以我们只能通过&取这个符号的地址，而不能取值

假如我们在源码中想将.ROM段的数据拷贝到FLASH段

start_of_ROM   = .ROM;
end_of_ROM     = .ROM + sizeof (.ROM);
start_of_FLASH = .FLASH;

extern char start_of_ROM, end_of_ROM, start_of_FLASH; //①
memcpy (& start_of_FLASH, & start_of_ROM, & end_of_ROM - & start_of_ROM); //②

①②注意extern声明和&取地址符号。

如果符号是数组怎么办？数组的名称（符号）就是代表其地址，所以下面的代码片段与上面的功能一致。

extern char start_of_ROM[], end_of_ROM[], start_of_FLASH[];
memcpy (start_of_FLASH, start_of_ROM, end_of_ROM - start_of_ROM);

那如果在汇编语言中应该怎么操作呢？

/* start address for the initialization values of the .data section. 
defined in linker script */
.word  _sidata
/* start address for the .data section. defined in linker script */  
.word  _sdata
/* end address for the .data section. defined in linker script */
.word  _edata
/* start address for the .bss section. defined in linker script */
.word  _sbss
/* end address for the .bss section. defined in linker script */
.word  _ebss
/* stack used for SystemInit_ExtMemCtl; always internal RAM used */

/**
 * @brief  This is the code that gets called when the processor first
 *          starts execution following a reset event. Only the absolutely
 *          necessary set is performed, after which the application
 *          supplied main() routine is called. 
 * @param  None
 * @retval : None
*/

    .section  .text.Reset_Handler
  .weak  Reset_Handler
  .type  Reset_Handler, %function
Reset_Handler:  
  ldr   sp, =_estack      /* set stack pointer */

/* Copy the data segment initializers from flash to SRAM */  
  movs  r1, #0
  b  LoopCopyDataInit

CopyDataInit:
  ldr  r3, =_sidata
  ldr  r3, [r3, r1]
  str  r3, [r0, r1]
  adds  r1, r1, #4
    
LoopCopyDataInit:
  ldr  r0, =_sdata  
  ldr  r3, =_edata
  adds  r2, r0, r1
  cmp  r2, r3
  bcc  CopyDataInit
  ldr  r2, =_sbss
  b  LoopFillZerobss
/* Zero fill the bss segment. */  
FillZerobss:
  movs  r3, #0
  str  r3, [r2], #4
    
LoopFillZerobss:
  ldr  r3, = _ebss
  cmp  r2, r3
  bcc  FillZerobss

/* Call the clock system intitialization function.*/
  bl  SystemInit   
/* Call static constructors */
/* bl __libc_init_array */
/* Call the application's entry point.*/
  bl  entry
  bx  lr    
.size  Reset_Handler, .-Reset_Handler

注意观察_sidata、_sdata、_edata等等这些都是在链接脚本中定义的