Silly Invaders (#1) - Compiling and degubbing code for Tiva

Compiling and degubbing code for Tiva
Hardware Abstraction Layer
Debugging, heap, and display
Analog Digital Converter and Timers
Playing Nokia Tunes
Random Number Generator, Rendering Engine, and the Game
A real-time operating system

Intro

I have recently been playing with microcontrollers a lot. Among other things, I have worked through some of the labs from this course on EdX. The material does not use much high-level code, so it gives a good overview of how the software interacts with the hardware. There are some "black box" components in there, though. For me, the best way to learn something well has always been building things from "first principles." I find black boxes frustrating. This post describes the first step on my way to make an Alien Invaders game from "scratch."

Compiling for Tiva

First, we need to be able to compile C code for Tiva. To this end, we will use GCC as a cross-compiler, so make sure you have the arm-none-eabi-gcc command available on your system. We will use the following flags build Tiva-compatible binaries:

-mcpu=cortex-m4 - produce the code for ARM Cortex-M4 CPU
-mfpu=fpv4-sp-d16 - FPv4 single-precision floating point with the register bank seen by the software as 16 double-words
-mfloat-abi=hard - generate floating point instructions and use FPU-specific calling conventions
-mthumb - use the Thumb instruction set
-std=c11 - use the C11 standard
-O0 - don't perform any optimizations
-Wall and -pedantic - warn about all the potential issues with the code
-ffunction-sections and -fdata-sections - place every function and data item in a separate section in the resulting object file; it allows the optimizations removing all unused code and data to be performed at link-time

Object files

To generate a proper binary image, we need to have some basic understanding of object files produced by the compiler. In short, they consist of sections containing various pieces of compiled code and the corresponding data. These sections may be loadable, meaning that the contents of the section should be read from the object file and stored in memory. They may also be just allocatable, meaning that there is nothing to be loaded, but a chunk of memory needs to be put aside for them nonetheless. There are multiple sections in a typical ELF object file, but we need to know only four of them:

.text - contains the program code
.rodata - contains the constants (read-only data)
.data - contains the read-write data
.bss - contains statically allocated variables (initialized to zero)

Let's consider the following code:

 1#include <stdio.h>
 2
 3int a = 12;
 4int b;
 5const char *c = "The quick brown fox jumps over the lazy dog.";
 6const char * const d = "The quick brown fox jumps over the lazy dog.";
 7int e[20];
 8const int f[4] = {7, 4, 2, 1};
 9
10int main(int argc, char **argv)
11{
12  printf("Hello world!\n");
13  return 0;
14}

After compiling it, we end up with an object file containing the following sections (most have been omitted for clarity):

]==> objdump -h test

test:     file format elf64-x86-64

Sections:
Idx Name          Size      VMA               LMA               File off  Algn
 13 .text         00000192  00000000004003f0  00000000004003f0  000003f0  2**4
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
 15 .rodata       0000006d  0000000000400590  0000000000400590  00000590  2**4
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
 24 .data         00000020  0000000000600948  0000000000600948  00000948  2**3
                  CONTENTS, ALLOC, LOAD, DATA
 25 .bss          00000090  0000000000600980  0000000000600980  00000968  2**5
                  ALLOC

As you can see, every section has two addresses:

VMA (virtual memory address) - This is the location of the section the code expects when it runs.
LMA (load memory address) - This is the location where the section is stored by the loader.

These two addresses are in most cases the same, except the situation that we care about here: an embedded system. In our binary image, we need put the .data section in ROM because it contains initialized variables whose values would otherwise be lost on reset. The section's LMA, therefore, must point to a location in ROM. However, this data is not constant, so it's final position at program's runtime needs to be in RAM. Therefore, the VMA must point to a location RAM. We will see an example later.

Tiva's memory layout

Tiva has 256K of ROM (range: 0x0000000000-0x0003ffff) and 32K of RAM (range: 0x20000000-0x20003fff). See the table 2-4 on page 90 of the data sheet for details. The NVIC (Interrupt) table needs to be located at address 0x00000000 (section 2.5 of the data sheet). We will create this table in C, put it in a separate object file section, and fill with weak aliases of the default handler function. This approach will enable the user to redefine the interrupt handlers without having to edit the start-up code. The linker will resolve the handler addresses to strong symbols if any are present.

So, we define a dummy interrupt handler that loops indefinitely:

1void __int_handler(void)
2{
3  while(1);
4}

and then create a bunch of weak aliases to this function:

1#define DEFINE_HANDLER(NAME) void NAME ## _handler() __attribute__ ((weak, alias ("__int_handler")))
2
3DEFINE_HANDLER(nmi);
4DEFINE_HANDLER(hard_fault);
5DEFINE_HANDLER(mman);
6DEFINE_HANDLER(bus_fault);
7DEFINE_HANDLER(usage_fault);
8...

Finally, we construct the nvic_table, place it in the .nvic section in the resulting object file and fill it with handler addresses:

1#define HANDLER(NAME) NAME ## _handler
2void (*nvic_table[])(void) __attribute__ ((section (".nvic"))) = {
3  HANDLER(reset),
4  HANDLER(nmi),
5  HANDLER(hard_fault),
6  HANDLER(mman),
7...

Linker scripts

We will use linker scripts to set the VMAs and the LMAs to the values we like and to create some symbols whose addresses we can play with in the C code. We first need to define the memory layout:

1MEMORY
2{
3  FLASH (rx)  : ORIGIN = 0x00000000, LENGTH = 0x00040000
4  RAM   (rwx) : ORIGIN = 0x20000000, LENGTH = 0x00008000
5}

We then need to tell the linker where to put the section in the final executable:

 1SECTIONS
 2{
 3  .text :
 4  {
 5    LONG(0x20007fff)
 6    KEEP(*(.nvic))
 7    *(.text*)
 8    *(.rodata*)
 9     __text_end_vma = .;
10  } > FLASH
11
12  .data :
13  {
14    __data_start_vma = .;
15    *(.data*)
16    *(vtable)
17    __data_end_vma = .;
18  } > RAM AT > FLASH
19
20  .bss :
21  {
22    __bss_start_vma = .;
23    *(.bss*)
24    *(COMMON)
25    __bss_end_vma = .;
26  } > RAM
27}

We start with the .text section and begin it with 0x20003fff. It is the initial value of the stack pointer (see the data sheet). Since the stack grows towards lower addresses, we initialize the top of the stack to the last byte of available RAM.
We then put the .nvic section. The KEEP function forces the linker to keep this section even when the link-time optimizations are enabled, and the section seems to be unused. The asterisk in *(.nvic) is a wildcard for an input object file name. Whatever is in the brackets is a wildcard for a section name.
We put all the code and read-only data from all of the input files in this section as well.
We define a new symbol: __text_end_vma and assign its address to the current VMA (the dot means the current VMA).
We put this section in FLASH: > FLASH at line 10.
We combine the .data* sections from all input files into one section and put it behind the .text section in FLASH. We set the VMAs to be in RAM: > RAM AT > FLASH.
Apparently TivaWare changes the value of the VTABLE register and needs to have the NVIC table in RAM, so we oblige: *(vtable).
We put .bss in RAM after .data.
We use asterisks in section names (i.e. .bss*) because -ffunction-sections and -fdata-sections parameters cause the compiler to generate a separate section for each function and data item.

Edit 02.04.2016: The initial stack pointer needs to be aligned to 8 bytes for passing of 64-bit long variadic parameters to work. Therefore, the value of the first four bytes in the text section should be: LONG(0x20007ff8). See this post for details.

See the binutils documentation for more details.

Start-up code

On the system start-up, we need to copy the contents of the .data section from FLASH to RAM ourselves before we can run any code. We do it by defining a reset handler:

 1extern unsigned long __text_end_vma;
 2extern unsigned long __data_start_vma;
 3extern unsigned long __data_end_vma;
 4extern unsigned long __bss_start_vma;
 5extern unsigned long __bss_end_vma;
 6
 7extern void main();
 8
 9void __rst_handler()
10{
11  unsigned long *src = &__text_end_vma;
12  unsigned long *dst = &__data_start_vma;
13
14  while(dst < &__data_end_vma) *dst++ = *src++;
15  dst = &__bss_start_vma;
16  while(dst < &__bss_end_vma) *dst++ = 0;
17
18  main();
19}
20
21void reset_handler() __attribute__ ((weak, alias ("__rst_handler")));

We first declare external symbols. They are put in the symbol table by the linker. The reset handler then moves the .data section from FLASH to RAM, zeroes the .bss section, and calls main.

A test

Let's put everything together. I wrote a short program that blinks an LED using the SysTick interrupt. The color of the LED depends on the switch pressed. The files are here:

Compile and link:

]==> arm-none-eabi-gcc -mcpu=cortex-m4 -mfpu=fpv4-sp-d16 -mfloat-abi=hard -mthumb -std=c11 -O0 -Wall -pedantic -ffunction-sections -fdata-sections -c main.c -g
]==> arm-none-eabi-gcc -mcpu=cortex-m4 -mfpu=fpv4-sp-d16 -mfloat-abi=hard -mthumb -std=c11 -O0 -Wall -pedantic -ffunction-sections -fdata-sections -c TM4C_startup.c -g
]==> arm-none-eabi-ld -T TM4C.ld TM4C_startup.o main.o -o main --gc-sections

Let's see what we have in the resulting binary:

]==> arm-none-eabi-objdump -h main

main:     file format elf32-littlearm

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .text         00000484  00000000  00000000  00010000  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  1 .data         00000004  20000000  00000484  00020000  2**2
                  CONTENTS, ALLOC, LOAD, DATA
  2 .bss          00000004  20000004  00000488  00020004  2**2
                  ALLOC

The .text section starts at 0x00000000 both VMA and LMA. The .data section starts at 0x00000484 LMA (in FLASH) but the code expects it to start at 0x20000000 VMA (in RAM). The symbol addresses seem to match the expectations as well:

]==> arm-none-eabi-objdump -t main | grep vma
20000004 g       .bss   00000000 __bss_start_vma
00000484 g       .text  00000000 __text_end_vma
20000008 g       .bss   00000000 __bss_end_vma
20000000 g       .data  00000000 __data_start_vma
20000004 g       .data  00000000 __data_end_vma

We now need to create a raw binary file that we can flash to the board. The arm-none-eabi-objcopy utility can take the relevant sections and put them in an output file aligned according to their LMAs.

]==> arm-none-eabi-objcopy -O binary main main.bin
]==> stat --format=%s main.bin
1160

The total size of the raw binary matches the sum of the sizes of the .text and .data sections (0x488 == 1160). Let's flash it and see if it works!

]==> lm4flash main.bin
Found ICDI device with serial: xxxxxxxx
ICDI version: 9270

Test

Get the full code at GitHub.

If you like this kind of content, you can subscribe to my newsletter, follow me on Twitter, or subscribe to my RSS channel.

2016-03-28

Table of Contents