Content tagged embedded-system


It's kind of silly, but I wanted to build a drone that can evade being hit by a lightsaber ever since I first watched this scene in "Attack of the Clones":

Star Wars - Jedi Younglings

To gain the understanding of the ecosystem, I decided to buy parts semi-randomly on Amazon, build something that can just fly, and then iteratively improve on this basic design. People can do astounding things with drones these days. My ultimate goal is to be able to build stuff like that.

The hardware

Here's the list of things I have bought:

  • A carbon fiber quadrotor frame. (Amazon)
  • Brushless rotors. I cannot find the exact model anymore, but they were similar to the ones in the link. (Amazon)
  • Electronic Speed Controllers. (Amazon)
  • Clock-wise and counter-clock-wise propellers. You only need two of each kind, but they easily break if you do tuning in a confined space, so it's wise to buy more of them upfront. (Amazon)
  • A battery. The one in the link is large enough and still fits inside the frame. I need it inside because I wanted the electronics to be easily accessible on the top - I will likely want to change it quite a bit later. (Amazon)
  • A power connection board. You can do without it, but it's quite a lot of connections, so soldering wires together and wrapping the joints in isolation tape is painful and looks ugly. (Amazon)
  • An autopilot board. I bought a cheap CC3D because I ultimately want to ditch it and build one myself. (Amazon)
  • A Raspberry Pi. I want the drone to fly by itself, so I did not buy any radio controller - it will be the task of a computer to do the steering. I used an RPi model 2 because I had one readily available at home. However, these days model 3 is cheaper, so it's probably a much better idea to buy that one. (Amazon)
  • A WiFi dongle. I want this first version to be controllable from a web browser via WiFi. A later version will send some telemetry and receive high-level commands via GSM. (Amazon)

Apart from all that, I used some electronics components to power things up and connect them. I had most of them at home, but I will put some links below nonetheless. You'll likely need these:

  • A prototyping board. It's nice to solder things together to something stable so that the components don't fly around attached to loose wires. The one in the following link should do. (Farnell)
  • A 5V voltage regulator. You will need one to power the Raspberry Pi up. The documentation of CC3D says that the board puts the unregulated output from the ESCs on the output of its serial ports. This output happens to be at 5V, so I initially used that for powering the Pi. Unfortunately, it needs to draw at least around 600 mA of current to work, so the ESC that powered the Raspberry got extremely hot and the motor it was controlling lagged behind the others. Make sure you buy a regulator with as stable output as possible. Some of the cheap ones will make the Raspberry reboot in the middle of the flight due to voltage oscillations. This, in turn, will make the autopilot think it lost the connection to the radio controller and it will go into failsafe mode. A TO-220-compatible heatsink for that regulator is not a bad idea either. You will also need two capacitors. I used 10 μF and 22 μF. Alternatively, you can get yourself a DC to DC converter, in which case you won't need capacitors or heatsinks, and it should be much easier on the battery. (Farnell) (Farnell) or (Farnell)
  • An NPN Transistor and two 10 kΩ resistors for a logic inverter with voltage level adjustment. (Farnell)
  • Header pins and jumper cables so that you can connect things nicely. (Amazon, Amazon)

I used some extra components, even though they are not necessary to make things work. I am not exactly sure where this project will take me, so it seemed prudent to plan far ahead.

  • A 3.3V voltage regulator. I will likely want to power a 3.3V-based microcontroller to act as an autopilot. It needs an extra 10μF capacitor. (Farnell)
  • Four NPN Transistors and eight 10 kΩ resistors for bi-directional voltage level adjusting.

Wiring things up

Wiring things up is not hugely complicated. I put the power connection board on the bottom side of the drone together with all the cables powering the ESDs. The ESD control cables and the power for the RaspberryPi go from the bottom to the top in two bunches in the middle of each side of the drone. There are all sorts of electronic-related connections on the top. The battery is inside the drone frame.

Drone Wiring
Drone Wiring

Pretty much the only thing to pay attention to at this stage is making sure that all the rotors are placed in the right positions and that they connect to the ESCs such that they spin in the right directions. Here's a great video on that. The image below was produced by the LibrePilot configuration wizard.

Drone Rotor Topology - produced by LibrePilot
Drone Rotor Topology - produced by LibrePilot

You cannot connect the communication ports of the Raspberry Pi to the CC3D directly because there is a difference in the voltage levels at which the ports operate. The Pi's GPIO works at 3.3V and cannot tolerate 5V. The autopilot should, in principle, work at 3.3V with tolerance to 5V. However, in practice, I found that what only 5V based logic works. This is why I needed to build two voltage level converters out of transistors. I want to use them to send commands and receive telemetry from the FlexiPort of the autopilot.

Voltage Leveling Circuit
Voltage Leveling Circuit

As shown in the rotor topology diagram, the computer controls the autopilot using the S-Bus protocol. This protocol is just transmitting some data over UART with the added quirk of S-Bus being a logical inversion of UART (every 1 in UART is a 0 in S-Bus and vice-versa) plus we need to take care of the voltage level difference in the high states. The best solution here is again to build a circuit that does the inversion. It is a half of the voltage leveling circuit:

Inverter Circuit
Inverter Circuit

Here's what the resulting board together with the voltage regulators looks like in my case:

Complete board
Complete board

Control and Telemetry

A massive bummer of the RaspberryPi for me right now is that it only has one hardware UART controller. I will need many. At least one more to read telemetry data from CC3D and, later on, one extra to talk to my WaveShare GSM modem. You can bitbang UART on GPIOs, and some external kernel modules out there can do that. The problem is that they rely on the kernel's hrtimers and are not reliable enough at higher speeds, especially if the system is under load. I use one for now at a low speed, but I am working on my own implementation that uses hardware timers to flip the GPIO states reliably and on time. The CC3D and the Raspberry Pi can talk telemetry over such simulated serial port using the UAVTalk protocol. The LibrePilot source code provides python bindings for that.

I used the one hardware UART port that the Pi has for the control link because it needs to operate at a high and non-standard speed. On RaspberryPis 1 and 2 this port is used as a Linux console output by default, so you will need to disable that in /boot/config.txt. RaspberryPis 0 and 3 use the hardware UART to control Bluetooth. This behavior may be disabled by installing the pi3-disable-bt device tree overlay. All the necessary details are here. Once you're done with that, you can connect pin 14 of the Pi to the input of the inverter and the yellow (orange) cable of the CC3D's main port to its output.

After doing all that, it's a matter of opening the serial port in the right mode and sending the protocol byte stream down the pipe. S-Bus expects a baud rate of 100000, one even parity bit, and two stop bits. Here's how to open the port in this mode using Python's pyserial:

import serial

port = serial.Serial('/dev/ttyAMA0', baudrate=100000,

I found an excellent description of the S-Bus data frames here. Each frame is 25 bytes long and consists of: a start byte, 16 11-bit channels packed in the next 22 bytes, a byte containing flags and extra binary channels, and, finally, a stop byte. The controller is supposed to send a frame every 7ms, but after reading the code, I found that the LibrePilot firmware is fine as long as it receives a frame at least ten times per second (at least more often than every 102.4ms to be precise). You can see the code of my encoder here.

I quickly got tired of putting these numbers in a terminal window, so I wrote a trivial controller that works in a browser and uses a bunch of sliders. The code is on GitHub.

Controller interface
Controller interface

Open/Libre Pilot

There seems to have been some disagreement in the Open/Libre Pilot community, and the project does not look like it's in a great shape. I needed to make a bunch trivial changes to the GCS source code to make it compile on my Debian Testing laptop. Furthermore, the firmware does not build with the cross-compiler toolchain they supply due to some GCC configuration issues. I managed to build the firmware using the stock Debian cross-compiler for ARM and modifying the Makefile to make it not use the -Werror flag. The firmware code has plenty of unused variables that make the build process fail with this setting turned on. After building everything, the GCS crashes every other time you try to power cycle the board. As far as the CC3D boards themselves are concerned, I have two of them, and only one works in a more or less stable way. The other one does not load the configuration correctly or hangs every 3 out of 5 boots.

I used the config wizard at the beginning but found it confusing, so I later decided to do the configuration manually. Here's a list of what I did screen-by-screen:

  • Hardware:
    • Receiver Port: Disabled+OneShot
    • Flexi Port: Telemetry
    • Main Port: S.Bus
    • USB HID Port: USBTelemetry
    • Telemetry Speed: 9600 - faster than that is not reliable with current implementations of software UART for RaspberryPi.
  • Vehicle - Multirotor:
    • Airframe Type: Quad X
    • Assigned the rotors to the appropriate channels
  • Input:
    • Remote Control Input:
      • All channels need to be assigned even though not all of them correspond to any inputs in the pipilot interface. Otherwise, you will get receiver warnings, and the copter won't arm. I figured that out the hard way by reading the firmware code. Here's to the great diagnostics!
      • Throttle is Channel 1, Yaw - Channel 2, Roll - Channel 3, Pitch - Channel 4.
      • Accessories are Channels 5 to 8.
      • You can assign other controls to whatever other channels you like.
      • S.Bus transmits 11bits worth of data per channel, so the minimum is at 0 and the maximum is at 2047.
    • Flight Mode Settings
      • Flight Mode Count: 1
      • Stabilized 1: Attitude, Attitude, Axis Lock, CruiseControl - CruiseControl is particularly important. If you set it Manual, the copter will behave crazy.
    • Arming Settings:
      • Arm airframe using throttle off and: Yaw Left
  • Output:
  • Attitude:
    • You want to level your gyros
    • People say that there are two ways to combat the copter drifting while hovering:
      • Increase the amount of low-pass filtering.
      • Set the virtual rotation to compensate for the board not being completely flat. See this link.
    • Neither of these solutions helped me.


Flight Test #0

If you think it looks completely underwhelming, then I have to agree with you. The main problem is the drift while hovering. I tried virtual rotation, low-pass filtering, and PID tuning, but no amount of configuration tweaking alleviates the problem. The setup does not have any optical sensors and accelerometers, by definition, don't see drifting at a constant speed. On the other hand, the copter is stationary at the beginning and starts to move after the take-off, so the acceleration is not zero. It might be that the sensors are not sensitive enough to pick it up. That's something that I intend to investigate once I get the telemetry connection working reliably.

Next steps

Here's what I plan to do next:

  • Get my kernel soft UART module based on hardware timers to work. I have the timer interface finished and tested, but still need to do the byte encoder, the GPIO state changing logic, and the TTY interface.
  • Connect the telemetry at higher speed to see if the sensors see the drift.
  • If the sensors see the drift, either write a PID controller at the level of pipilot or see why the firmware does not compensate for it.

Medium-term plans include:

  • Attach the Crazyflie sensor and the IMUs directly to the Raspberry Pi.
  • Hack the CC3D firmware so that the Pi can control the motors directly.
  • See if Linux (a non-RTOS) on RPi is reliable enough to control the copter and keep it hovering stably.

Long-term plans:

  • Port everything to my FRDM-K64F board to see if things improve if implemented on top of "bare metal."
  • Start playing with more complex control and estimation math.
  • Add cameras, lidar, and implement some autonomy.
  • Perhaps write all the microcontroller code in Rust instead of C.

Board bring-up

I started playing with the FRDM-K64F board recently. I want to use it as a base for a bunch of hobby projects. The start-up code is not that different from the one for Tiva, which I describe here - it's the same Cortex-M4 architecture after all. Two additional things need to be taken care of, though: flash security and the COP watchdog.

The K64F MCU restricts external access to a bunch of resources by default. It's a great feature if you want to ship a product, but it makes debugging impossible. The Flash Configuration Field (see section 29.3.1 of the datasheet) defines the default security and boot settings.

 1 static const struct {
 2   uint8_t backdor_key[8];   // backdor key
 3   uint8_t fprot[4];         // program flash protection (FPROT{0-3})
 4   uint8_t fsec;             // flash security (FSEC)
 5   uint8_t fopt;             // flash nonvolatile option (FOPT)
 6   uint8_t feprot;           // EEPROM protection (FEPROT)
 7   uint8_t fdprot;           // data flash protection (FDPROT)
 8 } fcf  __attribute__ ((section (".fcf"))) = {
 9   {0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00},
10   {0xff, 0xff, 0xff, 0xff}, // disable flash program protection
11   0x02,                     // disable flash security
12   0x01,                     // disable low-power boot (section 6.3.3)
13   0x00,
14   0x00
15 };

If flash protection (the fprot field) is not disabled, you won't be able to flash new code by copying it to the MBED partition and will have to run mass erase from OpenOCD every time:

interface cmsis-dap
set CHIPNAME k60
source [find target/kx.cfg]
kinetis mdm mass_erase

If the MCU is in the secured state (the fsec field), the debugger will have no access to memory.

The structure listed above needs to end up in flash just after the interrupt vector. I use the linker script to make sure it happens. I define the appropriate memory block:

FLASH-FCF  (rx)  : ORIGIN = 0x00000400, LENGTH = 0x00000010

And then put the .fcf section in it:

.fcf :

See here.

I also disable the COP (computer operates properly) watchdog which resets the MCU if it is not serviced often enough.

1 WDOG_UNLOCK = 0xc520;        // unlock magic #1
2 WDOG_UNLOCK = 0xd928;        // unlock magic #2
3 for(int i = 0; i < 2; ++i);  // delay a couple of cycles
4 WDOG_STCTRLH &= ~0x0001;     // disable the watchdog

You can get the template code at GitHub.

Table of Contents

  1. Compiling and start-up code
  2. Hardware Abstraction Layer and UART
  3. Debugging, display, heap and fonts
  4. Timers and ADC
  5. DAC, Sound and Nokia Tunes
  6. Random Number Generator, Rendering Engine and the Game
  7. Operating System


The game code up until this point abuses timers a lot. It has a timer to handle rendering and to refresh the display, and a timer to change notes of a tune. These tasks are not very time sensitive. A couple of milliseconds delay here or there is not going to be noticeable to users. The timer interrupts are more appropriate for things like maintaining a sound wave of the proper frequency. A slight delay here lowers the quality of the user experience significantly.

We could, of course, do even more complex time management to handle both the graphics and the sound in one loop, but that would be painful. It's much nicer to have a scheduling system that can alternate between multiple threads of execution. It is what I will describe in this post.

Thread Control Block and the Stack

Since there's usually only one CPU, the threads need to share it. The easiest way to achieve time sharing is to have a fixed time slice at the end of which the system will switch to another thread. The systick interrupt perfect for this purpose. Not only is it invoked periodically, but it can also by requested manually by manipulating a register. This property will be useful in implementation of sleeping and blocking.

But first things first: we need to have a structure that will describe a thread, a. k. a. a Thread Control Block:

1 struct IO_sys_thread {
2   uint32_t             *stack_ptr;
3   uint32_t              flags;
4   void (*func)();
5   struct IO_sys_thread *next;
6   uint32_t              sleep;
7   IO_sys_semaphore     *blocker;
8   uint8_t               priority;
9 };
  • stack_ptr - points to the top of the thread's stack
  • flags - properties describing the thread; we will need just one to indicate whether the thread used the floating-point coprocessor
  • func - thread's entry point
  • next - pointer to the next thread in the queue (used for scheduling)
  • sleep - number of milliseconds the thread still needs to sleep
  • blocker - a pointer to a semaphore blocking the thread (if any)
  • priority - thread's priority

When invoking an interrupt handler, the CPU saves most of the running state of the current thread to the stack. Therefore, the task of the interrupt handler boils down to switching the stack pointers. The CPU will then pop all the registers back from the new stack. This behavior means that we need to do some initialization first:

 1 void IO_sys_stack_init(IO_sys_thread *thread, void (*func)(void *), void *arg,
 2   void *stack, uint32_t stack_size)
 3 {
 4   uint32_t sp1 = (uint32_t)stack;
 5   uint32_t sp2 = (uint32_t)stack;
 6   sp2 += stack_size;
 7   sp2 = (sp2 >> 3) << 3;          // the stack base needs to be 8-aligned
 8   if(sp1 % 4)
 9     sp1 = ((sp1 >> 2) << 2) + 4;  // make the end of the stack 4-aligned
10   stack_size = (sp2 - sp1) / 4;   // new size in double words
12   uint32_t *sp = (uint32_t *)sp1;
13   sp[stack_size-1] = 0x01000000;          // PSR with thumb bit
14   sp[stack_size-2] = (uint32_t)func;      // program counter
15   sp[stack_size-3] = 0xffffffff;          // link register
16   sp[stack_size-8] = (uint32_t)arg;       // r0 - the argument
17   thread->stack_ptr = &sp[stack_size-16]; // top of the stack
18 }

The ARM ABI requires that the top of the stack is 8-aligned and we will typically push and pop 4-byte words. The first part of the setup function makes sure that the stack boundaries are right. The second part sets the initial values of the registers. Have a look here for details.

  • the PSR register needs to have the Thumb bit switched on
  • we put the startup function address to the program counter
  • we put 0xffffffff to the link register to avoid confusing stack traces in GDB
  • r0 gets the argument to the startup function
  • an interrupt pushes 16 words worth of registers to the stack, so the initial value of the stack pointer needs to reflect that

This function is typically called as:

1 IO_sys_stack_init(thread, thread_wrapper, thread, stack, stack_size);

Note that we do not call the user thread function directly. Rather we have a wrapper function that gets the TBC as its argument. It is because we need to remove the thread from the scheduling queue if the user-specified function returns.

The context switcher

Let's now have a look at the code that does the actual context switching. Since it needs to operate directly on the stack, it needs to be written in assembly. It is not very complicated, though. What it does is:

  • pushing some registers to the stack
  • storing the current stack pointer in the stack_ptr variable of the current TCB
  • calling the scheduler to select the next thread
  • loading the stack pointer from the new thread's TCB
  • popping some registers from the new stack
 1 #define OFF_STACK_PTR 0
 2 #define OFF_FLAGS     4
 3 #define FLAG_FPU      0x01
 5   .thumb
 6   .syntax unified
 8   .global IO_sys_current
 9   .global IO_sys_schedule
11   .text
13   .global systick_handler
14   .type systick_handler STT_FUNC
15   .thumb_func
16   .align  2
17 systick_handler:
18   cpsid i                     ; disable interrupts
19   push  {r4-r11}              ; push r4-11
20   ldr   r0, =IO_sys_current   ; pointer to IO_sys_current to r1
21   ldr   r1, [r0]              ; r1 = OS_current
23   ubfx  r2, lr, #4, #1        ; extract the fourth bit from the lr register
24   cbnz  r2, .Lsave_stack      ; no FPU context to save
25   vstmdb sp!, {s16-s31}       ; push FPU registers, this triggers pushing of
26                               ; s0-s15
27   ldr   r2, [r1, #OFF_FLAGS]  ; load the flags
28   orr   r2, r2, #FLAG_FPU     ; set the FPU context flag
29   str   r2, [r1, #OFF_FLAGS]  ; store the flags
31 .Lsave_stack:
32   str   sp, [r1, #OFF_STACK_PTR] ; store the stack pointer at *OS_current
34   push  {r0, lr}              ; calling c code, so store r0 and the link
35                               ; register
36   bl    IO_sys_schedule       ; call the scheduler
37   pop   {r0, lr}              ; restore r0 and lr
39   ldr   r1, [r0]              ; load the new TCB pointer to r1
40   ldr   sp, [r1, #OFF_STACK_PTR] ; get the stack pointer of the new thread
42   orr   lr, lr, #0x10         ; clear the floating point flag in EXC_RETURN
43   ldr   r2, [r1, #OFF_FLAGS]  ; load the flags
44   tst   r2, #0x01             ; see if we have the FPU context
45   beq   .Lrestore_regs        ; no FPU context
46   vldmia sp!, {s16-s31}       ; pop the FPU registers
47   bic   lr, lr, #0x10         ; set the floating point flag in EXC_RETURN
49 .Lrestore_regs:
50   pop   {r4-r11}              ; restore regs r4-11
51   cpsie i                     ; enable interrupts
52   bx    lr                    ;  exit the interrupt, restore r0-r3, r12, lr, pc,
53                               ; psr

The only complication here is that we sometimes need to store the floating point registers in addition to the regular ones. It is, however, only necessary if the thread used the FPU. The fourth bit of EXC_RETURN, the value in the LR register, indicates the status of the FPU. Go here and here for more details. If the value of the bit is 0, we need to save the high floating-point registers to the stack and set the FPU flag in the TCB.

Also, after selecting the new thread, we check if its stack contains the FPU registers by checking the FPU flag in its TCB. If it does, we pop these registers and change EXC_RETURN accordingly.

The Lazy Stacking is taken care of by simply pushing and popping the high registers - it counts as an FPU operation.

Semaphores, sleeping and idling

We can now run threads and switch between them, but it would be useful to be able to put threads to sleep and make them wait for events.

Sleeping is easy. We just need to set the sleep field in the TCB of the current thread and make the scheduler ignore threads whenever their sleep field is not zero:

1 void IO_sys_sleep(uint32_t time)
2 {
3   IO_sys_current->sleep = time;
4   IO_sys_yield();
5 }

The ISR that handles the system time can loop over all threads and decrement this counter every millisecond.

Waiting for a semaphore works in a similar way. We mark the current thread as blocked:

 1 void IO_sys_wait(IO_sys_semaphore *sem)
 2 {
 3   IO_disable_interrupts();
 4   --*sem;
 5   if(*sem < 0) {
 6     IO_sys_current->blocker = sem;
 7     IO_sys_yield();
 8   }
 9   IO_enable_interrupts();
10 }

The purpose of IO_sys_yield is to indicate that the current thread does not need to run anymore and force a context switch. The function resets the systick counter and forces the interrupt:

1 void IO_sys_yield()
2 {
3   STCURRENT_REG = 0;          // clear the systick counter
4   INTCTRL_REG   = 0x04000000; // trigger systick
5 }

Waking a thread waiting for a semaphore is somewhat more complex:

 1 void IO_sys_signal(IO_sys_semaphore *sem)
 2 {
 3   IO_disable_interrupts();
 4   ++*sem;
 5   if(*sem <= 0 && threads) {
 6     IO_sys_thread *t;
 7     for(t = threads; t->blocker != sem; t = t->next);
 8     t->blocker = 0;
 9   }
10   IO_enable_interrupts();
11 }

If the value of the semaphore was negative, we find a thread that it was blocking and unblock it. It will make the scheduler consider this thread for running in the future.

None of the user-defined threads may be runnable at the time the scheduler makes its decision. All of them may be either sleeping or waiting for a semaphore. In that case, we need to keep the CPU occupied with something, i.e., we need a fake thread:

1 static void iddle_thread_func(void *arg)
2 {
3   (void)arg;
4   while(1) IO_wait_for_interrupt();
5 }


The system maintains a circular linked list of TCBs called threads. The job of the scheduler is to loop over this list and select the next thread to run. It places its selection in a global variable called IO_sys_current so that other functions may access it.

 1 void IO_sys_schedule()
 2 {
 3   if(!threads) {
 4     IO_sys_current = &iddle_thread;
 5     return;
 6   }
 8   IO_sys_thread *stop = IO_sys_current->next;
10   if(IO_sys_current == &iddle_thread)
11     stop = threads;
13   IO_sys_thread *cur  = stop;
14   IO_sys_thread *sel  = 0;
15   int            prio = 266;
17   do {
18     if(!cur->sleep && !cur->blocker && cur->priority < prio) {
19       sel = cur;
20       prio = sel->priority;
21     }
22     cur = cur->next;
23   }
24   while(cur != stop);
26   if(!sel)
27     sel = &iddle_thread;
29   IO_sys_current = sel;
30 }

This scheduler is simple:

  • whenever there is nothing to run, select the idle thread
  • otherwise select the next highest priority thread that is neither sleeping nor blocked on a semaphore

Starting up the beast

So how do we get this whole business running? We need to invoke the scheduler that will preempt the current thread and select the next one to run. The problem is that we're running using the stack provided by the bootstrap code and don't have a TCB. Nothing prevents us from creating a dummy one, though. We can create it on the current stack (it's useful only once) and point it to the beginning of our real queue of TCBs:

1 IO_sys_thread dummy;
2 = threads;
3 IO_sys_current = &dummy;

We then set the systick up:

1 STCTRL_REG     = 0;            // turn off
2 STCURRENT_REG  = 0;            // reset
3 SYSPRI3_REG   |= 0xE0000000;   // priority 7
4 STRELOAD_REG   = time_slice-1; // reload value

And force its interrupt:

1 IO_sys_yield();
2 IO_enable_interrupts();


Tests 11 and 12 run a dummy calculation for some time and then return. After this happens, the system can only run the idle thread. If we plug-in the profiler code, we can observe the timings on a logic analyzer:

Test #11
Test #11

Test 13 is more complicated than the two previous ones. Three threads are running in a loop, sleeping, and signaling semaphores. Two more threads are waiting for these semaphores, changing some local variables and signaling other semaphores. Finally, there is the writer thread that blocks on the last set of semaphores and displays the current state of the environment. The output from the logic analyzer shows that the writer thread needs around 3.3 time-slices to refresh the screen:

Test #13
Test #13

Silly Invaders

How all this makes Silly Invaders better? The main advantage is that we don't need to calculate complex timings for multiple functions of the program. We create two threads, one for rendering of the scene and another one for playing the music tune. Each thread cares about its own timing. Everything else takes care of itself with good enough time guarantees.

 1 IO_sys_thread game_thread;
 2 void game_thread_func()
 3 {
 4   while(1) {
 5     SI_scene_render(&scenes[current_scene].scene, &display);
 6     IO_sys_sleep(1000/scenes[current_scene].scene.fps);
 7   }
 8 }
10 IO_sys_thread sound_thread;
11 void sound_thread_func()
12 {
13   IO_sound_player_run(&sound_player);
14 }

The threads are registered with the system in the main function:

1 IO_sys_thread_add(&game_thread,  game_thread_func,  2000, 255);
2 IO_sys_thread_add(&sound_thread, sound_thread_func, 1000, 255);
3 IO_sys_run(1000);

For the complete code see:


There is a great course on EdX called Realtime Bluetooth Networks explaining the topic in more details. I highly recommend it.


It took some playing around, but I have finally managed to figure out how to build from source all the tools necessary to put Zephyr on Arduino 101. You may say that the effort is pointless because you could just use whatever is provided by the SDK. For me, however, the deal is more about what I can learn from the experience that about the result itself. There is enough open source code around to make things work reasonably well, but putting it all together is a bit of a challenge, so what follows is a short HOWTO.

Arduino 101 setup
Arduino 101 setup


Arduino 101 has a Quark core and an ARC EM core. The appropriate targets seem to be i586-none-elfiamcu and arc-none-elf for the former and the later respectively. Since there is no pre-packaged toolchain for either of these in Debian, you'll need to build your own. You can use the vanilla binutils (version 2.27 worked for me) and the vanilla newlib (version did not require any patches). GCC is somewhat more problematic. Since apparently not all the necessary ARC patches have been accepted into the mainline yet, you'll need to download it from the Synopsys GitHub repo. GDB requires tweaking for both cores.


]==> mkdir binutils && cd binutils
]==> wget
]==> tar jxf binutils-2.27.tar.bz2
]==> mkdir i586-none-elfiamcu && cd i586-none-elfiamcu
]==> ../binutils-2.27/configure --prefix=/home/ljanyst/Apps/cross-compilers/i586-none-elfiamcu --target=i586-none-elfiamcu
]==> make -j12 && make install
]==> cd .. && mkdir arc-none-elf && arc-none-elf
]==> ../binutils-2.27/configure --prefix=/home/ljanyst/Apps/cross-compilers/arc-none-elf --target=arc-none-elf
]==> make -j12 && make install
]==> cd ../..


]==> mkdir gcc && cd gcc
]==> wget
]==> tar jxf gcc-6.2.0.tar.bz2
]==> git clone
]==> cd gcc && git checkout arc-4.8-dev && cd ..
]==> mkdir i586-none-elfiamcu && cd i586-none-elfiamcu
]==> ../gcc-6.2.0/configure --prefix=/home/ljanyst/Apps/cross-compilers/i586-none-elfiamcu --target=i586-none-elfiamcu --enable-languages=c --with-newlib
]==> make -j12 all-gcc && make install-gcc
]==> cd .. && mkdir arc-none-elf && arc-none-elf
]==> ../gcc/configure --prefix=/home/ljanyst/Apps/cross-compilers/arc-none-elf --target=arc-none-elf  --enable-languages=c --with-newlib --with-cpu=arc700
]==> make -j12 all-gcc && make install-gcc
]==> cd ../..


]==> mkdir newlib && cd newlib
]==> wget
]==> tar zxf newlib-
]==> mkdir i586-none-elfiamcu && cd i586-none-elfiamcu
]==> ../newlib- --prefix=/home/ljanyst/Apps/cross-compilers/i586-none-elfiamcu --target=i586-none-elfiamcu
]==> make -j12 && make install
]==> cd .. && mkdir arc-none-elf && arc-none-elf
]==> ../newlib- --prefix=/home/ljanyst/Apps/cross-compilers/arc-none-elf --target=arc-none-elf
]==> make -j12 && make install
]==> cd ../..


]==> cd gcc/i586-none-elfiamcu
]==> make -j12 all-target-libgcc && make install-target-libgcc
]==> cd ../arc-none-elf
]==> make -j12 all-target-libgcc && make install-target-libgcc
]==> cd ../..

GDB does not work for either platform out of the box. For Quark it compiles the i386 version but does not recognize the iamcu architecture even though, according to Wikipedia, it's essentially the same as i586 and libbfd knows about it. After some poking around the code, it seems that initilizing the i386 platform with iamcu bfd architecture definition does the trick:

 1 diff -Naur gdb-7.11.1.orig/gdb/i386-tdep.c gdb-7.11.1/gdb/i386-tdep.c
 2 --- gdb-7.11.1.orig/gdb/i386-tdep.c     2016-06-01 02:36:15.000000000 +0200
 3 +++ gdb-7.11.1/gdb/i386-tdep.c  2016-09-24 15:39:11.000000000 +0200
 4 @@ -8890,6 +8890,7 @@
 5  _initialize_i386_tdep (void)
 6  {
 7    register_gdbarch_init (bfd_arch_i386, i386_gdbarch_init);
 8 +  register_gdbarch_init (bfd_arch_iamcu, i386_gdbarch_init);
10    /* Add the variable that controls the disassembly flavor.  */
11    add_setshow_enum_cmd ("disassembly-flavor", no_class, valid_flavors,

For ARC the Synopsys open source repo provides a solution.

]==> mkdir gdb
]==> wget
]==> tar xf gdb-7.11.1.tar.xz
]==> cd gdb-7.11.1 && patch -Np1 -i ../iamcu-tdep.patch && cd ..
]==> git clone
]=> cd binutils-gdb && git checkout arc-2016.09-gdb && cd ..
]==> mkdir i586-none-elfiamcu && cd i586-none-elfiamcu
]==> ../gdb-7.11.1/configure --prefix=/home/ljanyst/Apps/cross-compilers/i586-none-elfiamcu --target=i586-none-elfiamcu
]==> make -j12 && make install
]==> cd .. && mkdir arc-none-elf && arc-none-elf
]==> ../binutils-gdb/configure --prefix=/home/ljanyst/Apps/cross-compilers/arc-none-elf --target=arc-none-elf 
]==> make -j12 all-gdb && make install-gdb
]==> ../..


There was no OpenOCD release for quite some time, and it does not seem to have any support for Quark SE. The situation is not much better if you look at the head of the master branch of their repo. Fortunately, both Intel and Synopsys provide some support for their parts of the platform and making it work with mainline openocd does not seem to be hard.

]==> git clone && cd openocd
]==> git checkout lj
]==> ./bootstrap
]==> ./configure --prefix=/home/ljanyst/Apps/openocd
]==> make -j12 && make install

Zephyr uses the following configuration for the Arduino (referred to as openocd.conf below):

 1 source [find interface/ftdi/flyswatter2.cfg]
 2 source [find board/quark_se.cfg]
 4 quark_se.quark configure -event gdb-attach {
 5         reset halt
 6         gdb_breakpoint_override hard
 7 }
 9 quark_se.quark configure -event gdb-detach {
10         resume
11         shutdown
12 }

You can use the following commands to run the GDB server, flash for Quark and flash for ARC respectively (this is what Zephyr does):

]==> openocd -s /home/ljanyst/Apps/openocd/share/openocd/scripts/ -f openocd.cfg  -c 'init' -c 'targets' -c 'reset halt'
]==> openocd -s /home/ljanyst/Apps/openocd/share/openocd/scripts/ -f openocd.cfg  -c 'init' -c 'targets' -c 'targets quark_se.arc-em' -c 'reset halt' -c 'load_image zephyr.bin 0x40010000' -c 'reset halt' -c 'verify_image zephyr.bin 0x40010000' -c 'reset run' -c 'shutdown'
]==> openocd -s /home/ljanyst/Apps/openocd/share/openocd/scripts/ -f openocd.cfg  -c 'init' -c 'targets' -c 'targets quark_se.arc-em' -c 'reset halt' -c 'load_image zephyr.bin 0x40034000' -c 'reset halt' -c 'verify_image zephyr.bin 0x40034000' -c 'reset run' -c 'shutdown'

Hello world!

You need to compile and flash Zephyr's Hello World sample. The two commands below do the trick for the compilation part:

make BOARD=arduino_101_factory CROSS_COMPILE=i586-none-elfiamcu- CFLAGS="-march=lakemont -mtune=lakemont -msoft-float -miamcu -O0"
make BOARD=arduino_101_sss_factory CROSS_COMPILE=arc-none-elf-

After flashing, you should see the following on your UART console:

]==> screen /dev/ttyUSB0 115200,cs8
ipm_console0: 'Hello World! arc'
Hello World! x86


If you follow the instructions from the Zephyr wiki, debugging for the Intel part works fine. I still have some trouble making breakpoints work for ARC and will try to write an update if I have time to figure it out.

]==> i586-none-elfiamcu-gdb outdir/zephyr.elf
(gdb) target remote :3333
Remote debugging using :3333
0x0000fff0 in ?? ()
(gdb) b main
Breakpoint 1 at 0x400100ed: file /home/ljanyst/Projects/zephyr/zephyr-project/samples/hello_world/nanokernel/src/main.c, line 37.
(gdb) c
target running
hit hardware breakpoint (hwreg=0) at 0x400100ed

Breakpoint 1, main () at /home/ljanyst/Projects/zephyr/zephyr-project/samples/hello_world/nanokernel/src/main.c:37
37              PRINT("Hello World! %s\n", CONFIG_ARCH);
(gdb) s
step done from EIP 0x400100ed to 0x400100f2
step done from EIP 0x400100f2 to 0x400100f7
step done from EIP 0x400100f7 to 0x40013129
target running
hit hardware breakpoint (hwreg=1) at 0x4001312f
printk (fmt=0x40013e04 "Hello World! %s\n") at /home/ljanyst/Projects/zephyr/zephyr-project/misc/printk.c:164
164             va_start(ap, fmt);
(gdb) s
step done from EIP 0x4001312f to 0x40013132
step done from EIP 0x40013132 to 0x40013135
165             _vprintk(fmt, ap);


My medium-term goal is to port my Silly Invaders game to a Real Time Operating System. Zephyr seems to be a good choice. It's open source, operates under the auspices of the Linux Foundation and has an active community with many developers from Intel committing the code.

They, unfortunately, do not support Tiva so I will need to port the OS before I can proceed with the application. I decided to buy the Freescale K64F board, which is supported, to familiarize myself a little with Zephyr before I start the porting work. The howto page for setting up K64F seems to be terribly complicated and requires a JTAG programmer. I summarize here a simpler way using cmsis-dap over USB.


I updated the MBED interface firmware following the instructions on this site. I also build my own OpenOCD from the head of the master branch using the following configuration options:

./configure --prefix=/home/ljanyst/Apps/openocd  --enable-cmsis-dap

Things may work fine with the stock firmware and the stock OpenOCD as well, but I did not try that. It's also probably a good idea to add the following udev rule so that you don't have to run things as root:

]==> cat /etc/udev/rules.d/99-openocd.rules
# frdm-k64f
ATTRS{idVendor}=="0d28", ATTRS{idProduct}=="0204", GROUP="plugdev", MODE="0660"
]==> sudo udevadm control --reload-rules

Hello world!

I use the ARM cross-compiler provided by Debian to compile Zephyr and then just copy the resulting binary to the MBED disk:

]==> cd samples/hello_world/nanokernel
]==> make BOARD=frdm_k64f CROSS_COMPILE=arm-none-eabi- CFLAGS=-O0
]==> cp outdir/zephyr.bin /media/ljanyst/MBED/

You can see the effects in the UART console using screen:

]==> screen /dev/ttyACM0 115200,cs8
Hello World!

I then run OpenOCD using the following script:

]==> cat k64f.cfg
set CHIPNAME k60
source [find target/kx.cfg]

$_TARGETNAME configure -event gdb-attach {

]==> openocd -s /home/ljanyst/Apps/openocd/share/openocd/scripts/ -c "interface cmsis-dap" -f k64f.cfg

And GDB:

]==> cat remote1.conf
target extended-remote :3333
monitor reset init
break main
]==> arm-none-eabi-gdb  --command=remote1.conf outdir/zephyr.elf
Breakpoint 1 at 0x129c: file /home/ljanyst/Projects/zephyr/zephyr-project/samples/hello_world/nanokernel/src/main.c, line 37.
Note: automatically using hardware breakpoints for read-only addresses.

Breakpoint 1, main () at /home/ljanyst/Projects/zephyr/zephyr-project/samples/hello_world/nanokernel/src/main.c:37
37              PRINT("Hello World!\n");
(gdb) s
printk (fmt=0x2c90 "Hello World!\n") at /home/ljanyst/Projects/zephyr/zephyr-project/misc/printk.c:164
164             va_start(ap, fmt);
(gdb) s
165             _vprintk(fmt, ap);
(gdb) s
_vprintk (fmt=0x2c90 "Hello World!\n", ap=...) at /home/ljanyst/Projects/zephyr/zephyr-project/misc/printk.c:75
75              int might_format = 0; /* 1 if encountered a '%' */
(gdb) where
#0  _vprintk (fmt=0x2c90 "Hello World!\n", ap=...) at /home/ljanyst/Projects/zephyr/zephyr-project/misc/printk.c:75
#1  0x00001b46 in printk (fmt=0x2c90 "Hello World!\n") at /home/ljanyst/Projects/zephyr/zephyr-project/misc/printk.c:165
#2  0x000012a2 in main () at /home/ljanyst/Projects/zephyr/zephyr-project/samples/hello_world/nanokernel/src/main.c:37

Table of Contents

  1. Compiling and start-up code
  2. Hardware Abstraction Layer and UART
  3. Debugging, display, heap and fonts
  4. Timers and ADC
  5. DAC, Sound and Nokia Tunes
  6. Random Number Generator, Rendering Engine and the Game
  7. Operating System

Random Number Generator

To make the game more engaging, we introduce some randomness into it. We don't need anything cryptographically secure, so a Linear Congruential Generator will do just fine. We count the time from the start-up in millisecond-long jiffies and wait for a first button press to select the seed.

 1 void button_event(IO_io *io, uint16_t event)
 2 {
 3   uint64_t btn;
 4   IO_get(io, &btn);
 5   if(btn)
 6     button_value = 1;
 8   if(!rng_initialized) {
 9     rng_initialized = 1;
10     IO_rng_seed(IO_time());
11   }
12 }

Rendering Engine

The rendering engine takes a scene descriptor, a display device, and a timer. Based on this information it computes new positions of objects, draws them on the screen if necessary and checks for collisions.

 1 struct SI_scene {
 2   SI_object **objects;
 3   void      (*pre_render)(struct SI_scene *);
 4   void      (*collision)(SI_object *obj1, SI_object *obj2);
 5   uint8_t     fps;
 6   uint8_t     num_objects;
 7   uint8_t     flags;
 8 };
10 void SI_scene_render(SI_scene *scene, IO_io *display, IO_io *timer);

Each SI_scene holds a list of "polymorphic" objects that should be rendered, a pointer to a pre_render function that calculates a new position of each object, and a pointer to a collision callback that is invoked when the scene renderer detects an overlap between two objects. The SI_scene_render function runs after every interrupt:

1 while(1) {
2     SI_scene_render(&scenes[current_scene].scene, &display, &scene_timer);
3     IO_wait_for_interrupt();
4   }

Whether it gets executed or not, depends on the flag parameter of the scene. If it's set to SI_SCENE_IGNORE, the renderer returns immediately. On the other hand, if it's set to SI_SCENE_RENDER, the renderer calls the pre_render callback, draws the objects on the screen, and computes the object overlaps notifying the collision callback if necessary. After each frame, the scene is disabled (SI_SCENE_IGNORE). It is re-enabled by the timer interrupt in a time quantum that depends on the fps parameter.

See SI_scene.h and SI_scene.c.

Each object has a draw function that enables the renderer to draw it on the screen. There are three types of objects: a generic object, a bitmap object, and a text object:

 1 struct SI_object {
 2   uint16_t x;
 3   uint16_t y;
 4   uint16_t width;
 5   uint16_t height;
 6   uint8_t  flags;
 7   uint8_t  user_flags;
 8   void (*draw)(struct SI_object *this, IO_io *display);
 9 };
11 struct SI_object_bitmap {
12   SI_object        obj;
13   const IO_bitmap *bmp;
14 };
16 struct SI_object_text {
17   SI_object      obj;
18   const char    *text;
19   const IO_font *font;
20 };

The object array in the scene is initialized with the SI_object pointers:

1 static SI_object         score_obj;
2 static SI_object_bitmap  invader_obj[5];
3 scene->objects[1] = &score_obj;
4 scene->objects[i+5] = &invader_obj[i].obj;

See SI_scene_game.c.

The renderer calls the draw function of each SI_OBJECT_VISIBLE object:

1 obj->draw(obj, display);

Finally, each draw method uses the CONTAINER_OF macro to compute the pointer to the actual object of concrete type:

 2   ((TYPE *) ( (char *)MEMBER_ADDR - offsetof(TYPE, MEMBER)))
 4 void SI_object_bitmap_draw(SI_object *obj, IO_io *display)
 5 {
 6   SI_object_bitmap *this = CONTAINER_OF(SI_object_bitmap, obj, obj);
 7   IO_display_print_bitmap(display, obj->x, obj->y, this->bmp);
 8 }
10 void SI_object_text_draw(SI_object *obj, IO_io *display)
11 {
12   SI_object_text *this = CONTAINER_OF(SI_object_text, obj, obj);
13   IO_display_set_font(display, this->font);
14   IO_display_cursor_goto(display, obj->x, obj->y);
15   IO_print(display, "%s", this->text);
16 }

The Game

All this seems to work pretty well when put together:

The Game

See silly-invaders.c.

Table of Contents

  1. Compiling and start-up code
  2. Hardware Abstraction Layer and UART
  3. Debugging, display, heap and fonts
  4. Timers and ADC
  5. DAC, Sound and Nokia Tunes
  6. Random Number Generator, Rendering Engine and the Game
  7. Operating System


Tiva does not have a DAC, but we'd like to have some sound effects while playing the game. Fortunately, it's easy to make a simple binary-weighted DAC using resistors and GPIO signals. It's not very accurate, but will do.

A binary-weighted DAC
A binary-weighted DAC

As far as the software is concerned, we will simply take 4 GPIO pins and set them up as output. We will then get an appropriate bit-banded alias such that writing an integer to it is reflected only in the state of these four pins.

 1 int32_t IO_dac_init(IO_io *io, uint8_t module)
 2 {
 3   if(module > 0)
 4     return -IO_EINVAL;
 6   TM4C_gpio_port_init(GPIO_PORTD_NUM);
 7   TM4C_gpio_pin_init(GPIO_PORTD_NUM, GPIO_PIN0_NUM, 0, 0, 1);
 8   TM4C_gpio_pin_init(GPIO_PORTD_NUM, GPIO_PIN1_NUM, 0, 0, 1);
 9   TM4C_gpio_pin_init(GPIO_PORTD_NUM, GPIO_PIN2_NUM, 0, 0, 1);
10   TM4C_gpio_pin_init(GPIO_PORTD_NUM, GPIO_PIN3_NUM, 0, 0, 1);
12   uint32_t addr =  GPIO_REG_BASE + GPIO_PORTD;
13   addr += GPIO_PIN0_BIT_OFFSET;
14   addr += GPIO_PIN1_BIT_OFFSET;
15   addr += GPIO_PIN2_BIT_OFFSET;
16   addr += GPIO_PIN3_BIT_OFFSET;
17   dac_data = (uint32_t*)addr;
19   io->type    = IO_DAC;
20   io->sync    = 0;
21   io->channel = 0;
22   io->flags   = 0;
23   io->read    = 0;
24   io->write   = dac_write;
25   io->event   = 0;
27   return 0;
28 }

See TM4C_platform01.c.


We will create a virtual device consisting of a DAC and a timer. Using the timer, we will change the output of the DAC frequently enough to produce sound. Since the timer interrupt needs to be executed often and any delay makes the sound break, we need to assign the highest possible priority to this interrupt so that it does not get preempted.

1 int32_t IO_sound_init(IO_io *io, uint8_t module)
2 {
3 //...
4   IO_dac_init(&snd_dac, 0);
5   IO_timer_init(&snd_timer, 11);
6   TM4C_enable_interrupt(104, 0); // adjust the interrupt priority for timer 11
7   snd_timer.event = snd_timer_event;
8 //...
9 }

Writing to this virtual device sets the frequency of the tone that we want to play by adjusting the timer's firing rate accordingly.

 1 static int32_t snd_write(IO_io *io, const void *data, uint32_t length)
 2 {
 3   if(length != 1)
 4     return -IO_EINVAL;
 5   const uint64_t *val = data;
 7   if(!(*val)) {
 8     snd_interval = 0;
 9     return 1;
10   }
11   uint8_t turn_on = 0;
12   if(!snd_interval)
13     turn_on = 1;
15   double interval = 1.0/(*val);
16   interval /= 32.0;
17   interval *= 1000000000;
18   snd_interval = interval;
20   if(turn_on)
21     IO_set(&snd_timer, interval);
22   return 1;
23 }

In reality, the timer fires 32 times more often than the frequency of the tone requires. It is because we use a table with 32 entries to simulate the actual sound wave. In principle, we could just use a sinusoid, but it turns out that the quality of the sound is not so great if we do so. I have found another waveform in the lab materials of EdX's Embedded Systems course that works much better.

 1 static const uint8_t snd_trumpet[] = {
 2   10, 11, 11, 12, 10,  8,  3,  1,  8, 15, 15, 11, 10, 10, 11, 10, 10, 10, 10,
 3   10, 10, 10, 10, 11, 11, 11, 11, 11, 11, 10, 10, 10 };
 5 static void snd_timer_event(IO_io *io, uint16_t event)
 6 {
 7   IO_set(&snd_dac, snd_trumpet[snd_step++]);
 8   snd_step %= 32;
 9   if(snd_interval)
10     IO_set(io, snd_interval);
11 }

See TM4C_platform01.c.

Nokia tunes

The tune player API consists of four functions:

1 int32_t IO_sound_play(IO_io *io, IO_io *timer, IO_tune *tune, uint16_t start);
2 int32_t IO_sound_stop(IO_io *io);
3 IO_tune *IO_sound_decode_RTTTL(const char *tune);
4 void IO_sound_free_tune(IO_tune *tune);
  • IO_sound_play uses a sound device and a timer to play a tune. It sends an IO_EVENT_DONE to the virtual sound device when the playback finishes.
  • IO_sound_stop stops the playback on the given device and returns the index of the last note it played so that it can be restarted from that point.
  • IO_sound_decode_RTTL takes an RTTTL representation and produces the IO_tune structure that can be handled by the player.
  • IO_sound_free_tune frees the memory used by IO_sound_decode_RTTL when it's no longer needed.

There is plenty of tunes all over the Internet. The ones in the demo video are taken from here. The code of the player is based on this work.

It plays music! :)

See IO_sound.c.

Table of Contents

  1. Compiling and start-up code
  2. Hardware Abstraction Layer and UART
  3. Debugging, display, heap and fonts
  4. Timers and ADC
  5. DAC, Sound and Nokia Tunes
  6. Random Number Generator, Rendering Engine and the Game
  7. Operating System


Tiva has 12 timer modules that can be configured in various relatively complex ways. However, for the purpose of this game, we don't need anything fancy. We will, therefore, represent a timer as an IO_io device with the IO_set function (generalized from IO_gpio_set) setting and arming it. When it counts to 0, the IO_TICK event will be reported to the event handler.

 1 void timer_event(IO_io *io, uint16_t event)
 2 {
 4   IO_set(&timer, 500000000); // fire in half second
 5 }
 7 int main()
 8 {
 9   IO_init();
10   IO_timer_init(&timer, 0);
11   timer.event = timer_event;
12   IO_set(&timer, 500000000); // fire in half second
14   while(1)
15     IO_wait_for_interrupt();
16 }

See TM4C_timer.c and test-07-timer.c.


Similarly to the timers, the ADC sequencers on Tiva may be set up in fairly sophisticated ways. There are 12 analog pins, two modules with four sequencers each. Again, we don't need anything sophisticated here, so we will just use the first eight pins and assign them to a separate sequencer each. In the blocking mode, IO_get initiates the readout and returns the value. In the non-blocking and asynchronous mode IO_set, requests sampling and IO_get returns it when ready. An IO_DONE event is reported to the event handler if enabled.

 1 IO_io slider;
 2 IO_io timer;
 3 uint64_t sliderR = 0;
 5 void timer_event(IO_io *io, uint16_t event)
 6 {
 7   IO_set(&slider, 0); // request a sample
 8 }
10 void slider_event(IO_io *io, uint16_t event)
11 {
12   IO_get(&slider, &sliderR);
13   IO_set(&timer, 100000000); // fire in 0.1 sec
14 }
16 int main()
17 {
18   IO_init();
20   IO_timer_init(&timer, 0);
21   IO_slider_init(&slider, 0, IO_ASYNC);
23   timer.event     = timer_event;
24   slider.event    = slider_event;
26   IO_event_enable(&slider,    IO_EVENT_DONE);
28   IO_set(&slider, 0); // request a sample
30   while(1)
31     IO_wait_for_interrupt();
32 }

See TM4C_adc.c and test-08-input.c.

The game board

Everything works fine when soldered together as well.

Buttons and the Slider

Table of Contents

  1. Compiling and start-up code
  2. Hardware Abstraction Layer and UART
  3. Debugging, display, heap and fonts
  4. Timers and ADC
  5. DAC, Sound and Nokia Tunes
  6. Random Number Generator, Rendering Engine and the Game
  7. Operating System


To test and debug the SSI code, I connected two boards and made them talk to each other. It mostly worked. However, it turned out that, by default, you can run the OpenOCD-GDB duo only for one board at a time. It's the one that libusb enumerates first. There is a patch that lets OpenOCD choose the device to attach to by its serial number. The patch has not made it to any release yet, but applying it and recompiling the whole thing is relatively straight-forward: clone the source, apply the patch and run the usual autotools combo. You will then need to create a config file for each board that specifies unique port numbers and defines the serial number of the device to attach to:

]==> cat board1.cfg
gdb_port 3333
telnet_port 4444
tcl_port 6666
interface hla
hla_serial 0Exxxxxx
source [find board/ek-tm4c123gxl.cfg]

]==> cat board2.cfg
gdb_port 3334
telnet_port 4445
tcl_port 6667
interface hla
hla_serial 0Exxxxxx
source [find board/ek-tm4c123gxl.cfg]

Separate GDB batch files come handy as well:

]==> cat gdb-board1.conf
target extended-remote :3333
monitor reset init
break main

]==> cat gdb-board2.conf
target extended-remote :3334
monitor reset init
break main

Tweaking the linker script

GCC started generating .init and .fini sections that contain no-op functions:

]==> arm-none-eabi-objdump -d  test-06-display.axf
Disassembly of section .init:

00007af8 <_init>:
    7af8:       b5f8            push    {r3, r4, r5, r6, r7, lr}
    7afa:       bf00            nop

Disassembly of section .fini:

00007afc <_fini>:
    7afc:       b5f8            push    {r3, r4, r5, r6, r7, lr}
    7afe:       bf00            nop

We will discard this code by adding the following to the linker script:

2   {
3     *(.init*)
4     *(.fini*)
5   }

GCC also started generating the stack unwinding code and GDB gets confused in some places if it is not present, so we put it after the code in FLASH:

1 .ARM.exidx :
2   {
3     *(.ARM.exidx*)
4     *(.gnu.linkonce.armexidx*)
5   } > FLASH

See TM4C.ld.


We need both SSI and GPIO to control the Nokia display that we want to use for the game. Since, in the end, both these systems need to push and receive data, they fit well the generic interface used for UART. The SSI's initialization function needs many more parameters than the one for UART, so we pack them all in a struct. As far as GPIO is concerned, there are two helpers: IO_gpio_get_state and IO_gpio_set_state that just write the appropriate byte to the IO device. GPIO also comes with a new event type: IO_EVENT_CHANGE.

1 struct IO_ssi_attrs {
2   uint8_t  master;        //!< 1 for master, 0 for slave
3   uint8_t  slave_out;     //!< 1 slave output enabled, 0 slave output disabled
4   uint32_t bandwidth;     //!< bandwidth in bps
5   uint8_t  frame_format;  //!< frame format
6   uint8_t  freescale_spo; //!< SPO value for freescale frames
7   uint8_t  freescale_sph; //!< SPH value for freescale frames
8   uint8_t  frame_size;    //!< size of the frame in bits
9 };

See TM4C_ssi.c and TM4C_gpio.c.

Platforms, the display interface, and fonts

All the devices that are not directly on the board may be connected in many different ways. To handle all these configurations with the same board, we split the driver into libtm4c.a (for the board specific stuff) and libtm4c_platform_01.a (for the particular configuration). For now, the only thing that the platform implements is the display interface. It passes the appropriate SSI module and GPIOs to the actual display driver. The user sees the usual IO_io structure that is initialized with IO_display_init and can be written to and synced. write renders the text to the back-buffer, while sync sends the back-buffer to the device for rendering. There's also a couple of specialized functions that have to do only with display devices:

 1 int32_t IO_display_get_attrs(IO_io *io, IO_display_attrs *attrs);
 2 int32_t IO_display_clear(IO_io *io);
 3 int32_t IO_display_put_pixel(IO_io *io, uint16_t x, uint16_t y, uint32_t argb);
 4 int32_t IO_display_print_bitmap(IO_io *io, uint16_t x, uint16_t y,
 5   const IO_bitmap *bitmap);
 6 int32_t IO_display_set_font(IO_io *io, const IO_font *font);
 7 int32_t IO_display_cursor_goto(IO_io *io, uint32_t x, uint32_t y);
 8 int32_t IO_display_cursor_goto_text(IO_io *io, uint32_t line, uint32_t space);
 9 int32_t IO_display_cursor_move(IO_io *io, int32_t dx, int32_t dy);
10 int32_t IO_display_cursor_move_text(IO_io *io, int32_t dline, int32_t dspace);

See IO_display.h.

Platform 01 provides one display device, a PCD8544, the one used in Nokia 5110. It translates and passes the interface calls to the lower-level driver. See pcd8544.c.

If you haven't noticed in the list of the functions above, the display interface supports multiple fonts. In fact, I wrote a script that rasterizes TrueType fonts and creates internal IO_font structures. These can then be used to render text on a display device. All you need to do is provide a TTF file, declare the font name and size in CMake, and then reference it in the font manager. The code comes with DejaVuSans10 and DejaVuSerif10 by default.

The heap

Malloc comes handy from time to time, so I decided to implement one. It is extremely prone to fragmentation and never merges chunks, so using free is not advisable. Still, sometimes you just wish you had one. For instance, when you need to define a buffer for pixels and don't have a good way to ask for display parameters at compile time. For alignment reasons, the heap management code reserves a bit more than 4K for the stack. It then creates a 32 bytes long guard region protected by the MPU. Everything between the end of the .bss section and the guard page is handled by IO_malloc and IO_free.

 1 void TM4C_heap_init()
 2 {
 3   uint8_t *stack_start = (uint8_t *)0x20007ff8;
 4   uint8_t *stack_end   = stack_start-4120;
 5   uint8_t *stack_guard = stack_end-32;
 6   uint8_t *heap_start  = (uint8_t *)&__bss_end_vma;
 8   MPUCTRL_REG |= (uint32_t)0x05; // enable MPU and the background region
10   uint32_t val = (uint32_t)stack_guard;
11   val |= 0x10; // valid
12   val |= 0x07; // highest priority region
13   MPUBASE_REG &= ~0xfffffff7;
14   MPUBASE_REG |= val;
16   val = 0;
17   val |= (1 << 28); // disable instruction fetches
18   val |= (4 << 1);  // 0x04 == 32bytes
19   val |= 1;         // enable the region
20   MPUATTR_REG &= ~0x173fff3f;
21   MPUATTR_REG |= val;
23   IO_set_up_heap(heap_start, stack_guard);
24 }

See TM4C.c and IO_malloc.c.

A display test

The LCD demo works fine on the breadboard. As you can see, there is a text printed with two kinds of fonts: with and without serifs. Later, the code plays the Game of Life shooting gliders.

Glider Gun on a breadboard


Since the display works fine, it's safe to do some soldering. We'll use a 9-volt battery as a power source and an LM1086 power regulator to supply 3.3 volts to the microcontroller and other devices.

Soldered Display - Front
Soldered Display - Front

Soldered Display - Back
Soldered Display - Back

Glider Gun - Soldered

Table of Contents

  1. Compiling and start-up code
  2. Hardware Abstraction Layer and UART
  3. Debugging, display, heap and fonts
  4. Timers and ADC
  5. DAC, Sound and Nokia Tunes
  6. Random Number Generator, Rendering Engine and the Game
  7. Operating System

Hardware Abstraction Layer

I'd like the game to be as portable as possible. As far as the game logic is concerned, the actual interaction with the hardware is immaterial. Ideally, we just need means to write a pixel to a screen, blink an LED or check the state of a push-button. It means that hiding the hardware details behind a generic interface is desirable. This interface can then be re-implemented for a different kind of board, and the whole thing can act as a cool tool for getting to know new hardware.

In this project, we will use one static library (libio.a) to provide the interface. This library will implement all the hardware independent functions as well as the stubs for the driver (as weak symbols). Another library (libtm4c.a) will provide the real driver logic for Tiva and the strong symbols. This kind of approach enables us to use the linker to easily produce the final binary for other platforms in the future.

Initialization PLL and FPU

To initialize the hardware platform, the user calls IO_init(). The stub for this function is provided by libio.a as follows:

1 int32_t __IO_init()
2 {
3   return -IO_ENOSYS;
4 }
6 WEAK_ALIAS(__IO_init, IO_init);

The actual implementation for Tiva in libtm4c.a initializes PLL to provide 80MHz clock and turns on microDMA. It also sets the access permissions to the FPU by setting the appropriate bits in the CPAC register and resetting the pipeline in assembly. We will likely need the floating point in the game, and it comes handy when calculating UART transmission speed parameters.

 1 int32_t IO_init()
 2 {
 3   TM4C_pll_init();
 4   TM4C_dma_init();
 6   // Enable the floating point coprocessor
 7   CPAC_REG |= (0x0f << 20);
 8   __asm__ volatile (
 9     "dsb\r\n"        // force memory writed before continuing
10     "isb\r\n" );     // reset the pipeline
11   return 0;
12 }

Simple read/write interface and functions

We provide an IO device abstraction called IO_io and implement four generic functions for accessing it:

1 int32_t IO_write(IO_io *io, const void *data, uint32_t length);
2 int32_t IO_print(IO_io *io, const char *format, ...);
3 int32_t IO_read(IO_io *io, void *data, uint32_t length);
4 int32_t IO_scan(IO_io *io, uint8_t type, void *data, uint32_t param);

IO_read and IO_write push to and fetch bytes from the device. IO_print writes a formated string to the device using the standard printf semantics. IO_scan reads a word (a stream of characters surrounded by whitespaces) and tries to convert it to the requested type.

Each subsystem needs to provide its initialization function to fill the IO_io struct with the information required to perform the IO operations. For instance, the following function initializes UART:

1 int32_t IO_uart_init(IO_io *io, uint8_t module, uint16_t flags, uint32_t baud);

It needs to know which UART module to use, what the desired mode of operation is (non-blocking, asynchronous, DMA...) and what should be the speed of the link. This approach hides the hardware details from the user well and is very generic, see test-01-uart.c. For instance, you can write something like this:

1 IO_init();
2 IO_io uart0;
3 IO_uart_init(&uart0, 0, 0, 115200);
4 IO_print(&uart0, "Hello %s\r\n", "World");

Passing 0 as flags to the UART initialization routine creates a blocking device that is required for IO_print and IO_scan to work.

Non-blocking and asynchronous IO

A blocking IO device will cause the IO functions to return only after they have pushed or pulled all the data to or from the hardware. If, however, you configure a non-blocking (IO_NONBLOCKING) device, the functions will process as many bytes as they can and return. They return -IO_WOULDBLOCK if it is not possible to handle any data.

The IO_ASYNC flag makes the system notify the user about the device readiness for reading or writing. These events are received and processed by a user-defined call-back function:

 1 void uart_event(IO_io *io, uint16_t event)
 2 {
 3   if(event & IO_EVENT_READ) {
 4   }
 6   if(event & IO_EVENT_WRITE) {
 7   }
 8 }
10 int main()
11 {
12   IO_init();
13   IO_io uart0;
14   IO_uart_init(&uart0, 0, IO_NONBLOCKING|IO_ASYNC, 115200);
15   uart0.event = uart_event;
16   IO_event_enable(&uart0, IO_EVENT_READ|IO_EVENT_WRITE);
17   while(1) IO_wait_for_interrupt();
18 }

See test-02-uart-async.c.


The DMA mode allows for transferring data between the peripheral and the main memory in the background. It uses the memory bus when the CPU does not need it for anything else. When in this mode, IO_read and IO_write only initiate a background transfer. The next invocation will either block or return -EWOULDBLOCK, depending on other configuration flags, as long as the current DMA operation is in progress. The memory buffer cannot be changed until the DMA transfer is done. Passing IO_ASYNC will generate completion events for DMA operations. It enables us to implement a pretty neat UART echo app:

 1 #include <io/IO.h>
 2 #include <io/IO_uart.h>
 4 char buffer[30];
 6 void uart_event(IO_io *io, uint16_t event)
 7 {
 8   if(event & IO_EVENT_DMA_READ)
 9     IO_write(io, buffer, 30);
11   if(event & IO_EVENT_DMA_WRITE)
12     IO_read(io, buffer, 30);
13 }
15 int main()
16 {
17   IO_init();
18   IO_io uart0;
19   IO_uart_init(&uart0, 0, IO_DMA|IO_ASYNC, 115200);
20   uart0.event = uart_event;
21   IO_read(&uart0, buffer, 30);
22   while(1) IO_wait_for_interrupt();
23 }

See test-03-uart-dma.c.

The driver

There was nothing ultimately hard about writing the driver part. It all boils down to reading the data sheet and following the instruction contained therein. It took quite some time to put everything together into a coherent whole, though. See: TM4C_uart.c.

Table of Contents

  1. Compiling and start-up code
  2. Hardware Abstraction Layer and UART
  3. Debugging, display, heap and fonts
  4. Timers and ADC
  5. DAC, Sound and Nokia Tunes
  6. Random Number Generator, Rendering Engine and the Game
  7. Operating System


I have recently been playing with microcontrollers a lot. Among other things, I have worked through some of the labs from this course on EdX. The material does not use much high-level code, so it gives a good overview of how the software interacts with the hardware. There are some "black box" components in there, though. For me, the best way to learn something well has always been building things from "first principles." I find black boxes frustrating. This post describes the first step on my way to make an Alien Invaders game from "scratch."

Compiling for Tiva

First, we need to be able to compile C code for Tiva. To this end, we will use GCC as a cross-compiler, so make sure you have the arm-none-eabi-gcc command available on your system. We will use the following flags build Tiva-compatible binaries:

  • -mcpu=cortex-m4 - produce the code for ARM Cortex-M4 CPU
  • -mfpu=fpv4-sp-d16 - FPv4 single-precision floating point with the register bank seen by the software as 16 double-words
  • -mfloat-abi=hard - generate floating point instructions and use FPU-specific calling conventions
  • -mthumb - use the Thumb instruction set
  • -std=c11 - use the C11 standard
  • -O0 - don't perform any optimizations
  • -Wall and -pedantic - warn about all the potential issues with the code
  • -ffunction-sections and -fdata-sections - place every function and data item in a separate section in the resulting object file; it allows the optimizations removing all unused code and data to be performed at link-time

Object files

To generate a proper binary image, we need to have some basic understanding of object files produced by the compiler. In short, they consist of sections containing various pieces of compiled code and the corresponding data. These sections may be loadable, meaning that the contents of the section should be read from the object file and stored in memory. They may also be just allocatable, meaning that there is nothing to be loaded, but a chunk of memory needs to be put aside for them nonetheless. There are multiple sections in a typical ELF object file, but we need to know only four of them:

  • .text - contains the program code
  • .rodata - contains the constants (read-only data)
  • .data - contains the read-write data
  • .bss - contains statically allocated variables (initialized to zero)

Let's consider the following code:

 1 #include <stdio.h>
 3 int a = 12;
 4 int b;
 5 const char *c = "The quick brown fox jumps over the lazy dog.";
 6 const char * const d = "The quick brown fox jumps over the lazy dog.";
 7 int e[20];
 8 const int f[4] = {7, 4, 2, 1};
10 int main(int argc, char **argv)
11 {
12   printf("Hello world!\n");
13   return 0;
14 }

After compiling it, we end up with an object file containing the following sections (most have been omitted for clarity):

]==> objdump -h test

test:     file format elf64-x86-64

Idx Name          Size      VMA               LMA               File off  Algn
 13 .text         00000192  00000000004003f0  00000000004003f0  000003f0  2**4
 15 .rodata       0000006d  0000000000400590  0000000000400590  00000590  2**4
 24 .data         00000020  0000000000600948  0000000000600948  00000948  2**3
                  CONTENTS, ALLOC, LOAD, DATA
 25 .bss          00000090  0000000000600980  0000000000600980  00000968  2**5

As you can see, every section has two addresses:

  • VMA (virtual memory address) - This is the location of the section the code expects when it runs.
  • LMA (load memory address) - This is the location where the section is stored by the loader.

These two addresses are in most cases the same, except the situation that we care about here: an embedded system. In our binary image, we need put the .data section in ROM because it contains initialized variables whose values would otherwise be lost on reset. The section's LMA, therefore, must point to a location in ROM. However, this data is not constant, so it's final position at program's runtime needs to be in RAM. Therefore, the VMA must point to a location RAM. We will see an example later.

Tiva's memory layout

Tiva has 256K of ROM (range: 0x0000000000-0x0003ffff) and 32K of RAM (range: 0x20000000-0x20003fff). See the table 2-4 on page 90 of the data sheet for details. The NVIC (Interrupt) table needs to be located at address 0x00000000 (section 2.5 of the data sheet). We will create this table in C, put it in a separate object file section, and fill with weak aliases of the default handler function. This approach will enable the user to redefine the interrupt handlers without having to edit the start-up code. The linker will resolve the handler addresses to strong symbols if any are present.

So, we define a dummy interrupt handler that loops indefinitely:

1 void __int_handler(void)
2 {
3   while(1);
4 }

and then create a bunch of weak aliases to this function:

1 #define DEFINE_HANDLER(NAME) void NAME ## _handler() __attribute__ ((weak, alias ("__int_handler")))
4 DEFINE_HANDLER(hard_fault);
6 DEFINE_HANDLER(bus_fault);
7 DEFINE_HANDLER(usage_fault);
8 ...

Finally, we construct the nvic_table, place it in the .nvic section in the resulting object file and fill it with handler addresses:

1 #define HANDLER(NAME) NAME ## _handler
2 void (*nvic_table[])(void) __attribute__ ((section (".nvic"))) = {
3   HANDLER(reset),
4   HANDLER(nmi),
5   HANDLER(hard_fault),
6   HANDLER(mman),
7 ...

Linker scripts

We will use linker scripts to set the VMAs and the LMAs to the values we like and to create some symbols whose addresses we can play with in the C code. We first need to define the memory layout:

2 {
3   FLASH (rx)  : ORIGIN = 0x00000000, LENGTH = 0x00040000
4   RAM   (rwx) : ORIGIN = 0x20000000, LENGTH = 0x00008000
5 }

We then need to tell the linker where to put the section in the final executable:

 2 {
 3   .text :
 4   {
 5     LONG(0x20007fff)
 6     KEEP(*(.nvic))
 7     *(.text*)
 8     *(.rodata*)
 9      __text_end_vma = .;
10   } > FLASH
12   .data :
13   {
14     __data_start_vma = .;
15     *(.data*)
16     *(vtable)
17     __data_end_vma = .;
18   } > RAM AT > FLASH
20   .bss :
21   {
22     __bss_start_vma = .;
23     *(.bss*)
24     *(COMMON)
25     __bss_end_vma = .;
26   } > RAM
27 }
  1. We start with the .text section and begin it with 0x20003fff. It is the initial value of the stack pointer (see the data sheet). Since the stack grows towards lower addresses, we initialize the top of the stack to the last byte of available RAM.
  2. We then put the .nvic section. The KEEP function forces the linker to keep this section even when the link-time optimizations are enabled, and the section seems to be unused. The asterisk in *(.nvic) is a wildcard for an input object file name. Whatever is in the brackets is a wildcard for a section name.
  3. We put all the code and read-only data from all of the input files in this section as well.
  4. We define a new symbol: __text_end_vma and assign its address to the current VMA (the dot means the current VMA).
  5. We put this section in FLASH: > FLASH at line 10.
  6. We combine the .data* sections from all input files into one section and put it behind the .text section in FLASH. We set the VMAs to be in RAM: > RAM AT > FLASH.
  7. Apparently TivaWare changes the value of the VTABLE register and needs to have the NVIC table in RAM, so we oblige: *(vtable).
  8. We put .bss in RAM after .data.
  9. We use asterisks in section names (i.e. .bss*) because -ffunction-sections and -fdata-sections parameters cause the compiler to generate a separate section for each function and data item.

Edit 02.04.2016: The initial stack pointer needs to be aligned to 8 bytes for passing of 64-bit long variadic parameters to work. Therefore, the value of the first four bytes in the text section should be: LONG(0x20007ff8). See this post for details.

See the binutils documentation for more details.

Start-up code

On the system start-up, we need to copy the contents of the .data section from FLASH to RAM ourselves before we can run any code. We do it by defining a reset handler:

 1 extern unsigned long __text_end_vma;
 2 extern unsigned long __data_start_vma;
 3 extern unsigned long __data_end_vma;
 4 extern unsigned long __bss_start_vma;
 5 extern unsigned long __bss_end_vma;
 7 extern void main();
 9 void __rst_handler()
10 {
11   unsigned long *src = &__text_end_vma;
12   unsigned long *dst = &__data_start_vma;
14   while(dst < &__data_end_vma) *dst++ = *src++;
15   dst = &__bss_start_vma;
16   while(dst < &__bss_end_vma) *dst++ = 0;
18   main();
19 }
21 void reset_handler() __attribute__ ((weak, alias ("__rst_handler")));

We first declare external symbols. They are put in the symbol table by the linker. The reset handler then moves the .data section from FLASH to RAM, zeroes the .bss section, and calls main.

A test

Let's put everything together. I wrote a short program that blinks an LED using the SysTick interrupt. The color of the LED depends on the switch pressed. The files are here:

Compile and link:

]==> arm-none-eabi-gcc -mcpu=cortex-m4 -mfpu=fpv4-sp-d16 -mfloat-abi=hard -mthumb -std=c11 -O0 -Wall -pedantic -ffunction-sections -fdata-sections -c main.c -g
]==> arm-none-eabi-gcc -mcpu=cortex-m4 -mfpu=fpv4-sp-d16 -mfloat-abi=hard -mthumb -std=c11 -O0 -Wall -pedantic -ffunction-sections -fdata-sections -c TM4C_startup.c -g
]==> arm-none-eabi-ld -T TM4C.ld TM4C_startup.o main.o -o main --gc-sections

Let's see what we have in the resulting binary:

]==> arm-none-eabi-objdump -h main

main:     file format elf32-littlearm

Idx Name          Size      VMA       LMA       File off  Algn
  0 .text         00000484  00000000  00000000  00010000  2**2
  1 .data         00000004  20000000  00000484  00020000  2**2
                  CONTENTS, ALLOC, LOAD, DATA
  2 .bss          00000004  20000004  00000488  00020004  2**2

The .text section starts at 0x00000000 both VMA and LMA. The .data section starts at 0x00000484 LMA (in FLASH) but the code expects it to start at 0x20000000 VMA (in RAM). The symbol addresses seem to match the expectations as well:

]==> arm-none-eabi-objdump -t main | grep vma
20000004 g       .bss   00000000 __bss_start_vma
00000484 g       .text  00000000 __text_end_vma
20000008 g       .bss   00000000 __bss_end_vma
20000000 g       .data  00000000 __data_start_vma
20000004 g       .data  00000000 __data_end_vma

We now need to create a raw binary file that we can flash to the board. The arm-none-eabi-objcopy utility can take the relevant sections and put them in an output file aligned according to their LMAs.

]==> arm-none-eabi-objcopy -O binary main main.bin
]==> stat --format=%s main.bin

The total size of the raw binary matches the sum of the sizes of the .text and .data sections (0x488 == 1160). Let's flash it and see if it works!

]==> lm4flash main.bin
Found ICDI device with serial: xxxxxxxx
ICDI version: 9270


Get the full code at GitHub.

Edit 28.03.2016: There are more details about the startup code in this post.


I have recently started playing with the Tiva launchpad. It's a pity, though, that most of the tutorials and course material out there show you how to program it only using something or other on Windows. I have even gone as far as installing it on my old laptop to follow some of these tutorials. But, I have quickly re-discovered the reasons for my dislike of Windows.

There are some great resources available explaining how to use the Stellaris board on Linux. Stellaris is a predecessor of Tiva, and much of this advice applies to Tiva as well. Everyone seems to use Make, though. I don't like it because generating source file dependencies and discovering libraries with it involves black magic and blood of goats. I decided, then, to add my two cents and create a template for CMake (GitHub). It works fine both with or without TivaWare and uses my BSD-licensed start-up files. To use it for your project, all you need to do is:

 1 #-------------------------------------------------------------------------------
 2 # Some boilerplate
 3 #-------------------------------------------------------------------------------
 4 cmake_minimum_required(VERSION 3.4)
 5 set(CMAKE_TOOLCHAIN_FILE ${CMAKE_SOURCE_DIR}/cmake/TM4C_toolchain.cmake)
 8 include(Firmware)
10 #-------------------------------------------------------------------------------
11 # Configure your project
12 #-------------------------------------------------------------------------------
13 project(tm4c-template)
14 add_executable(tm4c-template.axf main.c tm4c/TM4C_startup.c)
15 add_raw_binary(tm4c-template.bin tm4c-template.axf)
16 target_link_libraries(tm4c-template.axf ${TIVAWARE_LIB})

And then:

]==> mkdir build
]==> cd build
]==> cmake ../
-- The CXX compiler identification is GNU 4.9.3
-- Check for working CXX compiler: /usr/bin/arm-none-eabi-c++
-- Check for working CXX compiler: /usr/bin/arm-none-eabi-c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Configuring done
-- Generating done
-- Build files have been written to: /home/ljanyst/Temp/board/cmake/build
]==> make
Scanning dependencies of target tm4c-template.axf
[ 25%] Building C object CMakeFiles/tm4c-template.axf.dir/main.c.obj
[ 50%] Building C object CMakeFiles/tm4c-template.axf.dir/tm4c/TM4C_startup.c.obj
[ 75%] Linking C executable tm4c-template.axf
[ 75%] Built target tm4c-template.axf
Scanning dependencies of target tm4c-template.bin
[100%] Creating raw binary tm4c-template.bin
[100%] Built target tm4c-template.bin

Or, if you want TivaWare, do this instead:

]==> cmake .. -DTIVAWARE_PATH=/path/to/tivaware/


I wrote a short piece of code that lets you test things without the need for TivaWare. Go here and compile lm4flash, it needs libusb-1.0-0-dev on Debian.

]==> lm4flash tm4c-template.bin
Found ICDI device with serial: 0E21xxxx
ICDI version: 9270


You can tweak a bit the instructions from the tutorial over at to run a debugging session. Plug-in the board and start an Open On-Chip Debugger session:

]==> openocd -f /usr/share/openocd/scripts/board/ek-tm4c123gxl.cfg
Open On-Chip Debugger 0.9.0 (2015-05-28-17:08)
Licensed under GNU GPL v2
For bug reports, read
Info : The selected transport took over low-level target control. The results might differ compared to plain JTAG/SWD
adapter speed: 500 kHz
Info : clock speed 32767 kHz
Info : ICDI Firmware version: 9270
Info : tm4c123gh6pm.cpu: hardware has 6 breakpoints, 4 watchpoints

Then, in another terminal window, run gdb as follows:

]==> cat gdb-embeded.init
target extended-remote :3333
monitor reset halt
monitor reset init
break main
]==> arm-none-eabi-gdb --command=gdb-embeded.init  tm4c-template.axf
GNU gdb (7.10-1+9) 7.10
Copyright (C) 2015 Free Software Foundation, Inc.

-- cut --

Breakpoint 1, main () at /home/ljanyst/Temp/board/cmake/main.c:113
113       init_sys_tick();
(gdb) n
114       init_gpio();
(gdb) n
116       unsigned long led = 0x02;
(gdb) n
119         unsigned long sw1 = !(GPIODATA_REG_PORTF & 0x01);
(gdb) n
120         unsigned long sw2 = !(GPIODATA_REG_PORTF & 0x10);
(gdb) p sw1
$1 = 0
(gdb) p /t *(unsigned long *)0x4005d3fc
$2 = 10001

Have fun!

I have finally found some time to act on my long-standing goal to learn how to program microcontrollers. It's all rather pointless, though, if you cannot make them interact with the surrounding world in some interesting ways. This, in turn, involves quite a bit of analog electronics, which I haven't done all that much. There's a bunch of good materials all over the Internet that can help. I find this one particularly useful. It goes step-by-step through all the basics, as well as op-amps and transistors. The coolest part of it is that you can use what you learn immediately. The microcontroller part is rather rudimentary, so I made the robot play RTTTL ringtones. RTTTL is the same format that Nokia used for their old phones and there is plenty of tunes floating all around the Internet. I used this code as a base and ported it to Energia.

Circuits on the Robot
Circuits on the Robot

Some of the worst soldering ever :)
Some of the worst soldering ever :)

I like MSP-430 because, after programming it, you can remove it from the development kit and solder onto your board.

This is how the robot works:

  1. It plays Scott Joplin's "The Entertainer" tune.
  2. It waits for a noise command to enable the light-following.
  3. It acknowledges the command.
  4. It follows the light.
  5. If it detects no light, it waits for another noise command to leave the following mode.