-
Per Lindgren authoredPer Lindgren authored
app
Examples and exercises for the Nucleo STM32F401re/STM32F11re devkits.
Dependencies
- Rust 1.40, or later. Run the following commands to update you Rust tool-chain and add the target for Arm Cortex M4 with hardware floating point support.
> rustup update
> rustup target add thumbv7em-none-eabihf
-
For programming (flashing) and debugging
-
openocd
debug host, (install using your package manager) -
arm-none-eabi
tool-chain (install using your package manager). In the following we refer thearm-none-eabi-gdb
as justgdb
for brevity. -
stlink
(optional) tools for erasing and programming ST microcontrollers (install using your package manager).
-
-
itm
tools for ITM trace output, install by:
> cargo install itm
-
vscode
editor/ide andcortex-debug
plugin. Installvscode
using your package manager and follow the instructions at cortex-debug (optional for an integrated debugging experience) -
rust-analyzer
install following instructions at rust-analyzer (optional for Rust support invscode
)
Examples
Hello World! Building and Debugging an Application
- Connect your devkit using USB. To check that it is found you can run:
> lsusb
...
Bus 001 Device 004: ID 0483:374b STMicroelectronics ST-LINK/V2.1
...
(Bus/Device/ID may vary.)
- Run in a terminal (in the
app
project folder):
> openocd -f openocd.cfg
...
Info : Listening on port 6666 for tcl connections
Info : Listening on port 4444 for telnet connections
Info : clock speed 2000 kHz
Info : STLINK V2J20M4 (API v2) VID:PID 0483:374B
Info : Target voltage: 3.254773
Info : stm32f4x.cpu: hardware has 6 breakpoints, 4 watchpoints
Info : Listening on port 3333 for gdb connections
openocd
should connect to your target using the stlink
programmer (onboard your Nucleo devkit). See the Trouble Shooting
section if you run into trouble.
- In another terminal (in the same
app
folder) run:
> cargo run --example hello
The cargo
sub-command run
looks in the .cargo/config
file on the configuration (runner = "arm-none-eabi-gdb -q -x openocd.gdb"
).
We can also do this manually.
> cargo build --example hello
> arm-none-eabi-gdb target/thumbv7em-none-eabihf/debug/examples/hello -x openocd.gdb
This starts gdb with file
being the hello
(elf) binary, and runs the openocd.gdb
script, which loads (flashes) the binary to the target (our devkit). The script connects to the openocd
server, enables semihosting
and ITM
tracing, sets breakpoint
s at main
(as well as some exception handlers, more on those later), finally it flashes the binary and runs the first instruction (stepi
). (You can change the startup behavior in the openocd.gdb
script, e.g., to continue
instead of stepi
.)
- You can now continue debugging of the program:
...
Note: automatically using hardware breakpoints for read-only addresses.
halted: PC: 0x08000a72
DefaultPreInit ()
at /home/pln/.cargo/registry/src/github.com-1ecc6299db9ec823/cortex-m-rt-0.6.12/src/lib.rs:571
571 pub unsafe extern "C" fn DefaultPreInit() {}
(gdb) c
Continuing.
Breakpoint 1, main () at examples/hello.rs:12
12 #[entry]
The cortex-m-rt
run-time initializes the system and your global variables (in this case there are none). After that it calls the [entry]
function. Here you hit a breakpoint.
- You can continue debugging:
(gdb) c
Continuing.
halted: PC: 0x0800043a
^C
Program received signal SIGINT, Interrupt.
hello::__cortex_m_rt_main () at examples/hello.rs:15
15 loop {
At this point, the openocd
terminal should read something like:
Thread
xPSR: 0x01000000 pc: 0x08000a1a msp: 0x20008000, semihosting
Info : halted: PC: 0x08000a72
Info : halted: PC: 0x0800043a
Hello, world!
Your program is now stuck in an infinite loop (doing nothing).
- Press
CTRL-c
in thegdb
terminal:
Program received signal SIGINT, Interrupt.
0x08000624 in main () at examples/hello.rs:14
14 loop {}
(gdb)
You have now compiled and debugged a minimal Rust hello
example. gdb
is a very useful tool so lookup some tutorials/docs (e.g., gdb-doc, and the GDB Cheat Sheet.
ITM Tracing
The hello.rs
example uses the semihosting
interface to emit the trace information (appearing in the openocd
terminal). The drawback is that semihosting
is incredibly slow as it involves a lot of machinery to process each character. (Essentially, it writes a character to a given position in memory, runs a dedicated break instruction, openocd
detecects the break, reads the character at the given position in memory and emits the character to the console.)
A better approach is to use the ARM ITM (Instrumentation Trace Macrocell), designed to more efficiently implement tracing. The onboard stlink
programmer can put up to 4 characters into an ITM package, and transmit that to the host (openocd
). openocd
can process the incoming data and send it to a file or FIFO queue. The ITM package stream needs to be decoded (header + data). To this end we use the itmdump tool.
In a separate terminal, create a named fifo:
> mkfifo /tmp/itm.fifo
> itmdump -f /tmp/itm.fifo
Hello, again!
Now you can compile and run the itm.rs
application using the same steps as the hello
program. In the itmdump
console you should now have the trace output.
> cargo run --example itm
Under the hood there is much less overhead, the serial transfer rate is set to 2MBit in between the ITM (inside of the MCU) and stlink
programmer (onboard the Nucleo devkit). So in theory we can transmit some 200kByte/s data over ITM. However, we are limited by the USB interconnection and openocd
to receive and forward packages.
The stlink
programmer, buffers packages but has limited buffer space. Hence in practice, you should keep tracing to short messages, else the buffer will overflow. See trouble shooting section if you run into trouble.
panic
Handling
Rust The rust
compiler statically analyses your code, but in cases some errors cannot be detected at compile time (e.g., array indexing out of bounds, division by zero etc.). The rust
compiler generates code checking such faults at run-time, instead of just crashing (or even worse, continuing with faulty/undefined values like a C
program would) . A fault in Rust will render a panic
, with an associated error message (useful to debugging the application). We can choose how such panic
s should be treated, e.g., transmitting the error message using semihosting
, ITM
, some other channel (e.g. a serial port), or simply aborting the program.
The panic
example demonstrates some possible use cases.
The openocd.gdb
script sets a breakpoint at rust_begin_unwind
(a function in the rust core
library, used to recover errors.)
When running the example (see above howto compile and run), the gdb
terminal will show:
...
Breakpoint 2, main () at examples/panic.rs:27
27 panic!("Oops")
(gdb) c
Continuing.
halted: PC: 0x08000404
Breakpoint 1, rust_begin_unwind (_info=0x20017fb4)
at /home/pln/.cargo/registry/src/github.com-1ecc6299db9ec823/panic-halt-0.2.0/src/lib.rs:33
33 atomic::compiler_fence(Ordering::SeqCst);
(gdb) p *_info
$1 = core::panic::PanicInfo {payload: core::any::&Any {pointer: 0x8000760 <.Lanon.21a036e607595cc96ffa1870690e4414.142> "\017\004\000", vtable: 0x8000760 <.Lanon.21a036e607595cc96ffa1870690e4414.142>}, message: core::option::Option<&core::fmt::Arguments>::Some(0x20017fd0), location: core::panic::Location {file: <error reading variable>, line: 27, col: 5}}
Here p *_info
prints the arument to rust_begin_unwind
, at the far end you will find line: 27, col 5
, which corresponds to the source code calling panic("Ooops")
. (gdb
is not (yet) Rust aware enough to figure out how the file
field should be interpreted, but at least we get some useful information).
Alternatively we can trace the panic message over semihosting
(comment out extern crate panic_halt
and uncomment extern crate panic_semihosting
).
The openocd
console should now show:
Info : halted: PC: 0x080011a0
panicked at 'Oops', examples/panic.rs:27:5
Under the hood, this approach involves formatting of the panic message, which implementation occupies a bit of flash memory (in our case we have 512kB so plenty enough, but for the smallest of MCUs this may be a problem). Another drawback is that it requires a debugger to be connected and active.
Another alternative is to use ITM (uncomment extern crate panic_itm
), this is faster, but be aware, the message may overflow the ITM
buffer, so it may be unreliable. Also it assumes, that the ITM stream is actively monitored.
A third alternative would be to store the panic message in some non-volatile memory (flash, eeprom, etc.). This allows for true post-mortem debugging of a unit put in production. This approach is used e.g. in automotive applications where the workshop can read-out error codes of your vehicle.
Exception Handling and Core Peripheral Access
The ARM Cortex-M processors features a set of core peripherals and exception handlers. These offer basic functionality independent of vendor (NXP, STM, ...). The SysTick
peripheral is a 24-bit countdown timer, that raises a SysTick
exception when hitting 0 and reloads the set value. Seen as a real-time system, we can dispatch the SysTick
task in a periodic fashion (without accumulated drift under some additional constraints).
In the exception.rs
example a .
is emitted by the SysTick
handler using semihosting
. Running the example should give you a periodic updated of the openocd
console.
The exception_itm.rs
and exception_itm_raw.rs
uses the ITM instead. The difference is the way they gain access to the ITM
peripheral. In the first case we steal the whole set of core peripherals, while the in the second case we use raw pointer access to the ITM
. In both cases, the code is unsafe, as there is no guarantee that other tasks may access the peripheral simultaneously (causing a conflict/race). Later we will see how the concurrency problem is solved in RTFM to offer safe access to peripherals.
Crash - Analyzing the Exception Frame
In case the execution of an instruction fails, a HardFault
exception is raised by the hardware, and the HardFault
handler is executed. We can define our own handler as in example crash.rs
. In main
we attempt to read an illegal address, causing a HardFault
, and we hit a breakpoint (openocd.gdb
script sets a breakpoint at the HardFault
handler). From there you can print the exception frame, reflecting the state of the MCU when the error occurred. You can use gdb
to give a back trace
of the call-stack leading up to the error. See the example for detailed information.
Most crash conditions trigger a hard fault exception, whose handler is defined via
#[exception]
fn HardFault(ef: &cortex_m_rt::ExceptionFrame) -> ! {
...
cortex-m-rt
generates a trampoline, that calls into your user defined HardFault
handler. We can use cargo expand
to view the expanded code:
> cargo expand --example crash > crash_expand.rs
In the generated file we find:
#[doc(hidden)]
#[export_name = "HardFault"]
#[link_section = ".HardFault.user"]
pub unsafe extern "C" fn __cortex_m_rt_HardFault_trampoline(frame: &::cortex_m_rt::ExceptionFrame) {
__cortex_m_rt_HardFault(frame)
}
The HardFault
handler has access to the exception frame
, a
snapshot of the CPU registers at the moment of the exception.
To better see what is happening we make a --release
build
(It reduces the amount of redundant code.)
> cargo run --example crash --release
...
Breakpoint 2, HardFault (frame=0x20007fe0) at examples/crash.rs:28
28 #[exception]
(gdb) p/x *frame
$1 = cortex_m_rt::ExceptionFrame {r0: 0x2fffffff, r1: 0xf00000, r2: 0x0, r3: 0x0, r12: 0x0, lr: 0x800051f, pc: 0x8000524, xpsr: 0x61000000}
(gdb) disassemble frame.pc
Dump of assembler code for function crash::__cortex_m_rt_main:
0x08000520 <+0>: mvn.w r0, #3489660928 ; 0xd0000000
0x08000524 <+4>: ldr r0, [r0, #0]
0x08000526 <+6>: b.n 0x8000526 <crash::__cortex_m_rt_main+6>
End of assembler dump.
The program counter (frame.pc
) contains the address of the instruction that caused the exception. In GDB one can
disassemble the program around this address to observe the instruction that caused the
exception. In our case its the ldr r0, [r0, #0]
caused the exception. This instruction tried to load (read) a 32-bit word
from the address stored in the register r0
. Looking again at the contents of ExceptionFrame
we find that r0
contained the address 0x2FFF_FFFF
when this instruction was executed.
Looking at the assembly mvn.w r0, #3489660928 ; 0xd0000000
.
This is a move and not instruction, so the resulting value here is actually 0x2fffffff
. Why did it not do it straight up then as 0x2FFF_FFFF?
Well a 32 bit constant cannot be stored in a 32 bit instruction. So under the hood it stores 0xd0, bit shifts it and bit wise inversion. This is the level of optimization Rust + LLVM is capable of.
We can further backtrace the calls leading up to the fault.
(gdb) bt
#0 HardFault (frame=0x20007fe0) at examples/crash.rs:79
#1 <signal handler called>
#2 core::ptr::read_volatile (src=0x2fffffff)
at /rustc/73528e339aae0f17a15ffa49a8ac608f50c6cf14/src/libcore/ptr/mod.rs:948
#3 crash::__cortex_m_rt_main () at examples/crash.rs:71
#4 0x08000404 in main () at examples/crash.rs:66
Here we see that on frame #2
we are doing the read causing havoc.
We can also use panic!("Exception frame {:?}", ef);
to format and print the exception frame, e.g., over semihosting
or ITM
. In the example we use semihosting
, so when continuing debugging you will eventually the exception frame printed in the openocd
console. In the openocd.gdb
file we set breakpoints to the exception handlers:
# detect unhandled exceptions, hard faults and panics
break DefaultHandler
break HardFault
break rust_begin_unwind
So in case, you want to go directly to a panic!
printout of the exception frame comment out the breakpoints.
Notice. panic!("Exception frame {:?}", ef);
will bring in the formatting code from the core
library (which is kind of large), so in case you are scarce on flash memory, you may want use some other method.
Device Crates and System View Descriptions (SVDs)
Besides the ARM provided core peripherals the STM32F401re/STM32F411re MCUs has numerous vendor specific peripherals (GPIOs, Timers, USARTs etc.). The vendor provides a System View Description (SVD) specifying the register block layouts (fields, enumerated values, etc.). Using the svd2rust
tool we can derive a Peripheral Access Crate
(PAC) providing an API for the device that allow us to access each register according to the vendors specification. The device.rs
example showcase how a PAC for the STM32F401re/STM32F411re MCUs can be added. (These MCUs have the same set of peripherals, only the the maximum clock rating differs.)
> cargo run --example device --features stm32f4
The example output a .
each second over semihosting
and ITM
.
Cargo.toml
file
The Looking at the Cargo.toml
file we find:
...
[dependencies.stm32f4]
version = "0.9.0"
features = ["stm32f401", "rt"]
optional = true
...
# Built options for different examples
[[example]]
name = "device"
required-features = ["stm32f4"]
...
We compile stm32f4
(a generic library for all STMF4 MCUs) with features = ["stm32f401", "rt"]
, which indicates the specific MCU with rt
(so we get the interrupt vector etc.). By having the PAC as an optional dependency, we did not need to compile it (unless we need it, and as you might have experienced already compiling the PAC takes a bit of time to compile initially). (An SVD file is typically > 50k lines, amounting to the same (or more) lines of Rust code.)
By compiling with --features stm32f4
we "opt-in" this dependency.
Hardware Abstraction Layer
For convenience common functionality can be implemented for a specific MCU (or family of MCUs). The stm32f4xx-hal
is a Work In Progress, implementing a Hardware Abstraction Layer (hal) for the stm32f4
family. It implements the https://crates.io/search?q=embedded-hal
serial trait (interface), to read and write single bytes over a serial port. However, setting up communication is out of scope for the embedded-hal
.
The serial.rs
example showcase a simple echo application,
repeating back incoming data over a serial port (byte by byte). You will also get trace output over the ITM.
Looking closer at the example, rcc
is a singleton (constrain
consumes the RCC
and returns a singleton. The freeze
consumes the singleton (rcc
) and sets the MCU clock tree according to the (default) cfgr
. (Later in the exercises you will change this.)
This pattern ensures that the clock configuration will remain unchanged (the freeze
function cannot be called again, as the rcc
is consumed, also you cannot get a new rcc
as the RCC
was consumed by constrain
).
Why is this important you may ask? Well, this pattern allows the compiler to check and ensure that your code (or some library that you use) does not make changes to the system (in this case the clocking), which reduces the risk of errors and improves robustness.
Similarly, we split
the GPIOA
into its parts (pins), and select the operating mode to af7
for tx
(the transmit pin pa2
), and rx
(the receive pin pa3
). For details see, RM0368, figure 17 (page 151), and table 9 in STM32F401xD STM32F401xE. The GPIO pins pa2
and pa3
are (by default) connected to the stlink
programmer, see section 6.8 of the Nucleo64 user manual UM1724
. When the stlink
programmer is connected to a linux host, the device \dev\ttyACM0
appears as a virtual com port.
Now we can call Serial::usart2
to setup the serial communication, (according to table 9 in STM32F401xD STM32F401xE documentation it is USART2).
Following the singleton pattern it consumes the USART2
peripheral (to ensure that only one configuration can be active at any time). The second parameter is the pair of pins (tx, px)
that we setup earlier.
The third parameter is the USART configuration. By defualt its set to 8 bit data, and one stop bit. We set the baudrate to 115200. We also pass the clocks
(holding information about the MCU clock setup).
At this point tx
and rx
is owned by serial
. We can get access to them again by serial.split()
.
In the loop we match on the result of block!(rx.read())
. block!
repeatedly calls rx.read()
until ether a byte is received or an error returned. In case rx.read()
succeeded, we trace the received byte over the ITM, and echo it by tx.write(byte)
, ignoring the result (we just assume sending will always succeed). In case rx.read
returned with an error, we trace the error message.
As the underlying hardware implementation buffers only a single byte, the input buffer may overflow (resulting in tx.read
returning an error).
You can now compile and run the example. Start moserial
(or some other serial communication tool), connect to /dev/ttyACM0
with 115200 8N1). Now write a single character a
in the Outgoing
pane, followed by pressing . You should receive back an a
from the target. In the ITM trace output you should see:
Ok 97
Depending if was encoded as CR+LF, CR, LF, TAB, ... you will get additional bytes sent (and received). Try sending multiple characters at once, e.g. abcd
, you will see that the you well get a buffer overflow.
This is an example of a bad programming pattern, typically leading to serious problems in (real-time) embedded programming (so it takes more than just Rust to get it right). Later in the exercises we will see how better patterns can be adopted.
Real Time For the Masses (RTFM)
RTFM allows for safe concurrency, sharing resources between different tasks running at different priorities. The resource management and scheduling follow the Stack Resource Policy, which gives us outstanding properties of race- and deadlock free scheduling, single blocking, stack sharing etc.
We start by a simple example. For the full documentation see the RTFM book.
RTFM ITM, with Interrupt
Key here is that we share the ITM
peripheral in between the init
task (that runs first), the exti0
task (that runs preemptively with a default priority of 1), and the backgroud task idle
(that runs on priority 0). Since the highest priority task cannot be interrupted we can safely access the shared resource directly (in exti0
). In idle
however, we need to lock the resource (which returns itm
(a reference to the ITM
) peripheral to the closure).
By rtfm::pend
we can simulate we trigger an interrupt, (more realistically, interrupts are triggered by the environment, e.g., a peripheral has received some data).
> cargo run --example rtfm_itm --features rtfm
For more information see app.
Looking at the Cargo.toml
file we find:
...
[dependencies.stm32f4xx-hal]
version = "0.6.0"
features = ["stm32f401", "rt"]
optional = true
[dependencies.cortex-m-rtfm]
version = "0.5.1"
optional = true
[features]
rtfm = ["cortex-m-rtfm", "stm32f4xx-hal"]
...
[[example]]
name = "rtfm_itm"
required-features = ["rtfm"]
The rtfm
feature opt-in the dependencies to cortex-m-rtfm
and stm32f4xx-hal
(which in turn opt-in the dependency to stm32f4
under the stm32f401
and rt
features). Through the hal
we can get access to the underlying device/PAC (peripherals, interrupts etc.).
RTFM ITM, using Spawn
In the previous example we triggered the exti0
task manually. We can let RTFM do that for us using spawn
with an optional payload. Thus we have simple way to do message passing.
> cargo run --example rtfm_itm_spawn --features rtfm
The spawn.unwrap()
panics if the message could not be delivered (i.e, the queue is full). The size (capacity) of queues are 1 by default, but can be for each task individually, see spawn.