Software tracing comes in two main flavors, software-defined trace and hardware-defined trace. Hardware-defined trace means that the processor outputs a trace of low-level software events, such as control-flow branches and exceptions/interrupts. The resulting trace is very detailed and can therefore be overwhelming – you don’t see the forest for all the trees!
Software-defined trace means that tracing code is added on strategic locations to record selected events. This way you don’t see all the low-level details and there is some performance cost, but you can control exactly what events that is traced and no special tracing hardware is required. Many common RTOS already include tracing that captures RTOS events automatically, such as task-switches and kernel service calls. Below is a comparison of the two types of tracing.
Hardware Trace | Software Trace | |
Level of detail | Instructions | Events |
Performance impact | None | Minor |
Hardware requirements | Several | None |
When applicable | Lab only | Anywhere |
In many debugging and profiling cases it is important to trace exceptions, such as timer interrupts or interrupts from communication interfaces. Most ARM MCUs offer hardware support for this purpose, but this is not very flexible. There is no way of selecting what exceptions to trace, so you get all or nothing. Since some exceptions can be very frequent, this often generates a lot of data. So unless you have a high-end debug probe with top-notch tracing performance, you will quickly saturate the trace buffers and only get random fragments of the trace.
Software-defined trace allows for tracing just about anything, so you can choose to include just those exceptions handlers you are interested in. This is done by calling a trace certain function or macro, typically both at the entry and the exit of the exception handler, as illustrated in Figure 1.
Figure 1. Recording exception handlers
Getting a correct exception trace is however quite challenging on ARM Cortex-M MCUs. They feature highly optimized exception processing in order to reduce interrupt latency. One clever optimization is called Exception tail-chaining and means that if a new exception has been signaled when exiting an exception handler, it quickly switches to the new exception handler without returning to the previous context in between, as illustrated in Figure 2.
Figure 2. Exception tail-chaining on ARM Cortex-M MCUs
This is a great optimization, but since this is intended to be transparent to the software, it is not easy to detect this correctly using software-defined trace. If not accounting for exception tail-chaining, the resulting trace would show a short fragment of the previous context in between the tail-chained exceptions (Figure 3), which would be quite wrong in cases where they actually executed back-to-back (as in Figure 2). To get this correct, we need a way to tell these cases apart, but the processor does not provide any obvious information about this. So how do we handle this?
Figure 3. Incorrect display if not accounting for exception tail-chaining.
Figure 4. Percepio Tracealyzer (click to enlarge)
Tracealyzer detects exception tail-chaining by analyzing the number of clock cycles between adjacent exception event. If below a certain value, the exceptions have executed tail-chained since there has not been enough clock cycles to fully restore the previous context in between. The solution is illustrated by Figure 5, below.
Figure 5. How Tracealyzer detects exception tail-chaining
Tracealyzer allows you to trace and visualize the execution of exceptions (ISRs), RTOS tasks and other software events, providing more than 25 interconnected views that gives an amazing visual insight into the runtime world of RTOS-based firmware, accelerating development, validation and debugging. Tracealyzer is available for several common RTOS, including FreeRTOS, SafeRTOS, Linux, VxWorks, Micrium µC/OS-III, and a version for ThreadX will be released during 2016. And there is even a feature-limited free version. We have several new and exciting analysis features in development that allows for even better performance analysis, so stay tuned!