Niclas Lindblom
Sr. FAE at Percepio

As you might know, our Tracealyzer tools allows you to record and visualize the real-time behavior of your RTOS-based firmware. In previous blogs, I’ve talked about the possibility of streaming the trace data from your board, and today we’ll consider one such specific case – USB CDC.

A few months ago, when working with an inexpensive STM32 Nucleo board, I noticed a USB Device connector. Since our RTOS recorder already supported streaming using other interfaces and is pretty configurable, I decided to give it a shot. It didn’t seem very hard, especially since configuration tools like STM32CubeMX makes it quite easy to generate the code needed for a USB connection.


The USB standard offers several “device classes”, like mouse/keyboard (HID), audio, mass storage, etc. For streaming RTOS trace, the Communication Device Class (CDC) is very suitable.

This device class has many uses, of which one of them is to mimic the behavior of classic RS-232 serial connections. They are usually too slow for RTOS tracing, as the electrical characteristics often limits the maximum bandwidth to a few hundred kbit/s. But USB CDC provides a similar interface, including a virtual COM port on the host, and is much faster.

In case you are already using the USB interface for other purposes, you can combine multiple logical connections (USB Devices) on a single physical USB controller by defining a “Composite Device”.

RTOS tracing

The Tracealyzer solution consists of two parts, a trace recorder library and the host application (Tracealyzer). The communications interface between them is configurable using concept of “stream ports” – a header file that defines how to read and write the data – in this case using USB CDC.

When the user presses “Start recording” or “Stop recording” in the Tracealyzer application, a command code is sent to the target system. In this case, it is copied by the USB interrupt handler to a separate command buffer, which is then read by the Tracealyzer control task (TzCtrl).

When recording is enabled and an event is recorded, the event data is written to a paged buffer. When a buffer page gets full, the recorder switches to the next empty page, and the trace control task writes the full buffer page to the streaming interface. In this case, this is done using an ST API call that writes to the USB CDC connection.

To learn more about the trace recorder for FreeRTOS, see the User Manual.

How to set up USB streaming

Using STM32CubeMX it was straight-forward to set up the required USB CDC code. I configured it to included support for interrupt driven operation, i.e., sending and receiving data.

Then you need to include a suitable “stream port” to the recorder, that configures the recorder to the right API functions. This is just a header file with a few definitions, and this version for STM32 is already included with Tracealyzer for FreeRTOS. Just make sure your compiler finds it, and includes the source code in the stream port directory (the command buffer and interrupt handlers).

If using another MCU, or another interface, you need to define these macros in trcStreamingPort.h.

  • TRC_STREAM_PORT_INIT() – How to initialize the interface, here defined as MX_USB_DEVICE_Init();
  • TRC_STREAM_PORT_READ_DATA() – How to read command data from host – here defined as trcCDCReceive(), a local function in the stream port that scans the command buffer (written to by the USB interrupt handler).
  • TRC_STREAM_PORT_PERIODIC_SEND_DATA() should typically be defined to call the recorder function prvPagedEventBufferTransfer(), with a function pointer as argument, identifying a function that sends the trace data from the paged buffer (in this case, the local function trcCDCTransmit defined in the stream port code).

In case you want to study the details, see the /streamports/USB CDC folder in the trace recorder library (the FreeRTOS version).


My test system produced a lot of trace data, over 600 kbyte/s, as seen below in the Tracealyzer recorder window. However, RTOS trace typically only generate 20-200 kbyte/s, so the performance is more than sufficient.

The default setting is to have two buffer pages, each 2500 bytes, and calling the transmit function once every 10 ms. This gave a maximum throughput of 250 kbytes/s in this case, which is quite sufficient.

The performance is affected by several factors in the recorder configuration, the buffer size (i.e., the number of buffer pages and the size of each buffer page), the period of the TzCtrl task (that transmits) and the scheduling priority of this task. To reach 600 kbyte/s, I used a buffer size to 4000 bytes and transmitted every 2 ms. Finding an optimal setup is a balance between throughput and overhead.

Do you want to try this setup? Get in touch with me at, and I will help you get started.