![]() |
Previous Tutorial | Tutorial 18 | Next Tutorial | ![]() |
|||
STM32 Delay Microsecond & Millisecond Utility | DWT Delay & Timer Delay | |||||||
STM32 Course Home Page 🏠 |
In this tutorial, I’ll show you a couple of methods to implement STM32 delay functions both in microseconds and milliseconds. We’ve seen the HAL_Delay() utility in the built-in HAL libraries by STMicroelectronics. But it can only give you milliseconds delay, and that’s the goal for this tutorial. We’ll be using the DWT and STM32 hardware Timers for our implementations.
[toc]
Required Components For LABs
All the example code/LABs/projects in the course are going to be done using those boards below.
- Nucleo32-L432KC (ARM Cortex-M4 @ 80MHz) or (eBay)
- Blue Pill STM32-F103 (ARM Cortex-M3 @ 72MHz) or (eBay)
- ST-Link v2 Debugger or (eBay)
QTY | Component Name | 🛒 Amazon.com | 🛒 eBay.com |
2 | BreadBoard | Amazon | eBay |
1 | LEDs Kit | Amazon Amazon | eBay |
1 | Resistors Kit | Amazon Amazon | eBay |
1 | Capacitors Kit | Amazon Amazon | eBay & eBay |
2 | Jumper Wires Pack | Amazon Amazon | eBay & eBay |
1 | 9v Battery or DC Power Supply | Amazon Amazon Amazon | eBay |
1 | Micro USB Cable | Amazon | eBay |
1 | Push Buttons | Amazon Amazon | eBay |
1 | USB-TTL Converter or FTDI Chip | Amazon Amazon | eBay eBay |
★ Check The Full Course Complete Kit List
Some Extremely Useful Test Equipment For Troubleshooting:
- My Digital Storage Oscilloscope (DSO): Siglent SDS1104 (on Amazon.com) (on eBay)
- FeelTech DDS Function Generator: KKMoon FY6900 (on Amazon.com) (on eBay)
- Logic Analyzer (on Amazon.com) (on eBay)
Affiliate Disclosure: When you click on links in this section and make a purchase, this can result in this site earning a commission. Affiliate programs and affiliations include, but are not limited to, the eBay Partner Network (EPN) and Amazon.com.
STM32 Delay Functions
In earlier tutorials, we’ve been using the HAL_Delay utility function to get milliseconds time delay. And as we’ve discussed it’s built on the SysTick timer that ticks at a rate of 1000Hz and can only give you multiples of 1ms time delay. Using time delays despite being a bad practice to though out everywhere in the code, it can be mandatory in many cases and can be justified.
In OS-based applications, it’s forbidden to use time delays as it will mess up the timing behavior of the system. But a few microseconds of delay in a function that sends a trigger pulse to an ultrasonic sensor, LCD, or whatever can be justified and adds no risk to the system timing.
In the upcoming tutorials for interfacing various modules and actuators, we’ll pay attention to the blocking nature of code built with delays and maybe duplicate the work in order to have both versions for a driver that works in blocking and non-blocking mode. Blocking functions will use the delay utility which we’ll develop today. And the non-blocking functions will be handled by the SysTick timer as we’ll learn in the future.
For this tutorial, let’s only focus on how to generate time delays in us and ms with high accuracy. We’ll discuss a couple of ways to achieve this. First of which is the WDT, and the second one is the hardware timer modules.
STM32 DWT Delay
As we’ve discussed in an earlier tutorial, STM32 Debugging, the ARM Cortex®-M3/M4 core contains hardware extensions for advanced debugging features. The debug extensions allow the core to be stopped either on a given instruction fetch (breakpoint) or data access (watchpoint). The Arm® Cortex®-M3/M4 core provides integrated on-chip debug support. It is comprised of:
- SWJ-DP: Serial wire / JTAG debug port
- AHP-AP: AHB access port
- ITM: Instrumentation trace macrocell
- FPB: Flash patch breakpoint
- DWT: Data watchpoint trigger
- TPUI: Trace port unit interface (available on larger packages, where the corresponding pins are mapped)
- ETM: Embedded Trace Macrocell (available on larger packages, where the corresponding pins are mapped)
We are interested in the Data Watchpoint Trigger which provides some means to give some profiling information. For this, some counters are accessible to give the number of:
- Clock cycle
- Folded instructions
- Load store unit (LSU) operations
- Sleep cycles
- CPI (clock per instructions)
- Interrupt overhead
By tracking the clock cycles, we can generate an accurate time delay given that the operating frequency of the CPU is known or can be known. To make the code more generic, we’ll read the core speed from the RCC while initialization and use it for further work.
DWT_Initialization() Function
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
uint32_t DWT_Delay_Init(void) { /* Disable TRC */ CoreDebug->DEMCR &= ~CoreDebug_DEMCR_TRCENA_Msk; // ~0x01000000; /* Enable TRC */ CoreDebug->DEMCR |= CoreDebug_DEMCR_TRCENA_Msk; // 0x01000000; /* Disable clock cycle counter */ DWT->CTRL &= ~DWT_CTRL_CYCCNTENA_Msk; //~0x00000001; /* Enable clock cycle counter */ DWT->CTRL |= DWT_CTRL_CYCCNTENA_Msk; //0x00000001; /* Reset the clock cycle counter value */ DWT->CYCCNT = 0; /* 3 NO OPERATION instructions */ __ASM volatile ("NOP"); __ASM volatile ("NOP"); __ASM volatile ("NOP"); /* Check if clock cycle counter has started */ if(DWT->CYCCNT) { return 0; /*clock cycle counter started*/ } else { return 1; /*clock cycle counter not started*/ } } |
DWT_Delay_us() Function
1 2 3 4 5 6 7 8 9 |
// This Function Provides Delay In Microseconds Using DWT __STATIC_INLINE void DWT_Delay_us(volatile uint32_t au32_microseconds) { uint32_t au32_initial_ticks = DWT->CYCCNT; uint32_t au32_ticks = (HAL_RCC_GetHCLKFreq() / 1000000); au32_microseconds *= au32_ticks; while ((DWT->CYCCNT - au32_initial_ticks) < au32_microseconds-au32_ticks); } |
DWT_Delay_ms() Function
1 2 3 4 5 6 7 8 9 |
// This Function Provides Delay In Milliseconds Using DWT __STATIC_INLINE void DWT_Delay_ms(volatile uint32_t au32_milliseconds) { uint32_t au32_initial_ticks = DWT->CYCCNT; uint32_t au32_ticks = (HAL_RCC_GetHCLKFreq() / 1000); au32_milliseconds *= au32_ticks; while ((DWT->CYCCNT - au32_initial_ticks) < au32_milliseconds); } |
STM32 Hardware Timer Delay
An alternative way for the DWT to achieve time delay in milliseconds, microseconds, and even nanoseconds (require SysCLK >= 100MHz) is the hardware timers in the microcontroller. You can use one of them to achieve this task. I prefer to use a general-purpose or basic timer so you don’t lose valuable hardware resources for this task. For this tutorial, I’ll be using TIMER4 which is a general-purpose timer and connected to the APB1 bus. Here is the microcontroller’s architecture diagram.
While configuring the clock tree make sure to clock the required timer @ the Fsys clock as it’s assumed in the code that the timer clock is the same as the CPU. You can change the configurations, the clock, even the timer itself, and the bus by adjusting a couple of lines in code.
The timer Prescaler is set up depending on the Fsys clock so that each timer tick accounts for a 1-microsecond time unit. This information is stored in a static global variable in the code file and you don’t have to manually figure anything out. Jut initialize the timer delay function and you’re good to go. If another timer to be used for this task, you’ll only have to change this line down below.
1 2 |
#define TIMER TIM4 // You can change it to TIM2, TIM3, or whatever. |
TimerDelay_Initialization() Function
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
void TimerDelay_Init(void) { gu32_ticks = (HAL_RCC_GetHCLKFreq() / 1000000); TIM_ClockConfigTypeDef sClockSourceConfig = {0}; TIM_MasterConfigTypeDef sMasterConfig = {0}; HTIMx.Instance = TIMER; HTIMx.Init.Prescaler = gu32_ticks-1; HTIMx.Init.CounterMode = TIM_COUNTERMODE_UP; HTIMx.Init.Period = 65535; HTIMx.Init.ClockDivision = TIM_CLOCKDIVISION_DIV1; HTIMx.Init.AutoReloadPreload = TIM_AUTORELOAD_PRELOAD_ENABLE; if (HAL_TIM_Base_Init(&HTIMx) != HAL_OK) { Error_Handler(); } sClockSourceConfig.ClockSource = TIM_CLOCKSOURCE_INTERNAL; if (HAL_TIM_ConfigClockSource(&HTIMx, &sClockSourceConfig) != HAL_OK) { Error_Handler(); } sMasterConfig.MasterOutputTrigger = TIM_TRGO_RESET; sMasterConfig.MasterSlaveMode = TIM_MASTERSLAVEMODE_DISABLE; if (HAL_TIMEx_MasterConfigSynchronization(&HTIMx, &sMasterConfig) != HAL_OK) { Error_Handler(); } HAL_TIM_Base_Start(&HTIMx); } |
TimerDelay_us() Function
1 2 3 4 5 |
void delay_us(uint16_t au16_us) { HTIMx.Instance->CNT = 0; while (HTIMx.Instance->CNT < au16_us); } |
TimerDelay_ms() Function
1 2 3 4 5 6 7 8 9 |
void delay_ms(uint16_t au16_ms) { while(au16_ms > 0) { HTIMx.Instance->CNT = 0; au16_ms--; while (HTIMx.Instance->CNT < 1000); } } |
STM32 Delay Microsecond & Millisecond Utility
All the delay functions with both methods WDT and Timer are found in the util folder that you’ll download from the link below. It includes both the source code and header file for each WDT_Delay and TimerDealy.
In the LAB down below, I’ll show you how to add this to your project directory, include it in your application, and test it using an output GPIO pin and measure the timing with a digital storage oscilloscope (DSO).
STM32 Delay us & ms Example LAB
LAB Number | 15 |
LAB Title | STM32 Delay us and ms with WDT or Timer module |
- Set up a new project as usual with a GPIO output pin for testing
- Add the util to our project directory
- include the delay utilities (timer or DWT) and test it
STM32 Delay Example LAB15
And now, let’s build this system step-by-step
Step1: Open CubeMX & Create New Project
Step2: Choose The Target MCU & Double-Click Its Name
Step3: Configure A0 pin to be a GPIO output pin
Step4: Select the timer you’re willing to use for the delay. Let it be TIMER4, just enable the clock
No other configuration for the timer is needed. It’s all done in the code. However, don’t forget to remove the HAL_TIM_Config() from the main.c code as you won’t use it.
Step5: Set The RCC External Clock Source
Step6: Go To The Clock Configuration
Step7: Set The System Clock To Be 72MHz
Step8: Name & Generate The Project Initialization Code For CubeIDE or The IDE You’re Using
Step9: Add The util directory & add it to the path as a source code directory
Copy the util folder that you’ve downloaded earlier
Open the project in CubeIDE and right-click the project name and click paste
The folder will be added to your project. However, it’s not considered as a source code directory, so it won’t be compiled and give you linking error in the linking stage if you call any of its functions.
Right-click the util folder and click properties. And navigate to C/C++ paths and symbols, and source locations tab.
In the source locations tab, click add folder and add our util folder.
That’s it, now it’s properly added to our project and can be included in the application code (in main.c).
Can you notice that little icon changes in the util folder? this indicates that the IDE knows that this is a source code directory. It wasn’t like that when we first added it.
Step10: The Last step is to remove the timer config function from the main.c file & make sure to name the timer instance correctly in the TimerDelay.c file
1 |
#define TIMER TIM4 |
Here is The Application Code For This LAB
Note that the hal_timer_config function is removed and never called. The timer initialization is now done in the TimerDelay_Init() routine.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
#include "main.h" #include "../util/Timer_Delay.h" #include "../util/DWT_Delay.h" void SystemClock_Config(void); static void MX_GPIO_Init(void); int main(void) { HAL_Init(); SystemClock_Config(); MX_GPIO_Init(); /* Initialize The TimerDelay*/ TimerDelay_Init(); while (1) { HAL_GPIO_TogglePin(GPIOA, GPIO_PIN_0); delay_ms(100); } } |
Or you can delay with the WDT as in the following application
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
#include "main.h" #include "../util/Timer_Delay.h" #include "../util/DWT_Delay.h" void SystemClock_Config(void); static void MX_GPIO_Init(void); int main(void) { HAL_Init(); SystemClock_Config(); MX_GPIO_Init(); /* Initialize The DWT_Delay*/ DWT_Delay_Init(); while (1) { HAL_GPIO_TogglePin(GPIOA, GPIO_PIN_0); DWT_Delay_ms(100); } } |
Download The Timer & DWT Delay LAB15 Project
The Result For LAB Testing
Here are some screenshots from my DSO showing the timing for various signals. 1uSec, 10uSec, 100uSec, 1mSec delays. And how accurate it actually is. The DWT delay has shown similar results to the timer functions, they are nearly identical. However, the timer delay is much accurate at the low-end very short 1uSec delay. But all in all, it’s as good as it can be.
Stay tuned for the upcoming tutorials and don’t forget to SHARE these tutorials. And consider SUPPORTING this work to keep publishing free content just like this!
![]() |
Previous Tutorial | Tutorial 18 | Next Tutorial | ![]() |
Gracias Brother, Saludos desde mi amada Venezuela.
Hello Mr.Magdy
your tutorials on STM32 as far as I know are the most diverse and thorough tutorials I could found however it could be a bit more noob friendly(as someone who has started learning microcontrollers from stm32 not Arduino or PIC)
now the question is how do we create several pulse trains with 1us resolution in the background without interfering with while() loop?
let’s say we need 10 pulse channels
one of them is carrier so the others are with respect to the first one and we can vary the delay and pulse width of them on the fly
but they all operate in the background without consuming much processing power
so we can write other stuff in the while() loop
If i got your question right, then I think you can operate the PWM module in synchronized mode so that you have one master channel and the others will be in sync with its timing. Independent edge mode will also enable you to add any time shift in any channel and vary its duty cycle on the run.