Abstract Timers and their Implementation onto the ARM Cor tex-M family of MCUs by Lindgren, Per et al.
  
 
 
 
 
Abstract Timers and their Implementation 
onto the ARM Cor tex-M family of MCUs 
 
 
 
 
Conference Paper 
CISTER-TR-151202 
 
 
Per Lindgren 
Emil Fresk 
Marcus Lindner 
Andreas Lindner 
David Pereira 
Luís Miguel Pinho 
 
Conference Paper CISTER-TR-151202 Abstract Timers and their Implementation onto the ARM Cor  ... 
© CISTER Research Center 
www.cister.isep.ipp.pt   
1 
 
Abstract Timers and their Implementation onto the ARM Cor tex-M family of MCUs 
Per Lindgren, Emil Fresk, Marcus Lindner, Andreas Lindner, David Pereira, Luís Miguel Pinho 
CISTER Research Center 
Polytechnic Institute of Porto (ISEP-IPP) 
Rua Dr. António Bernardino de Almeida, 431 
4200-072 Porto 
Portugal 
Tel.: +351.22.8340509, Fax: +351.22.8321159 
E-mail:  
http://www.cister.isep.ipp.pt 
 
Abstract 
Real-Time For the Masses (RTFM) is a set of languages andto ols b eing develop ed to facilitate emb edded 
software development and provide highly efficient implementations gearedto static verification. The RTFM-kernel 
is an architecturedesigned to provide highly efficient and predicable Stack Resource Policy based scheduling, 
targeting bare metal (singlecore) platforms.We contribute b eyond prior work by intro ducing a platform indep 
endent timer abstraction that relies on existingRTFM-kernel primitives. We develop two alternative 
implementations for the ARM Cortex-M family of MCUs: ageneric implementation, using the ARM defined SysTick-
/DWT hardware; and a target sp ecific implementation, using the match compare/free running timers. While 
sacrificing generality, the latter is more flexible and may reduceoverall overhead. Invariants for correctness are 
presented,and metho ds to static and run-time verification are discussed. Overhead is b ound and characterized. 
In b oth casesthe critical section from release time to dispatch is less than2us on a 100MHz MCU. Queue and 
timer mechanisms aredirectly implemented in the RTFM-core language and canb e included in system-wide 
scheduling analysis. 
 
Abstract Timers and their Implementation
onto the ARM Cortex-M family of MCUs
Per Lindgren, Emil Fresk, Marcus Lindner and Andreas Lindner Luleå University of Technology
Email:{per.lindgren, emil.fresk, marcus.lindner, andreas.lindner}@ltu.se
David Pereira and Luís Miguel Pinho
CISTER / INESC TEC, ISEP Email: {dmrpe,lmp}@isep.ipp.pt
ABSTRACT
Real-Time For the Masses (RTFM) is a set of languages and
tools being developed to facilitate embedded software devel-
opment and provide highly efficient implementations geared
to static verification. The RTFM-kernel is an architecture
designed to provide highly efficient and predicable Stack Re-
source Policy based scheduling, targeting bare metal (single-
core) platforms.
We contribute beyond prior work by introducing a plat-
form independent timer abstraction that relies on existing
RTFM-kernel primitives. We develop two alternative im-
plementations for the ARM Cortex-M family of MCUs: a
generic implementation, using the ARM defined SysTick-
/DWT hardware; and a target specific implementation, us-
ing the match compare/free running timers. While sacri-
ficing generality, the latter is more flexible and may reduce
overall overhead. Invariants for correctness are presented,
and methods to static and run-time verification are dis-
cussed. Overhead is bound and characterized. In both cases
the critical section from release time to dispatch is less than
2us on a 100MHz MCU. Queue and timer mechanisms are
directly implemented in the RTFM-core language and can
be included in system-wide scheduling analysis.
1. INTRODUCTION
In the mainstream of embedded programming C-code still
remains the predominant means for software development.
To facilitate the development a vast number of light-weight
operating systems are available, e.g., FreeRTOS [1], ChibiOS
[2], and RIOT [3] and for larger platforms, Linux/POSIX
based and Win32 derivates. In common, they provide a
thread based concurrency model, where the programmer has
to take the full responsibility of coordinating scheduling and
resource management as very little support is given by the
programming models and supporting tools [4].
In this paper, we explore a language based approach. The
reactive programming model of RTFM-core (-core in the fol-
lowing) provides tasks with timing constraints and critical
sections (treated as single-unit resources). As such -core
provides a model suitable to specify the timely behavior of
the embedded software, as well as a formal underpinning
This work was partially supported by Portuguese National Funds through
FCT (Portuguese Foundation for Science and Technology) and by ERDF
(European Regional Development Fund) through COMPETE (Opera-
tional Programme âA˘ŸThematic Factors of CompetitivenessâA˘Z´), within
project FCOMP-01-0124-FEDER-037281 (CISTER); and by FCT and EU
ARTEMIS JU, within project ARTEMIS/0001/2013, JU grant nr. 621429
(EMC2)
EWiLi’15, October 8th, 2015, Amsterdam, The Netherlands.
Copyright retained by the authors.
amendable to both static and run-time verification. The
supporting rtfm-core compiler produces C code that com-
piled together with a RTFM run-time system renders an exe-
cutable. The RTFM-kernel is an architecture targeting bare
metal (single-core) platforms designed to provide highly ef-
ficient and predicable Stack Resource Policy (SRP) based
scheduling by exploiting the underlying interrupt hardware.
However, in prior work no kernel support was given for
asynchronous tasks with timing offsets. In this paper we
address this problem with the goal to provide a transparent,
abstract, and generic way of managing timer queue(s) and
underlying hardware timer(s). Transparent w.r.t its use, i.e.
the programmer should not need to think in terms of hard-
ware timers when specifying the application at hand. Ab-
stract in terms of the RTFM-kernel, (the obligation of the
kernel is merely to manage scheduling) thus we seek a solu-
tion where the kernel itself is free of dependencies both to
timer queue implementations and timer hardware specifics.
Furthermore, the solution should be generic enough to cover
a broad range of embedded platforms with little or no effort
of porting. Additional requirements for robustness, perfor-
mance and predictability are efficient, bound time imple-
mentations, complying with the task and resource model of
SRP, along with invariants for correctness.
In this paper we contribute beyond prior work by intro-
ducing a platform independent timer abstraction that relies
on the existing kernel primitives. The proposed abstrac-
tion allows application and target specific implementations
of timer queues and timer handlers. The timer handlers are
treated as ordinary tasks in the system, while each queue is
managed under protection of a critical section (resource) in
the system.
Requirements to support abstract timers with respect to
analysis and code generation in the rtfm-core compiler
are discussed along with their performance implications. As
a proof of concept, we develop and characterize two alter-
native timer implementations for the ARM Cortex-M fam-
ily of MCUs: a generic (single queue/handler) implementa-
tion using the ARM defined SysTick/DWT hardware), and a
multi-queue/handler implementation exploiting vendor spe-
cific match-compare/free running timer hardware.
Our experimental results indicate that for both generic
and vendor specific timers the critical section from task re-
lease time to dispatch is less than 2us on a 100MHz MCU.
We show that the vendor specific timers can be exploited to
reduce latency, total overhead and priority inversion in the
system. Furthermore, we discuss the outsets for SRP based
analysis of programs scheduled by virtual timers under the
RTFM-kernel.
Finally we present ongoing and future undertakings and
sum up the presented contributions to conclude the work.
2. THE RTFM-CORE LANGUAGE
The RTFM-core language is based on the notions of tasks
and resources in correspondence to the Stack Resource Pol-
icy (SRP) model defined in [5]. For a detailed description
on the original work on -core we refer the reader to [6]. Here
we give a brief overview.
2.1 RTFM-core programming model
In -core tasks execute concurrently and run-to-completion.
A task may request asynchronous execution of other tasks
and claim (named) single-unit resource(s) for the duration
of critical section(s) in a nested manner. Functionality is
expressed using ordinary C-code. In recent work [7] the
-core language has been extended to provide messages (task
execution requests with timing offsets):
async after X before Y t(...), where X, defines the
(baseline) offset from the release time of the sender (base-
line); Y , gives the relative deadline and t is the identifier of
the task to execute.
2.2 RTFM-kernel design
In short each task is implemented directly as an inter-
rupt handler bound to the interrupt vector. Requesting
a task for execution amounts to pending the correspond-
ing interrupt, while claiming a resource for a critical sec-
tion amounts to manipulating the interrupt hardware such
to reflect the semantics of the system ceiling under SRP.
The RTFM-kernel encapsulate the operations required for
SRP based scheduling in a minimalistic API implemented
as C-code macros. Those of interest for the presentation
are: RTFM_pend(i), which requests execution of the cor-
responding task i; RTFM_lock(c), which reads and stores
the old ceiling value on the stack and sets the new ceiling;
and finally, RTFM_unlock(c), which restores the old ceil-
ing value from the stack.
Currently the scheduling primitives have been implemented
for the ARM Cortex-M range of MCUs [8]. The system ceil-
ing is enforced either through interrupt masking (M0/M0+),
or through (atomic) accesses to the NVIC BASEPRI register
(M3 and above).
2.3 RTFM-core compiler
The rtfm-core compiler analyses the declarative (static)
task, resource and communication structure, and generates
a C-code output referring to the RTFM-kernel primitives.
Code generation and kernel primitives can be tailored to C-
compiler specifics (currently supporting gcc and compcert).
3. TIMER ABSTRACTION
3.1 Definitions
We introduce the following definitions:
Definition We denote a task to be postponed if stemming
from an asynchronous message:
async after X before Y
with a defined baseline offset (X > 0). We denote the set
of postponed tasks as OT .
Definition We have a set of virtual timers {V T1 . . . V Tn}.
Each virtual timer i is associated with a set of postponed
tasks ot(V Ti) ∈ OT , and a timer queue tq(V Ti) (sorted by
release time).
Definition We introduce a mappingM from virtual timers
V T ’s to physical timers PT ’s, allocated on the target hard-
ware.
A physical timer is shared if M(V Ti) = M(V Tj), i 6= j.
We have the two edge cases, when M is a 1-1 (complete)
mapping between virtual and physical timers, and the case
when we have a single (shared) physical timer.
Definition For a physical timer PTi, we denote bw(PTi) as
the bit-width and f(PTi) as the frequency of operation (in
Hz), ra(PTi) as the range of the timer (in seconds), derived
from 2bw/f , and pr(PTi) as the precision of the timer (in
seconds), given as pr = 1/f .
E.g., the range is given by 2bw(PTi)/f , e.g. a 32-bit timer
operating a 1MHz gives a range of 232/1 ∗ 106Hz = 4295s,
with a precision of 1 ∗ 10−6s = 1us.
3.2 ARM Cortex-M defined timers
The Cortex-M range of MCUs share the ARM defined core
providing a 24-bit SysTick timer and a 32-bit debug timer
(defined in the DWT unit).
3.2.1 SysTick timer
The SysTick timer is provided in order to generate peri-
odic interrupts. When enabled, it counts downwards, and
when transitioning from 1 to 0 it sets a flag and (option-
ally) generates a SysTick interrupt. On zero, it assumes the
value of the RELOAD register, hence a periodic behavior can
be achieved with a minimal of programming effort. The cur-
rent counter value (CURRENT) can be read, while a write to
CURRENT, forces CURRENT = RELOAD. The frequency of op-
eration, is determined by setting the clock source (core/ex-
ternal). (Some implementations provide the option to pre-
scale the core clock, e.g., /8.) The priority of the SysTick
interrupt is programmable, and an interrupt can be pended
by setting the PENDSTSET bit in the ICSR (Interrupt Con-
trol and State Register). The SysTick timer is stopped when
the processor is halted during debug.
3.2.2 Debug timer
The debug unit (DWT) provides a 32-bit, free running
cycle count register (DWT_CYCCNT). However, the DWT is
instrumental for providing debugging support, and hence
not free to arbitrary use. However we can safely enable and
read the current DWT_CYCCNT value and use it as a 32-bit
glitch-free time base. When the CPU is halted (e.g., during
debugging) the counter is stopped.
3.3 Generic timer implementation
A flow chart is given in Figure 1. Whenever a new message
enters first in the queue (Fyes) the timer handler (task) is
invoked. In the timer handler, if the release time has already
expired (Eyes), the queued task is pended for execution, else
(Eno) the timer is programmed for releasing the the task at
its time for expire. In case a task is pended, the timer is
iteratively dequeued until either the queue is empty (Qno),
or the release time not expired (Eno). In the latter case, the
timer is setup to generate an interrupt for next task to be
released.
The timer handler is sketched in Listing 1, while the Sy-
sTick (set timer specific) implementation is outlined in List-
ing 2, along with a flow chart for its operation Figure 2.
expired?
exit
enqueue
async
enable
timer
exit
pend timer
enqueue
exit
T 1
T 3T 2
F yes
F no
E yes
E no
Q yes
Qno
disable
pend task
dequeue?
first?
exit
set timer
T 2
Idle :
Wait :
LOCKED(R(tq)) LOCKED(R(tq))
Figure 1: Timer queue (left) and timer handler
(right) flow charts.
1 ISR SysTick_Handler {
2 lq = RTFM_LOCK(Q);
3 while ((T_CURR() - tq_h->bl) >= 0) {
4 // E_yes
5 RTFM_pend(tq_h->id);
6 if (tq_deq() == NULL) {
7 // Q_no
8 T_DISABLE();
9 RTFM_UNLOCK(lq);
10 return;
11 }
12 RTFM_UNLOCK(lq);
13 lq = RTFM_LOCK(Q);
14 }
15 T_SET(tq_h->bl);
16 RTFM_UNLOCK(lq);
17 }
Listing 1: SysTickHandler.core.
T_CURR is a macro to read the DWT_CYCCNT (debug cycle
counter) while T_ENABLE()/T_DISABLE() are macros to
enable/disable the SysTick interrupt.
The SYSTICK_MASK is defined to the max reload value for
the 24-bit counter. For brevity, initialization code is omit-
ted. However, worth to mention is that we read DWT_CYCCNT
to obtain a defined point in time (baseline) for the birth of
the system. As a proof of concept we have implemented a
simple insertion sort queue (Listings 3 and 4).
3.3.1 Invariants for correctness
The invariants concern the logic of the interaction between
the queue and the timer handler. Figure 3, depicts the over-
all timer operation. The following invariants should hold:
1 #define SYSTIC_MASK ((1<<24)-1)
2 void T_SET(RT_Time t) {
3 RT_time diff = t - T_CURR();
4 if (diff > SYSTIC_MASK) {
5 SYSTICK_RELOAD = SYSTIC_MASK;
6 } else {
7 if (diff <= 0) {
8 PEND_SYSTIC()
9 }
10 SYSTICK_RELOAD = (diff & SYSTIC_MASK)-1;
11 }
12 SYSTICK_CURRENT = 0; // write to force reload
13 }
Listing 2: SysTickSet.core.
D > max?set systick
M yes
M no
E no
pend st
set max
exit
D ≤ 0?
exit
exit
set D
E yes
Figure 2: SysTick implementation.
1 typedef struct TQ {
2 RT_Time bl;
3 RT_Tid id;
4 struct TQ* next;
5 } TQ;
6
7 TQ tq[TQ_LEN]; // queue
8 volatile
9 TQ* tq_h = NULL; // head pointer
10 TQ* tq_f = tq; // free pointer
11 TQ* tq_n; // new
12 TQ* tq_c; // current
Listing 3: tq.h header.
• Idle state:
– the timer queue is empty, and
– the timer interrupt is disabled.
• Wait state:
– the timer queue is non-empty, and
– the timer interrupt is enabled.
The invariants are upheld by the implementation following
the (informal) reasoning.
Queue correctness.
Idle Assume the timer is in Idle state (the queue is empty),
and the application emits an
async after X ... t (...).
Since the timer queue is empty we follow the right
branch (Fyes), i.e., we enqueue (X, t), and enable the
timer interrupt T_ENABLE() and pend the timer in-
terrupt T_PEND, which causes the transition T1 to be
taken. At this point, the queue is non-empty, and the
timer is enabled.
Wait Assume the timer is in Wait state (the queue is non-
empty), and the application emits an
async after X ... t (...).
In this case we take an (implicit) T2 transition and
remain in Wait state.
Timer handler correctness.
Idle Assuming the Idle invariant, the timer interrupt is dis-
abled. (Thus, the time-interrupt handler is not invoked
even in case a compare match occurs and the interrupt
is raised.)
1 void tq_enq(RT_Time t, RT_Tid id) {
2 RT_lock_t lq;
3 RT_LOCK(lq,R(tq));
4 if (tq_f == NULL) TQ_panic();
5 // allocate and fill new entry
6 tq_n = tq_f;
7 tq_n->bl = t; tq_n->id = id;
8 tq_f = tq_f->next;
9 if (tq_h == NULL || tq_h->bl - t > 0) {
10 // F_yes, put first in list
11 tq_n->next = tq_h; tq_h = tq_n;
12 RT_UNLOCK(lq);
13 T_ENABLE(); T_PEND(); return;
14 }
15 // F_no, put in middle or last
16 tq_c = tq_h;
17 while (tq_c->next != NULL && tq_c->next->bl < t) {
18 tq_c = tq_c->next;
19 }
20 tq_n->next = tq_c->next; tq_c->next = tq_n;
21 RT_UNLOCK(lq);
22 }
23
24 TQ* tq_deq() {
25 tq_c = tq_h;
26 if (tq_h != NULL) {
27 tq_h = tq_h->next;
28 tq_c->next = tq_f;
29 tq_f = tq_c;
30 }
31 return tq_c;
32 }
Listing 4: tq.c implementation.
Wait The time-interrupt handler is invoked when an inter-
rupt occurs and the interrupt is enabled. The interrupt
has been raised either due to a T1 transition or due
to the timer hardware on a compare match. Assum-
ing the Wait invariants there is (at least) one element
in the queue, thus we can safely access tq_h for the
<expire?> check. From there the following cases ap-
ply:
Eno On Eno we program the SysTick timer [set timer],
and return from the time-interrupt handler [exit].
This corresponds to a transition T2 where we remain in
Wait state (waiting for a compare match). This occurs
in the case the timer is programmed first time or on
an overflow (when range of the timer is insufficient to
reach the release time of the queued task). Notice,
the latter may occur repeatedly until the <expire?>
condition is met.
Eyes We release the expired task [pend task] and check
if more messages are queued <dequeue?>. From the
following cases apply:
Qno No further messages are queued and we disable the
interrupt [disable]. This corresponds to the tran-
sition T3 back to Idle state. At this point, the queue
is empty and the interrupt is disabled.
Qyes There is still at least one message in the queue, and we
check the <expired?> condition for the next queued
task (a T2 transition).
3.3.2 Correctness under Concurrency
The sending task (accessing the timer queue through emit-
ting an async after X ...) and the timer handler runs
Idle Wait
T
1
T
2
T
3
Figure 3: Timer states, Idle is the initial state.
concurrently, and potentially preemptively, to other tasks.
Hence, we may be exposed to race conditions. To this end
we may either turn to re-entrant (lock-free) queue imple-
mentations [9] or protect the queue as a resource in the sys-
tem. For this presentation, we turn to the locking mech-
anisms provided by the RTFM-kernel. In Figure 1, the
LOCKED(R(tq)) areas (marked yellow/boxed), indicates the
critical sections on the resource R(tq). For the implemen-
tation this amounts to RT_LOCK(lq, R(tq)) on entering
and RT_UNLOCK(lq) on exiting. Since the queue opera-
tions are protected by the resource R(tq), they are from the
outset of concurrency safe. While the release of an expired
task t [pend task] is executed while holding the resource
R(tq), the SRP protocol ensures that t is only dispatched
if it has a priority higher than the current ceiling. If t ac-
cesses the queue (through an async after X ...), then
⌈R(tq)⌉ ≥ p(t) which prevents dispatching t until R(tq) is
unlocked. (Moreover, under the assumption that the timer
handler task is given a priority equal or greater than t, t will
not be dispatched until the timer handler task finish.)
3.3.3 Characterization
The presented timer abstraction and its implementation
gives the following key characteristics:
• Given a bound size queue, tq_enq is a bound time
operation,
• tq_deq is a constant time operation (accessing and
advancing only the head of the list),
• timer handling is safe w.r.t. invariants, and it
• allows implementation (and analysis) as part of the
-core application1.
Timing characteristics have been determined by measur-
ing the clock cycle count (DWT_CYCCNT) on the current im-
plementations (as presented in the paper). The experiments
have been conducted on a STM32 F4, running at full speed
(168MHz). The measurements have been repeated and con-
sistent cycle counts have been observed. For the experiments
we have used gcc v4.8.3 (OL gives the optimization level),
with the default settings for the target architecture. All
measurements include the overhead of the instrumentation
code, hence safe and pessimistic w.r.t. actual performance.
The queue implementation has been characterized, Table
1. The Baseline gives the cycle count (including the cal-
l/return) overhead for inserting last in a queue holding 1
element. The LC gives the Linear coefficient (added cost
1In particular, the tq_enq is part of the execution time for
the sending task, and the critical section (holding R(tq)) of
the timer handler is constant time (although the execution
of the timer handler may involve iterations). Notice here
the “slight escape” from the critical section when releasing
multiple tasks.
for each element in worst case). As expected for insertion
sort, we found the coefficient indeed to be linear.
Table 2, shows the latency from set release-time to dis-
patch in clock cycles. This gives an upper bound to the dis-
patch overhead, (dispatching multiple queued tasks without
leaving the handler always infer lower latency). The block-
ing (related to tq) inferred by the timer handler is brought
to a constant by escaping the critical section for each iter-
ation. The Best/Nominal values, give the execution path
when the queued task is not at the end of the queue, while
the Worst case includes disabling the timer interrupt.
Table 1: Complexity of the queueing algorithm.
OL Baseline LC
O0 178 26.5
O1 95 10
O2 78 9
O3 78 9
Og 96 9.5
Table 2: Latency from set release time to dispatch.
OL Best
Case
Nominal
Case
Worst
Case
O0 229 298 338
O1 122 153 188
O2 123 153 217
O3 123 153 217
Og 124 155 195
From this we can conclude that the generic implementa-
tion is capable of a low latency dispatch (< 2us, scaled to a
100MHz MCU). We have given the necessary WCET charac-
terization for blocking, useful to SRP based timing analysis
(e.g., response time and overall schedulability).
3.4 Vendor specific timers
An ARM Cortex based MCU typically comprise an ARM
defined core and a set of vendor specific peripherals (typ-
ically including a set of timers/counters). Each counter/-
timer has a defined set of features (supporting the intended
use). The requirements for implementing the abstract timer
architecture boils down to the following:
• n-bit width counter (+ for larger n) with
• interrupt capability (r), programmable priority (+)
• frequency (rate) relation to core-clock defined (r) or
programmable (+), and
• programmable reload (r), match compare register (+).
While (r) this is required/sufficient, the suitability is im-
proved (+) by a larger bit width, programmable priority,
programmable frequency and match compare functionality.
As representative uses cases we have studied two popular
ARM Cortex MCUs, namely the NXP LPC1769 and the
STM32 F4. In the case of the NXP LPC1769 (and similar) a
Repetitive Interrupt Timer (RIT) is provided, and a set of 4
equivalent and fully programmable 32-bit timers (the latter
meeting all our requirements suitability criteria). In the case
of the STM32 F407VET (and similar), we find a set of 12
16-bit timers and 2 32-bit timers, meeting the requirements
and suitability criteria.
For the implementation, the specialization to a vendor
specific timer is isolated to the [set timer]. The writing
the match compare is always 32-bit under the ARM memory
model, (the underlying timer hardware merely discards the
16 MSBs), hence the characterization applies in all cases. In
Table 3 gives the overhead for the isolated SetSysTick, while
Table 4 depicts the overhead of setting a Vendor Specific
(STM32 F407VET 32-bit) timer.
Table 3: Characterization of the Timer Set function
for the SysTick Timer.
OL Best Case Worst Case
O0 54 67
O1 35 49
O2 33 41
O3 33 41
Og 33 41
Table 4: Characterization of the Timer Set function
for a Vendor Specific timer.
OL Best Case Worst Case
O0 37 45
O1 21 28
O2 21 22
O3 21 22
Og 21 22
3.5 Compiler support
In order to automatically generate code for the proposed
virtual timers, the -core to C compiler is required to under-
take the following (additional) steps in the analysis:1) derive
the set of postponed tasks OT , 2) associate each postponed
task ti ∈ OT to a V Tj , such that p(V Tj) = p(ti) (i.e., as-
sign a virtual timer to each priority level in the tasks set
OT ), 3) derive a mapping M from V T to PT . 4) derive for
each tqi, wherePTi ∈ PT the static queue length (tqi being
a potentially shared queue for PTi, M(V Ti) = PTi). 5)
associate each tqi to a resource R(tqi), with a ceiling value
assigned under SRP (derived from the priorities of the tasks
accessing the queue, and 6) derive a time base tb(PTi) for
each PTi. 7) Generate C code definitions accordingly.
In the generated C-code, each task has a defined baseline
set by reading the hardware timer (T_CURR()) for exter-
nally triggered task or given by the sending task). To the
kernel we introduce a (queue and timer implementation in-
dependent) macro RTFM_async_i(...) scaling the virtual
time based (in us) to that of the target PIi. Our prototype
-core compiler implementation, assumes the case of a single
(shared) physical timer. (The evaluation of multiple timers
has been conducted by manually.)
3.6 Timing performance
For scheduling analysis the timer handlers can be seen as
ordinary tasks, invoked once for the release of each message
(plus the number of the range overflows present, e.g. in
case of SysTick based solutions). With the outset that the
mapping M is complete there will be no priority inversion
introduced by the timer handlers (as they operate operate at
the same priority as the tasks they release). A timer handler
th for shared timer, may preempt a task tj (p(th) > p(tj)),
while p(tr) ≤ p(tj), tr being the released task.
For vendor specific timers we typically have the option
to set the frequency f(PTn) (increased frequency gives a
improved precision, while at the same time may increase
the background load for processing timer overruns). The
precision occurs as a jitter parameter to the scheduling. (In
case the timer operates at the core clock frequency of the
MCU (e.g., for our SysTick implementation), jitter is 0.)
3.7 Run-time verification
The proof of correctness for the implementation is infor-
mal. To this end, the T_ENABLE()/T_DISABLE() macros
and tq_eng/tq_deq implementations have been extended
to check the invariants. For run-time verification of tim-
ing constraints, the code generation for tasks is extended to
check on return of each task ti the condition:
bl_t_i + dl_t_i > T_CURR(), where bl_t_i is the
(dynamic) task release time (baseline) and dl_t_i the spec-
ified (relative) deadline.
3.8 Assumptions
Following the general -core assumption on schedulability,
any message can have at most one outstanding instance.
This allows the required (safe) queue length to be derived
directly as the sum of tasks associated to the queue In con-
sequence baseline offsets (after X ...) must be less or
equal to the sender’s inter-arrival time.
4. RELATED AND FUTUREWORK
In the context of light-weight operating systems, neither
ChibiOS[2], RIOT[3] nor FreeRTOS[1] provide official char-
acterized queue/timer implementations. TinyOS [10] (TEP
102/108) suggest an HAL virtualization layer. However,
timers in TinyOS are outside their model of computation
and treated as any other (arbitrary) event source. Con-
tiki [11] provides the Rtimer library for scheduling real-time
task, however unlike our approach their timer tasks are un-
safe. Hence, our work presented can be considered as a
baseline for future benchmarking.
Future work includes supporting baseline offsets larger
than inter-arrival time for the sender. As mentioned in Sec-
tion 3.5, the support for abstract timers is currently limited
to a single queue/timer handler. The time-base T_CURR is
32-bit, defined by the DWT. This limits the absolute time
offsets. Longer offsets can be obtained at application level
(manually keeping track of number of activations until de-
sired time has expired). Automatic allocation and assign-
ment of (potentially multiple) timer handlers according the
requirements of the application can support arbitrary off-
sets, as well as reducing priority inversion and overall over-
head. Besides temporal properties, issues of energy con-
sumption may be considered for multi-domain optimization.
Moreover, the presented abstract timer architecture allows
for multiple alternative queue implementations. By analyz-
ing the task set, the compiler could chose the best fit (lin-
ear, heap, etc.) for each queue according to its characteris-
tics (Section 3.3.3) and overall requirements (w.r.t. timing,
memory, etc.).
5. CONCLUSIONS
In this paper we have introduced abstract timers to the
purpose of platform independent support for postponed tasks.
The abstraction allows timer tasks (handlers) and queues to
be statically allocated and included in system wide compile-
time analysis under the task and resource model of RTFM.
We have proposed a generic timer implementation that re-
lies solely on the ARM defined Cortex-M core and existing
RTFM-kernel primitives, and is thus directly applicable to
a wide range of commercially available MCUs. Correctness
has been argued from invariants for queue and timer task
interactions and queue consistency in a concurrent setting.
Our experiments validate the feasibility of the abstract timer
architecture and the presented characterizations of queuing
overhead and generic/vendor specific timer implementations
gives concrete bounds, useful as input to further response
time and schedulability analysis.
6. REFERENCES
[1] FreeRTOS. (webpage) Last accessed 2015-09-18.
[Online]. Available: http://www.freertos.org
[2] ChibiOS/RT. (webpage) Last accessed 2015-09-18.
[Online]. Available: http://www.chibios.org
[3] RIOT. (webpage) Last accessed 2015-09-18. [Online].
Available: http://riot-os.org
[4] E. A. Lee, “The problem with threads,”Computer,
vol. 39, no. 5, pp. 33–42, May 2006.
[5] T. Baker, “A stack-based resource allocation policy for
realtime processes,” in Real-Time Systems Symposium,
1990. Proceedings., 11th, Dec. 1990, pp. 191 –200.
[6] P. Lindgren, M. Lindner, and et.al, “RTFM-core:
Language and Implementation,” in
ESWEEK/CPSArch 2014, 2014.
[7] P. Lindgren, M. Lindner, A. Lindner, V. Vyatkin,
D. Pereira, and L. M. Pinho, “A real-time semantics
for the IEC 61499 standard,” in ETFA 2015,
September 8-11, 2015, Luxembourg, 2015.
[8] J. Eriksson, F. Ha¨ggstrom, S. Aittamaa, A. Kruglyak,
and P. Lindgren, “Real-time for the masses, step 1:
Programming API and static priority SRP kernel
primitives.” in SIES. IEEE, 2013, pp. 110–113.
[9] A. Kogan and E. Petrank, “A methodology for
creating fast wait-free data structures,” ser. PPoPP
’12. New York, NY, USA: ACM, 2012, pp. 141–150.
[10] P. Levis, S. Madden, and et. al., “TinyOS: An
operating system for sensor networks,” in in Ambient
Intelligence. Springer Verlag, 2004.
[11] A. Dunkels, B. Gro¨nvall, and et. al., “Contiki - a
lightweight and flexible operating system for tiny
networked sensors,” in Emnets-I, Nov. 2004.
