






















Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners 
and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. 
 
• Users may download and print one copy of any publication from the public portal for the purpose of private study or research. 
• You may not further distribute the material or use it for any profit-making activity or commercial gain 
• You may freely distribute the URL identifying the publication in the public portal  
 
If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately 
and investigate your claim. 
   
 
Downloaded from orbit.dtu.dk on: Dec 18, 2017
Tradeoff analysis for Dependable Real-Time Embedded Systems during the Early
Design Phases




Publisher's PDF, also known as Version of record
Link back to DTU Orbit
Citation (APA):
Gan, J., Pop, P., & Madsen, J. (2014). Tradeoff analysis for Dependable Real-Time Embedded Systems during
the Early Design Phases. Kgs. Lyngby: Technical University of Denmark (DTU).  (DTU Compute PHD-2014; No.
330).
Tradeoﬀ Analysis for Dependable
Real-Time Embedded Systems




Technical University of Denmark
Department of Applied Mathematics and Computer Science
Building 303B, DK-2800 Kongens Lyngby, Denmark







Embedded systems are becoming increasingly complex and have tight competing con-
straints in terms of performance, cost, energy consumption, dependability, flexibility,
security, etc. The objective of this thesis is to propose design methods and tools for
supporting the tradeoff analysis of competing design objectives during the early design
phases, which are characterized by uncertainties. We consider safety-critical real-time
applications modeled as task graphs, to be implemented on distributed heterogeneous
architectures consisting of processing elements (PEs), interconnected by a shared com-
munication channel. Tasks are scheduled using fixed-priority preemptive scheduling,
and we use non-preemptive scheduling for messages.
As a first step, we address the problem of function-to-task decomposition. In this con-
text we have assumed that the application functionality is captured by a set of functional
blocks, with different safety requirements. We propose a Genetic Algorithm-based
metaheuristic to solve the function-to-task decomposition problem. Our algorithm also
decides the mapping of tasks to the PEs of a distributed architecture and the reliabil-
ity of each PE in the architecture, such that the safety and integrity constraints are
satisfied, the schedulability of the real-time application is guaranteed and the overall
development and product unit costs are minimized.
Next, we investigate tradeoffs between performance, energy and reliability. Address-
ing energy and reliability simultaneously is especially challenging, since lowering the
voltage to reduce the energy consumption has been shown to increase the transient
fault rate. We are interested to tolerate transient faults and we use task replication for
recovery. We propose a Tabu Search-based approach, which decides the mapping of
tasks to processing elements, as well as the processor voltage and frequency levels for
executing each task, such that transient faults are tolerated, the real-time constraints of
the application are satisfied, and the energy consumed is minimized.
ii
In this thesis, we target the early design phases, when decisions have a high impact
on the subsequent implementation choices. However, due to a lack of information, the
early design phases are characterized by uncertainties, e.g., in the worst-case execu-
tion times (WCETs), in the functionality requirements, or in the hardware component
costs. In this context, we select the hardware components for the architecture and
derive a mapping of tasks in the application, such that the resulted implementation
is both robust and flexible. The architecture also has a high chance to have its unit
cost within the cost budget. Robust means that the application has a high chance of
being schedulable, considering the WCET uncertainties, whereas a flexible mapping
has a high chance to successfully accommodate future functionality changes. We pro-
pose a Genetic Algorithm-based approach to solve this optimization problem. The
proposed tradeoff analysis methods have been evaluated using several synthetic and
real-life benchmarks.
Summary (Danish)
Indlejrede systemer bliver stadig mere komplekse og har stramme, konkurrerende be-
grænsninger med hensyn til ydelse, pris, energiforbrug, pålidelighed, fleksibilitet, sik-
kerhed osv. Formålet med denne afhandling er at foreslå metoder og redskaber til at
støtte en afvejningsanalyse af konkurrerende design mål i de tidlige design faser, som
er karakteriseret af usikkerhed. Vi berører sikkerhedskritiske realtidsapplikationer mo-
delleret som opgave grafer, der skal implementeres på distribuerede heterogene arkitek-
turer bestående af beregningsselementer (PE’er), sammenkoblet med en delt kommuni-
kationskanal. Opgaver planlægges ved hjælp af fast prioritet afbrydende (preemptive)
planlægning, og vi bruger ikke- afbrydende (non-preemptive) planlægning for meddel-
elser.
Som et første skridt, vi tager fat på problemet med funktion-til-opgave nedbrydning. I
denne sammenhæng har vi antaget, at applikationens funktionalitet er beskrevet ved
et sæt af funktionelle blokke, med forskellige sikkerhedskrav. Vi foreslår en meta-
heuristik baseret på en genetisk algoritme til at løse problemet med funktion-til-opgave
nedbrydningen. Vores algoritme afgører også fordelingen af opgaver til PE’en i en di-
stribueret arkitektur og pålideligheden af de enkelte PE’er i arkitekturen, således at
sikkerhedskravene er opfyldt, skedulerbarheden af realtids-applikationen er garanteret
og de overordnede udviklings og produkt omkostninger minimeres.
Dernæst undersøger vi afvejninger mellem ydeevne, energi og pålidelighed. Håndte-
ring af energi og pålidelighed samtidig er særligt udfordrende, fordi at sænke spæn-
dingen til at reducere energiforbruget har vist sig at øge hyppigheden af midlertidige
fejl. Vi er interesseret i at tolerere forbigående fejl og vi bruger opgave replikering til
fejlhåndtering. Vi foreslår metoden “Tabu Searc” , som beslutter fordeling af opgaver
til PE’er, samt processor spænding og frekvens niveauer for udførelse af hver enkelt
iv
opgave således at: forbigående fejl tolereres, realtids kriterier i applikationen er opfyldt
og energiforbruget minimeres.
I denne afhandling fokuserer vi på de tidlige design faser, hvor beslutninger har en
stor indvirkning på de efterfølgende implementeringsvalg. Dog er de tidlige design fa-
ser karakteriseret ved et højt niveau af usikkerhed på grund af manglende information.
F.eks. i de værst tænkelige eksekveringstider (WCET’er), i de funktionelle krav, eller i
de hardware komponenten omkostninger. I denne sammenhæng, vi vælge de hardware
komponenter til arkitektur og udleder en fordeling af opgaver i applikationen, således
at den endelige implementeringen er både robust og fleksibel. Arkitekturen også har
en høj chance for at få sin enhedsomkostninger inden omkostningerne budget. Robust
betyder, at programmet har en høj chance for at være skedulerbar, taget WCET usik-
kerheder i betragtning, mens en fleksibel fordeling har en høj chance for succesfuldt at
rumme fremtidige funktionelle ændringer. Vi foreslår en genetisk algoritme metodik til
at løse dette optimeringsproblem. De foreslåede afvejnings-analyse metoder er blevet
evalueret ved hjælp af flere syntetiske og real-life benchmarks.
Preface
This thesis was prepared at the Department of Applied Mathematics and Computer
Science, the Technical University of Denmark in fulfillment of the requirements for
acquiring the Ph.D. degree in computer engineering.
In this thesis, we propose methods and tools to support automatic design space ex-
ploration for the design of embedded systems in the early design phases. The thesis
consists of an introductory chapter and three papers.





I would like to express my sincere respect and heartfelt thanks towards Paul Pop and
Jan Madsen, for their professional supervision. I feel it is my great honor to be their
PhD student. During these years of my PhD, they were always there, providing me all
the help I need.
My greatest appreciation to Paul Pop, for two things. Firstly, I appreciated he selected
me to be his PhD student, which gave me a chance to improve my skills and knowledge,
which has proved to be a very rewarding experience. Secondly, I appreciated his pa-
tience towards me during these years. I learned a lot from his feedback, and improved,
not only the ability to get things done, but also the manner and enthusiasm of my work.
I would also like to express a big thank you to Jan Madsen and Flavius Gruian. They
gave me a lot of inspiring advice and patient reviews on our joint papers.
I have to mention that, close to half of my PhD time has been done at the Royal Institute
of Technology (KTH), Sweden. I would especially like to thank Axel Jantsch and Ingo
Sander for providing me an office and all the other help, which enabled me to finish
my PhD. I enjoyed so much the various conversations with them and the time at KTH.
During my Master study in Lund University and Carnegie Mellon University, I had
been working under the guidance of Claus Führer and Fernando De la Torre. They
encouraged me and provided inspiration, from where the idea of pursuing a PhD grew
in me.
viii Contents
I would also like to thank all my friends and all the fellow students for their support
and help in different aspects through the years when I was working in Sweden, USA
and Denmark.
In particular, I want to give my special thanks to my husband, Hao Wang, for his
understanding and support over the years. He always encourages me not to give up,
when I feel discouraged.
Finally, I take this opportunity to express my deep gratitude to my parents, Yang Gan









1.1 Design Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.1 Energy Consumption . . . . . . . . . . . . . . . . . . . . . . 3
1.1.2 Cost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.3 Performance and Predictability . . . . . . . . . . . . . . . . . 4
1.1.4 Dependability . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1.5 Robustness and Flexibility . . . . . . . . . . . . . . . . . . . 7
1.2 Design of Embedded Systems . . . . . . . . . . . . . . . . . . . . . 7
1.2.1 Systems Engineering Stages . . . . . . . . . . . . . . . . . . 8
1.2.2 Early Decisions . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.2.3 System-Level Design . . . . . . . . . . . . . . . . . . . . . . 10
1.2.4 Design Challenges . . . . . . . . . . . . . . . . . . . . . . . 12
1.2.5 Design Space Exploration . . . . . . . . . . . . . . . . . . . 14
1.3 Thesis Objective and Contribution . . . . . . . . . . . . . . . . . . . 15
2 Paper A: Criticality-Aware Function-to-Task Allocation for Distributed
Real-Time Embedded Systems 19
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.1.1 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.2 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.2.1 Application Model . . . . . . . . . . . . . . . . . . . . . . . 24
x Contents
2.2.2 Platform Model . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.3 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.3.1 Motivational Example . . . . . . . . . . . . . . . . . . . . . 32
2.4 Optimization Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.4.1 Objective Functions . . . . . . . . . . . . . . . . . . . . . . 35
2.4.2 Genetic Algorithm . . . . . . . . . . . . . . . . . . . . . . . 37
2.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3 Paper B: Reliability-Aware Dynamic Energy Management for Fault-Tolerant
Distributed Embedded Systems 45
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.2 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.3 Reliability Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.3.1 Energy/Reliability Trade-off Model . . . . . . . . . . . . . . 53
3.4 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.4.1 Motivational Example . . . . . . . . . . . . . . . . . . . . . 55
3.5 Offline Synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.6 Online Scheduling . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
3.7 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.7.1 Offline Synthesis Evaluation . . . . . . . . . . . . . . . . . . 67
3.7.2 Online Scheduling Evaluation . . . . . . . . . . . . . . . . . 69
3.8 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4 Paper C: Design for Robustness and Flexibility of Real-time Distributed
Applications during the Early Design Phases 73
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.2 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.2.1 Early Design Stages . . . . . . . . . . . . . . . . . . . . . . 76
4.2.2 Modeling WCET Uncertainties . . . . . . . . . . . . . . . . 78
4.2.3 Modeling Functionality Uncertainties . . . . . . . . . . . . . 79
4.3 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.3.1 Robustness and Flexibility . . . . . . . . . . . . . . . . . . . 80
4.4 Schedulability Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.5 Motivational Example . . . . . . . . . . . . . . . . . . . . . . . . . . 84
4.6 Mapping Optimization . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.6.1 NSGA-II for Multiobjectives Optimization . . . . . . . . . . 86
4.6.2 Determining the Mapping of Future Scenarios . . . . . . . . . 88
4.7 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . 89
4.8 Architecture Selection under Uncertainties . . . . . . . . . . . . . . . 92
4.8.1 Modeling Cost Uncertainty in Architecture Selection . . . . . 93
4.8.2 Architecture Selection Problem . . . . . . . . . . . . . . . . 94
4.8.3 Architecture Selection: Motivational Example . . . . . . . . . 94
4.8.4 GA-based Approach for Architecture Selection . . . . . . . . 97
Contents xi
4.9 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
A List of Abbreviations 101





We are living in a time of great change, with rapid development of science and tech-
nology. High-tech products are greatly improving our quality of life and work effi-
ciency. Most of the products we use today are controlled by digital computer systems.
When talking about digital computers, we may immediately think of the general pur-
pose computer systems, such as laptops, desktops or servers. However, the computers
inside most of the devices we used are a special type of computer systems, called em-
bedded systems. Over 98% of the microprocessors produced are used in embedded
systems [EJ09].
Compared to the general purpose computer systems, embedded systems are usually
used in a specialized application domain, performing a specific set of functions, re-
peatedly. Embedded systems are controlled and operated by a predetermined program,
and thus the expected operations of the embedded system interacting to the external
environment should have been considered and captured in the program.
It is difficult to give a precise definition of an embedded system. Their most impor-
tant characteristic is that they are built for performing a specific set of functions, and
thus are not general purpose computers. Another characteristic is that they are tightly
constrained, i.e., embedded systems have requirements such as high performance, low
energy consumption, low cost, small size and short time-to-market. Most of embedded
systems should be also continually reactive and perform the functions in real-time, with
high predictability.
2 Introduction
A mobile phone is an example of am embedded system, which is designed and used
for making telephone calls, sending messages, and some entertainment activities for
the leisure time. With smartphones, the distinction between general purpose computers
and embedded systems becomes less obvious.
Other examples of embedded systems are the automotive electronic functions inside a
vehicle, such as Anti-lock Braking Systems (ABS), engine control, airbags. A mod-
ern high-end vehicle can have more than 100 microprocessors implementing various
functionalities [Cha09].
Embedded systems are everywhere. People in their daily life and work use more than
30 embedded systems [EJ09], such as, cars, cameras, televisions, printers, traffic light
systems, etc. Embedded systems can be large, for example, airplanes and ships, or can
be very small, such as hearing aids or smart sensors. The growth rate in the number
of embedded devices is more than 10% per year and it is estimated the number of em-
bedded devices may reach 40 billion by 2020 [Res12]. According to statistics [Res12],
the market of embedded systems was valued at 121 billion dollars in 2011, and this
number is expected to grow by 6.8%, i.e., 194.27 billion dollars, in 2018.
The complexity of embedded systems is increasing very rapidly. For instance, more
than 100 million object code instructions (totaling close to 1 Gbyte of software) are
embedded in about 70 chips (electronic control units) in a currently new car [HS06,
EJ09], in order to perform the required functions.
The size of embedded software is increasing with 10% to 20% per year, depending
on the application area [EJ09]. Embedded software is more challenging to develop
than traditional desktop software, e.g., Microsoft Word, because embedded systems
have very tight constraints on the requirements that they have to fulfill. For example,
the embedded software of controlling a car should react the operations from the driver
quickly enough and provide a dependable service even in the worst-case scenarios.
Developing embedded software is expensive. According to [BCC+05], non safety-
critical embedded software normally costs to develop between 15-30 dollars per line of
code, while for highly critical applications, such as the space shuttle, the cost per line
of code increases to 1,000 dollars.
1.1 Design Metrics
As mentioned, embedded systems have very tight constraints on the requirements that
they have to fulfill. Those requirements are divided into functional requirements and
non-functional requirements. The desired functionality depends on the particular ap-
1.1 Design Metrics 3
plication that is implemented by the embedded system. Typical non-functional require-
ments can be expressed in terms of design metrics, such as, time-to-market, size, energy
consumption, cost, performance, predictability, robustness, flexibility, and dependabil-
ity attributes of reliability, safety, security, maintainability and availability. Time-to-
market is the time needed to build a working version of the system. Size means the
physical space required by the system. Other design metrics are considered in our the-
sis and introduced in this section. The formal definition of these design metrics is given
in the included papers: Paper A, Paper B and Paper C.
1.1.1 Energy Consumption
Energy-efficiency is one of the most important metrics for embedded systems, since
many are battery-powered mobile devices. For example, mobile phones have stringent
low power requirements, e.g., about two Watts [Mar11], limited by the available bat-
tery technology, so the battery lasts for a couple of days. The development of battery
technology is far from satisfactory, and does not support the need of rapidly increasing
computational requirements and the complexity of embedded applications [Roa09].
Process technology is targeted towards increased density, to support higher operating
frequency, resulting not only in higher power density but also in thermal problems.
Energy-efficiency aims to keep the embedded systems meeting other requirements such
as size, weight, and performance, at the same time saving the power dissipation as
much as possible, which can keep the temperature of embedded devices under desirable
levels, such that the processors are not burned and the battery-life is extended.
In our thesis, we are interested in reducing the energy consumption without nega-
tively impacting reliability and schedulability. These two design metrics of reliability
and schedulability and their tradeoffs will be introduced later. We measure the en-
ergy consumption of an embedded system in Processing Elements (PEs), which are a
major component of the system-level energy consumption. We use the power model
from [BBL09], as it is able to capture a set of realistic assumptions, such as discrete
operating modes of PEs, the energy and delay due to mode switches, and takes I/O
operations into account. The detailed definition of energy consumption for a specific
embedded system is presented in Section 3.2, Paper B.
1.1.2 Cost
Cost is always a key decision factor in most industries. The cost of a system could
be measured in many ways, for example, as the sum of all component costs integrated
in the system, or capturing the design, development and/or manufacturing costs. We
4 Introduction
define the Non-Recurring Engineering cost (NRE) as the one-time monetary cost of
designing the system, and we define the unit cost as the monetary cost of manufacturing
each copy of the system, excluding NRE cost. The NRE cost is fixed once the design of
an system has been done, regardless of how many units are going to be manufactured.
Jokob Axelsson [Axe06] proposed a cost model to estimate the per-product cost, i.e.,
the cost of each copy of the system up to delivery, for a distributed real-time embedded
system. In our thesis, we have adapted the cost model from [Axe06], considering
the cost of an architecture solution in Section 4.8, Paper C, and considering the total
cost of a design alternative that includes the unit cost of hardware components and the
development and certification costs of software tasks in Section 2.4.1, Paper A.
1.1.3 Performance and Predictability
Performance typically is captured by the execution time of computations and commu-
nications. Taking two mobile phones as an example, iPhone 5 has better performance
than iPhone 4, since iPhone 5 can finish a same task much faster with its dual core
running at 1,300 MHz, than the single core at 1,000 MHz of iPhone 4.
There are two main measures of performance, depending on what is of concern. One
is the latency or response time, which is the duration of a task’s execution. Another
is the throughput, that is the number of tasks a system executes per unit time. For
example, let us assume that, a camera takes 0.25 second to process an image. The
latency of that camera is 0.25 seconds, while the throughput of that camera is 4 images
per second [Vah06].
These measures of performance are not useful for real-time systems. A real-time sys-
tem is an embedded system where the correctness of the result depends also on the
time-instant when it has been produced [Kop11b]. In this context, other performance
measures, related to predictability are more appropriate. Predictability is a key property
of any real-time system that the timing requirements must be met. Before we discuss
the timing requirements, some basic terms used in real-time systems are introduced.
There are many models of computation used to capture the functionality of an embed-
ded system [ELLSV97]. In this thesis we have used the task graph model of com-
putation, see the details in the attached papers. A task is defined as a sequence of
instructions, and it is ready to be executed at any time after its release time. The execu-
tion time of the task may be different between task invocations, and this variability of
task execution time is typically due to, for example, the variations in the input data of
tasks or the speculative features of modern processors.
1.1 Design Metrics 5
In real-time systems, to be able to provide guarantees, engineers use the Worst-Case
Execution Time (WCET). The WCET is an analytical bound on the execution time
of a task (the execution time is surely not larger than this WCET bound) and can be
determined using tools such as aiT from AbsInt [FH04]. Note that a WCET is a bound,
i.e., a task may never execute for such a long period of time, and we are interested
having as low as possible values for WCET (i.e., less pessimistic). Similarly, we can
define the Best-Case Execution time (BCET) of a task. The length of time between the
release time of a task and the time it finished executing is named as response time. A
task is required to complete before its deadline.
Real-time embedded systems can have two types of timing requirements: hard and soft.
In hard real-time embedded systems, each task must be completed before its deadline.
Otherwise it may lead to catastrophic results. For example, the airbag in a car must be
inflated within 10-20 milliseconds (ms), after a collision is detected. Any delay might
be too late to save the life of the passenger.
In contrast, soft real-time embedded systems accept occasional timing failures, which
will not result in disastrous consequences, but can lead to a certain performance degra-
dation. For example, it is likely you are not aware that the DVD player failed to decode
some frames of the video source, when you watch a movie.
In our thesis work, we target on hard real-time embedded systems. The metrics used
for predictability are defined in Section 2.4.1 of Papers A, Section 3.5 of Paper B and
Section 4.4 of Paper C.
1.1.4 Dependability
Some of embedded systems, such as cars, airplanes, or medical equipment, nuclear
power plants, are safety-critical, since any deviation from the specified functionality
can have catastrophic consequences to the people or environment. Fault-tolerance
means the safety device is still under the control and provides alternative functionality,
although an internal fault occurred and was detected. Such faults might be permanent
(e.g., damaged microcontrollers or communication links), transient (e.g., caused by
electromagnetic interference), or intermittent (appear and disappear repeatedly). The
transient faults are the most common [Con03], and their number is increasing due to
the rising level of integration in semiconductors.
The failure rate of a system is the number of failures within a given period of time,
which depends on the technology and age of the system, the voltage or physical shocks
that the system suffers, and the ambient temperature that the system works on. The
typical permanent fault rate has been reported [ZMM04] in the range of 10−8 to 10−6
faults per hour for a chip. However, the fault rate increases dramatically while the sys-
6 Introduction
tem runs in some harsh environments. For example, for an orbiting satellite is reported
that up to 35 errors are found in a 15-minutes interval [CMR92], which equals to 140
faults/hour.
Software faults (bugs) are permanent, i.e., they are due to specification, design or imple-
mentation mistakes. Software does not experience transient faults similar to hardware,
since it is not aging, for example. A software bug disappears only if the software is
updated with a new version, where the bug has been removed. We have considered to
tolerate hardware transient faults which manifest themselves at the task-level.
Dependability is the system property that integrates such attributes as reliability, avail-
ability, safety, security, maintainability and availability [ALR+01]. Reliability is de-
fined as the probability of successful execution over a period of time under a given
set of operating conditions. We modeled the reliability for fault-tolerant distributed
embedded systems in Section 3.3, Paper B.
Safety describes the property that a system will not endanger human life or the environ-
ment. A Hazard is a situation in which there is active or potential danger to people or
the environment. Risk is a combination of the probability of a hazardous event and its
consequence. If, after performing an initial hazard and risk analysis, a system is con-
sidered safety-critical, it has to be certified [KK10]. Certification is a “conformity of
assessment” performed by a third party, e.g, an independent organization or a national
authority, namely a “certification authority”.
The current certification practice is “standards-based” [Rus07], and requires that the
product and the development processes fulfill the requirements and satisfy the objec-
tives of a certain certification standard, depending on the application area. For ex-
ample, [IEC10] is used in industrial applications, [ISO09] is for the automotive area,
whereas [RTC92] refers to software for airborne systems.
During the engineering of a safety-critical system, the hazards are identified and their
severity is analyzed, the risks are assessed and the appropriate risk control measures are
introduced to reduce the risk to an acceptable level. A Safety-Integrity Level (SIL) is
allocated to each safety function and captures the required level of risk reduction. SIL
allocation is typically a manual process, which is done after performing hazard and risk
analysis [Sto96], although a few researchers have proposed automatic approaches for
SIL allocation [PWR+10b]. SILs differ slightly among areas. For example, the avion-
ics area uses five “Design Assurance Levels” (DALs), from DAL E (lest critical) to
DAL A (most critical), while ISO 26262 specifies for the automotive area four “Auto-
motive Safety Integrity Levels” (ASILs), from ASIL A (least critical) to ASIL D (most
critical). However, the approach presented in this thesis is applicable to all safety-
critical areas, regardless of the standard. SILs are assigned to functional blocks, from
SIL 4 (most critical) to SIL 0 (non-critical).
1.2 Design of Embedded Systems 7
Security concerns the property that a system is able to protect the confidentiality, in-
tegrity, and availability of data and guarantee their authenticated communication. Cer-
tification standards require that tasks of different SILs are separated. In addition, they
also impose constraints on the communication to ensure data integrity. These con-
straints are similar to the Bell-LaPadula [BL73] and Biba [Bib77] data integrity mod-
els from the security domain. In our thesis, we have modeled the safety and integrity
requirements for criticality-aware functionality allocation and information communi-
cation in Section 2.2, Paper A.
1.1.5 Robustness and Flexibility
Design metrics are often difficult to quantify exactly due to, for example, uncertainties,
as discussed about variability of task execution time in Section 1.1.3, hence there can
be variations in their values. Also, the requirements of a system can change, especially
in the early design phases. In addition, the environment where an embedded system
operates will undergo changes. In this context, the issues of robustness and flexibility
are of utmost importance.
Robustness is generally defined as the ability of a system to resist change without alter-
ing its implementation. In our thesis, robustness means that the application has a high
chance of being schedulable, considering uncertainties (variations) in WCETs. For a
formal definition, see Section 4.3.1, Paper C.
Flexibility is generally defined as the ability to adapt to change. Many things can
change during the engineering of an embedded system, e.g., the initial requirements,
the functionality. Also, new functions are always added in subsequent product versions.
Performing changes is often very costly and requires extensive validation. For example,
the time-to-market of a power-train unit in automotive industry is 24-months. Five
months out of the time is used for the validation, which the percentage of validation
time is more than 20% of its time-to-market [PEPP04].
In our thesis work, flexibility is defined as the likelihood of successfully implement-
ing the future functionality changes, which have been modeled as “future scenarios”
[BE06]. The detailed definition of flexibility is presented in Section 4.3.1, Paper C.
1.2 Design of Embedded Systems
The design of embedded systems is facing ever-increasing demands from the rapidly
growing complexity of embedded systems and the competing constraints. The con-
8 Introduction
Figure 1.1: Life Cycle of Systems Engineering
straints that embedded systems should meet are measured by design metrics such as,
energy consumption, cost, predictability, robustness, flexibility, reliability, which are
described in the previous section,. The design methodology used in many organiza-
tions follows some versions of the “waterfall” model [Est07]. Such a methodology is
inadequate for complex systems, and many other methodologies have been proposed,
such as the V-model of system development [Est07].
In the following subsections, firstly we introduce the different stages of systems engi-
neering life cycle, and then present the system-level design methodology. We continue
to talk about the main challenges in designing embedded systems, which is to deter-
mine an implementation that simultaneously optimizes the competing design metrics.
In the end, methods and tool for automatic design space exploration are also discussed.
1.2.1 Systems Engineering Stages
There are many life cycle models used in the industry [Est07], such as waterfall mod-
els [Boe88] and V-models [FHKS09], but in principle they all have the following main
stages: concept development, engineering development and post-development, see the
top part of Fig. 1.1 [KSSB11]. During concept development, system engineers perform
a needs analysis, do concept exploration and definition. Engineering development con-
sists of advanced development, engineering design, integration and evaluation. Then,
in post-development, the mass-production from the prototype starts, followed by the
operation and maintenance of the system. In this thesis, we focus on the early design
stages, see the following subsection.
1.2 Design of Embedded Systems 9
1.2.2 Early Decisions
Experience from completed system design and development projects has indicated that,
the cost spent for the concept development stage accounts for about 20% of the total
cost. However, 80% of the cumulative cost of the system is committed already in this
stage [Bue11]. Decisions made in the early design stages not only have a high impact
on the subsequent implementation choices, but also have substantially negative impacts
on the total cost of the system, since it is costly and time-consuming to modify and
correct a decided design to other alternative designs in the engineering development
stage. i.e., early decisions have a high cost-effectiveness impact opportunity.
For example [Tas02], in the context of testing the software of a system, reports that,
70% of faults are introduced in the concept development stage, while 30% are found
in the engineering development stage. However, the cost for removing faults in the
engineering development stage is much more expensive (5-10 times) than the case we
fix them in the early design stages of concept development.
The bottom part Fig. 1.1 [SR11] shows the projected committed cumulative cost, and
the impact opportunity of a design decision over time. The conclusion is that we should
spent more effort on the early decisions, in order to increase the chance of completing
the projects on time and successfully.
There is a lot of research on embedded systems design [Mar11,GAGS09], but very few
researchers have addressed the early design stages. The challenge is that early design
stages are characterized by many uncertainties.
Uncertainty in the context of systems engineering refers to the inability to determine
precisely the state or attributes of a system. It can be caused by incomplete knowledge
or by stochastic variability. A detailed discussion on the taxonomy of uncertainty is
available in [Hai11].
Uncertainties in the early design stages need to be quantified, such that risks, defined
here as the probability of not reaching design targets, of different design alternatives can
be estimated for making early design decisions. In Paper C, we have modeled the un-
certainties of task WCETs (Section 4.2.2), functionality requirements (Section 4.2.3),
and hardware component costs (Section 4.8.1) in the early design stages, and consid-
ered them in evaluating the design metrics of robustness, flexibility and cost.
10 Introduction
Figure 1.2: System-level Design
1.2.3 System-Level Design
The aim of any design methodology is to minimize the time-to-market, and coordinate
the design tasks such that design metrics are simultaneously optimized. In the embed-
ded systems area, the design methodology is typically organized as in the Fig. 1.2. We
use boxes with rounded corners for input and output information, and rectangles for
design tasks. Our thesis work includes the steps which are inside the blue box. The
low-level development, i.e., a full synthesis of the design solution, whose steps are un-
der the blue box of Fig. 1.2, is outside the scope of our thesis work. In the following,
we explain the system-level design process (a) - (f).
1.2 Design of Embedded Systems 11
(a) Besides demanding requirements on functionality, embedded systems have to meet
also non-functional requirements, which are captured by the “design metrics” presented
in Section 1.1.
(b) The applications are typically modeled with formalism from the particular domain.
For example, the functionality of a vehicle can be described in terms of control algo-
rithms using differential equations to model the behavior of the vehicle and its environ-
ment. Researchers have used many Models of Computation (MoC) [ELLSV97], such
as task graphs or synchronous data flow networks for the modeling of embedded sys-
tems. In our thesis work, the detailed application models are presented in Sections 2.2.1
of Paper A, Sections 3.2 of Paper B and Sections 4.2 of Paper C.
(c) In our thesis work, the system platform is modeled at system level, which is viewed
as a set of heterogeneous processing elements interconnected by a shared communi-
cation channel. We consider two cases for determining the system platform. One is
given a fixed system platform that no modification is allowed, such a case is consid-
ered in Section 3.2, Paper B. Another is considering a system platform that can be
parametrized, e.g., the number of components, the type of components, and the perfor-
mance of components. We consider parametrized architectures in Section 2.2.2, Paper
A and Section 4.8, Paper C.
(d) Once the application functionality and system platform have been modeled, several
system-level design tasks are performed. The design tasks are typically performed
automatically with the help of tools.
In our thesis work, we have considered the following system-level design tasks.
• Function-to-task allocation determines the decomposition options for implement-
ing functions to tasks. In Fig. 1.2(b), an application is composed of several func-
tionalities with safety requirements. In the early design stages, such functional-
ities are captured using functional blocks of different SILs. At the implementa-
tion level, the functional blocks from the design level have to be transformed into
software or hardware tasks, or a combination of both. The certification standards
allow several decomposition options. The function-to-task allocation problem is
addressed in Paper A.
• Architecture selection determines the system platform. In Fig. 1.2(c), when the
given system platform can be parametrized (in Paper A and Paper C), we may
decide the number of components interconnected in the system platform, and
choose the components from different types, performance and other features to
build the system platform.
• Mapping decides the assignment of tasks to PEs and of messages to buses. The
mapping problem has been tackled in all our papers included in the thesis.
12 Introduction
• Voltage scaling refers to the assignment of processor operating modes (consist-
ing of voltage/frequency pairs) to each mapped task. We consider each PE is
characterized by a set of operating modes. Voltage scaling is the focus of Paper
B, together with mapping.
• Scheduling decides the execution order of the mapped tasks on the PE or mes-
sages on the bus. In all our papers included in the thesis, tasks are scheduled
using a fixed-priority preemptive scheduling, and messages are scheduled using
fixed-priority non-preemptive scheduling.
(e) the outcome of Fig. 1.2(d) box is a “model of system implementation”. A huge
number of alternative implementation solutions will be visited during the design space
explorations. Each such alternative implementation is evaluated in terms of the design
metrics, captured in (f) box of Fig. 1.2.
(f) The evaluation can be done analytically or using simulation. In our thesis work, we
used analytical models for evaluating a wide range of design metrics, and those models
are presented in each included paper in details.
We can perform the modeling, design and evaluation of embedded systems at several
abstraction levels [GK83]. From the lowest level to the highest level, there are cir-
cuit level, logic level, register-transfer level and system level. System-level design, by
means of highly abstraction of modeling, is able to investigate the many alternative
implementations of the system, and allows fast evaluation of each alternative, such that
the exploration of a huge design space is possible.
1.2.4 Design Challenges
It is challenging to design embedded systems with complex functionality and simulta-
neously optimize multiple design metrics, especially when the design metrics compete
with one another. Approaches used for improving a certain design metric can worsen
another one. Tradeoffs must be analyzed in order to determine the best solutions that
meet such an optimization challenge. In our thesis work, we have measured six de-
sign metrics, i.e., energy consumption, cost, schedulability, robustness, flexibility and
reliability. Their tradeoffs are listed as follows.
• Tradeoffs between schedulability and cost, within the safety requirements.
The system-wide safety requirement is maintained if each component has been
developed and operated under the given safety target. It is a common practice
to decompose the system-wide safety requirements and allocate per-component
1.2 Design of Embedded Systems 13
safety requirement to hardware and software components, separately. However,
the decomposition of function-to-task with safety requirements are not trivial.
For each function with a given criticality level, there are several decomposition
options, with varying impact on total cost (including the development cost of
software tasks and the unit cost of hardware components) and schedulability of
the implementation.
In Paper A, we have considered mixed-criticality applications to be implemented
on distributed architectures. We have proposed a GA-based approach to decide
the function-to-task allocation, the mapping of tasks to PEs and the reliability of
each PE, such that the safety requirements are preserved, the development and
unit costs are minimized, and the schedulability is maximized.
• Tradeoffs between energy consumption and schedulability, within the reliability
requirements.
The most common approach for energy minimization that allows energy and
performance trade-offs during run-time of the application is Dynamic Voltage
Scaling (DVS) [SAHE03]. DVS aims at reducing the dynamic power consump-
tion by scaling down operational frequency and circuit supply voltage, while
adapting the component (PE or communication link) performance to the actual
requirement of the system. A considerable amount of work has been done on
DVS, see [SAHE03] for a survey.
However, lowering the operating voltage and frequency not only decreases the
performance, but also increases the number of transient faults exponentially. The
main reason for such an increase is that, with lower voltages, even very low
energy particles are likely to create a critical charge that leads to a transient fault.
Moreover, the redundancy-based fault-tolerance techniques (such as replication)
and DVS-based low-power techniques compete for the available slack.
In Paper B, we addressed the mapping, voltage and frequency scaling for fault-
tolerant hard real-time applications mapped on distributed embedded systems.
We captured the effect of voltage and frequency scaling on system reliability,
and we showed that if the supply voltage and the operating frequency are low-
ered to reduce energy consumption, reliability is significantly reduced. We have
prepared an offline synthesis approach, based on a Tabu Search metaheuristic,
which decides the mapping and operating mode for each task such that the en-
ergy is reduced and the schedulability and reliability constraints are satisfied. We
have also proposed an online scheduling approach, which considers the mapping
determined by offline synthesis approach and decides at runtime the operating
mode for each job such that the energy is further reduced, while guaranteeing
the timing and reliability constraints.
• Tradeoffs among robustness, flexibility, and cost.
In the early design stages of building a new system version, the choice of reusing
legacy architecture components, using or upgrading to new components has a
14 Introduction
significant impact on the robustness, flexibility and architecture cost of the new
system version. In the case we choose to use new hardware components, with
improved metrics such as better performance, or lower power dissipation, we
may need to redevelop and validate the system platform which results in high
costs for this new architecture solution and high uncertainties in evaluating the
WCET of tasks. In case we migrate the legacy hardware components from the
previous products, the cost of such an architecture solution should be much less
than that in the former case, and the task WCET are more certain as well, but we
may not benefit from the improved metrics of the new architecture.
In Paper C, we have addressed the architecture selection and the mapping of
hard real-time applications on distributed heterogeneous architectures, during the
early design phases. We have proposed a GA-based optimization for architec-
ture selection and task mapping, targeting robustness, flexibility, and cost, which
takes into account the uncertainties in WCETs, functionality requirements, and
hardware component costs respectively. The proposed model allows the system
designer to make the early design decisions after considering the tradeoff among
robustness, flexibility, and cost.
1.2.5 Design Space Exploration
Design space exploration (DSE) is the process of visiting and analyzing the design
alternatives to identify good quality solutions. The search in design space is presented
in the three arrows which repeatedly visits new design alternatives in (b), (c) and (d)
in Fig. 1.2. The design alternative are created by performing design tasks presented in
Section 1.2.3.
DSE can be done manually, which is often the case in practice, but this is very ineffi-
cient for designing complex systems. In our thesis, we do automatic DSE supported
by tools, and the search in the automatic DSE tool is implemented using optimization
algorithms, in order to determine the good quality solutions in a reasonable time.
We consider m−dimensional space X of possible design alternatives, and F is the
n−dimensional space of values for each objectives, i.e., design metrics.
F(x) = opt( f1(x), · · · , fn(x)) (1.1)
where x∈ X , opt could be minimize or maximize the objectives, e.g., minimizing f1(x)
and maximize f2(x), · · · , fn(x).
1.3 Thesis Objective and Contribution 15
A design solution x ∈ X is called Pareto-optimal with respect to X iff there is no design
y ∈ X such that u = F(x) is dominated by v = F(y).
In our thesis work, we consider simultaneously optimizing multiple competing design
metrics. In Paper B, we merge all design metrics into a single objective function by
using a weighted average, and determine one optimized solution, while in Paper A and
Paper C, we perform multiobjective optimization, which aims to determine Pareto-front
of solutions.
In general, there are exact methods, such as integer/mixed linear program formulation,
and heuristics, such as Tabu Search (TS) or Genetic Algorithm (GA), for solving the
multiobjective optimization problems. In our thesis work, the optimization problems
we addressed (in Paper A, Paper B and Paper C) are NP-hard. In order to find high
quality solutions in a reasonable time, we proposed a TS-based heuristics in Paper B,
and used a GA-based heuristics for multiobjective optimization in Paper A and Paper C.
1.3 Thesis Objective and Contribution
The objective of this thesis is to propose design methods and tools for the design of
embedded systems. We are interested in addressing competing design metrics such
as energy consumption, cost, schedulability, reliability, robustness and flexibility, and
to support the designer making early design decisions during the early design phases,
which are characterized by uncertainties. The proposed methods have been imple-
mented in optimization tools which use advanced optimization techniques to determine
good quality solutions in a reasonable time.
The following papers and contributions have been included in the thesis:
• Paper A: Gan, Junhe, Paul Pop, Domitian Tamas-Selicean and Jan Madsen.
“Criticality-Aware Functionality Allocation for Distributed Real-Time Em-
bedded Systems.” To be submitted to International Journal of Reliability, Qual-
ity and Safety Engineering (IJRQSE). A preliminary version of this paper has
been accepted to the Design, Automation and Test in Europe (DATE) confer-
ence workshop: Performance, Power and Predictability of Many-Core Embed-
ded Systems (3PMCES), 2014.
Gan, Junhe, Paul Pop and Jan Madsen. “Criticality-Aware Functionality Al-
location for Distributed Multicore Real-Time Systems.” In the Design, Au-
tomation and Test in Europe (DATE) conference workshop: Performance, Power
and Predictability of Many-Core Embedded Systems (3PMCES), 2014.
16 Introduction
In this paper, we are interested in implementing mixed-criticality hard real-time
applications on a distributed heterogeneous hardware architecture, consisting of
a set of processing elements (PEs) interconnected by a shared bus. We assume
that the architecture provides the required separation mechanisms for mixed-
criticality applications. An application is modeled as a set of functional blocks,
with different Safety-Integrity levels (SILs), which dictate the development pro-
cesses and certification procedures that have to be followed. Before the applica-
tions are implemented, the functional blocks have to be decomposed into soft-
ware tasks. There are several decomposition options, which use redundancy
and diversity to achieve the desired SIL. We are interested to determine (1) the
function-to-task allocation, (2) the mapping of tasks to PEs, and (3) the reliability
of each PE, such that the total cost is minimized, the application is schedulable
and the safety and integrity constraints are satisfied. The proposed algorithm has
been evaluated using a synthetic benchmark and a real-life benchmark.
• Paper B: Gan, Junhe, Paul Pop, Flavius Gruian and Jan Madsen. “Reliability-
Aware Dynamic Energy Management for Fault-Tolerant Distributed Em-
bedded Systems.” To be submitted to ACM Transactions on Design Automa-
tion of Electronic Systems (TODAES). A preliminary version of this paper has
been published at the Asia and South Pacific Design Automation Conference
(ASP-DAC), 2011.
Junhe Gan, Flavius Gruian, Paul Pop and Jan Madsen. “Energy/Reliability
Trade-Offs in Fault-Tolerant Event-Triggered Distributed Embedded Sys-
tems” In Proceedings of the 16th Asia and South Pacific Design Automation
Conference, pp. 731−736. IEEE Press, 2011.
This paper presents an approach to the synthesis of low-power fault-tolerant hard
real-time applications mapped on distributed heterogeneous embedded systems.
We first propose a design-time (offline) synthesis approach which decides the
mapping of tasks to processing elements, as well as the voltage and frequency
levels for executing each task, such that transient faults are tolerated, the timing
constraints of the application are satisfied, and the energy consumed is mini-
mized. Tasks are scheduled using fixed-priority preemptive scheduling, while
replication is used for recovery from multiple transient faults. Addressing en-
ergy and reliability simultaneously is especially challenging, since lowering the
voltage to reduce the energy consumption has been shown to increase the tran-
sient fault rate. We present a Tabu Search-based approach which uses an en-
ergy/reliability trade-off model to find reliable and schedulable implementations
which minimizes the energy consumption. We also propose a runtime (online)
synthesis algorithm, which changes dynamically the voltage and frequency levels
of running tasks to reduce further the energy consumption, while guaranteeing
the schedulability of the application and its fault-tolerance to transient faults.
To provide such guarantees, the offline synthesis has to assume the worst-case,
i.e., that tasks will execute up to their worst-case execution times, and the max-
imum faults will occur. The online scheduling will know at runtime the actual
1.3 Thesis Objective and Contribution 17
execution times of tasks and the fault occurrences. We evaluated the proposed
synthesis approaches using several synthetic and real-life benchmarks.
• Paper C: Gan, Junhe, Paul Pop and Jan Madsen. “Design for Robustness and
Flexibility of Real-time Distributed Applications during the Early Design
Phases.” To be submitted to Journal of Systems and Software (JSS). A prelimi-
nary version of this paper has been published at the Design, Automation and Test
in Europe (DATE) conference, 2012.
Gan, Junhe, Paul Pop, Flavius Gruian and Jan Madsen. Robust and Flexible
Mapping for Real-Time Distributed Applications during the Early Design
Phases.” In Proceedings of the Conference on Design, Automation and Test in
Europe, pp. 935−940. EDA Consortium, 2012.
We are interested in mapping hard real-time applications on distributed hetero-
geneous architectures. An application is modeled as a set of tasks, and we con-
sider a fixed-priority preemptive scheduling policy. We target the early design
phases, when decisions have a high impact on the subsequent implementation
choices. However, due to a lack of information, the early design phases are char-
acterized by uncertainties, e.g., in the Worst-Case Execution Times (WCETs)
or in the functionality requirements. We model uncertainties in the WCETs us-
ing the “percentile method”. The uncertainties in the functionality requirements
are captured using “future scenarios”, which are task sets that model function-
ality likely to be added in the future. In this context, we derive a mapping of
tasks in the application, such that the resulted implementation is both robust and
flexible. Robust means that the application has a high chance of being schedu-
lable, considering the WCET uncertainties, whereas a flexible mapping has a
high chance to successfully accommodate the future scenarios. We propose a
Genetic Algorithm-based approach to solve this optimization problem. We also
show how this problem can be extended to consider the architecture selection:
deciding what hardware components to use in the architecture. In this context,
we consider the uncertainties related to hardware component costs. Extensive
experiments show the importance of taking into account the uncertainties during







In this paper, we are interested in implementing mixed-criticality hard real-time
applications on a distributed heterogeneous hardware architecture, consisting of
a set of processing elements (PEs) interconnected by a shared bus. We assume
that the architecture provides the required separation mechanisms for mixed-
criticality applications. An application is modeled as a set of functional blocks,
with different Safety-Integrity levels (SILs), which dictate the development pro-
cesses and certification procedures that have to be followed. Before the applica-
tions are implemented, the functional blocks have to be decomposed into software
tasks. There are several decomposition options, which use redundancy and diver-
sity to achieve the desired SIL. We are interested to determine (1) the function-
to-task allocation, (2) the mapping of tasks to PEs, and (3) the reliability of each
PE, such that the total cost is minimized, the application is schedulable and the
safety and integrity constraints are satisfied. The proposed algorithm has been
evaluated using a synthetic benchmark and a real-life benchmark.
20
Paper A: Criticality-Aware Function-to-Task Allocation for Distributed
Real-Time Embedded Systems
2.1 Introduction
Safety is a property of a system that will not endanger human life or the environment.
Safety-Integrity Levels (SILs) are assigned to safety-related functions to capture the
required level of risk reduction, and will dictate the development processes and cer-
tification procedures that have to be followed [IEC10], [ISO09], [RTC92]. There are
four SIL levels, ranging from SIL 4 (most critical) to SIL 1 (least critical). Certifica-
tion standards require that safety functions of different criticality levels are protected
(or, isolated), so they cannot influence each other. For example, without protection, a
lower-criticality task could corrupt the memory of a higher-criticality task.
The “Research Agenda for Mixed-Criticality Systems” [BBB+09] defines a mixed-
criticality system as “an integrated suite of hardware, operating system and middle-
ware services and application software that supports the execution of safety-critical,
mission-critical, and non-critical software within a single, secure computing platform”.
Many such applications, following physical, modularity or safety constraints, are im-
plemented using distributed architectures, composed of several different types of hard-
ware components (called nodes), interconnected in a network. Initially, each function
was implemented in a separate node, which has led to a large increase in the number of
nodes.
The current trends are towards “integrated architectures”, where several functions are
integrated onto the same node. In this context, designers are relying on partitioning
mechanisms at the platform level. For example, in the avionics area, the partitioned
architecture is called “Integrated Modular Avionics” (IMA) [Ari97], and the platform-
level separation mechanisms are provided by implementations of the ARINC 653 stan-
dard [Ari13]. ARINC 653 consists of hardware-mediated operating system-level spa-
tial and temporal partitioning mechanisms [Rus99]. Similar platform-level separation
mechanisms are available in other industries [Ern10, LSOH07, PTV+13].
In this paper we are interested in the implementation of mixed-criticality hard real-time
applications on integrated distributed architectures. The functionality of such applica-
tions is captured in the early design stages using functional blocks of different SILs.
Before the applications are implemented, the functional blocks have to be decomposed
into software tasks. For example, in the automotive area [CGL+07], the functionality
is captured at the “Vehicle” and “Analysis” levels using functional blocks. During the
“Design” level these functions are decomposed into tasks, which are implemented on
the target architecture in the “implementation” level.
SIL allocation of functional blocks is typically a manual process, which is done af-
ter performing hazard and risk analysis, but researchers have proposed automatic ap-
proaches for SIL allocation [PWR+10a]. However, no automatic approaches have been
proposed for the function-to-task allocation.
2.1 Introduction 21
The function-to-task allocation problem is not trivial, since, for each function with a
given SIL k, there are several decomposition options. For example, for a SIL 3 func-
tion, the automotive certification standard ISO 26262 [ISO11] allows several possible
decompositions: a SIL 3 task; a SIL 2 task and a SIL 1 task; further, the SIL 2 task
from the previous decomposition can be further decomposed into two SIL 1 tasks. De-
composing a SIL 3 function into two or even more tasks of lower SILs may reduce
the development and certification costs. Such decompositions rely on redundancy and
software diversity to achieve the desired SIL level using lower SIL tasks.
The decomposition will have an impact on the total cost and the schedulability of the
implementation. The development and certification costs of a software task grow expo-
nentially with each SIL [IBM10]. We assume that for each SIL k we have a correspond-
ing required reliability Rk of processing element (PE). That is, high-criticality tasks are
only allowed to be mapped on a correspondingly high reliability PE. Our optimization
approach decides the type of PEs to be used in the architecture, and high-reliability PEs
are more expensive, which impacts the total cost. In addiction, the function-to-task de-
composition will affect schedulability, because it may introduce additional tasks to be
scheduled and because it may require message validation for communication integrity,
which introduces additional overheads.
We consider heterogeneous distributed platforms, consisting of several PEs intercon-
nected using a broadcast bus. We assume that the platform provides both spatial and
temporal partitioning, thus enforcing enough separation for the mixed-criticality appli-
cations. The separation among mixed-criticality tasks also affects the communication.
For example, integrity models require that a task can only receive an input from a task
of the same criticality level or higher than its own. In this paper we consider that
ACROSS integrity model [Was14], which allows lower-criticality tasks to send mes-
sages to higher-criticality tasks if they pass first through a validation middleware. We
assume that the platform provides such a validation middleware to be used for commu-
nication integrity.
Each partition can have its own scheduling policy. However, to simplify the discus-
sion, in this paper, we assume that all applications are scheduled using Fixed-Priority
Preemptive Scheduling (FPPS). Although we address hard real-time applications, (non-
critical) soft real-time applications can also be handled using a technique such as the
Constant Bandwidth Server [AB98], where the server is seen as a hard task providing
a desired level of service to soft tasks.
We assume that the communication protocol has mechanisms to enforce partitioning
at the bus level. For example, space partitioning is attained in SAFEbus [HD93]
by mapping the messages to unique locations in the inter-module memory, protected
by a memory-mapping hardware in the host, and temporal partitioning is achieved in
TTP [Kop11a] by enforcing a Time-Division Multiple Access scheme. TTEthernet [AS
11] offers spatial separation for mixed-criticality messages through the concept of vir-
22
Paper A: Criticality-Aware Function-to-Task Allocation for Distributed
Real-Time Embedded Systems
tual links, and temporal separation, enforced through schedule tables for time-triggered
messages and bandwidth allocation for rate constrained messages. Researchers have
shown how realistic bus protocols such as TTP [PEP04], FlexRay [PPE+08] and TTEth-
ernet [TSPS12b] can be taken into account during the design. However, in this paper
we consider a simple bus where messages are transmitted using a fixed-priority non-
preemptive policy.
Given a mixed-criticality application modeled as a set of safety functions to be imple-
mented on such a distributed architecture, we are interested to determine the function-
to-task decomposition, the type of PEs in the architecture and the mapping of tasks to
the PEs, such that the total cost is minimized, the application is schedulable and the
safety and integrity constraints are satisfied.
2.1.1 Related Work
There is a large amount of research on hard real-time systems [Kop11a, But97], in-
cluding task mapping to heterogeneous architectures [BSB+01]. Researchers have
also started to address the mapping problem in the context of mixed-criticality sys-
tems [TSP11a], where the problem is to decide the assignment of tasks to partitions and
the mapping to PEs such that the applications are schedulable. The work in [TSP11a]
can also decide to “elevate” a task, i.e., raise its SIL, if this is needed to improve the
schedulability, and uses a SIL-related development cost model to support the reduction
of the overall cost.
The problem of deciding the SIL of a safety function is typically a manual process,
which is done after performing hazard and risk analysis. in this context, researchers
have started to propose automatic approaches to this problem, which is called “SIL
allocation” [PWR+10b]. A more broader view is taken by [SBK], which propose a
method for the propagation, transformation and refinement of safety requirements in
general.
Some of the work on SIL allocation addresses also SIL decomposition (which they
call “SIL algebra”), but in the context of safety functions implemented using hardware
architectures [PWA+13,APW+13,BDS11]. [PWA+13] have proposed a Genetic Algo-
rithm for SIL decomposition and [APW+13] have proposed a Tabu Search metaheuris-
tic. Both works aim at a SIL decomposition which reduces the development costs,
and are interested in deriving a fault-tolerant architecture. The safety of the resulted
architecture is evaluated using Fault-Tree Analysis.
Researchers have proposed [BDS11] a tool called DALculus for the automatic allo-
cation of SILs (which in the avionics area are called “Design Assurance Levels” or
DAL) such that the smallest DAL possible is allocated to a function in order to mini-
2.1 Introduction 23
mize costs, while following the recommended practices in the avionics area. The rules
from [ARP10] and the architecture constraints are formulated as a constraint satisfac-
tion problem and solved with an existing solver.
However, when preforming SIL allocation and decomposition at the level of safety
function, special case has to be taken not to abuse the SIL concept, which is not a
measure of safety, but a way to dictate the development process rigor, and which cannot
be decomposed unless “sufficient independence” can be shown between the requested
functionalities [WC12].
Other related works are in the context of the automatic transformation of structural
models to runtime models. For example, [KWS03] have proposed an automatic trans-
formation of structural models (components and their interaction) to runtime models
(tasks that communicate via message), taking into account real-time constraints. Their
method also does priority assignment and thread mapping. The transformation of com-
ponent models to tasks is also addressed in [FÅS04]. Thus, real-time components
which model the desired functionality at “design time” are transformed into tasks at
the “runtime” level. The authors propose a Genetic Algorithm-based approach to solve
this transformation problem. Their work is limited to single-processor systems.
[FDT05] have looked at transforming functional blocks at the level of the “functional
architecture” to Timed Petri Nets (TPNs) at the “runtime platform” level, deciding at
the same time the links between the functions and resources, which from the “opera-
tional architecture”. None of these works address safety-critical systems and the SIL
decomposition.
We have proposed a Genetic Algorithm-based optimization approach, which decides
the decomposition of functions to tasks, the mapping of tasks to PEs and the type of PEs
to be used in the architecture, such that the costs are minimized, the safety and integrity
constraints are satisfied and the schedulability of the applications is guaranteed.
The paper is organized as follows. The next section present the system models consid-
ered. We formulate the problem and illustrate a motivational example in Section 2.3.
The proposed a Genetic Algorithm-based approach is presented in Section 2.4, and is
evaluated using a synthetic and a real-life benchmarks in Section 2.5. We draw the
conclusion in the last section.
24




The set of all applications in the system is denoted with Γ. At the design level, we
model an application as functional blocks. Functional blocks have input and output
ports and communicate with each other with directional links, which connect the output
port of a functional block to the input port of another functional block. We assume that
there are no loops in this communication (if loops are present, they can be unrolled).
All the functional blocks in the system are modeled as a graph G(F ,E), where each
node Fi ∈ F is a functional block, and an edge ei j ∈ E from Fi to Fj denotes a data
dependency between an output port of Fi and an input port of Fj. We use fixed-priority
preemptive scheduling at the PE-level, hence we assume that we know the period ti and
deadline di of each functional block Fi. If dependent functional blocks have different
periods, they are combined into a merged graph capturing all activations for the hyper-
period (least common multiple of all periods).
As mentioned, a safety-critical system should not endanger human life or the environ-
ment. A hazard is a situation in which there is actual or potential danger to people or
to the environment. Risk is a combination of the frequency or probability of a specified
hazardous event, and its consequence. If, after performing an initial hazard and risk
analysis, a system is deemed safety-related, it has to be certified [Sto96]. Certifica-
tion is a “conformity of assessment” performed by a third party, e.g, an independent
organization or a national authority, namely a “certification authority”.
The current certification practice is “standards-based” [Rus07], and requires that the
product and the development processes fulfill the requirements and satisfy the objec-
tives of a certain certification standard, depending on the application area. For ex-
ample, [IEC10] is used in industrial applications, [ISO09] is for the automotive area,
whereas [RTC92] refers to software for airborne systems.
During the engineering of a safety-critical system, the hazards are identified and their
severity is analyzed, the risks are assessed and the appropriate risk control measures are
introduced to reduce the risk to an acceptable level. A Safety-Integrity Level (SIL) is
allocated to each safety function and captures the required level of risk reduction. SIL
allocation is typically a manual process, which is done after performing hazard and risk
analysis [Sto96], although a few researchers have proposed automatic approaches for
SIL allocation [PWR+10b]. SILs differ slightly among areas. For example, the avion-
ics area uses five “Design Assurance Levels” (DAL), from DAL E (lest critical) to
DAL A (most critical), while ISO 26262 specifies for the automotive area four “Auto-
motive Safety Integrity Levels” (ASIL), from ASIL A (least critical) to ASIL D (most
2.2 System Model 25
Table 2.1: ISO/DIS 26262 SIL decomposition schemes
SIL Can be decomposed as
SIL 4 SIL 3 + SIL 1 or SIL 2 + SIL 2 or SIL 4
SIL 3 SIL 2 + SIL 1 or SIL 3
SIL 2 SIL 1 + SIL 1 or SIL 2
SIL 1 SIL 1
critical). However, the approach presented in this paper is applicable to all safety-
critical areas, regardless of the standard. SILs are assigned to functional blocks, from
SIL 4 (most critical) to SIL 0 (non-critical).
At the implementation level, applications are modeled as a set of interacting tasks.
A task is a set of instructions which execute on a PE. Hence, the functional blocks
from the design level have to be transformed into software or hardware tasks, or a
combination of both. Let us consider a safety function of SIL k, to be implemented
as software tasks. The certification standards allow several options. For example, the
safety-function could be implemented as one task of SIL k or, using redundancy to
increase dependability, as several redundant tasks of a lower SIL, e.g., SIL k-1.
Decomposing a safety function of a higher SIL into several redundant tasks of lower
SILs can reduce the development and certification costs, and could be the right choice
in a particular context. For software redundancy, the standards recommend the use
diversity, i.e., different implementations of the same functionality. This is because a
fault (bug) in a software task will lead to a correlated failure in all of the tasks sharing
the same implementation, unless software diversity is used. Often, one of the redun-
dant tasks will implement a simpler (and maybe less accurate) algorithm as alternative
diverse implementation.
Certification standards refer to this process as “SIL decomposition” and provide rec-
ommendations on the possible decompositions. For example, ISO/DIS 262621, Part
9, Section 5, provides the guide shown in Table 2.1 for SIL decomposition. Such a
decomposition guide amounts to a “SIL algebra” [PWA+13], i.e., the SIL of the safety
function is the sum of the SILs of the redundant tasks.
In this paper we assume that the safety functions are implemented as software tasks
running on a distributed architecture. Let us consider a tasks τA which has to fulfill
a safety requirement of SIL 3. According to Table 2.1, we can decompose task τA
into two redundant tasks, e.g., τB with SIL 2 and τC of SIL 1. Task τB can be further
decomposed into two SIL 1 tasks.
1As mentioned, ISO 26262 uses the concept of Automotive SIL, or ASIL. To simplify the discussion, we
consider ASIL D to be SIL 4 and ASIL A to be SIL 1.
26
Paper A: Criticality-Aware Function-to-Task Allocation for Distributed
Real-Time Embedded Systems
Figure 2.1: Example applications modeled as functional blocks
We assume that, for those tasks which are considered for decomposition, the designer
will specify a library LA of possible decompositions based on the standard considered,
similar to the library in Table 2.1. The library LA will specify for each function Fi ∈ F
the possible decompositions into tasks. Each decomposition of a function Fi is modeled
as a task graph Gi(Vi,Ei) in LA.
Each node τ j ∈ Vi represents one task. We introduce the function R, which applies to
a functional block or a task, to capture the safety requirements in terms of SIL 0 to SIL
4. The SILs of the functional blocks from Fig.2.1 and of the tasks in the decomposition
library LA, resulted after decomposition, are presented in Fig.2.2.
The mapping of task to PE is denoted by the function M : Γ→N , where N is the set
of PEs in the architecture. This mapping is not yet known and will be decided by our
approach. For each task τi we know the Worst-Case Execution Time (WCET) C
N j
i on
each processing element N j where τi is considered for mapping. The WCETs of the
tasks are presented in Table 2.2. Note that in this small example, we assume PEs N j
have the same performance (speed) but with different SILs, so in Table 2.2 there is only
one WCET for each task.
An edge εi j ∈ Ei from τi to τ j indicates that the output of τi is the input of τ j. A
task becomes ready after all its inputs have arrived, and it issues its outputs when it
terminates. Communication between tasks mapped to different PEs is performed by
message passing over the bus. We assume that the message sizes smi of each message
mi are known.
We define the decomposition function D(Fi), D(Fi) : Vi → Di, where Di is a set of
decomposition options, specified in the decomposition library LA. There are three
decomposition options for D(F4) in Fig. 2.2: D14 in one task τ4 (SIL 4), D
2
4 in two
tasks, namely τ8 (SIL 3) and τ9 (SIL 1), and D34 in three tasks τ10 of SIL 2, τ11 of
SIL 1, and τ9 of SIL 1.
2.2 System Model 27
Figure 2.2: Decomposition library LA for the applications in Fig.2.1
In the bottom task graph of Fig. 2.6, we show how F4, once decomposed as D24, is
connected to the graph of application A2 from Fig. 2.1. We assume that a decomposed
task will be connected to the original application graph via two “connecting” tasks (see
the gray tasks in Fig. 2.2); one task which is distributing the input to the redundant
decomposed tasks (e.g., τ14) and one task which is collecting the outputs (e.g., τ15).
The SIL of the connecting tasks are given by the designer based on the communication
constraints as discussed in Section 2.2.2 and on the requirements from the standards.
The WCETs for our example tasks are shown in Table 2.2. Note that in the decompo-
sition option D24, the sum of the WCETs of the decomposed tasks, i.e., τ8,τ9,τ14, τ15,
has a value of 44, compared to the value of 20 for the single task τ4 in D14.
28
Paper A: Criticality-Aware Function-to-Task Allocation for Distributed
Real-Time Embedded Systems
Table 2.2: Tasks in LA
Tasks Ci ti = di SIL Wτi
τ1 25 50 2 6
τ2 25 50 3 16
τ3 40 100 2 8
τ4 20 100 4 28
τ5 30 100 1 3
τ6 15 50 2 5
τ7 15 50 1 2
τ8 20 100 3 15
τ9 20 100 1 3
τ10 20 100 2 6
τ11 20 100 1 3
τ12 2 50 2 1
τ13 2 50 2 1
τ14 2 100 3 2
τ15 2 100 3 2
τ16 2 100 2 1
τ17 2 100 2 1
τ18 2 100 2 1
τ19 2 100 2 1
The SIL assigned to a task will dictate the development processes and certification pro-
cedures that have to be followed. Standards provide checklists of objectives required
to be fulfilled for each SIL. Depending on the SIL, the standard may also impose that
some objectives to be satisfied with independence, to ensure an unbiased evaluation and
to avoid misinterpretation of the requirements [RTC92]. For example, for the verifica-
tion process, independence is achieved by using tools and personnel other than those
used throughout the development process.
SIL 0 functions are non-critical and do not impact the safety of the systems, thus are not
covered by the standards. In the case of SIL 1, the processes are similar to those covered
by quality management standards such as ISO 9001 [ISO08]. SIL 2 involves more
reviewing and testing. SIL 3 is significantly more difficult, and requires “semi-formal”
methods. SIL 4 often mandates formal methods, increasing further the development
costs.
The assessment of conformity to the checklist of objectives has to be performed by
independent assessors. For SIL 1 is enough to have an independent person, whereas
for SIL 2 an independent department is required. In the case of SIL 3 and SIL 4, an
independent organization has to be used. Moreover, the number of objectives that have
2.2 System Model 29
to be satisfied with independence is also growing. For example, in the case of DO-
178B, the main difference between DAL A and DAL B is the number of objectives to
be satisfied with independence: 25 out of 66 objectives are required for DAL A to be
satisfied with independence, while for DAL B it is only 14 out of 66.
Software development cost estimation is a widely researched topic, and is beyond the
scope of this paper. The reader is directed to [JS07, BAC00] for reviews on this topic.
One of the most influential software cost models is the Constructive Cost Model (CO-
COMO) [BMS00]. Researchers have shown how to take into account the development
costs during the design process of embedded systems [DMG97].
The development of safety-critical systems is a highly structured and systematic pro-
cess dictated by standards. These standards increase the development costs due to
additional processes for software development and testing, qualification activities in-
volved in compliance and increased process complexity, shown also by an IBM Ra-
tional study [IBM10]. Because of the systematic nature of the development processes
dictated by the standards, we assume that the designer will be able to estimate the de-
velopment effort required for a task. Hence, we define the development cost function
Wτi to capture the cost to develop and certify a task τi to its required SIL. Table 2.2
shows an example of the development costs for each of the tasks in Fig. 2.2. Similarly,
we define the development costs for the set of all the applications, W (Γ), as the sum of
the costs for each application. An example certification cost estimation in person-days
for an Air Traffic Control radio platform is presented in [Roc09].
2.2.2 Platform Model
We consider hardware architectures consisting of a set N of processing elements, in-
terconnected by a broadcast communication channel, see Fig.2.3. In this paper, our
focus is on the decomposition of safety function, so we consider a simple shared bus
where the communication is performed according to a non-preemptive fixed-priority
scheduling policy. We have shown in [TSPS12a] how a realistic communication pro-
tocol such as TTEthernet can be taken into account. We consider that the tasks are
scheduled using Fixed-Priority Preemptive Scheduling (FPPS), but our work can be
extended to consider also other scheduling policies, such as Static Cyclic Scheduling
(SCS) [TSP11b].
Tasks of different SILs can share the same PE. To be able to host high-criticality tasks,
a PE has to have a high reliability. We consider that for each PE in the architecture
we have implementations with different reliabilities. The reliability of a PE can be in-
creased through “hardening” or by using a more rigorous design [IPP+09]. We denote
with LH the library of varying reliability PEs available, and with wkj the unit cost of
PE N j with a reliability corresponding to a SIL k. Note that we will always use in an
30
Paper A: Criticality-Aware Function-to-Task Allocation for Distributed
Real-Time Embedded Systems
Figure 2.3: Example Platform
architecture the lowest cost PEs available, which have the required reliability to host
the highest-criticality task mapped on that PE, and that the unit cost increases with
increased reliability.
When several tasks of different SILs share the same processing element, the certifica-
tion standards require that they are developed at the highest SIL among the SILs of the
tasks, which is very expensive. Unless, the standards state, it can be shown that the
implementation of the tasks is “sufficiently independent”, i.e., there is both spatial and
temporal separation among the tasks. Hence, tasks of different SILs have to be pro-
tected from each other. Otherwise, for example, a lower-criticality task could corrupt
the code or data area of a higher-criticality task [CAS01], or block the higher-criticality
task from accessing the CPU, leading thus to a failure.
In this paper, we consider that the separation between tasks of different SILs, and from
different applications, is achieved through a temporal- and space-partitioning scheme
similar to Integrated Modular Avionics (IMA) [Rus99]. Note that partitioning schemes
similar to IMA are available in several application areas [PTV+13], not only in the
avionics area. Space partitioning uses mechanisms such as a Memory Management
Unit (MMU) to ensure that, for example, applications running on different partitions
cannot corrupt the memory for the other applications. Temporal partitioning ensures
the access of each application to the CPU, according to the scheduling policy used at
the PE-level. A detailed discussion about partitioning is available in [Rus99].
Fig. 2.3 shows the partitions Pj1,Pj2, ...,Pji, ... on the PE N j. We consider that we have
one partition set for each application on each PE where the tasks are mapped for exe-
cution, and we have one partition Pji for each SIL, such that only tasks from the same
application and the same SIL will share a partition. During scheduling, when perform-
ing context switching between tasks of different SILs we will also perform partition
switching. Recent architecture [WESK10, AFOTH06] and partitioned operating sys-
tems [Pik, POK] have a very small overhead associated with partition switching. We
take into account in the WCET of tasks the partition switching overhead.
2.2 System Model 31
The PEs of the architecture are connected to the network through a Trusted Interface
Subsystem (TiSS). The communication mechanisms are implemented by a Trusted Re-
source Manager (TRM), see Fig. 2.3. The TiSSes and TRM offer all the services
required for dependable communications in mixed-criticality systems. Such an in-
frastructure has been proposed in several architectures [WESK10, Kop11b], and offers
spatial and temporal partitioning for messages. The reliability of the communication
infrastructure is high enough to accommodate the highest-criticality.
As mentioned, certification standards require that tasks of different SILs are separated.
In addition, they also impose constraints on the communication to ensure data in-
tegrity. These constraints are similar to the Bell-LaPadula [BL73] and Biba [Bib77]
data integrity models from the security domain. In this paper, we consider that the plat-
form supports the ACROSS Integrity Model [Was14], which is an adaptation of Totel’s
model intended to offer support for multiple levels of criticality [TBDP98]. Such mech-
anisms are available in many platforms, and are similar to multiple independent levels
of security/safety architectures [AFOTH06]. Thus, the integrity model we used has the
following rules:
1. One task is assigned to exactly one partition (in our case, a partition is considered
be similar to the “component” concept used in [Was14]). Note that each partition is
assigned an integrity level (SIL).
2. Communication is allowed only between tasks of the same criticality level, or from
a higher-criticality tasks to a lower-criticality task.
These rules are very restrictive for communication and prevent higher-criticality tasks
from cooperating with lower criticality tasks. However, in practice it is often necessary
to have such a cooperation in order to implement cost-efficiently mixed-criticality ap-
plications (otherwise all lower-criticality tasks would have to be developed and certified
at the highest integrity levels). In this paper we consider (as in [Was14] and [TBDP98])
that if a communication is required from a lower-criticality task to a higher-criticality
task, we use a “Validation Middleware” (VaM) to mediate this communication.
A VaM (see Fig. 2.3) receives lower-integrity inputs and produces high-integrity out-
puts. This is achieved through using application-dependent fault-tolerance mecha-
nisms, see [Was14], which proposes a VaM, for details. The VaM will introduce a
non-negligible overhead in the communication. We take into account this overhead in
our application model by adding it to the WCET of a high-criticality task receiving a
message from a lower-criticality task. Thus, we have the third rule:
3. Communication from a lower-critical task to a higher-criticality task must pass
through a VaM.
32
Paper A: Criticality-Aware Function-to-Task Allocation for Distributed
Real-Time Embedded Systems
2.3 Problem Formulation
The problem we are addressing in this paper can be formulated as follows. As an in-
put to our problem we have (i) the set of applications Γ, which consist of functional
blocks and their SIL requirements, (ii) the library LA of function-to-task decomposi-
tions, which includes the WCET of tasks and their SIL-dependent development and
certification costs, (iii) an architecture consisting of a set N of PEs interconnected by
a shared bus, and (iv) the library LH of architecture implementations for the PEs, with
varying reliability and unit costs.
We are interested to determine an implementation S , such that the total costs are mini-
mized, the schedulability of the applications maximized, and the requirements on safety
(the SILs of the safety functions) and integrity (the three rules used to preserve the in-
tegrity of communication) are satisfied.
Synthesizing an implementation S means deciding on (1) the function-to-task decom-
position D , (2) the mapping of tasks to PEs M, and (3) the types of PEs H to use in the
architecture.
2.3.1 Motivational Example
Let us consider the example applications from Fig. 2.1, where we have five functional
blocks, F1 to F5, with different SIL requirements, periods and deadlines. We use the
decomposition library LA from Fig. 2.2, with the task properties from Table 2.2. We
consider an architecture of three PEs, and the library LH of alternatives is presented
in Table 2.3. We are interested to determine an implementation S such that the to-
tal costs are minimized, the schedulability is maximized, and the safety and integrity
requirements are satisfied.
In this example, for simplicity, we have not considered all possibilities of decomposi-
tion for LA. For instance, F1 in Fig. 2.2 could be decomposed into two tasks with SIL 1,
Table 2.3: Alternatives for N j in Library LH
PEs Maximum SIL k H(N j)→ Nkj w(Nkj )
N1 1 N11 10
N2 2 N22 20
N3 3 N33 40
N4 4 N44 60
2.3 Problem Formulation 33
Figure 2.4: Straightforward Solution
based on the SIL decomposition schemes in Table 2.1. We assume that the engineer
has decided only to perform decompositions in F2 of SIL 3 and F4 of SIL 4.
One possible solution to this problem is presented in Fig.2.4. We call this solution the
“Straightforward Solution”, or “SFS”. This is a solution where: (1) we do not decom-
pose the functional blocks into tasks with lower SILs, i.e., we use the decomposition
options D11 for F1, D
1
2 for F2, D
1
3 for F3, D
1
4 for F4, and D
1
5 for F5; (2) we consider
a simple mapping where we cluster all tasks based on SILs. Note that three PEs are
considered in the current architecture, so tasks of SIL 4 are mapped on one PE with
maximum SIL 4, tasks of SIL 3 are mapped on another PE with maximum SIL 3, and
the left tasks of SIL 2 and SIL 1 are mapped on the third PE with maximum SIL 2.
Fig.2.4 shows the resulted tasks graphs after decomposition, the mapping of tasks on
PEs, and which PE version from Table 2.3 we have used. For example, F4 is de-
composed into a single task τ4 and is mapped on N44 . As mentioned in Section 2.2.2,
according to Rule 3, when a lower criticality task sends data to a higher criticality task,
we have to use a VaM. In Fig.2.4, messages m1 and m2, depicted by squares in the task
graphs and bold boxes in the bus, are sent from τ1 (SIL 2) to τ2 (SIL 3), τ3 (SIL 2) to
τ4 (SIL 4), respectively, have to use VaMs in the receiving components.
The SFS is also shown in Fig.2.5, using a “x”, where we display a graph with the
resulting values of two objective functions, total cost (y-axis) and schedulability (x-
axis). The exact definition of these two objective functions is given in Section 2.4.1.
For the total cost (which includes both the development and certification costs of tasks
and the unit costs of PEs), the smaller values means lower costs. For the schedulability,
values above zero means “not schedulable”, whereas values below zero means that the
solution is schedulable, and the smaller the value, the higher the schedulability. As
we can see from Fig.2.5. SFS has a high cost and it is not schedulable. This is not
surprising, considering that it is know that mapping decisions have a strong impact on
solution quality, and SFS does not optimize the mapping.
Fig.2.5 also shows solutions that optimize the mapping. “Criticality-Aware Mapping
Optimization” (CMO) uses the same simple approach for decomposition as SFS, but
optimize the mapping aiming at improving the objective functions. CMO solutions,
34
Paper A: Criticality-Aware Function-to-Task Allocation for Distributed
Real-Time Embedded Systems



















Figure 2.5: SFS and Optimized Results
determined after exhaustive search, are Pareto-optimal solutions. As we can see from
Fig.2.5, where each CMO solution is depicted with a blue “+”, by optimizing the map-
ping, we are able find schedulable solutions. However, the cost remains high.
The focus of this paper is on the function-to-task decomposition. we have proposed an
optimization approach called “Criticality-Aware Functional Decomposition and Map-
ping Optimization” (CDMO), which decides the functional decomposition, the map-
ping of tasks to PEs and the types of PEs such that our design objectives are optimized.
CDMO solutions are found after exhaustive search as well. They are shown in Fig.2.5
using red “*”. As we can see from the figure, by carefully deciding on the function-to-
task decomposition, we are able to significantly reduce the costs, and further improve
the schedulability, compared with CMO which does not optimize the function-to-task
decomposition.
Due to space reasons, we are not able to depict all Pareto-optimal solutions as we have
done with SFS in Fig.2.4. However, we choose two extreme solutions found by CDMO:
the best solution in terms of schedulability is shown in Fig.2.6, and the best solution
in terms of costs is shown in Fig.2.7. In these figures, tasks with the same criticality
(SIL) are marked by using the same color. Messages sent from a lower-SIL task to a
higher-SIL task that have to go through VaMs have been highlighted by squares in the
task graphs and bold boxes on the bus.
The problem presented in this section is NP-hard (determining task mapping itself is
proven to be an NP-hard [TBW92]). To solve this multiobjective optimization problem,
we have proposed an approach based on Genetic Algorithm, see the next section.
2.4 Optimization Strategy 35
Figure 2.6: Best Solution in terms of Schedulability
Figure 2.7: Best Solution in terms of Costs
2.4 Optimization Strategy
We propose a Genetic Algorithm (GA)-based approach, called CDMO (Criticality-
aware functional Decomposition and task Mapping Optimization), to solve the op-
timization problem presented in Section 2.3. There are several off-the-shelf multi-
objective GA implementations, such as NSGA-II [DAPM00] and search frameworks
for multiobjective optimization such as PISA [BLTZ03]. We decided to use the Non-
dominated Sorting Genetic Algorithm-II (NSGA-II) [DAPM00], due to its good per-
formance and its simple implementation.
2.4.1 Objective Functions
We are interested to minimize the total cost and maximize schedulability. To determine
the system schedulability we use response-time analysis to calculate the worst-case
response time ri of every task τi, which is compared to the deadline di. The basic
analysis presented in [But11] has been extended over the years. For example, the state-
of-the-art analysis in [PGH98] considers arbitrary arrival times and deadlines, offsets
36
Paper A: Criticality-Aware Function-to-Task Allocation for Distributed
Real-Time Embedded Systems
and synchronous inter-task communication (where a receiving task has to wait for the
input of the sender task).
Our focus in this paper is optimizing the design of a mixed-critical hard real-time
system between schedulability and cost. Hence, we decide to use the basic analysis
from [But11]. We do, however, take communication into account to make sure that the
mapping solutions do not create too much bus traffic. We consider a non-preemptive
fixed-priority bus and during the design space exploration we mark as infeasible solu-
tions which result in a bus utilization over 100%. The utilization is calculated from the
bus speed, the size of the messages exchanged over the bus and period of the sender
tasks. The overhead Co required by the validation middleware to increase the integrity
of a message is added to the receiving task’s WCET.
The schedulability of an implementation S is captured using the “degree of schedula-
bility", rS over all tasks:
rS =
{
`1 = ∑i max(0,ri−di) i f `1 > 0
`2 = ∑i(ri−di) i f `1 = 0
(2.1)
If the application is not schedulable, there exists at least one ri greater than the deadline
di, therefore `1 of the function will be positive. In this case rS is equal to `1. However,
if the application is schedulable, then each ri is smaller than the corresponding deadline
di. In the case `1 = 0 and we use `2 as the rS , as it is able to differentiate between two
design alternatives, both leading to a schedulable implementation.
The second objective is the total cost that includes the unit cost of the PEs and the







where S is the implementation solution currently evaluated. The first term is a sum-
mation of the unit cost of PEs, wNkj , considering the selection N
k
j from the library LH ,
and the second term is the sum of the development costs Wτi the tasks τi included in S
according to the chosen decomposition function. Note that the software development
costs W are divided by the number of estimated Units that are going to be produced.
2.4 Optimization Strategy 37
2.4.2 Genetic Algorithm
GA is a metaheuristic optimization approach, which belongs to the class of Evolution-
ary Algorithms, inspired from the process of natural evolution. The set of candidate
solutions is called a “population”, and each solution is (i) encoded using a string called
a “chromosome”. The population is (ii) initialized to n candidate solutions, where n
is the population size. The population is evolved by (iii) selecting a set of solutions
and performing (iv) recombination and (v) mutation to generate offsprings. Finally, the
parent population is (vi) replaced with an offspring population with better “fitness”.
The fitness of a solution is evaluated using our two objectives. Steps (iii) to (vi) are
repeated until a termination condition is reached.
Steps (i) to (vi) are explained in the reminder of this section. There are several choices
for their implementation. Through experiments, we decided to choose the following
approaches, which can find good solutions in a reasonable time. The parameters were
also determined experimentally.
(i) Encoding: We encode a design alternative of function-to-task allocation and task
mapping as a chromosome. A chromosome is composed of genes. Fig. 2.8 shows
several chromosomes for the motivational example presented in Section 2.3.1. Each
chromosome has two parts.
The first part contains a gene for each functional block in the application. In the motiva-
tional example, we have five genes for F1 to F5. The value of the gene, j, represents the
decomposition option D ji selected from the library LA for the respective Fi. Taking the
4th gene of chromosome (a) in Fig. 2.8 as an example, we have used the decomposition
option 3 for F4, which corresponds to D34.
The second part of the chromosomes is related to mapping information. Thus, we have
one gene for each task that could be used in an implementation S , i.e., all the tasks are
included in the library LA. In our current example (see Table 2.2), we have a total of 19
tasks. The value of the gene for each task encodes the index of the PE where the task
will be mapped on. For example, in Fig. 2.8, gray genes are related to mapping. The
1st gray gene “1” in the chromosome (a) means task τ1 will be mapped on PE N1.
(ii) Initialization: The initial population is randomly generated and has a population
size n.
(iii) Selection: We use “tournament selection” to select parents for performing recom-
bination and mutation. In a tournament, four chromosomes are chosen at random, and
the fittest one wins. In total, 2(pc× n) parents are chosen for performing recombina-
tion, while n− (pc× n) parents are chosen for performing mutation, where pc is the
probability of recombination.
38
Paper A: Criticality-Aware Function-to-Task Allocation for Distributed
Real-Time Embedded Systems
Figure 2.8: Design Transformations and Chromosome Decoding
(iv) Recombination (also called crossover): We employ a standard single point crossover.
For each two parents, we compare a randomly generated number with pc, if this num-
ber ≤ pc, the two parents are cut at a random point and the sections after the cut point
are swapped to generate the offsprings. Otherwise, the offsprings are just copies of
their parents. A crossover example is presented in Fig. 2.8: The chromosome (a) and
(b) are recombined along the cut point depicted with a vertical line, obtaining the chro-
mosomes (c) and (d).
(v) Mutation is used to add diversity to a population obtained from recombination.
For each gene in the chromosome, we compare a randomly generated number with
pm (probability of mutation) and if this number ≤ pm, this position is mutated. For
the function-to-task allocation genes, we will randomly select another available de-
2.5 Evaluation 39
composition option. In the genes encoding the task mapping, we will randomly select
another PE index from the architecture library LH . We present two mutation examples
in Fig. 2.8. Chromosome (e) is the mutation result of (c) and chromosome (f) is the
mutation result of (d). The affected genes are highlighted using bold boxes.
(vi) Replacement: Recombination and mutation generate n offsprings out of the n par-
ents in the current population. Before we can evaluate the quality of a implementation
S encoded in a chromosome using the objective functions presented in Section 2.4.1,
we need to decode the chromosome. We use a fixed size chromosome, which makes
crossover and mutation easier, by including a gene for all the tasks in LA into the
chromosome. However, not all tasks from LA will be used in the implementation S .
The tasks used will depend on the decomposition options used for the function blocks.
Thus, during the decoding, we go through each gene corresponding to a task and check
if that task is used in S , based on the decomposition options recorded in the first part of
the chromosome, i.e., the functional block genes. Fig.2.8(g) and (h) show the decoding
of chromosomes (e) and (f), respectively. The tasks not part of S are marked with “-”,
and will not be used in the objective functions calculation.
Replacement decides which n solutions are kept out of the 2n solutions available. The
key advantage of NSGA-II lies in how it performs selection and replacement, with the
goal of preserving diverse non-dominated solutions, in the hope of finding the Pareto-
optimal front. See [DAPM00] for the details on the selection and replacement proce-
dures used in NSGA-II.
Steps (iii) to (vi) are repeated until there is no improvement for a given number of
consecutive generations, e.g., 30. In the end, we obtain a Pareto-front of solutions,
which, however, is not guaranteed to contain the Pareto-optimal, since NSGA-II is a
search metaheuristic which does not guarantee optimality.
2.5 Evaluation
To evaluate our proposed “Criticality-Aware Functional Decomposition and Mapping
Optimization” (CDMO) approach, we used a synthetic benchmark and a real-life bench-
mark. The synthetic benchmark has two applications with a total of eight functional
blocks. We have assigned SILs from 1 to 3 to the functional blocks. The periods and
deadlines have been assigned by 50 or 100 ms. The application model of this synthetic
benchmark is presented in Fig. 2.9(a).
We have used a decomposition library LA based on Table 2.1. Functional block of SIL
3 has two decomposition options: D1i will be one task of SIL 3, D
2
i will be two tasks
of SIL 2 and SIL 1. Functional block of SIL 2 also has two decomposition options: D1i
40
Paper A: Criticality-Aware Function-to-Task Allocation for Distributed
Real-Time Embedded Systems
Figure 2.9: Application Model of the Synthetic Benchmark
will be one task of SIL 2, D2i will be two tasks of SIL 1. Functional block with SIL 1 is
decomposed into one task of SIL 1. We do not consider the further decomposition, i.e.,
the decomposed tasks of SIL 2 from functional block of SIL 3 will not be decomposed
into two tasks of SIL 1.
The WCET for the decomposed tasks have been randomly decided between 4 – 40 ms.
The WCET of each “connecting” task generated by decomposition is assumed to be 2
ms, and the overhead Co required by the VaM is assumed to be 1 ms. The size of the
messages are assumed between 10 – 30 bytes. We were interested to implement these
applications on an architecture consisting of three PEs.
The real-life benchmark is an automotive case study adapted from [TCN00], and con-
sists of six applications: Engine Controller (SIL 3, seven functions), Automatic Gear
2.5 Evaluation 41
Figure 2.10: Real-life Automotive Applications
Box (SIL 4, four functions), Anti-locking Brake System (SIL 4, six functions), Sus-
pension controller (SIL 3, five functions), Wheel Angle Sensor (SIL 2, two functions),
and Body Work related to the passengers (SIL 1, seven functions). See Fig. 2.10(a) for
this real-life automotive applications.
We have used the periods and deadlines from [TCN00], and a decomposition library
based on the automotive certification standard ISO 26262 [ISO11], see Table 2.1. Each
function in Automatic Gear Box (SIL 4) and Anti-locking Brake System (SIL 4) has
three decomposition options: D1i will be one task of SIL 4, D
2
i will be two tasks of
SIL 3 and SIL 1, and D3i will be two tasks of SIL 2 and SIL 2. Each function in Engine
Controller (SIL 3) and Suspension Controller (SIL 3) has two decomposition options:
D1i will be one task of SIL 3 and D
2
i will be two tasks of SIL 2 and SIL 1. Each function
in Wheel Angle Sensor (SIL 2) has two decomposition options: D1i will be one task
of SIL 2 and D2i will be two tasks of SIL 1. Functions in Body Work (SIL 1) are only
42
Paper A: Criticality-Aware Function-to-Task Allocation for Distributed
Real-Time Embedded Systems






















Figure 2.11: Synthetic Benchmark
decomposed into one task of SIL 1. Further decompositions are not considered. The
automotive benchmark has to be implemented on an architecture consisting of six PEs.
We have run our CDMO approach to these two benchmarks on their respective architec-
tures. The algorithms were implemented in Matlab 2013a and the experiments run by 5
and 20 minutes on an Intel Core i7 CPU 920 (2.67 GHz) computer. We have tuned the
NSGA-II parameters such that the results are as close as possible to the optimal (i.e.,
no improvements were seen after a very long runtime). We set n = 100, Pc = 0.25,
Pm = 0.5 and the search terminates if no improvement is seen after 10 generations.
The Pareto-front of solutions obtained by CDMO are depicted with a red “*” in Fig.2.11
(synthetic benchmark) and Fig.2.12 (real-life automotive benchmark). Together with
CDMO, we also present the two other approaches: “Criticality-Aware Mapping Op-
timization” (CMO) and “Straightforward Solution” (SFS). CMO does not optimize
the function-to-task decomposition (it assumes that each function is implemented as a
task), but it optimizes the mapping of tasks. SFS does not perform any optimizations:
each function is implemented as a single tasks and tasks are cluster based on SILs,
hence no mapping optimization is performed. The SFS of the synthetic benchmark
is displayed in Fig. 2.9(b), while the SFS of the real-life automotive applications is
displayed in Fig. 2.10(b).
2.5 Evaluation 43



























Figure 2.12: Real-life Automotive Applications
Fig.2.11 and Fig.2.12 have the degree of schedulability on the X-axis and the total cost
on the y-axis, see Section 2.4.1 for the definition of the cost functions. As we can see
from the figures, SFS depicted with a “x”, is not able to obtain schedulable results.
The mapping optimization CMO is performed using the same GA approach as in
CDMO, but using a same function-to-task decomposition which functions will not be
decomposed into tasks with lower SILs, i.e., no design transformations in the first part
of the chromosomes. The Pareto-front of solutions obtained by CMO is shown using
blue “+” signs. As we can see CMO is able to obtain schedulable solutions, but they
have a high cost.
Only by using CDMO, which performs also the optimization of function-to-task de-
composition, we are able to obtain schedulable solutions which also have a lowest cost.
This shows that providing automatic tools to support the engineer in performing safety
function decomposition can lead to cheaper implementations. The engineer can choose
from the Pareto-front solutions that a desired trade-off between cost and schedulability.
An improved degree of schedulability means that the implementation has more space
for future upgrades.
44
Paper A: Criticality-Aware Function-to-Task Allocation for Distributed
Real-Time Embedded Systems
2.6 Conclusion
In this paper we have considered mixed-criticality applications to be implemented on
distributed architectures, and we were interested in decomposing the function-to-task
allocation and the task mapping such that the development and unit costs are mini-
mized, and the schedulability is maximized.
We have taken into account safety and integrity requirements. The safety functions
are assigned a SIL and we have used SIL decompositions from the standards to ensure
that the safety requirements are preserved after decomposition. We also ensure that
we use high-reliability PEs for the high-SIL functions. To ensure the communication
integrity, we perform a validation in case a lower-criticality task sends information to a
higher-criticality task. The separation required by the certification standards for mixed-
criticality functions is ensured by partitioning, whereas for integrity preservation we
use a validation middleware.
We have proposed a GA-based approach, called CDMO, for our two-objective opti-
mization problem. The algorithm decides the function-to-task allocation, the mapping
of tasks to PEs and the reliability of each PE. The proposed CDMO approach has been
evaluated on a synthetic benchmark and on a large automotive real-life case study. The








This paper presents an approach to the synthesis of low-power fault-tolerant hard
real-time applications mapped on distributed heterogeneous embedded systems.
We first propose a design-time (offline) synthesis approach which decides the map-
ping of tasks to processing elements, as well as the voltage and frequency levels
for executing each task, such that transient faults are tolerated, the timing con-
straints of the application are satisfied, and the energy consumed is minimized.
Tasks are scheduled using fixed-priority preemptive scheduling, while replication
is used for recovery from multiple transient faults. Addressing energy and relia-
bility simultaneously is especially challenging, since lowering the voltage to reduce
the energy consumption has been shown to increase the transient fault rate. We
present a Tabu Search-based approach which uses an energy/reliability trade-off
model to find reliable and schedulable implementations which minimizes the en-
ergy consumption. We also propose a runtime (online) synthesis algorithm, which
46
Paper B: Reliability-Aware Dynamic Energy Management for
Fault-Tolerant Distributed Embedded Systems
changes dynamically the voltage and frequency levels of running tasks to reduce
further the energy consumption, while guaranteeing the schedulability of the ap-
plication and its fault-tolerance to transient faults. To provide such guarantees,
the offline synthesis has to assume the worst-case, i.e., that tasks will execute up
to their worst-case execution times, and the maximum faults will occur. The on-
line scheduling will know at runtime the actual execution times of tasks and the
fault occurrences. We evaluated the proposed synthesis approaches using several
synthetic and real-life benchmarks.
3.1 Introduction
Safety-critical applications have to function correctly, satisfy their timing constraints
and be energy-efficient even in the presence of faults. Such faults might be permanent
(e.g., damaged microcontrollers or communication links), transient (e.g., caused by
electromagnetic interference), or intermittent (appear and disappear repeatedly). The
transient faults are the most common [Con03], and their number is increasing due to
the rising level of integration in semiconductors.
Researchers have proposed several hardware architecture solutions, such as the Time-
Triggered architecture [KB03], that rely on hardware replication to tolerate a single
permanent fault in any of the components of a fault-tolerant unit. Such approaches can
be used for tolerating transient faults as well, but they incur great hardware costs. Alter-
natives to such purely hardware-based solutions are approaches such as re-execution,
replication, and checkpointing. Several researchers have shown how the schedulabil-
ity of an application can also be guaranteed with appropriate levels of fault-tolerance
[BM94, BDP96, ZC03].
With regard to energy minimization, the most common approach that allows trade-offs
between energy and performance during run-time of the application is Dynamic Volt-
age and Frequency Scaling (DVFS) [SAHE03]. DVFS aims at reducing the dynamic
power consumption by scaling down operational frequency and circuit supply voltage.
A considerable amount of work has been done on DVFS, see [SAHE03] for a survey.
The effectiveness of DVFS for the current processors is debatable [LSH10] because
of their reduced dynamic range of power consumption. However, the safety-critical
embedded systems we address in this paper typically use mature microprocessors, for
which the failure rates are well-documented, and where DVFS is relevant.
Incipient research has analyzed the interplay of energy/performance trade-off and the
fault-tolerance techniques [EAHS+06,MME04,ZC04]. Redundancy-based fault-tolerance
techniques (such as re-execution and replication) and DVFS-based low-power tech-
niques compete for the available slack. The interplay of power management and fault
3.1 Introduction 47
recovery has been addressed in [MME04], where checkpointing policies were evalu-
ated with respect to energy. In [EAHS+06], time redundancy was used in conjunction
with information redundancy, which does not compete with DVFS for slack, to tolerate
transient faults. In [ZC04], fault tolerance and dynamic power management were stud-
ied, and rollback recovery with checkpointing was used to tolerate multiple transient
faults in distributed systems.
Addressing energy and reliability simultaneously is especially challenging because
lowering the voltage to reduce energy consumption has been shown to increase the
number of transient faults exponentially [ZMM04]. The main reason for such an in-
crease is that, with lower voltages, even very low energy particles are likely to create a
critical charge that leads to a transient fault.
Several researchers have started to address this aspect. A single-task reliability-aware
checkpointing scheme was evaluated in [ZMM04]. Applications distributed on multi-
processors scheduled using non-preemptive static cyclic scheduling and modeled with
Directed Acyclic Graphs (DAG) have been addressed in [AGK12, PPIE07]. Pop et al.
[2007] have proposed a constraint logic programming based approach, which, consid-
ering a given mapping of tasks to processors, decides the voltage and frequency levels,
and the start time of tasks in the static schedule such that transient faults are tolerated,
the timing constraints of the application are satisfied and the energy is minimized. The
authors have used task re-execution in a shared recover block to tolerate the transient
faults.
Aydin and Zhu [2009] have proposed a reliability-aware DVFS heuristic for indepen-
dent task sets on uni-processor systems, which schedules a separate recovery task for
each task executed at a reduced voltage and frequency level, in order to preserve the
desired reliability. This approach has been improved in [ZAZ13] to consider a shard-
recovery scheme, i.e., several tasks will use a shard recovery block, in the context of
dependent tasks modeled using DAG. Researchers have also addressed reliability in
the context of temperature-aware design. Their focus has been on limiting the oper-
ating temperature in order to increase the life-time of the system [CRM+06]. Such a
technique could be used in conjunction with our approach.
Pop et al. [2007], as mentioned, consider energy/reliability trade-offs in the context
of distributed time-triggered systems, where tasks and messages are scheduled based
on a static-cyclic scheduling policy, and transient faults are tolerated using task re-
execution. In [PIEP09], researchers show how re-execution and active replication can
be combined in an optimized implementation that leads to a schedulable fault-tolerant
application without increasing the resources required.
In this paper, we consider heterogeneous distributed event-triggered systems, where
tasks are scheduled using fixed-priority preemptive scheduling, and messages are sched-
uled using fixed-priority non-preemptive scheduling. Transient faults are tolerated
48
Paper B: Reliability-Aware Dynamic Energy Management for
Fault-Tolerant Distributed Embedded Systems
through task replication. In this context, we propose an optimization approach that
decides at design time (offline) the mapping and the operating voltage and frequency
level of tasks, such that transient faults are tolerated, the timing constraints of the ap-
plication are satisfied, and the energy consumed is minimized.
We have developed a Tabu Search-based approach for the synthesis of fault-tolerant
event-driven systems that takes into account the influence of voltage and frequency
scaling on reliability. The offline synthesis, assumes that tasks will execute up to their
worst-case execution time (WCETs) and that all the task replicas will have to execute
in order to tolerate transient fault occurrences. However, during runtime, tasks typi-
cally will execute only a fraction of their WCETs [EY97]. In addition, the knowledge
about fault occurrences, i.e., which replicas do not have to execute because faults have
not occurred in the original task, could potentially be used for further energy savings.
Our interest is to obtain such energy savings online without impacting negatively the
reliability. Therefore, in this paper, we also propose an online DVFS algorithm which,
based on the knowledge of actual task execution times and fault occurrences, decides
the operating voltage and frequency levels of tasks such that the energy is further re-
duced (compared to the offline synthesis), the transient faults are tolerated and the
timing constraints (deadlines of tasks) are satisfied.
There is limited work on reliability-aware online DVFS. The approaches in [ZC06,
WMWZ12] consider uni-processor systems and have an online component that de-
cides the voltage and frequency levels to minimize energy, such that the task dead-
lines are satisfied. Fault-tolerance is achieved by using checkpointing with rollback
recovery. However, these approaches ignore the negative impact of lower voltages
on the reliability [ZMM04].This aspect is taken into account in the online approach
proposed in [ZAZ13] which decides the frequency for tasks under schedulability and
fault-tolerance grantees, but Zhao et al. [2013] also consider only uni-processor sys-
tems.
The next two sections present system models and reliability model. Section 3.4 in-
troduces the problem and discusses a motivational example. Our proposed offline
reliability-aware mapping, voltage and frequency scaling approach is presented in Sec-
tion 3.5, and the online reliability-aware voltage and frequency scaling algorithm is
introduced in Section 3.6. Section 3.7 presents the evaluation of the proposed synthesis
approaches. We draw our conclusions in the last section.
3.2 System Model
We consider hardware architectures consisting of a set N of heterogeneous process-
ing elements (PEs) and interconnected by a shared communication channel. Tasks
3.2 System Model 49
are scheduled using fixed-priority preemptive scheduling. Researchers have shown
[TSPS12a] how realistic protocols such as TTEthernet can be taken into account during
the analysis and synthesis. However, for simplicity, in this paper we consider that the
PEs are interconnected by a shared bus, which uses a fixed-priority dynamic scheduling
scheme and assumes that the messages are non-preemptible. This is similar to how a
widespread bus protocol such as the controller area network [Bos91] works.
In this paper we are interested in reducing the energy consumption without negatively
impacting reliability and schedulability. We use DVFS [SAHE03] for performing
power management in PEs of the architecture. For new processors and memory ar-
chitectures, DVFS is limited in its potential for energy savings, in some situations,
paradoxically, leading to an increased power consumption [LSH10]. In addition, other
system components, such as memories, and I/Os have an increased energy consump-
tion. However, as we mentioned in the introduction, in this paper we are interested
in safety-critical embedded systems, which use mature platforms due to dependability
reasons, where DVFS is still effective [LSH10]. We use similar assumptions and en-
ergy model as all the previous research [AZ09, ZAZ13, AGK12], which focuses on the
energy management of PEs.
We use the power model from [BBL09], as it is able to captured a set of realistic
assumptions, such as discrete operating modes of PEs, the energy and delay due to
mode switches, and takes I/O operations into account. Thus, a PE Ni is characterized







described by three parameters: f Nij is the operating clock frequency of Ni running in
mode j, which is measured in Hz; vNij is the supply voltage measured in Volts; p
Ni
j is
the power spent measured in Watts. We also introduce the normalized frequency F and











In this paper we are interested in tolerating transient faults, which are the most common
faults in today’s embedded systems [Con03]. There are many transient errors detection
techniques [CMS82] with varying error detection coverage and overhead. Similar to
the previous research [AZ09, ZAZ13, AGK12], we assume that faults are detected at
the completion of a task’s execution, and the overhead of fault detection is included in
the task’s WCET. Researchers have shown [PIEP09] how re-execution, which provides
time-redundancy, and active replication, which provides spatial-redundancy, can be
combined in an optimized implementation that reduces the fault-tolerance overheads.
In this paper we are not interested in the optimization of the redundancy mechanisms
needed for fault tolerance. Hence, for simplicity, we have decided to use active repli-
cation to tolerate transient faults. Thus critical tasks will have replicas that will execute
on the same or a different PE, delivering the desired functionality in case the original
task fails due to a transient fault. Our work can be extended to consider any other fault-
tolerance mechanisms such as re-execution or check-pointing with rollback recovery.
50
Paper B: Reliability-Aware Dynamic Energy Management for
Fault-Tolerant Distributed Embedded Systems
We model an application as a set Γ of periodic real-time tasks. The mapping of a task
τi to a PE N j is captured by the mapping function M : Γ→ N , i.e., M (τi) = N j. This
mapping is not yet known and will be decided by our approach presented in Section 3.5.
For each task τi we know its WCET C
N j
i (in the maximum speed operating mode) with
respect to each N j which it is considered for mapping, its period Ti and deadline Di. An
application Γ is repeated periodically with a period Tcycle. Since we use fixed-priority
preemptive scheduling for tasks, each task τi is assigned a unique priority.
Tasks are divided into two categories: critical and non-critical. To prevent the appli-
cation failure, critical tasks have to tolerate transient faults. We assume that for each
critical task τi, the designer will specify a desired redundancy level ki, i.e., how many
replications of τi have to be introduced, see Section 3.3 for details. The desired redun-
dancy level is captured by the function F : F (τi) = ki. If ki = 0, the task is non-critical.
A critical task and its replicas could be mapped on the same PE, or mapped on different
PEs, and can run in different operating modes.
Each execution of a periodic task is called a job. The jth job of task τi is referred to as
Ji j. Our offline synthesis will assign a same operating mode to all jobs Ji j of a task τi.





the online scheduling may decide at runtime to run jobs of the same task in different
operating modes. In that case, the function L(Ji j) will refer to the mode of the job Ji j.
For simplicity, we do not consider the situation when an intermediate mode can be
obtained by using two modes applied to different execution segments of the same task,
as in [BBL09]. However, our approach can be extended to consider this aspect.
Tasks communicate using messages. We know the size smi and the priority of each mes-
sage mi. Our model captures two types of communication: Sampling and queuing. A
sampling communication has a buffer storage for a single message; arriving messages
overwrite the buffer, and reading does not remove the message, i.e., it can be read re-
peatedly. A queuing communication uses a buffer that can store several messages, and
works as a FIFO queue. A reader task will block if the buffer is empty and a writer task
will block if the buffer is full. We assume that the buffer size have been determined
such that there is no overflow or underflow Manolache et al. [2006].
The bus is used also to communicate fault-occurrence information. Thus, if a task
does not experience a fault, and one of its replicas is on a different PE, we broadcast a
message informing the scheduler on the other PEs not to start the replicas (or terminate
them, if they are executing). Our schedulability analysis takes all these messages into
account.
3.3 Reliability Model 51
3.3 Reliability Model
Safety is a property of a system that will not endanger human life or the environment.
A Hazard is a situation in which there is active or potential danger to people or the
environment. Risk is a combination of the probability of a hazardous event and its con-
sequence. If, after performing an initial hazard and risk analysis, a system is considered
safety-critical, it has to be certified [KK10]. Certification is a “conformity of assess-
ment" performed by a third party, according to area-specific certification standards. For
example, the IEC 61508 standard is used for industrial applications, ISO 26262 is for
the automotive area, and D0-178B is used for avionics software.
During the engineering of a safety-critical system, the risks are assessed and the appro-
priate risk control measures mandated by the standards are introduced to reduce the risk
to an “acceptable level". Thus, safety functions are assigned a Safety-Integrity Level
(SIL), which captures the required level of risk reduction. The SIL assigned to a soft-
ware task or hardware component will dictate the development processes and amount
of redundancy that have to be used.
In this paper we are interested to tolerate hardware transient faults which manifest
themselves at the task-level. Note that all software faults (bugs) are permanent, i.e.,
they are due to specification, design or implementation mistakes. Software does not
experience transient faults similar to hardware, since it is not aging, for example. A
software bug disappears only if the software is updated with a new version, where the
bug has been removed. If permanent faults have to be tolerated, we assume that the
designer has introduced the appropriate measures. For example, permanent faults in
hardware are tolerated through hardware redundancy: using several (identical) hard-
ware modules, such that a healthy module can take over if the original module is faulty.
Fault masking configurations (such as Triple Modular Redundancy) are used when no
erroneous output can be allowed even for a short period of time, and reconfiguration,
which switches in a spare module after the fault is detected, is used otherwise. Perma-
nent faults in software can only be tolerated using some form of functional diversity,
i.e.,: developing separate software modules using different specifications, teams, meth-
ods, in the hope that the permanent faults are not correlated across the modules [KK10].
Transient faults are typically tolerated by re-executing again the failed task. By its na-
ture, the transient fault has most likely disappeared, and the re-execution will happen
fault-free. If time constraints are not strict, the re-execution will be scheduled after the
transient fault has been detected. To reduce the recovery delay, a replica of a task can be
scheduled in parallel on a different PE. Such an approach is called active replication. In
this paper we allow replicas to be executed also on the same PE (amounting, practically,
to re-execution) if our synthesis decides that this is a good option. Finally, checkpoint-
ing with rollback recovery is used for tasks with very long computation times, where
52
Paper B: Reliability-Aware Dynamic Energy Management for
Fault-Tolerant Distributed Embedded Systems
saving the current status of a task into periodic checkpoints is better than waiting until
the end to recover [KK10].
The amount of redundancy used for fault tolerance is determined by the reliability of
the components and the criticality of the function (i.e., the SIL, which corresponds
to a certain reliability objective). The more reliable a component (hardware module
or software task), the less redundancy has to be used. We define the reliability as
the probability of successful execution. Since we are interested in tolerating transient
faults, we have to determine the probability that a task will not fail due to a transient
fault.
The previous related research [AZ09,ZAZ13,AGK12] has used the exponential failure
law to capture the reliability Ri of a task τi:
Ri = e−λci (3.1)
where λ is the permanent fault rate of the PE running τi and ci is the execution time
of the task. However, the exponential failure law refers to permanent hardware faults,
[KK10] and does not accurately capture the probability of correct operation in case of
transient faults. In this case, researchers have proposed to use a Weibull distribution





where λ`j now captures the hardware transient fault rate of the PE N j running task τi in
the operating mode `, and µ`j is a shape parameter. A value of µ
`
j < 1 indicates that the
failure rate decreases over time, for µ`j = 1 the failure rate is constant, and a value of
µ`j > 1 means that the failure rate increases with time. More complex models, which
can capture the dependence of transient errors on patterns of usage, have also been
proposed, but they are beyond the scope of this paper [CMS82].
Note that not all hardware transient faults will manifest themselves as software task
failures. This depends on the length of the transient faults, the fault type and its source
in terms of the architectural component affected [WRPG11]. The percentage of hard-
ware transient faults affecting the software can be determined by fault injection, and
our model can use this information, if available. However, since quite a large percent-
age of hardware transients will lead to a task failure [WRPG11], we will use Eq.3.2 as
it is a safe (pessimistic) estimation of task failures.
3.3 Reliability Model 53
Researchers have so far assumed that the software itself is perfect and hence will only
fail because of transient hardware failures [AZ09,ZAZ13,AGK12]. However, software
bugs are very common, and for this reason the certification standards make the opposite
assumption, i.e., that software always fails [IEC98] and require the use of appropriate
mitigation. The reason the standards make this assumption is that software reliability
modeling is difficult and there are no industrial-strength well-established software reli-
ability models, and using software redundancy without using diversity does not work,
as it does for hardware. Under this assumption, we consider that the permanent soft-
ware faults have already been addressed as discussed, through software diversity, and
we do not explicitly take them into account in our model. Therefore, we capture the
reliability Ri of executing a task τi on a PE N j using Eq.3.2, where, for simplicity, we
consider that the fault rate is constant over time, i.e., µ`j = 1.
The reliability of a critical task, Rrepi , is increased through introducing ki replicas. R
rep
i






Note that when ki = 0 (i.e., the task is non-critical), R
rep
i = Ri. Considering independent




where Ji j ∈ Γ are all the jobs of the application Γ released during Tcycle. Note that the
reliability is a function of time, i.e.,: the probability of continuous successful execution
over a given time period. For safety-critical embedded systems, this time is typically
defined as the “mission duration". We denote the mission duration with TS .
3.3.1 Energy/Reliability Trade-off Model
Let us denote with c
N j
i j the execution time of a job Ji j in the fastest operating mode.
Then, considering the well-known energy/performance tradeoff with DVFS, the execu-
tion time ci j of a job Ji j mapped on PE N j and executed in mode ` is ci j = c
N j
i j / F
N j
` .
The energy consumption ES of finishing all jobs during Tcycle is calculated by:
ES = ∑
Ji j∈Γ′
pNq` ci j (3.5)
54
Paper B: Reliability-Aware Dynamic Energy Management for
Fault-Tolerant Distributed Embedded Systems
where Γ′ is the set of all tasks in the application Γ to which we add all the replicas, Ji j
are all the jobs of the tasks in Γ′ released during Tcycle, and p
Nq
` is the power consumed
by Nq in the operating mode ` and ci j is the execution time of job Ji j considering the
mode ` on Nq.
Note that the assigned redundancy levels ki are only valid under the assumption that
the system reliability RS does not fall below a specified reliability goal Rg. If RS is
below Rg, the fault rate might be higher and thus more redundancy might be necessary
to tolerate the increased number of faults.
According to the energy/reliability trade-off model from [ZMM04], which is used in
all of the related work [AZ09, ZAZ13, AGK12], the fault rate depends on the oper-
ating mode, i.e., lowering the voltage to reduce energy consumption has been shown
to increase the number of transient faults exponentially [ZMM04]. More formally,
[ZMM04] shows that when a PE runs in the minimum voltage and frequency level, the
fault rate increases 10d times compared to that of PE running in the maximum voltage
and frequency level, where d (d > 0) is a PE-specific constant.
Let us define λ0j as the minimum fault rate, which corresponds to the maximum voltage
and frequency mode ΛN jmax. Thus, the maximum fault rate λmaxj , which corresponds to




where d (> 0) is an PE-specific constant.
Based on [ZMM04], and using the normalized frequency F and normalized voltage V




j ·Fα ·10−βV (3.7)
where α and β are calculated by the given λ0j and λ
0
max. Further, we have considered
d = 2, i.e., the fault rate increases 100 times between the maximum and minimum
operating modes [ZMM04]. Note that this mode-dependent fault rate is used in the
task reliability calculation from Eq.3.2. Since the fault rate increases exponentially
when supply voltage and operating frequency decrease, switching the operating mode
has an impact on the system reliability.
3.4 Problem Formulation 55
3.4 Problem Formulation
The problem we are addressing in this paper can be formulated as follows. Given (1)
an application modeled as a set Γ of tasks, with associated redundancy levels captured
by function F , (2) an architecture consisting of a set N of heterogeneous PEs that
work under different operating modes Λ, and (3) a reliability goal Rg which bounds
the system reliability Rs, we are interested in synthesizing an implementation S , such
that the deadlines of all tasks are satisfied, the system reliability is within the imposed
reliability goal, i.e., Rs ≥ Rg, and the energy consumption is minimized.
Synthesizing an implementation S = (M , L) means deciding offline (1) the mapping
M of tasks and replicas to the PEs and (2) the operating mode L for executing each
task.
The offline synthesis has to assume that the tasks will execute up to their WCET and
that we need to execute all replicas. At runtime we will know the actual task execution
times and fault occurrences. Hence, at runtime we are interested to dynamically set the
operating mode for executing each job in order to exploit the resulted slack, such that
the energy is further reduced while the timing and reliability constraints are guaranteed.
3.4.1 Motivational Example
Let us consider the example in Fig.3.1 where we have a task set of six tasks (four tasks
and two replicas) mapped on an architecture of two PEs. For simplicity, in this example
we ignore the communication. The WCETs, periods, deadlines and priorities of all
tasks are presented in the left table. τ1 and τ2 are critical tasks and they are replicated
once. We denote the replicated task of τi as τ′i and its jobs as J′i j. Recall that tasks are
periodic and they are executed in accordance to fixed-priority preemptive scheduling.
The operating modes of the two PEs are given in the right table, which also contains
λ0, α, and β (we assume the same values for both PEs in this example). We set the
shape parameter µ`j to 1 (see Eq.3.2) for both PEs and consider the application cycle
Tcycle = 100 ms. We denote with E0 and R0 the energy and reliability, respectively,
of S0, see Fig. 3.1. In the motivational example we have set Rg to 0.99960, which
means that we accept a probability of failure that is ten times greater than R0, i.e.,
Rg = 1−10(1−R0).
Our problem has two components: an offline synthesis component and an online com-
ponent. We will use the example to discuss first the offline synthesis. Fig.3.2 shows
two alternatives to the offline synthesis problem. The figure shows a timeline for each
PE. As mentioned, the offline scheduling has to assume that the tasks will execute up
to their WCET (the length of the rectangles in the timeline) and that all replicas are
56
Paper B: Reliability-Aware Dynamic Energy Management for
Fault-Tolerant Distributed Embedded Systems
Γ′ CN1i C
N2
i Ti = Di Priority
τ1 7 14 50 1
τ2 6 12 100 4
τ3 5 10 50 2
τ4 8 16 100 5
τ′1 7 14 50 3
τ′2 6 12 100 6
Tcycle = 100 ms
` N1 N2
Freq. Volt. Power Freq. Volt. Power
[MHz] [V] [W] [MHz] [V] [W]
1 333 1.2 4 166 1.1 2
2 666 1.4 12 333 1.25 4.5
3 1000 1.6 25 500 1.5 11
α = −4, β = −0.04, λ0 = 10−7
E0 = 1312 mJ. R0 = 0.9999969. Rg = 0.999969
Figure 3.1: System Model Example
Figure 3.2: Offline Synthesis
executed. In addition, the timeline on each PE considers the scenario when all tasks are
released at t = 0. However, tasks might be released at different times. In Section 3.5,
we discuss the schedulability analysis and the assumptions considered.
The offline synthesis has to decide the mapping and the operating mode for each task.
In the figures, the mapping is indicated by placing the task on the timeline of the PE
where it is mapped and the height of the rectangle indicates the operating mode. For
the discussion, we use a reference implementation S0 which does not perform voltage
and frequency scaling, and thus runs all the tasks in the maximum speed operating
mode and S0 attempts to reduce the energy consumption by mapping the tasks on the
PEs which consume the least energy in the maximum operating mode, without missing
their deadlines.
3.4 Problem Formulation 57
Figure 3.3: Online Scheduling
Fig.3.2(a) shows the solution Sa which minimizes the energy consumption without
concerning for reliability. Sa meets all tasks’ deadlines and reduces the energy by
42.58%, compared to E0. However, Sa does not meet the system reliability goal, i.e.,
RSa = 0.999362< Rg = 0.999969. Since 1−RS captures the probability of failure, we





where R0 is the system reliability of S0. According to Eq. 3.8, the probability of failure
for Sa is 160 times greater than that of S0 .
By carefully deciding the mapping M and operating modes L , it is possible to reduce
the negative impact on reliability without a significant loss of energy savings. Another
implementation, Sb, shown in Fig.3.2(b) fulfills all timing and reliability requirements,
with only 0.45% loss in energy savings, compared to Sa.
Let us now discuss the online component of our synthesis approach. At runtime, tasks
will finish before their WCETs and in the case that no errors are occurring in the crit-
ical tasks, the corresponding replicas are terminated. This will increase the available
slack in the schedule. Hence, at runtime, additional energy savings can be obtained
by exploiting this additional slack. Our online scheduling uses the mapping decided by
58
Paper B: Reliability-Aware Dynamic Energy Management for
Fault-Tolerant Distributed Embedded Systems
the offline component, but will decide at runtime a different (if needed) operating mode
for executing each job such that the energy is further minimized, under reliability and
timing constraints.
We consider as the starting point the offline synthesis result from Fig.3.2(b) and we
assume that, during runtime, a single fault is going to happen in the second job of task
τ1, namely J12. Fig.3.3 shows two online scheduling alternatives. We assume that the
jobs will finish before their WCETs, and the actual execution times during runtime are
shown by the length of the rectangles in Fig.3.3, under the respective operating mode
(which is the height of the rectangle).
Fig.3.3(a) shows an online scheduling solution which uses the additional slack at run-
time to reduce further the energy, but without enforcing the reliability constraint Rg.
Job J11 finishes before its WCET and does not experience a fault. This means that we
do not have to execute the replica J′11. The online scheduling will use the slack resulted
the earlier finish of job J11 and the removal of J′11 to run J21 in a lower operating mode
compared on Fig.3.2(b), in order to save energy.
In the runtime scenario depicted in Fig.3.3(a), J21 will finish without faults before its
replica J′21 is released on N2. Our online scheduling approach uses the shared bus to
broadcast the error information to all PEs (we take these extra messages into account
during schedulability analysis). Thus, J′21 is not activated on N2, creating extra slack.
However, because a fault has occurred in J12, its replica, J′12, is executed.
We use the slack created on N2 due to the earlier termination of J31 and due to not
executing J′21 to scale down the operating modes of J41 and J32 compared to the of-
fline solution in Fig.3.2(b). This leads to an energy reduction of 85.52% compared to
S01. However, this energy reduction comes at the cost of not meeting the reliability
goal. The system reliability of Fig.3.3(a) is only 0.999959, which is lower than Rg
(0.999969), and increases 13-times in the probability of failure.
In our online scheduling we are interested to exploit the extra slack created at runtime
to further reduce the energy, but without violating the reliability goal. Such a solution
is shown in Fig.3.3(b). Compared to the solution in Fig.3.3(a), we decide at runtime
not to scale down further J41 and J32, in order to preserve the reliability goal. Thus, the
resulted system reliability is 0.999985, which is within the reliability goal. Compared
to Fig.3.3(a), in Fig.3.3(b) we are still able to reduce the energy consumption to a
similar level, but this time without negatively impacting the reliability.
These examples show that it is important, both offline and online, to carefully optimize
the mapping of tasks and the operating modes of jobs such that the reduction of energy
1The reference energy E0, which considers running all tasks in the maximum speed operating mode and
up to their WCETs, is a very pessimistic metric.
3.5 Oine Synthesis 59
does not lead to impaired reliability. The next two sections present our proposed offline
and online scheduling algorithms, respectively.
3.5 Offline Synthesis
Determining an optimal task mapping is a NP-hard problem [TBW92]. To solve the
offline synthesis problem presented in the previous section, we propose a Tabu Search-
based approach, which decides offline the mapping M and operating mode L for exe-
cuting each task.
Tabu Search (TS) [Glo89] is an optimization metaheuristic which iteratively explores
the solutions in the vicinity (neighborhood) of the current solution, selecting the ones
which optimize the cost function. Our proposed cost function in Eq. 3.9 captures the
energy minimization under timing and reliability constraints:
cost(S) = ES +WR ·max(0,Rg−RS )+WS · rS (3.9)
where the first term is the energy consumption of running the implementation S , the
second term is the reliability constraint and the third term represents the timing con-
straint. Instead of considering solutions which do not meet timing and reliability con-
straints as infeasible, and to ignore them during the design space exploration, we de-
cided penalize such solutions in the cost function by giving large values, the weights
of the corresponding terms, namely WR and WS , which allows us to explore infeasible
regions of the search space and drive the search towards feasible regions.
The reliability constraint is enforced by calculating difference between Rg and RS . If
the reliability constraint is satisfied (RS ≥ Rg) then the second term is zero. Otherwise,
it is a large positive value, depending on the penalty weight WR. The timing constraints




We use a response-time analysis [But11] to calculate the worst-case response time
(WCRT) ri of every task τi, which is compared to the deadline Di. The basic anal-
ysis presented in [But11] has been extended over the years. For example, the state-of-
the-art analysis in [PGH98] considers arbitrary arrival times and deadlines, offsets and
inter-task communication. The schedulability analysis takes into account the delays of
the “queuing” messages when determining the WCRT of receiving tasks, and checks
the schedulability of the bus considering all type of messages.
60
Paper B: Reliability-Aware Dynamic Energy Management for
Fault-Tolerant Distributed Embedded Systems
Algorithm 1 MVFS
1: S0 ← InitialSolution(Γ′, N );
2: S ← Snow ← S0
3: while max_iterations is not exhausted do
4: C ←M -moves(Snow, m)
5: for each solution Snewi ∈ C do
6: C ← C ∪ L-moves(Snewi , PL , n)
7: end for
8: C ← C ∪ L-moves(Snow, PL , n)
9: for each solution Snewj ∈ C do
10: Calculate ∆ j = cost(Snow) − cost(Snewj )
11: end for
12: Snow ← SelectNewSolution(∆ j, length_tl)
13: S ← SaveBestSolution(Snow)
14: end while
15: return S
If the application is schedulable, each ri is smaller than the corresponding deadline Di.
Then rS = 0 which means that the schedulability constraint is satisfied and the third
term in Eq. 3.9 is zero. However, if the application is not schedulable, there exists at
least one ri greater than the deadline Di, therefore rS is positive. This means that the
third term in Eq. 3.9 is a large positive value depending on the penalty weight WS .
Algorithm 1 presents our reliability-aware Mapping, Voltage and Frequency Scaling
optimization (MVFS), based on Tabu Search algorithm. MVFS takes the complete
task set Γ′, i.e., the set of original task Γ and the replicas, and the architecture N as
inputs, and returns the implementation S = (M , L) which minimizes the cost function.
The search starts from an initial solution S0 (line 1) which is a mapping such that the
utilization of the PEs is balanced and the communication is minimized. In S0, all tasks
are assigned the maximum speed operating mode.
Tabu Search explores the design space by using design transformations (called also
“moves"), applied to the current solution Snow in order to generate neighboring solu-
tions. The search iteratively moves from the current solution Snow to a new solution
Snew in the neighborhood of Snow. We first introduce the moves we have used in our
implementation and then we explain the search. We will use an example to discuss how
Tabu Search works. We perform two types of design moves: (1) mapping moves (M -
moves) and (2) voltage and frequency scaling moves (L-moves). The neighborhood
might be very large, and evaluating the cost function for all neighbors is computation-
ally very expensive, thus we consider a limited number of neighboring solutions called
a candidate set, denoted with C .
3.5 Oine Synthesis 61
We introduce the following neighbors (lines 4-8) into the candidate set C . In line 4,
we add m neighbors obtained using M -moves into C . For each M -move (line 4),
we move a randomly chosen task from one PE to another PE, or swap two randomly
chosen tasks between two PEs. The tasks involved in an M -move will be assigned
the highest operating mode on the new PE. In lines 5-7, for each new solution Snewi
generated in line 4, we add n neighbors into the candidate set C obtained by modifying
them using L-moves. For each task in Γ′, if a randomly generated number is larger
than the given PL , the task will be affected by an L-move, i.e., the task is assigned with
another operating mode. In line 8, we also consider adding a number of n neighbors of
Snow by only performed L-moves into C . The L-moves in line 8 is performed as the
same as in line 6.
In each iteration of the loop in lines 3-14, TS selects a neighbor from C as the current
solution Snow, from which the search will continue. To avoid being stuck in a local
optimum, and to prevent cycling TS filters the neighborhood using a memory structure
called a tabu list. In our implementation, we mark as tabu the move that generated an
improved solution Snow . We record a number of length_tl such moves in the tabu list.
The size length_tl of the tabu list is also called the tabu tenure.
We evaluate all candidates in C (lines 9-11). For selecting a new solution, we first
attempt to select an improved solution with the largest improvement compared to cur-
rent solution, as long as it is not on the tabu list or its tabu status can be aspirated. A
tabu status can be aspirated when the solution has a lower cost than that of the cur-
rent best-known solution. If no such improved solution exists, we randomly select a
non-improved solution to be the new solution as long as it has a non-tabu status, i.e.,
it is not on the tabu list. Randomly accepting non-improving moves introduces the
diversification, which forces the search into previously unexplored areas of the design
space.
The selection of new solution and the maintenance of tabu list are done inside the
SelectNewSolution function (line 12 of Algorithm 1). Then, SaveBestSolution function
(line 13) saves the new solution if it is the best solution so far. After max_iterations
without an improvement, the algorithm stops and the best-found S is reported. Our
algorithm can also be stopped if a given time limit has been exceeded.
Let us illustrate how TS works using the example in Fig. 3.4, where we have the ap-
plication and an architecture from Fig.3.1. In Fig. 3.4, the tasks are represented using
rectangles labeled with the task name. The length of the rectangle indicates the WCET
of the respective task while the height reflects the operating mode. We assume that
during the search, TS has reached to the current solution Snow depicted in Fig. 3.4(a).
This solution meets both the timing and reliability constraints. The cost function value
(Eq.3.9) is equal to the energy value, 1126.51, i.e., the last two terms in Eq.3.9 are
zero. However, Snow is not the best solution so far. The cost of the best-so-far solution,
62
Paper B: Reliability-Aware Dynamic Energy Management for
Fault-Tolerant Distributed Embedded Systems
938.37, shown in the table of Fig. 3.4(a), is lower than the cost of Snow. In Fig. 3.4(a),
we also show the tabu list, where length_tl = 3 for this small example.
In Snow, tasks τ1, τ2, and τ3 are mapped on N1, while tasks τ′1, τ′2 and τ4 are mapped
on N2. Tasks τ1, τ2 and τ4 are assigned to the maximum operating mode, L3, while
tasks τ′1, τ
′
2 and τ3 are assign to L2. Fig. 3.4(b)–(e) present four neighbors of Snow
obtained by applying M -moves and L-moves to Snow. These neighbors together with
other neighboring solutions, are placed in the candidate set C .
The solution Snewb has been obtained from S
now by remapping τ4 from N2 to N1, chang-
ing its mode in N1 to L2, and changing the mode of τ′2 in N2 to L1. We denote with a
thick border a task that has been involved in anM -move, and with a shadow a task that
has been involved in an L-move. Under each of the neighbor in Fig. 3.4(b)–(e), we also
present the cost of that solution and its energy consumption. Snewb is schedulable and
Figure 3.4: Moves in Tabu Search
3.6 Online Scheduling 63
reduces the energy compared to Snow. However, Snewb does not meet the reliability goal,
and hence the second term in Eq.3.9 is penalized (we have used WR = 107). Hence, the
cost Snewb is 4103.76, which is much larger than the cost of S
now.
The solution Snewc is Fig. 3.4(c) has been obtained from Snow by swapping the PEs of
τ2 and τ′1 and by changing their operating modes of τ2 and τ
′
1 to L2. In Fig. 3.4(e),
we change the mode of τ3 to L3 and τ′2 to L1, compared to Snow. The solution Snewd in
Fig. 3.4(d) is not schedulable, hence we penalized the third term in Eq.9 (with WS =
200), respecting in a cost of 9544.30.
We have to select a new solution Snew from which to continue the search. The solution
Snewc in Fig. 3.4(c) has the best cost function among the neighbors considered, and also
improves on Snow. However, the move “swap τ2 and τ′1" which has involved in creating
this solution is tabu, i.e., it is on the tabu list in Fig. 3.4(a). The tabu status of Snewc can
not be aspirated since the cost of Snewc is not better than the best-so-far cost of 938.37.
Solutions Snewb and S
new
d do not improve the cost function because they are penalized.
Solution Snewe from Fig. 3.4(e) improves on the cost function of Snow and it is not tabu.
Hence, Snewe is selected as the new solution Snow and the exploration continues from
this solution.
3.6 Online Scheduling
As mentioned, to guarantee the schedulability and reliability constraints, the offline
synthesis algorithm presented in the previous section has to assume that the tasks will
execute up to their WCET and all the replicas have to be executed for fault tolerance.
However, during runtime, tasks will execute for a small fraction of their WCET [EY97]
and in the case the critical tasks do not experience faults, we can avoid starting (or we
could terminate) the replicas. In this section, we propose an online scheduling approach
which uses the reclaimed slack at runtime to further reduce the energy consumption by
scaling down the operating voltage and frequency.
The area of real-time scheduling for DVFS multiprocessors has received a lot of at-
tention, and there are many online DVFS algorithms proposed in the literature (see
[CK07] for a survey). These algorithms have been proposed for both uniprocessor
and multiprocessor systems. Several scheduling policies have been targeted, such as
aperiodic scheduling, and periodic preemptive scheduling such as Rate Monotonic
and Earliest Deadline First, and periodic non-preemptive scheduling, such as Static
cyclic Scheduling. Energy efficiency can be achieved through strategies such as slack
reclamation [Gru01, AMMMA01, PS01, CYK06, WMWZ12], DVS with power-down
[ISG03,QNHM04], by using stochastic information on the expected workload [GK03,
LS04], or by taking into account the leakage current [CHK06].
64
Paper B: Reliability-Aware Dynamic Energy Management for
Fault-Tolerant Distributed Embedded Systems
In general, there are two kinds of slack reclamation strategies. One is collecting all
the possible slack, and planing for all the jobs that have not yet started to execute. We
denote it as “global" slack reclamation. Another, is collecting only the current available
slack, and planning for the current job that is going to execute. We can call it “greedy"
slack reclamation.
The multiprocessor strategies decide at runtime not only the operating mode for each
task, but also the task mapping to PEs. In our online scheduling, we consider the task
mapping fixed, as determined at design time by our offline synthesis strategy, and we
are interested online in deciding only the operating mode for executing the jobs of tasks.
Our intention is not to propose a novel runtime algorithm. Instead, we are interested
to determine if we can obtain further energy savings at runtime without negatively im-
pacting the imposed reliability requirement, by further lowering the operating voltage
and frequency. Hence, we have decided to use a greedy slack reclamation scheme as
the starting point of our online energy/reliability trade-off algorithm.
Our proposed Online Voltage and Frequency Scaling (OVFS) algorithm is presented
in Algorithm 2. As mentioned, we use a fixed-priority preemptive scheduling policy.
Thus the scheduler in each PE will select the job Ji j with the highest priority on that
PE to be executed. OVFS is called before a job Ji j starts to execute and returns the
operating mode ` for executing the job Ji j.
The slack management strategy used in OVFS is similar at the core with the greedy
slack distribution from [Gru01] and the D-TDVS presented in [WMWZ12]. Both of
these employ slack levels, such that a slack of certain priority generates slack at that
level (if it finishes earlier than its WCET) but consumes slack produced by higher pri-
ority tasks (starting with the highest priority slack). Furthermore, idle times are also
consuming the slack in a similar way, thus unused slack degrades and finally disap-
pears. So far the approach is a standard DVS online method that keeps the response
time guarantees computed by the offline analysis. However, to take into account relia-
bility, two improvements are added. First, when a task with replicas finishes success-
fully, the replicas will be also finished (some even before starting), generating a certain
amount of slack at their respective level. Second, the operating speed for the newly
scheduled task is limited not only by its deadline, but also by the reliability goal, which
means that tasks may actually be required to execute slightly faster than in a classic
DVS approach.
Line 2 in Algorithm 2 computes the slack Hi available to a task with priority i, which is
about to start executing. This slack may be partially or completely used up by choosing
a lower operating speed for this task (lines 6-9), such that the WCET at the lower
speed (C`i ) does not consume more than the available slack on top of the WCET C
MV FS
i
precomputed offline by MVFS. Furthermore, the reliability with the new mode φ`−1S
should maintain the reliability goal φg.
3.6 Online Scheduling 65
Algorithm 2 OVFS
1: before Ji starts to execute
2: calculate the available slack Hi which can be used for Ji j
3: retrieve the pre-computed CMV FSi
4: start from `, the mode computed by MVFS
5: calculate φ`−1S (the GSFR when Ji is executed in operating mode (`−1))
6: while (Hi +CMV FSi ) ≥C`−1i and φ`−1S ≤ φg do
7: ` = ` - 1
8: retrieve the pre-computed C`−1i
9: calculate φ`−1S
10: end while
11: return the operating mode ` to execute Ji
When lowering the operating mode for the next job, our DVFS algorithm does not only
guarantee the schedulability, but also guarantees that the imposed reliability constraint
is met. Recall that according to Eq.3.7, lowering the voltage and frequency will in-
crease the transient fault rate, consequently lowering the reliability RS of system. The
assumption is that if RS becomes smaller than Rg, the system may need more replicas
for the critical tasks due to the increased transient faults. Thus, unless RS ≥ Rg, the
system is not fault-tolerant.
During our proposed offline synthesis approach MVFS, the reliability goal is enforced
by checking the reliability RS of every S (the second term in the cost function from
Eq.3.9) over the mission duration TS . However, at runtime, we cannot use the system
reliability RS as calculated by Eq.3.4. Since reliability is a function of time, it can
lead to a “so far so good" situation where the operating mode is lowered aggressively
initially for all jobs because we are well-above the goal Rg. A similar situation is
shown in [AGK11] where using the reliability as a metric would result in too much
replication for the tasks placed forwards the end of a schedule and no replication for
the initial tasks. Therefore, for the purpose of our online scheduling approach, similar
to [AGK11], we use a “reliability per time unit" metric, Global System Failure Rate,






where log is using the natural logarithm. US is the sum of the execution times for all
the jobs executed so far, including the current job Ji j to be scheduled, considering the
pre-calculated operating mode Λ`, and RS is the reliability at the time when job Ji j will
finish. Calculation examples of RS will be given later for the cases shown in Fig. 3.5.
66
Paper B: Reliability-Aware Dynamic Energy Management for
Fault-Tolerant Distributed Embedded Systems
Figure 3.5: OVFS example
To enforce the reliability goal Rg, we convert it to a GSFR goal φg and compare it to a
calculated GSFR φS . We define φg as −log(Rg)/U , where Rg is the reliability over the
system cycle Tcycle and U is the sum of the WCET of all jobs within Tcycle. We consider
the system satisfies the reliability requirement at runtime if the current φS ≤ φg. Note
that a smaller GSFR means higher reliability. In line 5 of Algorithm 2, this reliability
check is expressed as φ`−1S ≤ φg. For instance, φg = 4.65× 10−6 for our motivational
example of Fig.3.1.
Let us illustrate how OVFS works in Fig. 3.5 by using the example from Fig. 3.3(b),
where we have the application and architecture from Fig. 3.1, and we start with the
offline solution produced by MVFS in Fig. 3.2(b). Let us assume that we have reached
to the point we have to decide the mode ` of executing J32 at t = 50. The calculation
of RS for J32 is expressed as R`32 =∏Ji j∈t32 R
`
i j, where Ji j ∈ t32 means we consider all
finished and started jobs when Ji j will finish. For those finished jobs, we use their actual
execution time while for the started but not yet finished jobs, we use their WCETs.
Thus, R132 = R
1
11 ·R121 ·R112 ·R231 ·R241 ·R132 = 0.999469 for the case J32 will be executed
on mode L1 Fig. 3.5(a), while R232 = R
1
11 ·R121 ·R112 ·R231 ·R241 ·R232 = 0.999726 for the
case J32 will be executed on mode L2 Fig. 3.5(b).
From the point of view of the schedulability, the slack (hashed rectangles) is enough
to scale down J32 to L1, as we can see in Fig. 3.5(a). However, in this case, φaS > φg
(RaS = 0.999469 < Rg), i.e., the reliability goal would be missed. Hence, we keep
running J32 by the offline calculated mode L2, as shown in Fig. 3.5(b), to preserve the
system reliability.
3.7 Experimental Results 67
Table 3.1: Three types of PEs
Fast PE Medium PE Slow PE
Freq. Volt. Power Freq. Volt. Power Freq. Volt. Power
[MHz] [V] [W] [MHz] [V] [W] [MHz] [V] [W]
500 1.2 9.2 300 1.4 4.3 133 1.1 1.36
600 1.25 12 350 1.5 5.6 166 1.2 1.9
700 1.3 15.1 400 1.6 7.1 200 1.3 2.58
800 1.35 18.6 450 1.7 8.95 233 1.4 3.4
1000 1.4 25 500 1.8 11.4 266 1.5 4.4
α = −6.5, β = −0.039 α = −8.9, β = −0.038 α = −6.5, β = −0.039
3.7 Experimental Results
We have implemented our proposed algorithms in C++ and run them on an Intel Core
i7 CPU 920 (2.67 GHz, 4 GB RAM, and Windows 7) computer. The offline synthe-
sis MVFS is evaluated in Section 3.7.1 and the online scheduling OVFS is evaluated
in Section 3.7.2. For evaluation, we used ten synthetic benchmarks and five real-life
case studies. In all benchmarks, a quarter of tasks were considered critical and for
each critical task we introduced one redundancy. We have used only “sample" type of
communication for messages, see Section3.2. We used three types of PEs, described in
Table 3.1. The minimum failure rate (when system runs at the maximum speed mode)
of all PEs was set to λ0 = 10−7 and the shape parameter µh was set to 1.
For comparison purposes, we have derived a baseline offline solution S0. In S0, all
jobs execute in the maximum speed operating mode, and the mapping is optimized in
terms of energy consumption, under schedulability and reliability goals. This is the
same baseline solution used in the motivational example. For the experiments, S0 is
determined by performing very long runs with our proposed MVFS, where we use
only M -moves and we do not use L-moves (as mentioned, all tasks are set to the
highest operating mode). The resulted energy consumption and reliability in S0 are
denoted with E0 and R0, respectively. The reliability goal Rg is set such that we accept
a probability of failure which is ten times larger than R0, i.e., Rg = 1 − 10(1 − R0).
3.7.1 Offline Synthesis Evaluation
We tuned the Tabu Search parameters of MVFS such that no improvements were seen
for longer run times, thus leading to near optimal solutions. The terminating condition
for MVFS used in the experiments is a time limit. We have used time limits between 5
minutes to 5 hours, depending on the size of the benchmark.
68
Paper B: Reliability-Aware Dynamic Energy Management for
Fault-Tolerant Distributed Embedded Systems
In the first experiment we wanted to evaluate the quality of MVFS as the systems be-
come larger. Table 3.2 presents the experimental setup details and the results. We used
five synthetic benchmarks of 10 to 105 tasks (including replicated tasks) mapped on
architectures with 2 to 6 different types of PEs. The details of each test case are pre-
sented in columns 1–4. The results obtained with MVFS are presented in the last two
columns. The increase in the failure probability θ was calculated using Eq. 3.8 and
had to be smaller or equal to 10 in order to meet the reliability goal Rg. Alongside
the MVFS results, we also present the results obtained with MVFS−. This is an im-
plementation which minimizes the energy, but without any concern for reliability (the
reliability constraint, the second term in the cost function of Eq. 3.9, is removed). As
we can see from Table 3.2, columns 5 and 6, minimizing the energy without consider-
ing reliability leads to a dramatic increase in the probability of failure, which increases
more than 100 times in most cases. However, our MVFS approach is able to keep the
system reliability RS within the specified Rg, without a significant loss in energy sav-
ings (last column) compared to MVFS− saved energy (column 6). The saved energy
percentage, denoted by E∆, in the column 6 and 8 of the table, are relative to the energy
E0 of the reference solution S0, i.e., E∆ = (E0−ES )/E0.
In the second experiment we wanted to determine the ability of MVFS to find good
quality solutions as the utilization of the system increases. The experimental setup
and the results obtained are presented in Table 3.3. We used a synthetic benchmark
with 25 tasks (20 tasks and 5 replicated tasks) mapped on a heterogeneous architecture
with 3 different types of PEs. We varied the execution times resulting in five cases
corresponding to utilization (column 5), from 27.21% to 71.98%. More energy can be
saved in the less utilized systems since more slack could be used for lowering operating
voltage and frequency without missing the deadlines. As expected, using MVFS is
especially important where the energy saving potential is greater, because reliability is
significantly impaired without it.
For example, in the test set 1 in Table 3.3, the probability of failure increases 198
times when MV FS− does not use the reliability constraint. Using MVFS, we keep the
reliability within Rg (i.e., θ≤ 10) with a loss of only 29.57−25.00 = 4.57% in energy
Table 3.2: Offline synthesis: Synthetic benchmarks with different system sizes
Numbers of MVFS− MVFS
Test PEs Orig. Repl. θ E∆ θ E∆
Set Tasks Tasks [times] [%] [times] [%]
1 2 8 2 166 28.23 10 24.64
2 4 31 8 112 28.56 10 25.28
3 4 42 11 137 30.47 10 26.04
4 6 63 16 104 25.92 10 21.92
5 6 84 21 57 22.78 10 20.57
3.7 Experimental Results 69
Table 3.3: Offline Synthesis: Synthetic benchmarks with varying initial utilization
Numbers of Initial MVFS− MVFS
Test PEs Orig. Repl. Util. θ E∆ θ E∆
Set Tasks Tasks [%] [times] [%] [times] [%]
1 3 20 5 27.21 198 29.57 10 25.00
2 3 20 5 41.00 121 26.55 8 23.02
3 3 20 5 52.73 101 25.26 9 21.05
4 3 20 5 61.56 72 22.94 10 20.09
5 3 20 5 71.98 7 12.76 7 12.76
Table 3.4: Offline synthesis: Real-life benchmarks
Numbers of MVFS− MVFS
Benchmarks PEs Orig. Repl. θ E∆ θ E∆
Tasks Tasks [times] [%] [times] [%]
networking-cords 2 13 3 141 28.01 10 20.49
auto-indust-cords 4 24 6 77 22.68 10 17.87
telecom-cords 4 30 8 129 28.16 9 19.57
above three together 6 67 17 64 15.26 10 13.86
Smart-phone 2 61 16 60 18.71 9 15.23
savings. Trading-off a small percentage of energy savings, our offline MVFS approach
is able to guarantee the reliability constraints.
Finally, we evaluated MVFS on real-life case studies. Five benchmarks were selected
from the Embedded System Synthesis Benchmark Suite (E3S), version 0.9 [Dic08],
and Smart-Phone Benchmarks [SAHE04]2. The experimental setup details and the
results obtained are presented in Table 3.4. As can be seen, this evaluation confirms
the results obtained from the synthetic benchmarks. This means that by using MVFS
we are able to eliminate the negative impact of energy minimization on reliability with
minimal loss in energy savings.
3.7.2 Online Scheduling Evaluation
To evaluate our proposed OVFS, we have built a simulator which simulates the exe-
cution of jobs, according to fixed-priority preemptive scheduling, and the fault occur-
rences. We have used as input to our simulator the test sets of synthetic benchmarks
(Table 4.3) and real-life benchmarks (Table 3.4). We have considered a mission time
TS of 30 hours for each test set.
2We only used the applications of MP3 decoder, GSM decoder, JPEG encoder and JPEG decoder.
70
Paper B: Reliability-Aware Dynamic Energy Management for
Fault-Tolerant Distributed Embedded Systems
Table 3.5: Online scheduling: Synthetic Benchmarks
Numbers of OVFS− OVFS
Test PEs Orig. Repl. E˜∆ E˜∆
Set Tasks Tasks [%] [%]
1 2 8 2 79.74 77.68
2 4 31 8 83.91 81.91
3 5 42 11 82.09 79.90
4 6 63 16 80.54 78.42
5 8 84 21 81.69 78.51
Table 3.5 and 3.6 present the online simulation results of the synthetic and real-life
benchmarks, respectively. The details of each test case are presented in columns 1−
4. Similar to the evaluation of our offline approach MVFS, we compare two online
approaches, OVFS and OVFS− in terms of their energy savings and the impact on
reliability. OVFS is our proposed online scheduling approach presented in Section
3.6. Unlike OVFS, OVFS− does not enforce the reliability goal φg when reducing the
operating voltage and frequency at runtime, i.e., it does not enforce φS ≤ φg.
For each simulation, we use as starting point the offline solution determined by MVFS
in the previous section. We report the average percentage of energy savings per appli-
cation cycle, denoted by E˜∆, for both OVFS− (column 5) and OVFS (column 6), which
we compare the average energy consumption in the online scheduling per application
cycle, E˜onlineS (recall that we simulate the program TS = 30 hours) with E0 that S0 has
been evaluated for Tcycle in the offline synthesis.
As we can see from the resulting Table 3.5, by reclaiming slack at runtime, online
algorithm for both implementations leads to significant energy savings compared to
the offline reference solution S0, i.e., 76% to 83%. Compared to OVFS− which do not
impose φg as a reliability constraint, smaller energy savings (within 5%) are reported
in the OVFS columns, however the reliability is preserved. The increase in failure
probability varies with OVFS in every application cycle, thus it is not reported in the
tables.
The results show that, at runtime, it is possible to exploit the available slack for further
energy savings compared to the offline synthesis solution, without negatively impairing
the reliability. This conclusion is also supported by the results obtained using the real-
life benchmarks, which are reported in Table 3.6.
3.8 Conclusions 71
Table 3.6: Online scheduling: Real-life benchmarks
Numbers of OVFS− OVFS
Benchmarks PEs Orig. Repl. E˜∆ E˜∆
Tasks Tasks [%] [%]
networking-cords 2 13 3 80.48 78.07
auto-indust-cords 4 24 6 77.54 75.45
telecom-cords 4 30 8 83.34 80.80
above three together 6 67 17 78.84 75.12
Smart-phone 2 61 16 76.40 72.72
3.8 Conclusions
In this paper we addressed the mapping, voltage and frequency scaling for fault-tolerant
hard real-time applications mapped on distributed embedded systems where tasks and
messages are scheduled using an event-driven scheduling policy. We captured the ef-
fect of voltage and frequency scaling on system reliability, and we showed that if the
supply voltage and the operating frequency are lowered to reduce energy consump-
tion, reliability is significantly reduced. That is why we have proposed both offline and
online approaches that can take reliability into account when performing voltage and
frequency scaling.
We have prepared an offline synthesis approach, MVFS, based on a Tabu Search meta-
heuristic, which decides the mapping and operating mode for each task such that the
energy is reduced and the schedulability and reliability constraints are satisfied. We
have also proposed an online scheduling approach, OVFS, which considers the map-
ping determined by MVFS and decides at runtime the operating mode for each job such
that the energy is further reduced. OVFS also guarantees the timing and reliability con-
straints.
As the experimental results show, our synthesis approaches are able to produce energy
efficient implementations which are both schedulable and fault-tolerant. By carefully
deciding the mapping, operating voltage and frequency of each task, we showed that it
is possible to eliminate the negative impact of voltage and frequency scaling on relia-
bility with minimal decrease in energy savings.
72
Paper B: Reliability-Aware Dynamic Energy Management for
Fault-Tolerant Distributed Embedded Systems
CHAPTER 4





We are interested in mapping hard real-time applications on distributed heteroge-
neous architectures. An application is modeled as a set of tasks, and we consider
a fixed-priority preemptive scheduling policy. We target the early design phases,
when decisions have a high impact on the subsequent implementation choices.
However, due to a lack of information, the early design phases are character-
ized by uncertainties, e.g., in the worst-case execution times (wcets) or in the func-
tionality requirements. We model uncertainties in the wcets using the “percentile
method”. The uncertainties in the functionality requirements are captured using
“future scenarios”, which are task sets that model functionality likely to be added
in the future. In this context, we derive a mapping of tasks in the application,
such that the resulted implementation is both robust and flexible. Robust means
that the application has a high chance of being schedulable, considering the wcet
uncertainties, whereas a flexible mapping has a high chance to successfully accom-
74
Paper C: Design for Robustness and Flexibility of Real-time Distributed
Applications during the Early Design Phases
modate the future scenarios. We propose a Genetic Algorithm-based approach to
solve this optimization problem. We also show how this problem can be extended
to consider the architecture selection: deciding what hardware components to use
in the architecture. In this context, we consider the uncertainties related to hard-
ware component costs. Extensive experiments show the importance of taking into
account the uncertainties during the early design phases.
4.1 Introduction
There is a lot of research on embedded systems design [LP05], but very few researchers
have addressed the early design phases. The decisions taken during the early design
phases, e.g., architecture selection and task mapping, have a high impact on the sub-
sequent implementation choices [Axe06]. However, early phases are characterized by
many uncertainties, e.g., in terms of the hardware components available, functionality
that has to be implemented and attributes such as the worst-case execution time (wcet).
We address hard real-time applications, modeled as a set of tasks, where timing con-
straints are of utmost importance. We are interested in tackling uncertainties in the
functionality requirements and the wcets of tasks. We model the uncertainties in the
wcets using the “percentile method" [Axe05], which captures the wcet of a task by two
values, the 50th and the 90th percentile. These numbers are chosen by the designers
based on the best available information and their experience.
Today, most systems are engineered in an evolutionary fashion: introducing a new ver-
sion of an existing product, introducing new features—possibly as part of a planned
evolution of a product line, performing a design-iteration, etc. Functionality require-
ments often change during these iterations. For example, new tasks may be introduced
to update the existing functionality or add a new feature, etc. Several approaches to
generating realistic product scenarios and how these product scenarios should be pri-
oritized, are discussed in [BE06]. Thus, we capture the uncertainties in functionality
requirements using “future scenarios" [BE06], which are task sets that model function-
alities likely to be added in the future.
We initially assume that the hardware architecture is fixed, and we focus on decid-
ing the mapping of tasks to processing elements (PEs), such that the application is
schedulable, even considering the uncertainties. A straightforward solution to tackle
uncertainties is to over-design the system, e.g., to build a lot of spare capacity, but this
is often prohibitively expensive. As an alternative, researchers have proposed adaptive
systems [SPM09], which change at runtime their configuration (e.g., re-mapping tasks
using task migration) in response to changes in the requirements, execution times, envi-
ronment, etc. However, many hard real-time applications are safety critical, where on-
4.1 Introduction 75
line task migration is not feasible, and an offline reconfiguration, e.g., task re-mapping,
may be very costly. Hence, we want to derive, early on, a mapping of tasks to PEs,
which is both robust and flexible. In our case, robust means that the application has a
high chance of being schedulable, considering the wcet uncertainties, whereas a flexi-
ble mapping has a high chance to successfully accommodate future scenarios.
For time-triggered systems using static-cyclic scheduling, researchers have proposed
approaches for the synthesis of flexible schedules, which can accommodate future
changes. In [PIEP09], the future applications are captured using a set of possible wcets
and their probability distributions, and the flexibility is measured as the likelihood of
successfully adding new functionality in the future. In [ZCP+05], the flexibility is de-
fined as both “extensibility”, i.e., the maximum increase in the wcet of a task that can be
handled without rescheduling, and “scalability”, i.e., the maximum wcet of a new task
that a schedule can accommodate without change. The work in [GZDNBH10] uses
the info-gap decision theory to synthesize robust FlexRay bus schedules, considering
uncertainties in design parameters, such as the size of a message.
In this paper we consider that the tasks are scheduled using fixed-priority preemp-
tive scheduling (fpps). In this context, robustness has been addressed using “sen-
sitivity analysis" [RHE06], where the attributes of an implementation, e.g., worst-
case response times, are evaluated against changes in system properties, e.g., wcets.
In [Axe05], the mapping is fixed, and the uncertainties in the wcets are captured using
the “percentile method". The author proposes a Monte Carlo simulation approach to
evaluate the likelihood that tasks will meet their deadlines. Other researchers [LGT10]
quantify the robustness in terms of a “revision cost”, and they aim to provide robust
designs, which minimize the revision cost.
Regarding flexibility, [PEP02] proposes a mapping technique to increase the chance of
successfully accommodating future applications. The future applications are captured
similar to [PIEP09], using probabilities for wcet values. The approach in [HTRE02]
defines flexibility in terms of the amount of functionality that the design is able to im-
plement, and proposes a flexibility vs. cost trade-off model to search for Pareto-optimal
solutions. In [BE06], researchers model the uncertainty in functionality requirements
using scenarios, i.e., prioritized task sets, and derive an architecture and a mapping of
tasks such that the flexibility is maximized. The flexibility is defined as the likelihood
of all tasks being schedulable when adding a scenario to the existing mapping.
In this paper we are initially interested to derive a robust and flexible mapping solu-
tion during early design phases, for a given architecture and taking into account the
uncertainties. The problem is defined in Section 4.3, and the uncertainty models are
presented in Section 4.2. Section 4.5 presents a motivational example which shows the
importance of considering the uncertainties. In order to address both robustness and
flexibility simultaneously, we propose a Genetic Algorithm-based approach to search
for a Pareto-front of solutions (Section 4.6). The evaluation of the proposed approach
76
Paper C: Design for Robustness and Flexibility of Real-time Distributed
Applications during the Early Design Phases
is presented in Section 4.7. In Section 4.8, we extend the problem to consider also the
architecture selection. We are interested to determine that hardware architecture and
mapping which maximizes the robustness and flexibility as defined earlier. In addition,
derived architecture should have a high chance of meeting an imposed cost budget. In
the last section, we draw the conclusion of our work.
4.2 System Model
In this paper, a system is composed of software applications, modeled as a set of tasks,
and a hardware architecture consisting of a set N of PEs, interconnected by a com-
munication channel. The mapping of a task τi to a PE N j is captured by a mapping
function: M (τi) = N j. This mapping is not yet known and will be decided by our pro-
posed approach. For each task τi, we assume that the period Ti and deadline Di are
known and given such that Di ≤ Ti.
Tasks communicate using messages. We know the size smi and the priority of each mes-
sage mi. Our model captures two types of communication: sampling and queuing. A
sampling communication has a buffer storage for a single message; arriving messages
overwrite the buffer, and reading does not remove the message, i.e., it can be read re-
peatedly. A queuing communication uses a buffer that can store several messages, and
works as a FIFO queue. A reader task will block if the buffer is empty and a writer task
will block if the buffer is full. We assume that the buffer size have been determined
such that there is no overflow or underflow [MEP06].
Tasks are scheduled using fixed-priority preemptive scheduling. Researchers have
shown [TSPS12a] how realistic protocols such as TTEthernet can be taken into account
during the analysis and synthesis. However, for simplicity, in this paper we consider
that the PEs are interconnected by a shared bus, which uses a fixed-priority dynamic
scheduling scheme and assumes that the messages are non-preemptible. This is similar
to how a widespread bus protocol such as the Controller Area Network [Bos91] works.
4.2.1 Early Design Stages
There are many life cycle models used in the industry [Est07], such as waterfall mod-
els [Boe88] and V-models [FHKS09], but in principle they all have the following main
stages: concept development, engineering development and post-development, see the
top part of Fig. 4.1 [KSSB11]. During concept development, system engineers perform
a needs analysis, do concept exploration and definition. Engineering development con-
sists of advanced development, engineering design, integration and evaluation. Then,
4.2 System Model 77
Figure 4.1: Life Cycles of Systems Engineering
in post-development, the mass-production from the prototype starts, followed by the
operation and maintenance of the system.
Experience from completed system design and development projects has indicated that,
the cost spent for the concept development stage accounts for about 20% of the total
cost. However, 80% of the cumulative cost of the system is committed already in this
stage [Bue11]. Decisions made in the early design stages not only have a high impact
on the subsequent implementation choices, but also have substantially negative impacts
on the total cost of the system, since it is costly and time-consuming to modify and
correct a decided design to other alternative designs in the engineering development
stage. i.e., early decisions have a high cost-effectiveness impact opportunity.
The bottom part Fig. 4.1 [SR11] shows the projected committed cumulative cost, and
the impact opportunity of a design decision over time. The conclusion is that we should
spent more effort on the early decisions, in order to increase the chance of completing
the projects on time and successfully.
There is a lot of research on embedded systems design [Mar11,GAGS09], but very few
researchers have addressed the early design stages. The challenge is that early design
stages are characterized by many uncertainties.
Uncertainty in the context of systems engineering refers to the inability to determine
precisely the state or attributes of a system. It can be caused by incomplete knowledge
or by stochastic variability. A detailed discussion on the taxonomy of uncertainty is
available in [Hai11].
78
Paper C: Design for Robustness and Flexibility of Real-time Distributed
Applications during the Early Design Phases
Uncertainties in the early design stages need to be quantified, such that risks, defined
as the probability of not reaching design targets, of different design alternatives can be
estimated for making early design decisions. The next two subsections present how we
model the uncertainties in the wcet and the functionality, respectively.
4.2.2 Modeling WCET Uncertainties
In this paper, we are not interested in modeling the variability of the execution time
ei of a task τi, but in modeling the uncertainties of the wcet ci. The variability of ei is
typically captured by a probability mass function [KDB+05] and is due to, for example,
the variations in the input data of tasks or the speculative features of modern processors.
In our case, the uncertainties in a wcet ci come from the lack of information during
the early design stages, when, for example, the algorithm used to implement a task τi
or the parameters of the architecture are not yet known. Similar to [Axe05], we use
the “percentile method" to model the wcet uncertainties. Thus, we capture the wcet
of a task by two values, the 50th and 90th percentiles. This means that in 50% of the
possible future implementations, a task will have a wcet smaller or equal to the 50th
percentile value, while in the majority of the cases, i.e., 90%, the wcet will not exceed
the 90th percentile value.
Table 4.1 presents an example in which four tasks, τ1 to τ4, can be mapped on two
PEs, N1 and N2. For each task τi and its mapping on each PE N j, two wcet values are
given, i.e., the 50th and 90th percentile, respectively. The more information is available
about a task and the PEs where it can potentially be mapped, the smaller the difference
between the two percentiles. A wcet can also be a fixed value, e.g., for legacy tasks
mapped on legacy PEs. In our example, τ4 is an updated task, thus designers believe
that its 90th percentile is only 20% larger than its 50th percentile. For τ3, this difference
is 50%. However, tasks τ1 and τ2 are new tasks to be introduced, hence designers have
lower confidence and thus their 90th percentiles are twice the 50th percentiles.
Table 4.1: Uncertainties in wcets
N1 N2 Ti = Di
Tasks 50th 90th 50th 90th
τ1 10 20 15 30 50
τ2 25 50 37.5 75 100
τ3 40 60 60 90 150
τ4 60 72 90 108 300
4.2 System Model 79
We use a Gumbel distribution, with the cumulative distribution function (cdf ) defined
as [KN00]: P(ci ≤ x) = e−e
− x−µβ , where P is the probability that the wcet ci will have a
value smaller or equal to x. Using the two percentile values, corresponding to 0.5 and
0.9 probability, we can determine the distribution parameters µ and β, i.e., determine
the cdf of the wcet. For example, for τ1 mapped on N1 in Table 4.1, we have µ = 8.05
and β = 5.31. Knowing the cdf for a task τi, we can also determine its probability
density function (pdf ) σi.
4.2.3 Modeling Functionality Uncertainties
Additionally, product requirements often change during later design and development
phases. We use the approach from [BE06] to describe an application as one baseline
functionality, S0, and a set of future scenarios, S f = {S1,S2, . . .}. Each future scenario
is associated a weight wi, which reflects the probability of this future scenario becoming
a reality.
Let us consider the example in Table 4.2, which includes one baseline functionality
S0 (tasks τ1 to τ4 from Table 4.1), and four future scenarios S1 to S4. We consider
an architecture of two PEs, N1 and N2 (the same as in Table 4.1) . S1 replaces τ1
with τ5 due to a functionality update. S2 introduces τ6 for enhancing the application
performance. In S3, a new application is added, which is modeled by τ7 and τ8. S4 is
the combination of S1 and S2, which captures the case when both S1 and S2 happen at
Table 4.2: Modeling uncertainties in functionality requirements
N1 N2 Ti = Di
Tasks 50th 90th 50th 90th
S0: Example in Table 4.1
S1 = S0 \ τ1∪ τ5 (w1 = 0.8)
τ5 15 30 22.5 45 50
S2 = S0∪ τ6 (w2 = 0.4)
τ6 25 50 37.5 75 100
S3 = S0∪ τ7∪ τ8 (w3 = 0.6)
τ7 30 60 45 90 300
τ8 50 75 75 102.5 300
S4 = S0 \ τ1∪ τ5∪ τ6 (w4 = 0.2)
τ5 15 30 22.5 45 50
τ6 25 50 37.5 75 100
80
Paper C: Design for Robustness and Flexibility of Real-time Distributed
Applications during the Early Design Phases
the same time. Next to each scenario Si, we also specify its weight wi. These scenarios,
and their associated weights, are determined using the methods presented in [BE06].
4.3 Problem Formulation
As an input to our problem, we have the hardware architecture N , the baseline func-
tionality S0 and the set of future scenarios S f . For each task, we know the two wcet
percentile values, on every PE where it is considered for mapping.
We are interested to determine the mapping M0 of the baseline functionality S0 on
the given architecture N , such that the robustness and flexibility of M0 is maximized.
These two metrics will be formally defined in the next subsection. A mapping M0 is
robust if the tasks in S0 have a high chance of being schedulable. M0 is flexible if it
has a high chance to successfully accommodate the future scenarios from S f , such that
they are also likely to be schedulable. This is a two-objective optimization problem
(robustness and flexibility). Our optimization strategy, presented in Section 4.6, will
produce a Pareto-front of solutions.
We target safety-critical hard real-time applications, so we consider that M0 is fixed
when adding a future scenario. Our optimization strategy will produce the mappings
Mi of future scenarios Si, as a byproduct of evaluating the flexibility of M0. In the later
design stages, when a scenario Si has become a reality, we use our proposed mapping
optimization strategy to decide the mapping Mi of Si, while keeping the mapping of
tasks in S0, decided during the early design phases, fixed.
4.3.1 Robustness and Flexibility




`1 = ∑i max(0,ri−Di) i f `1 > 0
`2 = ∑i(ri−Di) i f `1 = 0
(4.1)
where ri is the worst-case response time (wcrt) of a task τi and Di is its deadline. If
a mapping is not schedulable, there exists at least one ri greater than the deadline Di,
therefore the term `1 of the function will be positive. In this case rM is equal to `1.
However, if a mapping is schedulable, then each ri is smaller than its corresponding
4.4 Schedulability Analysis 81
deadline Di. In that case `1 = 0 and we use `2 as the rM , to be able to differentiate
between two mapping alternatives, both leading to feasible schedules. rM ≤ 0 means
the mapping is schedulable. A larger negative value of rM indicates the mapping is
“more schedulable", i.e., the wcrts are smaller.
Note that rM is a stochastic variable, since it is calculated based on wcrts ri, and each
ri is determined by the related wcets ci, see Section 4.4.
The robustness of a mapping Mk (k = 0,1, . . .), for the tasks in a task set Sk, is defined
as the probability of all tasks in Sk being schedulable,
RMk = P(rMk ≤ 0) (4.2)
where rMk is the degree of schedulability from Eq. 4.1.
Let us denote Mi, the mapping of the tasks in a future scenario Si ∈ S f , on top of the
mapping M0 of the baseline functionality S0, such that the robustnessRMi is maximized.









where wi is the weight of scenario Si, and |S f | is the number of future scenarios. To
calculate FM0 , we need first to determine the mapping Mi of each Si, such that RMi is
maximized (See Section 4.6.2).
4.4 Schedulability Analysis
We are interested to determine the probability RMk of a mapping Mk to be schedulable.
This value is used in both metrics, robustness (RM0 ) and flexibility (RMi , i = 1,2, . . .).
In this paper, we assume that tasks are scheduled using a fixed-priority preemptive
scheduling policy, and we use a Response Time Analysis (RTA) [Fid98] to determine
the wcrt ri of a task τi, according to the recurrence equation:







where hp(τi) is the set of tasks that have a priority higher than the priority of τi.
82
Paper C: Design for Robustness and Flexibility of Real-time Distributed
Applications during the Early Design Phases
Figure 4.2: Calculating schedulability distribution with MCS
This basic analysis has been extended over the years [Fid98] to take into account block-
ing times, arbitrary deadlines and release times, jitter, offsets, etc. Our analysis for
uncertain wcets uses a RTA inside an iterative loop. For simplicity, in this paper, we
have decided to consider the case when Di ≤ Ti and ignore the messages. The RTA is
orthogonal to our analysis, and can be extended to consider a more general case.
In [KDB+05], a stochastic schedulability analysis is used to handle the variability in
ei. Each job Ji, j of a task τi may have different execution times, depending on the prob-
ability distribution function ξi of ei. Thus, for calculating ri, the updated response time
equation (Eq. 4.4) [KDB+05] uses stochastic variables of ci and c j. In each iteration of
the recurrence equation for ri, ci and c j will get different values, based on their pdf s of
ξi and ξ j, respectively. However, such a solution is not applicable in our case, where
each job Ji, j of a task τi has the same wcet ci in each iteration.
The analysis in [Axe05] uses Monte Carlo Simulation (MSC) to determine the prob-
ability distribution of ri. With MCS, a large number of iterations are run, and the
following steps are performed. First, for each task τi, a value of ci is generated based
on its wcet pdf, σi. Second, the generated values of ci are used to determine the wcrt ri
of each task τi with Eq. 4.4. Then, the degree of schedulability rM is calculated using
Eq. 4.1. Finally, the rM values are collected over all iterations, and thus the degree of
schedulability cdf is obtained. This is illustrated in Fig. 4.2.
MCS requires a large number of iterations (e.g., 100,000) to get an accurate result,
which is time-consuming, and thus we cannot use MCS during design space explo-
ration. In this paper, instead of MCS, we propose using the Kernel Smoothing Density
Estimate (KSDE) technique [BA97] to quickly approximate the degree of schedulabil-
ity cdf.
4.4 Schedulability Analysis 83
Similar to MCS, we start by performing a number of iterations to get the degree of
schedulability values. However, we need fewer samples for KSDE (e.g., 1,000 instead
of 100,000 in MCS), since a smoothing technique is applied to estimate the pdf based
on the available samples. Given m random samples X1, . . . ,Xm whose underlying den-













where K( x−Xih ) is the kernel and h (> 0) is the bandwidth. The bandwidth h is a smooth-
ing parameter, which controls how wide the probability mass is spread around a sample.
We evaluated several kernels and bandwidths, and compared the results with those




















where Φ(Xi−Φ(Xi))0.6745 is a robust estimate for standard deviation of the distribution, and
Φ(Xi) denotes the median of Xi.
Considering the two mappings, M and M′, from Fig. 4.3, both MCS and KSDE resulted
in RM = 93% and RM′ = 67%, i.e., the probability of the tasks in S0 to be schedulable
is 93% (using M) and is 67% (using M′). The difference is that MCS took 25 seconds
(using 100,000 samples), whereas KSDE finishes in 0.5 second (using 1,000 samples).
Figure 4.3: Two mapping alternatives
84
Paper C: Design for Robustness and Flexibility of Real-time Distributed
Applications during the Early Design Phases
We have used both MCS and KSDE to determine the robustness of 20 task sets mapped
on varying number of PEs. The maximum difference between the two techniques is 3%.
Thus, we use KSDE as the basis for calculating the two objective functions during the
optimization. Note that the analysis presented in this section is only used to guide the
search, not to provide schedulability guarantees. We assume that a RTA will be used
during the later design and development stages (when maybe more accurate informa-
tion about wcets and the functionality is available) to check the schedulability of an
implementation.
4.5 Motivational Example
In the following, we show the importance of modeling and taking into account the
uncertainties in the early design phases. For comparison purposes, let us introduce
a “straightforward mapping" (SFM) approach, which does not take into account the
uncertainties. Thus, with SFM, we consider that each wcet is characterized by a single
value (the expected value of the wcet pdf σi, denoted by E(σi)), and the future scenarios
are ignored. SFM determines that mapping which minimizes the E(rM), calculated
using the expected wcet values in Eq. 4.1.
Let us assume that SFM has to decide between two mappings M and M′ from Fig. 4.3,
during design space exploration. Using the expected wcet values, rM = −260, while,
rM′ = −317, which means that M′ would be preferred. However, we reach a different
conclusion if we take into account the uncertainties in wcets and compare the two
mappings in terms of robustness. Fig. 4.4 presents the degree of schedulability cpf s
for the two mappings. The probability of the mapping being schedulable P(rM ≤ 0)
is determined by the intersection of the cpf with the vertical line at point “0". As
we can see, M has a better chance of being schedulable (93%) than M′ (67%), so
actually chosing M instead of M′ is more “robust", i.e., it has a higher chance of being
schedulable.
Let us consider the baseline functionality from Table 4.1, and the future scenarios from
Table 4.2. We are interested to determine a mapping which maximizes robustness and
flexibility. The Pareto-optimal solutions found after an exhaustive search, are depicted
by (blue) ‘×’ in Fig 4.5. The rightmost ‘×’ is the most robust mapping, with 95.4%
robustness and 88.4% flexibility. The leftmost ‘×’ is the most flexible mapping, with a
robustness of 92.7% and a flexibility of 90.1%.
We have also plotted in Fig. 4.5 the optimal mapping obtained by SFM, using a ‘+’
symbol. The robustness and flexibility of this mapping have been calculated using
Eq. 4.2 and Eq. 4.3, respectively, taking into account the uncertainty model from

























Figure 4.4: Results of M and M′

















Figure 4.5: SFM and Pareto-optimal
Fig. 4.1 and Fig. 4.2. As we can see, SFM produces a poor quality solution, with
only 85.5% robustness and 70.6% flexibility.
4.6 Mapping Optimization
We propose a Genetic Algorithm (GA)-based approach, called Mapping for Robustness
and Flexibility (MRF), to solve the optimization problem presented in Section 4.3. In
86
Paper C: Design for Robustness and Flexibility of Real-time Distributed
Applications during the Early Design Phases
Figure 4.6: Outline of the MRF optimization
Fig.4.6, we show the outline of MRF, and highlight the main steps of GA. Each step in
Fig.4.6 will be presented in Section 4.6.1.
4.6.1 NSGA-II for Multiobjectives Optimization
There are several off-the-shelf multiobjective GA implementations, such as NSGA-
II [DAPM00] and search frameworks for multiobjective optimization such as PISA [BLTZ03].
In this paper, we focus on determining the importance of modeling the uncertainties in
the early design stages, and thus we decided to use the Non-dominated Sorting Ge-
netic Algorithm-II (NSGA-II) [DAPM00], due to its good performance and its simple
implementation.
4.6 Mapping Optimization 87
Figure 4.7: Chromosome encoding and design transformations
GA is a metaheuristic optimization approach, which belongs to the class of Evolution-
ary Algorithms, inspired from the process of natural evolution. The set of candidate
solutions is called a “population”, and each solution is (i) encoded using a string called
a “chromosome”. The population is (ii) initialized to n candidate solutions, where n
is the population size. The population is evolved by (iii) selecting a set of solutions
and performing (iv) recombination and (v) mutation to generate offsprings. Finally, the
parent population is (vi) replaced with an offspring population with better “fitness”.
The fitness of a solution is evaluated using our multiobjective function. Steps (iii) to
(vi) are repeated until a termination condition is reached.
Steps (i) to (vi) are explained in the reminder of this section. There are several choices
for their implementation. Through experiments, we decided to choose the following
approaches, which can find good solutions in a reasonable time. The parameters were
also determined experimentally. One example of parameters is given in Section 4.7.
(i) Encoding: We use direct-value encoding, where each chromosome represents a
mapping alternative, and each allele (the value of a gene, which is a component of a
chromosome) represents a PE. For example, the mapping alternatives M and M′ shown
in Fig. 4.3 are encoded as chromosomes shown in the first column of Fig. 4.7. The ith
position in the chromosome is the index j of the PE N j which the task τi is considered
for mapping. Thus, the mapping of τ1 and τ2 to N1, τ3 and τ4 to N2 is described by the
string 1|1|2|2.
(ii) Initialization: The initial population is randomly generated and has a population
size n.
(iii) Selection: We use “tournament selection” to select parents for performing recom-
bination and mutation. In a tournament, four chromosomes are chosen at random, and
the fittest one wins. In total, 2(pc× n) parents are chosen for performing recombina-
tion, while n− (pc× n) parents are chosen for performing mutation, where pc is the
probability of recombination.
88
Paper C: Design for Robustness and Flexibility of Real-time Distributed
Applications during the Early Design Phases
(iv) Recombination (also called crossover): We employ a standard single point crossover.
For each two parents, we compare a randomly generated number with pc, if this num-
ber ≤ pc, the two parents are cut at a random point and the sections after the cut point
are swapped to generate the offsprings. Otherwise, the offsprings are just copies of
their parents. For example, Let us assume that for the two chromosomes in the first
column of Fig. 4.7 it is decided to crossover on their second position. In this case, the
offsprings are shown in the middle column of Fig. 4.7.
(v) Mutation is used to add diversity to a population obtained from recombination.
For each position of a parent’s string, we compare a randomly generated number with
pm (probability of mutation) and if this number ≤ pm, this position is mutated, i.e.,
the task is randomly remapped to another PE. In the last column of Fig. 4.7, the gene
highlighted by a bold box have been performed mutation.
(vi) Replacement: Recombination and mutation generate n offsprings out of the n par-
ents in the current population. Replacement decides which n solutions are kept out of
the 2n solutions available. The key advantage of NSGA-II lies in how it performs se-
lection and replacement, with the goal of preserving diverse non-dominated solutions,
in the hope of finding the Pareto-optimal front. See [DAPM00] for the details on the
selection and replacement procedures used in NSGA-II.
The three rounded boxes in the right column of Fig.4.6 describe the steps used for
evaluating the fitness of each chromosome (design alternative). Decoding the chromo-
some means extracting the mapping information for each task. Then, we can perform
schedulability analysis, used to perform the evaluation, when the objective functions
are calculated (robustness and flexibility in our case).
Steps (iii) to (vi) are repeated in each generation (highlighted by a blue box in Fig.4.6)
until there is no improvement for a given number of consecutive generations, e.g., 10
generations. Our algorithm can also be stopped if a given time limit has been exceeded,
e.g., 1 hour. In the end, we obtain a Pareto-front of solutions, which, however, is
not guaranteed to contain the Pareto-optimal, since NSGA-II is a search metaheuristic
which does not guarantee optimality.
4.6.2 Determining the Mapping of Future Scenarios
When measuring the flexibility FM0 of a mapping M0 (performed in the “Evalua-
tion” box of Fig. 4.6), we need to determine the mapping of each future scenario,
Si (i = 1,2, . . .), and calculate its robustness RMi using Eq. 4.3. To get an accurate
flexibility value for FM0 , Mi should be as close as possible to the optimal, i.e., it has
a maximum robustness value for RMi . However, determining such optimal mappings
is time-consuming, and the evaluation of flexibility is performed when visiting every
4.7 Experimental Results 89
M0 alternative. Hence, we propose a Greedy algorithm to determine the mapping Mi of
each future scenario Si. Note that we only have to map those tasks from Si which are
not present in the baseline functionality S0, i.e., they are new tasks. All the other tasks
will keep the some mapping as in M0. Thus, the new tasks in Si are sorted according
to their utilization, calculated using the expected wcets, i.e., ui = E(σi)/Ti. Then each
task is mapped on the PE with the lowest utilization, and the PE utilizations are updated
before moving to the next task.
To determine the quality of the proposed Greedy algorithm, we have also implemented
a GA-based mapping. The two algorithms have been evaluated using the synthetic
benchmark “synthetic 1" (see Section 4.7). The difference is only 4%, but GA is 35-
times slower than the greedy algorithm. Therefore, we have decided to use the Greedy
algorithm instead of GA for determining the mapping of future scenarios.
4.7 Experimental Results
To evaluate our proposed approach, we used four real-life benchmarks (Table 4.4)
from the Embedded System Synthesis Benchmark Suite (E3S), version 0.9 [Dic02],
and eight synthetic benchmarks (Table 4.3) generated using Task Graphs For Free
(TGFF) [DRW98]. The details of the benchmarks are reported in Columns 1–4 of
the tables. For the synthetic benchmarks, wcets were generated in the range 30–70
ms. These values are considered as the wcets with 50th percentile. wcets with 90th
percentile are generated to be up to 50% larger than their 50th percentile.
For each benchmark, four future scenarios, S1 to S4, are considered. To create S1, we
have randomly selected 10% of tasks in S0 and increased their 50th percentile with
20% (and correspondingly adjusted their 90th percentile). For S2, we have randomly
Table 4.3: Synthetic Benchmarks
Test Set Number of SFM MRF
PEs Tasks Most robust Most flexible
S0 S f RM0 FM0 RM0 FM0 RM0 FM0
synthetic 1 3 22 30 34% 17% 84% 56% 69% 64%
synthetic 2 3 22 30 75% 30% 96% 52% 87% 82%
synthetic 3 6 42 58 26% 22% 70% 23% 61% 63%
synthetic 4 6 42 58 77% 44% 95% 45% 89% 81%
synthetic 5 8 62 86 38% 13% 70% 23% 60% 33%
synthetic 6 8 62 86 81% 23% 97% 64% 93% 73%
synthetic 7 10 84 116 22% 6% 88% 18% 69% 27%
synthetic 8 10 84 116 74% 9% 88% 11% 81% 28%
90
Paper C: Design for Robustness and Flexibility of Real-time Distributed
Applications during the Early Design Phases
Table 4.4: Real-life Benchmarks
Test Set Number of SFM MRF
PEs Tasks Most robust Most flexible
S0 S f RM0 FM0 RM0 FM0 RM0 FM0
consumer-cords 2 12 16 67% 51% 96% 66% 93% 68%
networking-cords 2 13 17 75% 69% 89% 69% 80% 75%
auto-indust-cords 4 24 32 42% 12% 59% 38% 56% 41%
telecom-cords 4 30 42 46% 44% 97% 57% 91% 73%
introduced new functionality, which is about 10% of the size of S0. S3 is similar to S2,
but larger, 20% of S0. Finally, S4 is a combination of S1 and S2. The weights of four
scenarios, S1 to S4, are 0.8, 0.4, 0.6 and 0.2, respectively.
In the first set of experiments (Table 4.3), we were interested to determine the impor-
tance of capturing the uncertainties during the early design phases. We have varied
the size of the system from 22 tasks (S0) and 3 PEs to 84 tasks (S0) and 10 PEs and
applied two optimization approaches: MRF, presented in the previous section, which
takes into account both uncertainties, and SFM, introduced in Section 4.5, which uses
the expected values of wcets and ignores the future scenarios.
The robustness and flexibility of SFM are reported in columns 5 and 6 of Table 4.3.
MRF produces a Pareto-front of solutions. We show the resulting Pareto-fronts of
those synthetic benchmarks in Fig.4.8 – Fig.4.15, using (blue) “x” symbols, and report
the extremes in the Pareto-front of all synthetic benchmarks, the most robust solution
(columns 7 and 8) and the most flexible solution (columns 9 and 10) in Table 4.3.
Both SFM and MRF are implemented in Matlab 2013a and run on an Intel Core i7
CPU 920 (2.67 GHz) computer. We tuned the NSGA-II parameters such that no im-
provements were seen for longer run times, thus leading to near optimal solutions. The
terminating condition for NSGA-II used in the experiments is a time limit. We have
used time limits between 20 minutes to 3 hours, depending on the size of the bench-
mark. Taking “synthetic 7” and “synthetic 8” as examples, we set n = 100, pc = 0.5,
pm = 0.25 and the search terminates in 3 hours.
As we can see from Table 4.3, SFM is not able to find robust and flexible solutions,
whereas our MRF approach is able to find good quality solutions, which both robust-
ness and flexibility are significantly increased compared to SFM.
In the second set of experiments, we evaluated the MRF approach using four real-life
benchmarks from E3S. The experimental setup details and the results obtained are pre-
sented in Table 4.4. We show the resulting Pareto-fronts of those real-life benchmarks
in Fig.4.16–Fig.4.19, using (blue) “x” symbols, and report only the extremes in the
4.7 Experimental Results 91
















Figure 4.8: synthetic 1














Figure 4.9: synthetic 2















Figure 4.10: synthetic 3















Figure 4.11: synthetic 4














Figure 4.12: synthetic 5














Figure 4.13: synthetic 6






























Figure 4.15: synthetic 8
Pareto-front of all real-life benchmarks in columns 7–10 of Table 4.4. The evaluation
confirms the results obtained from the synthetic benchmarks.
92
Paper C: Design for Robustness and Flexibility of Real-time Distributed
Applications during the Early Design Phases





























































4.8 Architecture Selection under Uncertainties
So far, we have assumed that the hardware architecture is given, and we have focused
on the mapping optimization. In this section we show how the mapping problem can
be extended to consider the architecture selection under uncertainties. In our case,
architecture selection means deciding the hardware components, their characteristic
and their interconnections.
Embedded system products have to be regularly upgraded, in order to be commercially
successful in the currently highly competitive market, especially in consumer electron-
ics. The most common upgrades are fixing bugs, introducing new functionality, and
improving performance. Both applications and architecture may have been upgraded
such that the new system is able to achieve better performance, compared to the old
version. However, since a huge time and effort (e.g., many years) have been invested
on developing and maintaining the previous system, reusing the legacy applications
and architecture as much as possible is a wise solution to save the costs. Moreover, the
experience we gained from the previous system can be used to reduce the design risk.
In the early design stages of building a new system version, the choice of reusing legacy
architecture components, using or upgrading to new components has a significant im-
pact on the robustness, flexibility and architecture cost. In the case we choose to use
new hardware components, with improved metrics such as better performance, or lower
4.8 Architecture Selection under Uncertainties 93
power dissipation, we may need to redevelop and validate the platform which results
in high cost for the new architecture solution and high uncertainties in evaluating the
wcet of tasks. In case we migrate the legacy hardware components from the previous
products, the cost of such an architecture solution should be much less than that in the
former case, and the task wcet are more certain as well, but we may not benefit from
the improved metrics of the new architecture.
As mentioned in Section 4.2.2, uncertainties in evaluating wcet of a task may result
from the fact that the complete parameters of the architecture are not yet decided in
the early design stages. We have modeled the uncertainties in wcets and functionality
requirements, and we have used the KSDE method as the basis for determining the
schedulability probability of a mapping alternative. In this section, we extent our work,
to consider the architecture selection and model the uncertainty in architecture cost
during the early design phases.
4.8.1 Modeling Cost Uncertainty in Architecture Selection
We consider distributed heterogeneous platforms where a set of PEs are interconnected
by shared communication channels. The platform may contain PEs such as General-
Purpose Processor (GPP), Digital Signal Processor (DSP) or Application Specific In-
struction Set (ASIP). Some of these might be legacy components. We assume that the
engineer provides a library of hardware components, denoted by LH , which includes a
set of PEs, and a set of buses, with different performance and cost.
The cost of the components in the library LH is also affected by uncertainties. If a new
processor has not yet been developed, its cost is not yet known. If there are supply
problems with one of the legacy components that have to be used, the cost could also
fluctuate. We model the uncertainty in the cost of an architecture component using the
percentile method [Axe05], as we have done with the wcets. The uncertainty in per-
formance is captured by the wcet model. In this context, the wcet uncertainty depends
also on the choice of the hardware component that will run the task, which is not yet
known.
An example hardware library LH , with the cost uncertainty model, is given in Ta-
ble 4.5. We capture the cost of a hardware component by two values, the 50th and
90th percentiles. This means that in 50% of the possible future implementations, the
selected hardware component will have a cost smaller or equal to the 50th percentile
value, while in the majority of the cases, i.e., 90%, the cost will not exceed the 90th
percentile value.
94
Paper C: Design for Robustness and Flexibility of Real-time Distributed
Applications during the Early Design Phases
Table 4.5: Uncertainties in unit cost of hardware components
PEs 50th [e] 90th [e] Buses 50th [e] 90th [e]
N1 30 45 B1 4 6
N2 20 30 B2 3 4.5
N3 35 105 B3 6 18
N4 25 75 B4 5 15
We calculate the unit cost of an architecture solution, wN , by accumulating the unit
cost of each hardware component in the architecture. Note that wN is a distribution of
values due to the variability in the unit cost of each component.
4.8.2 Architecture Selection Problem
The architecture selection and mapping problem can be formulated as follows. As
an input to our problem we have (i) a library LH of hardware components (PEs and
buses) with varying cost and performance, (ii) the baseline functionality S0, and the set
of future scenarios S f and (iii) a cost budget wb. For each hardware component, we
model its cost using two percentile values, and for each task we know the two wcet
percentile values for each hardware component where it is considered for mapping.
We are interested to determine an implementation I consisting of the architecture N
(the set of hardware components and their interconnections) the mapping M0 of the
baseline functionality S0 on the architecture N , such that the robustness and flexibility
of I are maximized, and the architecture has a high chance to have its unit cost within
the budget wb. The robustness and flexibility are defined in Section 4.3.1: robustness
means that the tasks in S0 have a high chance of being schedulable, and flexibility
means that I has a high chance to successfully accommodate the future scenarios S f .
The two objective, i.e., robustness and flexibility are formally defined as in Eq. 4.2 and
Eq. 4.3, respectively. The third objective CN is the probability that the cost wN of the
architecture N meets the imposed budget wb. Thus, we define CN = P(wN ≤ wb) as
the third optimization objective.
4.8.3 Architecture Selection: Motivational Example
Let us illustrate the problem presented in the previous section using the example from
Table 4.5 and Table 4.6. Table 4.5 presents the library LH of hardware components.
The cost is captured in e with the 50th and 90th percentiles. We assume that N1 and N2
4.8 Architecture Selection under Uncertainties 95
Table 4.6: Uncertainties in wcets
N1 N2 N3 N4 Ti =
Tasks 50th 90th 50th 90th 50th 90th 50th 90th Di
τ1 10 20 15 30 5 15 7.5 22.5 50
τ2 25 50 37.5 75 12.5 37.5 18.75 56.25 100
τ3 40 60 60 90 20 40 30 60 150
τ4 60 72 90 108 30 45 45 67.5 300
Table 4.7: Uncertainties in functionality requirements
N1 N2 N3 N4 Ti =
Tasks 50th 90th 50th 90th 50th 90th 50th 90th Di
S0: Example in Table 4.6
S1 = S0 \ τ1∪ τ5 (w1 = 0.8)
τ5 15 45 22.5 67.5 7.5 22.5 11.25 33.75 50
S2 = S0∪ τ6 (w2 = 0.4)
τ6 25 50 37.5 75 12.5 37.5 18.75 56.25 100
S3 = S0∪ τ7∪ τ8 (w3 = 0.6)
τ7 30 60 45 90 15 45 22.5 67.5 300
τ8 50 75 75 102.5 25 50 37.5 75 300
S4 = S0 \ τ1∪ τ5∪ τ6 (w4 = 0.2)
τ5 15 45 22.5 67.5 7.5 22.5 11.25 33.75 50
τ6 25 50 37.5 75 12.5 37.5 18.75 56.25 100
are legacy PEs, while N3 and N4 are new PEs for which we do not have their complete
information yet. We assume the cost budget of the architecture solution, wb, to be
100 e, i.e., wb = 100 e.
Table 4.6 presents baseline functionality S0 in which four tasks, τ1 to τ4, can be mapped
on four PEs, from N1 to N4. For each task τi and its mapping on each PE N j, two wcet
values are given, i.e., the 50th and 90th percentile, respectively. Note that the more
information is available or more experience we have about a task and the PEs where it
can potentially be mapped, the smaller the difference between the two percentiles. The
future scenarios S f are captured by Table 4.7.
For comparison purposes, we introduce a “Straightforward Implementation” (SFI) ap-
proach, which does not take into account the uncertainties due to the architecture se-
lection (wcet and cost) and functionality requirements change (future senarios). Thus,
with SFI, we consider that the wcet of each task and the unit cost of each component
(PE and bus) are characterized by a single value (the expected value of their probability
density function σi, denoted by E(σi)), and the future scenarios are ignored.
96
Paper C: Design for Robustness and Flexibility of Real-time Distributed
Applications during the Early Design Phases
Figure 4.20: Architecture selection and mapping optimization
SFI determines that implementation which maximizes the schedulability of the tasks
in S0, i.e., minimizes E(rI ), where rI is defined as in Eq. 4.1, and meets the cost
requirements, i.e., the unit cost of the selected architecture, wN , is within the budget
wb. The implementation I found by SFI is presented in Fig. 4.20(a), and consists of
two PEs , N3 and N4 interconnected by a bus B3. The mapping of S0 is also shown in
the figure: τ2 and τ4 are on N3 and τ1 and τ3 are on N4. The values of the objective
functions for this solution are E(rI) = −486, and wN = 80.77 e which is below wb.
Note that expected values are used during the calculations, and we do not have the CN
yet for SFI.
We evaluate the solution produced by SFI using our proposed uncertainty models (See
Section 4.2.2, Section 4.2.3, and Section 4.8.1) that consider the uncertainties in wcet,
functionality requirements, and cost. We have plotted the SFI in Fig.4.21, using a red
‘+’ symbol. Fig.4.21 shows the three objective functions of robustness, flexibility and
cost on a three-dimensional space. We can see the robustness of SFI is 0.5, while the
flexibility is 0.86 and the cost is 0.64, i.e., SFI has only a 50% chance to be schedu-
lable, a 86% chance to schedule the future scenarios, and a 64% chance to meet the
architecture budget wb.
Let us consider the baseline functionality from Table 4.6, the future scenarios from Ta-
ble 4.7, and the architecture selection from Table 4.5. We are interested to determine
an implementation I , which maximizes the robustness, flexibility, and the chance of
meeting the architecture budget wb. Let us call this approach “Architecture selection
and Mapping for Robustness, Flexibility and Cost” (AMRFC). In contrast with SFI,
AMRFC takes into account the uncertainty models when deriving the architecture and
mapping of the implementation I . The Pareto-optimal solutions found after an exhaus-
tive search using AMRFC, are depicted by blue ‘∗’ symbols in Fig.4.21.
We can see from Fig.4.21 that, by using the uncertainty models to take decisions on
the architecture and mapping, it is possible to have much better quality solutions which
have been simultaneously optimized for robustness, flexibility and cost. One of the
Pareto-optimal solutions, marked as I1 in Fig.4.21, has a 100% chance to be schedu-
lable, a 84% chance to accommodate the future scenarios, and a 77% chance to meet
4.8 Architecture Selection under Uncertainties 97
Figure 4.21: SFI and Pareto-optimal
the architecture budget wb. Compared to SFI, I1 improves the robustness with 50%
percent, and increases with 13% the chance to meet the cost requirement, losing only
2% percentage in flexibility. The detailed implementation I1 is depicted in Fig.4.21(b).
Another Pareto-optimal solution, marked as I2 in Fig.4.21, has the best robustness
(100%) and flexibility (98%), at a reduction of only 6% chances to meet the cost re-
quirement, compared to SFI. The detailed implementation I2 is depicted in Fig.4.21(c).
4.8.4 GA-based Approach for Architecture Selection
We have extended the Genetic Algorithm (GA)-based approach, MRF, from Section 4.6.1
to solve the optimization problem formulated in Section 4.8.2 to consider also the ar-
chitecture selection. We call this extended GA-based approach “Architecture selection
and Mapping for Robustness, Flexibility and Cost” (AMRFC). AMRFC has the same
structure as MRF (see Fig. 4.6), but we have modified the way of encoding and decod-
ing the chromosomes to include the architecture selection.
In Section 4.6, the architecture was known, and the chromosome has encoded the map-
ping, such that each gene has represented an index to the PEs in the architecture. In this
98
Paper C: Design for Robustness and Flexibility of Real-time Distributed
Applications during the Early Design Phases
section, the chromosome has to include also the architecture selection. Note that we
do not know the number of PEs in the architecture: we have to decide both the number
of PEs and their types. However, we consider we have a single shared bus, connecting
all PEs. Thus, we consider that a chromosome has υ + ω + 1 of genes, where the first
υ genes decide the PEs selection, the followed by ω genes used for the task mapping,
and the last gene is used for bus selection.
To simplify the implementation of our GA, we assume that the engineer gives an upper
bound υ on the number of PEs that can be used in the architecture. Thus, the de-
sign space exploration is semi-automatic: the engineer will have to call our proposed
AFRFC approach with different values for υ, until he or she is satisfied with the solu-
tion. The genes υ used for the architecture selection encode the index to the PEs in the
component library LH , and the last gene is an index to the buses in LH .
We show two chromosomes used for our motivational example in the “Encoding” box
of Fig. 4.22. In these two chromosomes, the first four genes indicate that, at the maxi-
mum, four PEs can be considered in the architecture. In each of these genes, the index
of a selected PE is recorded. The following genes, depicted in gray, represent the map-
ping of tasks in the baseline functionality S0, and the genes encode an index to the first
four genes encoding the PEs. For example, in the first chromosome of Fig. 4.22, task τ1
in the first gray gene is mapped on the PE encoded in the 1st gene of the chromosome,
which points to the PE N4 in the library LH from Table 4.5.
We perform the crossover and mutation similar to Section 4.6.1. These are illustrated
in Fig. 4.22. However, the decoding is different compared to Section 4.6.1. Taking the
first chromosome in the “Mutation” result as an example, τ1 is considered to map on
N4, τ2 is on N4, τ3 is on N4, and τ4 is on N2. The 4th gene in the architecture genes
is encoding N3, but none of mapping genes is assigned a ‘4’, so the 4th gene of the
architecture genes is not actually used. In the current chromosome example, a bus B4
is selected to interconnect the two PEs N4 and N2.
We mark a design alternative as an infeasible solution in the case its bus utilization over
100%, and a infeasible solution is discarded during the design space exploration and
will not be presented in the final Pareto-front of solutions. The utilization is calculated
from the bus speed, the size of the messages exchanged over the bus and period of the
sender tasks.
We have the decoded strings, shown in the “Decoding” box of Fig. 4.22, where the
first ith (i ∈ ω, τi ∈ S0) places in the string are the index j of the PE N j that task τi
will be mapped on. The last place is the index i of the selected bus Bi in the architec-
ture. The implementation of these two decoded strings, I1 and I2, are demonstrated in
Fig. 4.20(b) and (c), respectively.
4.9 Conclusion 99
Figure 4.22: Chromosome encoding and decoding
We perform the same strategy for the mapping of future scenarios as in Section 4.6.2.
The proposed GA approach has been run for 10 minutes to obtain the Pareto-front
solutions, which are exactly the same as the Pareto-optimal solutions determined by
performing an exhaustive search that takes 2 hours, for the motivational example pre-
sented in Section 4.8.3.
4.9 Conclusion
In this paper, we have addressed the architecture selection and the mapping of hard
real-time applications on distributed heterogeneous architectures, during the early de-
sign phases. We have considered a fixed-priority preemptive scheduling policy, where
the system schedulability is determined using a response time analysis. We have mod-
eled the uncertainties in wcets, functionality requirements, and hardware component
costs. We have used the Kernel Smoothing Density Estimate as the basis for deter-
mining the schedulability probability of a mapping alternative, and the chance of an
architecture alternative meeting the cost budget. We have proposed a GA-based opti-
mization for architecture selection and task mapping, targeting robustness, flexibility
and cost, which takes into account the uncertainties in wcets, functionality requirements
and hardware component costs, respectively. The results obtained on the synthetic and
real-life benchmarks show the importance of modeling the uncertainties during the
early design phases, and taking them into account during design space exploration.
100
Paper C: Design for Robustness and Flexibility of Real-time Distributed
Applications during the Early Design Phases
APPENDIX A
List of Abbreviations
Table A.1: List of Abbreviations
Abbreviations Description
ABS Anti-lock Braking Systems
AMRFC Architecture selection and Mapping for
Robustness, Flexibility and Cost (algorithm)
ASIL Automotive Safety Integrity Level
BCET Best-Case Execution Time
CMO Criticality-Aware Mapping Optimization
CDMO Criticality-Aware Functional Decomposition
and Mapping Optimization
DAG Directed Acyclic Graphs
DAL Design Assurance Level
DSE Design Space Exploration
DVFS Dynamic Voltage and Frequency Scaling
DVS Dynamic Voltage Scaling
E3S Embedded System Synthesis Benchmark Suite
FPS Fixed-Priority Preemptive Scheduling
GA Genetic Algorithm
GFSR Global System Failure Rate
IMA Integrated Modular Avionics
102 List of Abbreviations
KSDE Kernel Smoothing Density Estimate
MoC Models of Computation
MRF Mapping for Robustness and Flexibility (algorithm)
ms milliseconds
MSC Monte Carlo Simulation
MVFS Mapping, Voltage and Frequency Scaling (algorithm)
NRE Non-Recurring Engineering (cost)
NSGA-II Non-dominated Sorting Genetic Algorithm-II
OVFS Online Voltage and Frequency Scaling (algorithm)
PE Processing Element





TGFF Task Graphs For Free
TiSS Trusted Interface Subsystem
TRM Trusted Resource Manager
TS Tabu Search
VaM Validation Middleware
WCET Worst-Case Execution Time
WCRT Worst-Case Response Time
APPENDIX B
List of Notations
Table B.1: List of Notations used in the thesis
Abbreviations Description
Γ a set of applications
N a set of PEs in the architecture




smi the size of mi
N j a PE
Ti period
pc the probability of crossover
pm the probability of mutation
m population size for GA
C
N j
i WCET of τi mapped on N j
ri WCRT of τi
hp(τi) the set of tasks that have a priority higher than the priority of τi
S an implementation (a solution)
rS the degree of schedulability of S
104 List of Notations
Table B.2: List of Notations used in Paper A
Abbreviations Description
di deadline of functional blocks or tasks
Fi ∈ F a functional block
ei j ∈ E data dependency between Fi and Fj
G(F ,E) a graph models an application
Di a set of decomposition options for Fi
D(Fi) : Vi→Di the decomposition function of Fi
Gi(Vi,Ei) a task graph models a decomposed function Fi
LA a library of function-to-task decompositions
LH a library of architecture implementations for PEs
H(N j)→ Nkj N j with SIL k is used in the architecture
k safety criticality level from 0 to 4
R safety requirements in terms of SIL 0 to SIL 4
Pji the ith partition on the PE N j
Wτi the cost to develop and certify τi to its required SIL
wkj the unit cost of N j with a reliability corresponding to a SIL k
wS total cost of S
Co the timing overhead required by the VaM
105
Table B.3: List of Notations used in Paper B
Abbreviations Description
d PE-specific constant
Di deadline of τi
f Nij the operating clock frequency of Ni running in mode j
vNij the supply voltage of Ni running in mode j
pNij the power spent measured of Ni running in mode j






j ) the operating mode of Ni running in mode j
F the normalized frequency
V the normalized voltage
` the operating mode
ki the redundancy level
F : F (τi) = ki the function to specify the redundancy level of τi





the function to assign operating mode for τi
Ri reliability of τi
λ the permanent fault rate of the PE running τi
λ`j the transient fault rate of N j running in `
λ0j the minimum fault rate
λmaxj the maximum fault rate
µ`j a shape parameter in Eq.3.2
α and β shape parameters in Eq.3.7
Tcycle the period that the application is repeated periodically
TS the mission duration
c
N j
i j the execution time of Ji j in the fastest operating mode
RS the reliability of the system for a period of time Tcycle
ES energy consumption of finishing all jobs during Tcycle
Rg the reliability goal
S0 the reference implementation
E0 the energy consumption of S0
R0 the reliability of S0
WR and WS penalty weights in cost function Eq.3.9
τ′i the replica of τi
Γ′ the complete task set includes original task and the replicas
C the candidate set
φS the GSFR of a system
US the sum of the execution times for all jobs executed so far
θ the degeneration of the system reliability
E∆ the saved energy percentage
106 List of Notations
Table B.4: List of Notations used in Paper C
Abbreviations Description
Di dealine of τi
µ and β Gumbel distribution parameters
S0 baseline functionality
M0 the mapping solution for the baseline functionality S0
S f a set of future scenarios
wi the probability of a future scenario becoming a reality
RMk the robustness of a mapping Mk
FM0 the flexibility of M0
ei execution time of τi
ξi probability distribution function of ei
ci stochastic variables of Ci
σi probability distribution function of ci
ui utilization of Si
LH a library of hardware components
wN the unit cost of an architecture solution
wb cost budget of an architecture solution
CN the probability that wN of N meets the imposed budget wb
e Euro
Bibliography
[AB98] L. Abeni and G. Buttazzo. Integrating multimedia applications in hard
real-time systems. In Proceedings of Real-Time Systems Symposium,
pages 4 –13, 1998.
[AFOTH06] J. Alves-Foss, P. W. Oman, C. Taylor, and W. S. Harrison. The mils ar-
chitecture for high-assurance embedded systems. International journal
of embedded systems, 2(3):239–247, 2006.
[AGK11] I. Assayad, A. Girault, and H. Kalla. Tradeoff exploration between
reliability, power consumption, and execution time. Computer Safety,
Reliability, and Security, pages 437–451, 2011.
[AGK12] I. Assayad, A. Girault, and H. Kalla. Tradeoff exploration between
reliability, power consumption, and execution time for embedded sys-
tems. International Journal on Software Tools for Technology Transfer,
pages 1–17, 2012.
[ALR+01] A. Avizienis, J-C Laprie, B. Randell, et al. Fundamental concepts of
dependability. University of Newcastle upon Tyne, Computing Sci-
ence, 2001.
[AMMMA01] H. Aydin, R. Melhem, D. Mossé, and P. Mejia-Alvarez. Dynamic and
aggressive scheduling techniques for power-aware real-time systems.
In Real-Time Systems Symposium, 2001. Proceedings of 22nd IEEE,
pages 95–105. IEEE, 2001.
[APW+13] L. S. Azevedo, D. Parker, M. Walker, Y. Papadopoulos, and R. E.
Araùjo. Automatic decomposition of safety integrity levels: Optimiza-
108 Bibliography
tion by tabu search. In Workshop on Critical Automotive applications:
Robustness and Safety, 2013.
[Ari97] ARINC 651-1: Design Guidance for Integrated Modular Avionics. AR-
INC (Aeronautical Radio, Inc), 1997.
[Ari13] ARINC 653P0: Avionics Application Software Standard Interface, Part
0, Overview of ARINC 653. ARINC (Aeronautical Radio, Inc), 2013.
[ARP10] SAE ARP4754A. Guidelines for development of civil aircraft and sys-
tems, 2010.
[AS 11] AS 6802. Time-Triggered Ethernet. SAE International, 2011.
[Axe05] Jakob Axelsson. A method for evaluating uncertainties in the early
development phases of embedded real-time systems. In Embedded and
Real-Time Computing Systems and Applications, 2005. Proceedings.
11th IEEE International Conference on, pages 72–75. IEEE, 2005.
[Axe06] Jakob Axelsson. Cost models with explicit uncertainties for electronic
architecture trade-off and risk analysis. Intl. Council on Systems Engi-
neering (INCOSE), 2006.
[AZ09] H. Aydin and D. Zhu. Reliability-aware energy management for peri-
odic real-time tasks. Computers, IEEE Transactions on, 58(10):1382–
1397, 2009.
[BA97] A. W. Bowman and A. Azzalini. Applied smoothing techniques for
data analysis: The kernel approach with S-Plus illustrations. Oxford
University Press, USA, 1997.
[BAC00] B. Boehm, C. Abts, and S. Chulani. Software development cost estima-
tion approaches–A survey. Annals of Software Engineering, 10(1):177–
205, 2000.
[BBB+09] J. Barhorst, T. Belote, P. Binns, J. Hoffman, J. Paunicka, P. Sarathy,
J. Scoredos, P. Stanfill, D. Stuart, and R. Urzi. A research agenda for
mixed-criticality systems. In Cyber-Physical Systems Week, 2009.
[BBL09] E. Bini, G. Buttazzo, and G. Lipari. Minimizing cpu energy in real-
time systems with discrete speed management. ACM Transactions on
Embedded Computing Systems (TECS), 8(4):31, 2009.
[BCC+05] R. Bloomfield, J. Cazin, D. Craigen, N. Juristo, E. Kesseler, and
J. Voas. Validation, verification and certification of embedded systems.
2005.
Bibliography 109
[BDP96] A. Burns, R. Davis, and S. Punnekkat. Feasibility analysis of fault-
tolerant real-time task sets. In Real-Time Systems, 1996., Proceedings
of the Eighth Euromicro Workshop on, pages 29–33. IEEE, 1996.
[BDS11] P. Bieber, Ré. Delmas, and C. Seguin. Dalculus–theory and tool for de-
velopment assurance level allocation. In Computer Safety, Reliability,
and Security, pages 43–56. Springer, 2011.
[BE06] I. Bate and P. Emberson. Incorporating scenarios and heuristics to im-
prove flexibility in real-time embedded systems. In Real-Time and Em-
bedded Technology and Applications Symposium, 2006. Proceedings of
the 12th IEEE, pages 221–230. IEEE, 2006.
[Bib77] Kenneth J Biba. Integrity considerations for secure computer systems.
Technical report, DTIC Document, 1977.
[BL73] D. E. Bell and L. J. LaPadula. Secure computer systems: Mathematical
foundations. Technical report, DTIC Document, 1973.
[BLTZ03] S. Bleuler, M. Laumanns, L. Thiele, and E. Zitzler. Pisa – a plat-
form and programming language independent interface for search al-
gorithms. pages 1–1, 2003.
[BM94] A. A. Bertossi and L. V. Mancini. Scheduling algorithms for fault-
tolerance in hard-real-time systems. Real-Time Systems, 7(3):229–245,
1994.
[BMS00] B. W. Boehm, R. Madachy, and B. Steece. Software Cost Estimation
with Cocomo II. Prentice Hall PTR, Upper Saddle River, NJ, USA, 1st
edition, 2000.
[Boe88] B. W. Boehm. A spiral model of software development and enhance-
ment. Computer, 21(5):61–72, 1988.
[Bos91] Robert Bosch. Can specification version 2.0. Robert Bosch Gmbh,
Stuttgart, 1991.
[BSB+01] T. D. Braun, H. J. Siegel, N. Beck, L. L. Bölöni, M. Maheswaran, A. I.
Reuther, J. P. Robertson, M. D. Theys, B. Yao, D. Hensgen, et al. A
comparison of eleven static heuristics for mapping a class of indepen-
dent tasks onto heterogeneous distributed computing systems. Journal
of Parallel and Distributed computing, 61(6):810–837, 2001.
[Bue11] D. M. Buede. The engineering design of systems: models and methods,
volume 55. John Wiley & Sons, 2011.
[But97] Giorgio Buttazzo. Hard Real-Time Computing Systems: Predictable
Scheduling Algorithms and Applications. Kluwer Academic Publish-
ers, Boston, 1997.
110 Bibliography
[But11] Giorgio C Buttazzo. Hard real-time computing systems: predictable
scheduling algorithms and applications, volume 24. Springer, 2011.
[CAS01] CAST. CAST-2: Guidelines for assessing software partition-
ing/protection schemes. Position Paper, Federal Aviation Administra-
tion, 2001.
[CGL+07] P. Cuenot, S. Gerard, H. Lonn, M-O Reiser, D. Servat, C-J Sjostedt,
R. T. Kolagari, M. Torngren, M. Weber, et al. Managing complexity of
automotive electronics using the east-adl. In 12th IEEE International
Conference on Engineering Complex Computer Systems, pages 353–
358, 2007.
[Cha09] R. N. Charette. This car runs on code. IEEE Spectrum, 46(3):3, 2009.
[CHK06] J. J. Chen, H. R. Hsu, and T. W. Kuo. Leakage-aware energy-efficient
scheduling of real-time tasks in multiprocessor systems. In Real-Time
and Embedded Technology and Applications Symposium, 2006. Pro-
ceedings of the 12th IEEE, pages 408–417. IEEE, 2006.
[CK07] J. J. Chen and C. F. Kuo. Energy-efficient scheduling for real-time
systems on dynamic voltage scaling (dvs) platforms. In Embedded and
Real-Time Computing Systems and Applications, 2007. Conference on,
pages 28–38. IEEE, 2007.
[CMR92] A. Campbell, P. McDonald, and K. Ray. Single event upset rates
in space. Nuclear Science, IEEE Transactions on, 39(6):1828–1835,
1992.
[CMS82] X. Castillo, S. R. McConnel, and D. P. Siewiorek. Derivation and cal-
ibration of a transient error reliability model. Computers, IEEE Trans-
actions on, 100(7):658–671, 1982.
[Con03] C. Constantinescu. Trends and challenges in vlsi circuit reliability. Mi-
cro, IEEE, 23(4):14–19, 2003.
[CRM+06] A. K. Coskun, T. S. Rosing, K. Mihic, G. De Micheli, and Y. Leblebici.
Analysis and optimization of mpsoc reliability. Journal of Low Power
Electronics, 2(1):56–69, 2006.
[CYK06] J. J. Che, C. Y. Yang, and T. W. Kuo. Slack reclamation for real-time
task scheduling over dynamic voltage scaling multiprocessors. In Sen-
sor Networks, Ubiquitous, and Trustworthy Computing, 2006. Confer-
ence on, volume 1. IEEE, 2006.
[DAPM00] K. Deb, S. Agrawal, A. Pratap, and T. Meyarivan. A fast elitist non-
dominated sorting genetic algorithm for multi-objective optimization:
NSGA-II. In Parallel Problem Solving from Nature PPSN VI, pages
849–858. Springer, 2000.
Bibliography 111
[Dic02] R. Dick. Embedded system synthesis benchmarks suite, 2002.
[Dic08] Robert Dick. Embedded system synthesis benchmarks suites (e3s),
2008.
[DMG97] J. A. Debardelaben, V. K. Madisetti, and A. J. Gadient. Incorporating
cost modeling in embedded-system design. IEEE Design and Test of
Computers, 14:24–35, July 1997.
[DRW98] R. P. Dick, D. L. Rhodes, and W. Wolf. TGFF: task graphs for free. In
Proceedings of the 6th international workshop on Hardware/software
codesign, pages 97–101. IEEE Computer Society, 1998.
[EAHS+06] A. Ejlali, B. M. Al-Hashimi, M. T. Schmitz, P. Rosinger, and S. G.
Miremadi. Combined time and information redundancy for seu-
tolerance in energy-efficient real-time systems. Very Large Scale Inte-
gration (VLSI) Systems, IEEE Transactions on, 14(4):323–335, 2006.
[EJ09] C. Ebert and C. Jones. Embedded software: Facts, figures, and future.
Computer, 42(4):42–52, 2009.
[ELLSV97] S. Edwards, L. Lavagno, E.A. Lee, and A. Sangiovanni-Vincentelli.
Design of embedded systems: Formal models, validation, and synthe-
sis. Proceedings of the IEEE, 85(3):366–390, 1997.
[Ern10] Rolf Ernst. Certification of Trusted MPSoC Platforms. 10th Interna-
tional Forum on Embedded MPSoC and Multicore, 2010.
[Est07] Jeff A. Estefan. Survey of model-based systems engineering (mbse)
methodologies. Incose MBSE Focus Group, 25, 2007.
[EY97] R. Ernst and W. Ye. Embedded program timing analysis based on path
clustering and architecture classification. In Computer-Aided Design,
1997. Digest of Technical Papers., 1997 Conference on, pages 598–
604. IEEE, 1997.
[FÅS04] J. Fredriksson, M. Åkerholm, and K. Sandström. Calculating resource
trade-offs when mapping component services to real-time tasks. In
Fourth Conference on Software Engineering Research and Practice in
Sweden Linköping, Sweden, 2004.
[FDT05] Sé. Faucou, A-M Déplanche, and Y. Trinquet. An adl centric approach
for the formal design of real-time systems. In Architecture Description
Languages, pages 67–82. Springer, 2005.
[FH04] C. Ferdinand and R. Heckmann. ait: Worst-case execution time pre-
diction by static program analysis. In Building the Information Society,
pages 377–383. Springer, 2004.
112 Bibliography
[FHKS09] J. Friedrich, U. Hammerschall, M. Kuhrmann, and M. Sihling. Das
v-modell xt. In Das V-Modell R© XT, pages 1–32. Springer, 2009.
[Fid98] C. J. Fidge. Real-time schedulability tests for preemptive multitasking.
Real-Time Systems, 14(1):61–93, 1998.
[GAGS09] D. D. Gajski, S. Abdi, A. Gerstlauer, and G. Schirner. Embedded Sys-
tem Design: Modeling, Synthesis and Verification. Springer Publishing
Company, Incorporated, 2009.
[GK83] D. D. Gajski and R. H. Kuhn. Guest editors’ introduction: New vlsi
tools. Computer, 16(12):11–14, 1983.
[GK03] F. Gruian and K. Kuchcinski. Uncertainty-based scheduling: Energy-
efficient ordering for tasks with variable execution time. In Proceed-
ings of the 2003 International Symposium on Low Power Electronics
and Design, pages 465–468, 2003.
[GK09] A. Girault and H. Kalla. A novel bicriteria scheduling heuristics pro-
viding a guaranteed global system failure rate. Dependable and Secure
Computing, IEEE Transactions on, 6(4):241–254, 2009.
[Glo89] Fred Glover. Tabu search: part i. ORSA Journal on computing,
1(3):190–206, 1989.
[Gru01] F. Gruian. Hard real-time scheduling for low-energy using stochas-
tic data and dvs processors. In Proceedings of the 2001 International
Symposium on Low Power Electronics and Design, pages 46–51, Au-
gust 6–7 2001.
[GZDNBH10] A. Ghosal, H. Zeng, M. Di Natale, and Y. Ben-Haim. Computing ro-
bustness of flexray schedules to uncertainties in design parameters. In
Proc. DATE, pages 550–555, 2010.
[Hai11] Yacov Y Haimes. Risk modeling, assessment, and management. John
Wiley & Sons, 2011.
[HD93] K. Hoyme and K. Driscoll. SAFEbus. IEEE Aerospace Electronic
Systems Magazine, 8:34–39, 1993.
[HS06] T.A. Henzinger and J. Sifakis. The embedded systems design chal-
lenge. In FM 2006: Formal Methods, pages 1–15. Springer, 2006.
[HTRE02] C. Haubelt, J. Teich, K. Richter, and R. Ernst. System design for flexi-
bility. In Proc. DATE, pages 854–861, 2002.
[IBM10] IBM. DO-178B compliance: turn an overhead expense into a compet-
itive advantage. White paper, IBM Rational, 2010.
Bibliography 113
[IEC98] IEC. 61508 functional safety of electrical/electronic/programmable
electronic safety-related systems. International electrotechnical com-
mission, 1998.
[IEC10] IEC 61508. IEC 61508: Functional safety of electri-
cal/electronic/programmable electronic safety-related systems.
International Electrotechnical Commission, 2010.
[IPP+09] V. Izosimov, I. Polian, P. Pop, P. Eles, and Z. Peng. Analysis and op-
timization of fault-tolerant embedded systems with hardened proces-
sors. In Design, Automation & Test in Europe Conference & Exhibi-
tion, 2009. DATE’09., pages 682–687. IEEE, 2009.
[ISG03] S. Irani, S. Shukla, and R. Gupta. Algorithms for power savings.
In Proceedings of the fourteenth annual ACM-SIAM symposium on
Discrete algorithms, pages 37–46. Society for Industrial and Applied
Mathematics, 2003.
[ISO08] ISO 9001. Quality management systems - Requirements. International
Organization for Standardization, 2008.
[ISO09] ISO/DIS 26262. ISO/DIS 26262 - Road vehicles — Functional safety.
International Organization for Standardization / Technical Committee
22 (ISO/TC 22), 2009.
[ISO11] CD ISO. 26262, road vehicles–functional safety. International Stan-
dard ISO/FDIS, 26262, 2011.
[JS07] M. Jorgensen and M. Shepperd. A systematic review of software de-
velopment cost estimation studies. IEEE Transactions on Software En-
gineering, 33(1):33–53, 2007.
[KB03] H. Kopetz and G. Bauer. The time-triggered architecture. Proceedings
of the IEEE, 91(1):112–126, 2003.
[KDB+05] K. Kim, J. L. Diaz, L. L. Bello, J. M. Lopez, C. G. Lee, and S. L.
Min. An exact stochastic analysis of priority-driven periodic real-time
systems and its approximations. Computers, IEEE Transactions on,
54(11):1460–1466, 2005.
[KK10] I. Koren and C. M. Krishna. Fault-tolerant systems. Morgan Kauf-
mann, 2010.
[KN00] S. Kotz and S. Nadarajah. Extreme value distributions: theory and
applications. World Scientific Publishing Company, 2000.
[Kop11a] H. Kopetz. Real-Time Systems: Design Principles for Distributed Em-
bedded Applications. Springer, 2011.
114 Bibliography
[Kop11b] Hermann Kopetz. Real-time systems: design principles for distributed
embedded applications. Springer Science+ Business Media, 2011.
[KSSB11] A. Kossiakoff, W. N. Sweet, S. Seymour, and S. M. Biemer. Systems
engineering principles and practice, volume 83. Wiley. com, 2011.
[KWS03] S. Kodase, S. Wang, and K.G. Shin. Transforming structural model
to runtime model of embedded software with real-time constraints. In
Proceedings of the conference on Design, Automation and Test in Eu-
rope: Designers’ Forum-Volume 2, page 20170. IEEE Computer Soci-
ety, 2003.
[LGT10] M. Lukasiewycz, M. Glaß, and J. Teich. Robust design of embedded
systems. In Proc. DATE, pages 1578–1583, 2010.
[LP05] L. Lavagno and C. Passerone. Design of embedded systems. Embedded
Systems Handbook, 2005.
[LS04] J. R. Lorch and A. J. Smith. Pace: A new approach to dynamic voltage
scaling. Computers, IEEE Transactions on, 53(7):856–869, 2004.
[LSH10] E. Le Sueur and G. Heiser. Dynamic voltage and frequency scaling:
The laws of diminishing returns. In Proceedings of the 2010 interna-
tional conference on Power aware computing and systems, pages 1–8.
USENIX Association, 2010.
[LSOH07] B. Leiner, M. Schlager, R. Obermaisser, and B. Huber. A Comparison
of Partitioning Operating Systems for Integrated Systems. Computer
Safety, Reliability, and Security, pages 342–355, 2007.
[Mar11] Peter Marwedel. Embedded system design. Springer, 2011.
[MEP06] S. Manolache, P. Eles, and Z. Peng. Buffer space optimisation with
communication synthesis and traffic shaping for nocs. In Proceedings
of the conference on Design, automation and test in Europe: Proceed-
ings, pages 718–723. European Design and Automation Association,
2006.
[MME04] R. Melhem, D. Mossé, and E. Elnozahy. The interplay of power man-
agement and fault recovery in real-time systems. Computers, IEEE
Transactions on, 53(2):217–231, 2004.
[PEP02] P. Pop, P. Eles, and Z. Peng. Flexibility driven scheduling and mapping
for distributed real-time systems. In RTCSA, 2002.
[PEP04] P Pop, P Eles, and Z Peng. Analysis and Synthesis of Communication-
Intensive Heterogeneous Real-Time Systems. Kluwer Academic Pub-
lishers, 2004.
Bibliography 115
[PEPP04] P. Pop, P. Eles, Z. Peng, and T. Pop. Scheduling and mapping in an in-
cremental design methodology for distributed real-time embedded sys-
tems. Very Large Scale Integration (VLSI) Systems, IEEE Transactions
on, 12(8):793–811, 2004.
[PGH98] J. C. Palencia and M. González Harbour. Schedulability analysis for
tasks with static and dynamic offsets. In Real-Time Systems Sympo-
sium, 1998. Proceedings., The 19th IEEE, pages 26–37. IEEE, 1998.
[PIEP09] P. Pop, V. Izosimov, P. Eles, and Z. Peng. Design optimization of time-
and cost-constrained fault-tolerant embedded systems with checkpoint-
ing and replication. Very Large Scale Integration (VLSI) Systems, IEEE




[PPE+08] T. Pop, P. Pop, P. Eles, Z. Peng, and A. Andrei. Timing analysis of the
FlexRay communication protocol. Real-Time Systems, 39(1-3):205–
235, 2008.
[PPIE07] P. Pop, K. H. Poulsen, V. Izosimov, and P. Eles. Scheduling and voltage
scaling for energy/reliability trade-offs in fault-tolerant time-triggered
embedded systems. In Proceedings of the 5th IEEE/ACM interna-
tional conference on Hardware/software codesign and system synthe-
sis, pages 233–238. ACM, 2007.
[PS01] P. Pillai and K. G. Shin. Real-time dynamic voltage scaling for low-
power embedded operating systems. In ACM SIGOPS Operating Sys-
tems Review, pages 89–102. ACM, 2001.
[PTV+13] P. Pop, L. Tsiopoulos, S. Voss, O. Slotosch, C. Ficek, U. Nyman,
and A. Ruiz. Methods and tools for reducing certification costs of
mixed-criticality applications on multi-core platforms: the RECOMP
approach. In Proceedings of the Workshop of Industry-Driven Ap-
proaches for Cost-effective Certification of Safety-Critical, Mixed-
Criticality Systems, 2013.
[PWA+13] D. Parker, M. Walker, L. Azevedo, Y. Papadopoulos, and R. Araújo.
Automatic decomposition and allocation of safety integrity levels using
a penalty-based genetic algorithm. In Moonis Ali, Tibor Bosse, KoenV.
Hindriks, Mark Hoogendoorn, CatholijnM. Jonker, and Jan Treur, ed-
itors, Recent Trends in Applied Artificial Intelligence, volume 7906 of
Lecture Notes in Computer Science, pages 449–459. Springer Berlin
Heidelberg, 2013.
116 Bibliography
[PWR+10a] Y. Papadopoulos, M. Walker, M-O Reiser, M. Weber, D. Chen,
M. Törngren, D. Servat, A. Abele, F. Stappert, H. Lonn, et al. Au-
tomatic allocation of safety integrity levels. In Proceedings of the
1st workshop on critical automotive applications: robustness & safety,
pages 7–10. ACM, 2010.
[PWR+10b] Y. Papadopoulos, M. Walker, M.-O. Reiser, M. Weber, D. Chen,
M. Törngren, David Servat, A. Abele, F. Stappert, H. Lonn, L. Bernts-
son, Rolf Johansson, F. Tagliabo, S. Torchiaro, and Anders Sandberg.
Automatic allocation of safety integrity levels. In Proceedings of the 1st
Workshop on Critical Automotive applications: Robustness and Safety,
pages 7–10, 2010.
[QNHM04] G. Quan, L. Niu, X. S. Hu, and B. Mochocki. Fixed priority scheduling
for reducing overall energy on variable voltage processors. In Real-
Time Systems Symposium, 2004. Proceedings of 25th IEEE Interna-
tional, pages 309–318. IEEE, 2004.
[Res12] Transparency Market Research. Embedded system market - global in-
dustry analysis, size, share, growth, trends and forecast, 2012 - 2018.
2012.
[RHE06] R. Racu, A. Hamann, and R. Ernst. A formal approach to multi-
dimensional sensitivity analysis of embedded real-time systems. In
Real-Time Systems, 2006. 18th Euromicro Conference on, pages 10–
pp. IEEE, 2006.
[Roa09] ITRS Roadmap. International technology roadmap for semiconductors,
2009 edn. Executive Summary. Semiconductor Industry Association,
2009.
[Roc09] Rockwell-Collins. Certification cost estimates for future communica-
tion radio platforms. Technical report, Rockwell-Collins, 2009.
[RTC92] RTCA DO-178B. Software Considerations in Airborne Systems and
Equipment Certification. Radio Technical Commission for Aeronautics
(RTCA), 1992.
[Rus99] John Rushby. Partitioning for avionics architectures: Requirements,
mechanisms, and assurance. NASA Contractor Report CR-1999-
209347, NASA Langley Research Center, June 1999.
[Rus07] John Rushby. Just-in-time certification. In 12th IEEE Interna-
tional Conference on the Engineering of Complex Computer Systems
(ICECCS), pages 15–24, 2007.
[SAHE03] M. T. Schmitz, B. M. Al-Hashimi, and P. Eles. System-level design
techniques for energy-efficient embedded systems. Springer, 2003.
Bibliography 117
[SAHE04] M. T. Schmitz, B. M. Al-Hashimi, and P. Eles. System-level design
techniques for energy-efficient embedded systems. Springer, 2004.
[SBK] D. Sojer, C. Buckl, and A. Knoll. Propagation, transformation and
refinement of safety requirements.
[SPM09] P.K. Saraswat, P Pop, and J Madsen. Task Migration for Fault-
Tolerance in Mixed-Criticality Embedded Systems, volume 6, Number
3. 2009.
[SR11] A.P. Sage and W.B. Rouse. Handbook of systems engineering and man-
agement. Wiley. com, 2011.
[Sto96] N.R. Storey. Safety Critical Computer Systems. Addison-Wesley Long-
man Publishing Co., Inc., Boston, MA, USA, 1996.
[Tas02] Gregory Tassey. The economic impacts of inadequate infrastructure for
software testing. National Institute of Standards and Technology, RTI
Project, 7007, 2002.
[TBDP98] E. Totel, J-P Blanquart, Y. Deswarte, and D. Powell. Supporting mul-
tiple levels of criticality. In Fault-Tolerant Computing, 1998. Digest
of Papers. Twenty-Eighth Annual International Symposium on, pages
70–79. IEEE, 1998.
[TBW92] K. W. Tindell, A. Burns, and A. J. Wellings. Allocating hard real-time
tasks: an np-hard problem made easy. Real-Time Systems, 4(2):145–
165, 1992.
[TCN00] L. Thiele, S. Chakraborty, and M. Naedele. Real-time calculus for
scheduling hard real-time systems. In Circuits and Systems, 2000. Pro-
ceedings. ISCAS 2000 Geneva. The 2000 IEEE International Sympo-
sium on, volume 4, pages 101–104. IEEE, 2000.
[TSP11a] D. Tamas-Selicean and P. Pop. Design optimization of mixed-criticality
real-time applications on cost-constrained partitioned architectures. In
Real-Time Systems Symposium (RTSS), 2011 IEEE 32nd, pages 24–33.
IEEE, 2011.
[TSP11b] D. Tamas-Selicean and P. Pop. Design optimization of mixed-criticality
real-time applications on cost-constrained partitioned architectures. In
Real-Time Systems Symposium (RTSS), 2011 IEEE 32nd, pages 24–33.
IEEE, 2011.
[TSPS12a] D. Tamas-Selicean, P. Pop, and W. Steiner. Synthesis of commu-
nication schedules for ttethernet-based mixed-criticality systems. In
Proceedings of conference on Hardware/software codesign and system
synthesis, pages 473–482. ACM, 2012.
118 Bibliography
[TSPS12b] D. Ta˘mas¸-Selicean, P. Pop, and W. Steiner. Synthesis of communica-
tion schedules for TTEthernet-based mixed-criticality systems. In Pro-
ceedings of the International Conference on Hardware/software Code-
sign and System Synthesis, pages 473–482, 2012.
[Vah06] Frank Vahid. Embedded system design: a unified hardware/software
introduction. John Wiley & Sons, 2006.
[Was14] Armin Wasicek. The across integrity model. In IAENG Transactions
on Engineering Technologies, pages 333–348. Springer, 2014.
[WC12] D. D. Ward and S. E. Crozier. The uses and abuses of asil decomposi-
tion in iso 26262. In System Safety, incorporating the Cyber Security
Conference 2012, 7th IET International Conference on, pages 1–6. IET,
2012.
[WESK10] A. Wasicek, C. El-Salloum, and H. Kopetz. A system-on-a-chip plat-
form for mixed-criticality applications. In Object/Component/Service-
Oriented Real-Time Distributed Computing (ISORC), 2010 13th IEEE
International Symposium on, pages 210–216. IEEE, 2010.
[WMWZ12] T. Wei, P. Mishra, K. Wu, and J. Zhou. Quasi-static fault-tolerant
scheduling schemes for energy-efficient hard real-time systems. Jour-
nal of Systems and Software, 2012.
[WRPG11] J. Wei, L. Rashid, K. Pattabiraman, and S. Gopalakrishnan. Comparing
the effects of intermittent and transient hardware faults on programs. In
Dependable Systems and Networks Workshops, Conference on, pages
53–58. IEEE, 2011.
[ZAZ13] B. Zhao, H. Aydin, and D. Zhu. Shared recovery for energy efficiency
and reliability enhancements in real-time applications with precedence
constraints. ACM Transactions on Design Automation of Electronic
Systems (TODAES), 18(2):23, 2013.
[ZC03] Y. Zhang and K. Chakrabarty. Energy-aware adaptive checkpointing
in embedded real-time systems. In Design, Automation and Test in
Europe Conference and Exhibition, 2003, pages 918–923. IEEE, 2003.
[ZC04] Y. Zhang and K. Chakrabarty. Dynamic adaptation for fault tolerance
and power management in embedded real-time systems. ACM Transac-
tions on Embedded Computing Systems (TECS), 3(2):336–360, 2004.
[ZC06] Y. Zhang and K. Chakrabarty. A unified approach for fault tolerance
and dynamic power management in fixed-priority real-time embedded
systems. Computer-Aided Design of Integrated Circuits and Systems,
IEEE Transactions on, 25(1):111–125, 2006.
Bibliography 119
[ZCP+05] W. Zheng, J. Chong, C. Pinello, S. Kanajan, and A. Sangiovanni-
Vincentelli. Extensible and scalable time triggered scheduling. In Ap-
plication of Concurrency to System Design, pages 132–141, 2005.
[ZMM04] D. Zhu, R. Melhem, and D. Mossé. The effects of energy manage-
ment on reliability in real-time embedded systems. In Computer Aided
Design, 2004. ICCAD-2004. IEEE/ACM International Conference on,
pages 35–40. IEEE, 2004.
