

# The use of simulation in the design of critical embedded systems

Nicolas Navet
University of Luxembourg,
founder RealTime-at-Work



NAFEMS Conference Paris, June 9th 2016



## Critical systems are often very complex



Inside an engine ECU: functions are the nodes (≈1500), edges are function calls, Functions are processing around 35000 variables

Complete Electrical and Electronic architecture: 10s of ECUs, many wired and some wireless networks, gateways, etc

## **Outline**

✓ Simulation in the design of critical systems with a focus on timing-accurate simulation



## Verification along the dev. cycle



#### **Simulation**

- Functional simulation
- ✓ Software-in-the-loop, hardware in the loop, etc
- ✓ Timing-accurate simulation of ECU, bus, system-level



#### **Formal verification**

- ✓ Worst-Case Execution Time analysis
- ✓ Worst-Case Response time analysis: ECU, bus, system-level
- ✓ Probabilistic analysis (academia)

#### Testing

- Execution time measurements
- ✓ Integration tests
- ✓ Off-line trace analysis& monitoring tools
- **√** ...



#### "Early stage"

Technological & design choices

#### "Project"

Configuration & optimization



#### "Real"

Refine and validate models & impact of non-conformance

## Critical systems are often real-time systems

✓ Correctness in the *value domain* → functional simulation



✓ Correctness in the time domain → timing accurate simulation, everything else is abstracted away

## **Hundreds of timing constraints**



- ✓ Responsiveness
- ✓ Freshness of data
- ✓ Jitters
- ✓ Synchronicity
- **√** ...



Timing-accurate simulation: the activities of the system are modelled by their activation patterns and execution time – functional behaviour is not captured

## Zoom on response time constraints



Sim

Accurate model → verification

Approximate model → debugging, but
usually unpredictably unsafe for verification

esting

✓ Response times by simulation: ECU, networks, system-level

#### Requires knowledge of

- ✓ All activities: tasks, frames, signals
- ✓ Software code to derive execution times
- Complete embedded architecture with all scheduling & configuration parameters for buses and ECUs

Solution for early-stage verification: conservative assumptions and time budget per resource

## Interest in the tails of the distribution



## Working with quantiles in practice – see [5]



- 1. Identify frame deadline
- 2. Decide the tolerable risk  $\rightarrow$  target quantile
- Simulate "sufficiently" long
- 4. If target quantile value is below max. acceptable value, performance objective is met

## Performance metrics: illustration on a Daimler prototype network (ADAS, control functions) [1]



The 58 flows of data sorted by increasing communication latencies

#### Simulation of embedded architectures





## CPAL simulation language – see [4]

Model and program

functional and non-functional concerns

Simulate

possibly embedded within external tools such as RTaW-Pegase™ and

Matlab/Simulink ™





3 Execute

bare metal or hosted by an OS - prototypes or real systems

Freely available from www.designcps.com

# How do we know simulation models are correct?!



## What do we have at hand?

- ✓ Are the models described ? Usually no
- ✓ Is source code available? No

- Black-box tools
- ✓ Complexity of the models and implementations? High Domain experts typically take many months to master a new technology!
- ✓ Do we have qualification? No
- ✓ Are there public benchmarks on which validate the results? No
- ✓ Limited number of end-users and cost-pressure ? Yes
- ✓ Can we prove the correctness of the simulation results? No

Best practice: several techniques and several tools for cross-validation

## **Examples of cross-validation**

- ✓ Comparing different simulation models: e.g, in-house vs commercial, coarse-grained vs fine-grained
- ✓ Comparing simulation against analytic results: e.g., upperbound and lower-bounds analysis
- ✓ Validating a simulator using real communication/execution traces: e.g., comparing inter-arrival times distributions
- ✓ Re-simulating worst-case situation from mathematical analysis
- **√** ...

Our experience: for complex systems, validating timing accurate simulation models is much easier than mathematical models

Illustration: Some/IP middleware [7,8]

SOME/IP SD: service discovery for automotive Ethernet
Objective: find the right tradeoff between subscription
latency and SOME/IP SD overhead



## Simulation for .. safety-critical systems?!

Our view: if system can be made robust to rare (quantified) faults such as deadline misses, then designing with simulation is more effective in terms of resource usage

#### Know what to expect from simulation – typically:

- ✓ Worst-case behaviors are out of reach but extremely rare events (e.g., Pr << 10<sup>-6</sup> - see[1])
- ✓ Able to provide guarantees for events up  $Pr < 10^{-6}$  in a few hours
- ✓ Coarse-grained lower-bounds analysis to cross-validate

#### Sound simulation methodology – see [1]

- ✓ Q1: is a single run enough ?
- ✓ Q2: can we run simulation in parallel and aggregate results?
- ✓ Q3: simulation length?
- ✓ Q4: correlations between "feared events"?

## Simulation for .. safety-critical systems?!



#### Ahead of us #1: timing-Augmented Model Driven Development

✓ Functional integration fails if control engineering assumptions not met at run-time: sampling jitters, varying response times, etc



Solution: injecting delays in the simulation - but how to do that early stage without knowledge of complete configuration?

#### Ongoing work:

- Designer defines timing-acceptable solution in terms of significant events: order & quantified relationships btw them
- 2. Derive QoS needed from the runtime systems: CPU, comm. latencies
- 3. Resource reservation & QoS ensured at run-time

# Ahead of us #2 : finding initial conditions leading to degraded performances → worst-case oriented simulation



Avionics network: the 3214 flows of data sorted by increasing communication latencies

## Ahead of us #2: simulation is unable to find pessimistic situations ... unlike lower bound analysis



## Key takeaways

- ✓ Complex mathematical models is a dead-end for systems not conceived with analyzability as a requirement → they cannot catch up with the complexity see [1]
- ✓ Simulation is effective for critical systems that can tolerate faults with a *controlled* risk → best resource usage
  - Need for proper methodology
  - Cross-validation is a must-have
  - Models and their assumptions should be questioned by end-users
- ✓ Today: high-performance timing-accurate simulation of complete heterogeneous embedded architectures
- ✓ Ahead of us: system-level simulation with functional behavior within a Model-Driven Engineering flow

### References

- [1] N. Navet, J. Seyler, J. Migge, "<u>Timing verification of real-time automotive Ethernet networks: what can we expect from simulation?</u>", Embedded Real-Time Software and Systems (ERTS 2016), Toulouse, France, January 27-29, 2016.
- [2] S. Altmeyer, N. Navet, "<u>Towards a declarative modeling and execution framework for real-time systems</u>", First IEEE Workshop on Declarative Programming for Real-Time and Cyber-Physical Systems, San-Antonio, USA, December 1, 2015.
- [3] H. Bauer, J.-L. Scharbarg, C. Fraboul, "Improving the Worst-Case Delay Analysis of an AFDX Network Using an Optimized Trajectory Approach", IEEE Transactions on Industrial informatics, Vol 6, No. 4, November 2010.
- [4] CPAL the Cyber-Physical Action Language, freely available from <a href="http://www.designcps.com">http://www.designcps.com</a>, 2015.
- [5] N. Navet, S. Louvart, J. Villanueva, S. Campoy-Martinez, J. Migge, "<u>Timing verification of automotive communication architectures using quantile estimation</u>", Embedded Real-Time Software and Systems (ERTS 2014), Toulouse, France, February 5-7, 2014.
- [6] N. Navet N., L. Fejoz L., L. Havet, S. Altmeyer, "<u>Lean Model-Driven Development through Model-Interpretation: the CPAL design flow</u>", Technical report from the University of Luxembourg, to be presented at ERTSS2016, October 2015.
- [7] J. Seyler, N. Navet, L. Fejoz, "Insights on the Configuration and Performances of SOME/IP Service

  <u>Discovery</u>", in SAE International Journal of Passenger Cars- Electronic and Electrical Systems, 8(1), 124-129, 2015.
- [8] J. Seyler, T. Streichert, M. Glaß, N. Navet, J. Teich, "Formal Analysis of the Startup Delay of SOME/IP Service Discovery", Design, Automation and Test in Europe (DATE2015), Grenoble, France, March 13-15, 2015.
- [9] F. Boniol and V. Wiels, "Landing gear system", case –study presented at ABZ2014, 2014.
- [10] AUTOSAR, "Specification of Timing Extensions", Release 4.0 Rev 2, 2010.
- [11] M. Tatar, "Inside an Engine ECU a view you've not seen before", Linkedin Pulse, 2016.