567 research outputs found
Network Calculus with Flow Prolongation -- A Feedforward FIFO Analysis enabled by ML
The derivation of upper bounds on data flows' worst-case traversal times is
an important task in many application areas. For accurate bounds, model
simplifications should be avoided even in large networks. Network Calculus (NC)
provides a modeling framework and different analyses for delay bounding. We
investigate the analysis of feedforward networks where all queues implement
First-In First-Out (FIFO) service. Correctly considering the effect of data
flows onto each other under FIFO is already a challenging task. Yet, the
fastest available NC FIFO analysis suffers from limitations resulting in
unnecessarily loose bounds. A feature called Flow Prolongation (FP) has been
shown to improve delay bound accuracy significantly. Unfortunately, FP needs to
be executed within the NC FIFO analysis very often and each time it creates an
exponentially growing set of alternative networks with prolongations. FP
therefore does not scale and has been out of reach for the exhaustive analysis
of large networks. We introduce DeepFP, an approach to make FP scale by
predicting prolongations using machine learning. In our evaluation, we show
that DeepFP can improve results in FIFO networks considerably. Compared to the
standard NC FIFO analysis, DeepFP reduces delay bounds by 12.1% on average at
negligible additional computational cost
On the Robustness of Deep Learning-predicted Contention Models for Network Calculus
The network calculus (NC) analysis takes a simple model consisting of a
network of schedulers and data flows crossing them. A number of analysis
"building blocks" can then be applied to capture the model without imposing
pessimistic assumptions like self-contention on tandems of servers. Yet, adding
pessimism cannot always be avoided. To compute the best bound on a single
flow's end-to-end delay thus boils down to finding the least pessimistic
contention models for all tandems of schedulers in the network - and an
exhaustive search can easily become a very resource intensive task. The
literature proposes a promising solution to this dilemma: a heuristic making
use of machine learning (ML) predictions inside the NC analysis.
While results of this work were promising in terms of delay bound quality and
computational effort, there is little to no insight on when a prediction is
made or if the trained algorithm can achieve similarly striking results in
networks vastly differing from its training data. In this paper, we address
these pending questions. We evaluate the influence of the training data and its
features on accuracy, impact and scalability. Additionally, we contribute an
extension of the method by predicting the best contention model
alternatives in order to achieve increased robustness for its application
outside the training data. Our numerical evaluation shows that good accuracy
can still be achieved on large networks although we restrict the training to
networks that are two orders of magnitude smaller
Exact Worst-case Delay in FIFO-multiplexing Feed-forward Networks
In this paper, we compute the actual worst-case end-to-end delay for a flow in a feed-forward network of first-in–first-out (FIFO)-multiplexing service curve nodes, where flows are shaped by piecewise-affine concave arrival curves, and service curves are piecewise affine and convex. We show that the worst-case delay problem can be formulated as a mixed integer linear programming problem, whose size grows exponentially with the number of nodes involved. Furthermore, we present approximate solution schemes to find upper and lower delay bounds on the worst-case delay. Both only require to solve just one linear programming problem and yield bounds that are generally more accurate than those found in the previous work, which are computed under more restrictive assumptions
Differentiable Programming & Network Calculus: Configuration Synthesis under Delay Constraints
With the advent of standards for deterministic network behavior, synthesizing
network designs under delay constraints becomes the natural next task to
tackle. Network Calculus (NC) has become a key method for validating industrial
networks, as it computes formally verified end-to-end delay bounds. However,
analyses from the NC framework have been designed to bound the delay of one
flow at a time. Attempts to use classical analyses to derive a network
configuration have shown that this approach is poorly suited to practical use
cases. Consider finding a delay-optimal routing configuration: one model had to
be created for each routing alternative, then each flow delay had to be
bounded, and then the bounds had to be compared to the given constraints. To
overcome this three-step process, we introduce Differential Network Calculus.
We extend NC to allow the differentiation of delay bounds w.r.t. to a wide
range of network parameters - such as flow paths or priority. This opens up NC
to a class of efficient nonlinear optimization techniques that exploit the
gradient of the delay bound. Our numerical evaluation on the routing and
priority assignment problem shows that our novel method can synthesize flow
paths and priorities in a matter of seconds, outperforming existing methods by
several orders of magnitude
Network-Calculus Service Curves of the Interleaved Regulator
The interleaved regulator (implemented by IEEE TSN Asynchronous Traffic
Shaping) is used in time-sensitive networks for reshaping the flows with
per-flow contracts. When applied to an aggregate of flows that come from a FIFO
system, an interleaved regulator that reshapes the flows with their initial
contracts does not increase the worst-case delay of the aggregate. This
shaping-for-free property supports the computation of end-to-end latency bounds
and the validation of the network's timing requirements. A common method to
establish the properties of a network element is to obtain a network-calculus
service-curve model. The existence of such a model for the interleaved
regulator remains an open question. If a service-curve model were found for the
interleaved regulator, then the analysis of this mechanism would no longer be
limited to the situations where the shaping-for-free holds, which would widen
its use in time-sensitive networks. In this paper, we investigate if
network-calculus service curves can capture the behavior of the interleaved
regulator. We find that an interleaved regulator placed outside of the
shaping-for-free requirements (after a non-FIFO system) can yield unbounded
latencies. Consequently, we prove that no network-calculus service curve exists
to explain the interleaved regulator's behavior. It is still possible to find
non-trivial service curves for the interleaved regulator. However, their
long-term rate cannot be large enough to provide any guarantee (specifically,
we prove that for the regulators that process at least four flows with the same
contract, the long-term rate of any service curve is upper bounded by three
times the rate of the per-flow contract).Comment: 17 pages, 13 figures, 4 table
A time-predictable many-core processor design for critical real-time embedded systems
Critical Real-Time Embedded Systems (CRTES) are in charge of controlling fundamental parts of embedded system, e.g. energy harvesting solar panels in satellites, steering and breaking in cars, or flight management systems in airplanes. To do so, CRTES require strong evidence of correct functional and timing behavior. The former guarantees that the system operates correctly in response of its inputs; the latter ensures that its operations are performed within a predefined time budget.
CRTES aim at increasing the number and complexity of functions. Examples include the incorporation of \smarter" Advanced Driver Assistance System (ADAS) functionality in modern cars or advanced collision avoidance systems in Unmanned Aerial Vehicles (UAVs). All these new features, implemented in software, lead to an exponential growth in both performance requirements and software development complexity. Furthermore, there is a strong need to integrate multiple functions into the same computing platform to reduce the number of processing units, mass and space requirements, etc. Overall, there is a clear need to increase the computing power of current CRTES in order to support new sophisticated and complex functionality, and integrate multiple systems into a single platform.
The use of multi- and many-core processor architectures is increasingly seen in the CRTES industry as the solution to cope with the performance demand and cost constraints of future CRTES. Many-cores supply higher performance by exploiting the parallelism of applications while providing a better performance per watt as cores are maintained simpler with respect to complex single-core processors. Moreover, the parallelization capabilities allow scheduling multiple functions into the same processor, maximizing the hardware utilization.
However, the use of multi- and many-cores in CRTES also brings a number of challenges related to provide evidence about the correct operation of the system, especially in the timing domain. Hence, despite the advantages of many-cores and the fact that they are nowadays a reality in the embedded domain (e.g. Kalray MPPA, Freescale NXP P4080, TI Keystone II), their use in CRTES still requires finding efficient ways of providing reliable evidence about the correct operation of the system.
This thesis investigates the use of many-core processors in CRTES as a means to satisfy performance demands of future complex applications while providing the necessary timing guarantees. To do so, this thesis contributes to advance the state-of-the-art towards the exploitation of parallel capabilities of many-cores in CRTES contributing in two different computing domains. From the hardware domain, this thesis proposes new many-core designs that enable deriving reliable and tight timing guarantees. From the software domain, we present efficient scheduling and timing analysis techniques to exploit the parallelization capabilities of many-core architectures and to derive tight and trustworthy Worst-Case Execution Time (WCET) estimates of CRTES.Los sistemas crĂticos empotrados de tiempo real (en ingles Critical Real-Time Embedded Systems, CRTES) se encargan de controlar partes fundamentales de los sistemas integrados, e.g. obtenciĂłn de la energĂa de los paneles solares en satĂ©lites, la direcciĂłn y frenado en automĂłviles, o el control de vuelo en aviones. Para hacerlo, CRTES requieren fuerte evidencias del correcto comportamiento funcional y temporal. El primero garantiza que el sistema funciona correctamente en respuesta de sus entradas; el Ăşltimo asegura que sus operaciones se realizan dentro de unos limites temporales establecidos previamente. El objetivo de los CRTES es aumentar el nĂşmero y la complejidad de las funciones. Algunos ejemplos incluyen los sistemas inteligentes de asistencia a la conducciĂłn en automĂłviles modernos o los sistemas avanzados de prevenciĂłn de colisiones en vehiculos aereos no tripulados. Todas estas nuevas caracterĂsticas, implementadas en software,conducen a un crecimiento exponencial tanto en los requerimientos de rendimiento como en la complejidad de desarrollo de software. Además, existe una gran necesidad de integrar mĂşltiples funciones en una sĂłla plataforma para asĂ reducir el nĂşmero de unidades de procesamiento, cumplir con requisitos de peso y espacio, etc. En general, hay una clara necesidad de aumentar la potencia de cĂłmputo de los actuales CRTES para soportar nueva funcionalidades sofisticadas y complejas e integrar mĂşltiples sistemas en una sola plataforma. El uso de arquitecturas multi- y many-core se ve cada vez más en la industria CRTES como la soluciĂłn para hacer frente a la demanda de mayor rendimiento y las limitaciones de costes de los futuros CRTES. Las arquitecturas many-core proporcionan un mayor rendimiento explotando el paralelismo de aplicaciones al tiempo que proporciona un mejor rendimiento por vatio ya que los cores se mantienen más simples con respecto a complejos procesadores de un solo core. Además, las capacidades de paralelizaciĂłn permiten programar mĂşltiples funciones en el mismo procesador, maximizando la utilizaciĂłn del hardware. Sin embargo, el uso de multi- y many-core en CRTES tambiĂ©n acarrea ciertos desafĂos relacionados con la aportaciĂłn de evidencias sobre el correcto funcionamiento del sistema, especialmente en el ámbito temporal. Por eso, a pesar de las ventajas de los procesadores many-core y del hecho de que Ă©stos son una realidad en los sitemas integrados (por ejemplo Kalray MPPA, Freescale NXP P4080, TI Keystone II), su uso en CRTES aĂşn precisa de la bĂşsqueda de mĂ©todos eficientes para proveer evidencias fiables sobre el correcto funcionamiento del sistema. Esta tesis ahonda en el uso de procesadores many-core en CRTES como un medio para satisfacer los requisitos de rendimiento de aplicaciones complejas mientras proveen las garantĂas de tiempo necesarias. Para ello, esta tesis contribuye en el avance del estado del arte hacia la explotaciĂłn de many-cores en CRTES en dos ámbitos de la computaciĂłn. En el ámbito del hardware, esta tesis propone nuevos diseños many-core que posibilitan garantĂas de tiempo fiables y precisas. En el ámbito del software, la tesis presenta tĂ©cnicas eficientes para la planificaciĂłn de tareas y el análisis de tiempo para aprovechar las capacidades de paralelizaciĂłn en arquitecturas many-core, y tambiĂ©n para derivar estimaciones de peor tiempo de ejecuciĂłn (Worst-Case Execution Time, WCET) fiables y precisas
Real-Time and Energy-Efficient Routing for Industrial Wireless Sensor-Actuator Networks
With the emergence of industrial standards such as WirelessHART, process industries are adopting Wireless Sensor-Actuator Networks (WSANs) that enable sensors and actuators to communicate through low-power wireless mesh networks. Industrial monitoring and control applications require real-time communication among sensors, controllers and actuators within end-to-end deadlines. Deadline misses may lead to production inefficiency, equipment destruction to irreparable financial and environmental impacts. Moreover, due to the large geographic area and harsh conditions of many industrial plants, it is labor-intensive or dan- gerous to change batteries of field devices. It is therefore important to achieve long network lifetime with battery-powered devices.
This dissertation tackles these challenges and make a series of contributions. (1) We present a new end-to-end delay analysis for feedback control loops whose transmissions are scheduled based on the Earliest Deadline First policy. (2) We propose a new real-time routing algorithm that increases the real-time capacity of WSANs by exploiting the insights of the delay analysis. (3) We develop an energy-efficient routing algorithm to improve the network lifetime while maintaining path diversity for reliable communication. (4) Finally, we design a distributed game-theoretic algorithm to allocate sensing applications with near-optimal quality of sensing
A Probabilistically Analyzable Cache to Estimate Timing Bounds
RÉSUMÉ - Les architectures informatiques modernes cherchent à accélérer la performance moyenne
des logiciels en cours d’exécution. Les caractéristiques architecturales comme : deep pipelines,
prédiction de branchement, exécution hors ordre, et hiérarchie des mémoire à multiple
niveaux ont un impact négatif sur le logiciel de prédiction temporelle. En particulier, il est
difficile, voire impossible, de faire une estimation précise du pire cas de temps d’exécution
(WCET) d’un programme ou d’un logiciel en cours d’exécution sur une plateforme informatique
particulière. Les systèmes embarqués critiques temps réel (CRTESs), par exemple
les systèmes informatiques dans le domaine aérospatiale, exigent des contraintes de temps
strictes pour garantir leur fonctionnement opérationnel. L’analyse du WCET est l’idée centrale
du développement des systèmes temps réel puisque les systèmes temps réel ont toujours
besoin de respecter leurs échéances. Afin de répondre aux exigences du délai, le WCET des
tâches des systèmes temps réel doivent être déterminées, et cela est seulement possible si
l’architecture informatique est temporellement prévisible. En raison de la nature imprévisible
des systems informatiques modernes, il est peu pratique d’utiliser des systèmes informatiques
avancés dans les CRTESs. En temps réel, les systèmes ne doivent pas répondre aux exigences
de haute performance. Les processeurs conçus pour améliorer la performance des systèmes
informatiques en général peuvent ne pas être compatibles avec les exigences pour les systèmes
temps réel en raison de problèmes de prédictabilité. Les techniques d’analyse temporelle actuelles
sont bien établies, mais nécessitent une connaissance détaillée des opérations internes
et de l’état du système pour le matériel et le logiciel. Le manque de connaissances approfondies
des opérations architecturales devient un obstacle à l’adoption de techniques déterministes
de l’analyse temporelle (DTA) pour mesurer le WCET. Les techniques probabilistes de l’analyse
temporelle (PTA) ont, quant à elles, émergé comme les techniques d’analyse temporelle
pour la prochaine génération de systèmes temps réel. Les techniques PTA réduisent l’étendue
des connaissances nécessaires pour l’exécution d’un logiciel informatique afin d’effectuer
des estimations précises du WCET. Dans cette thèse, nous proposons le développement d’une
nouvelle technique pour un cache probabilistiquement analysable, tout en appliquant les techniques
PTA pour prédire le temps d’exécution d’un logiciel. Dans ce travail, nous avons mis
en place une cache aléatoire pour les processeurs MIPS-32 et Leon-3. Nous avons conçu et mis
en œuvre les politiques de placement et remplacement aléatoire et appliquer des techniques
temporelles probabilistiques pour mesurer le WCET probabiliste (pWCET). Nous avons Ă©galement
mesuré le niveau de pessimisme encouru par les techniques probabilistes et comparé
cela avec la configuration du cache déterministe. La prédiction du WCET fournie par les
techniques PTA est plus proche de la durée d’exécution réelle du programme. Nous avons
comparé les estimations avec les mesures effectuées sur le processeur pour aider le concepteur
à évaluer le niveau de pessimisme introduit par l’architecture du cache pour chaque technique
d’analyse temporelle probabiliste. Ce travail fait une première tentative de comparaison des
analyses temporelles déterministes, statiques et de l’analyse temporelle probabiliste basée sur
des mesures pour l’estimation du temps d’execution sous différentes configurations de cache.
Nous avons identifié les points forts et les limites de chaque technique pour la prévision du
temps d’execution, puis nous avons fourni des directives pour la conception du processeur
qui minimisent le pessimisme associé au WCET. Nos expériences montrent que le cache répond
à toutes les conditions pour PTA et la prévision du programme peut être déterminée
avec une précision arbitraire. Une telle architecture probabiliste offre un potentiel inégalé et
prometteur pour les prochaines générations du CRTESs.
---------- ABSTRACT - Modern computer architectures are targeted towards speeding up the average performance
of software running on it. Architectural features like: deep pipelines, branch prediction, outof-order
execution, and multi-level memory hierarchies have an adverse impact on software
timing prediction. Particularly, it is hard or even impossible to make an accurate estimation
of the worst case execution-time (WCET) of a program or software running on a particular
hardware platform.
Critical real-time embedded systems (CRTESs), e.g. computing systems in aerospace
require strict timing constraints to guarantee their proper operational behavior. WCET
analysis is the central idea of the real-time systems development because real-time systems
always need to meet their deadlines. In order to meet the deadline requirements, WCET of
the real-time systems tasks must be determined, and this is only possible if the hardware
architecture is time-predictable. Due to the unpredictable nature of the modern computing
hardware, it is not practical to use advanced computing systems in CRTESs. The real-time
systems do not need to meet high-performance requirements. The processor designed to
improve average cases performance may not fit the requirements for the real-time systems
due to predictability issues.
Current timing analysis techniques are well established, but require detailed knowledge
of the internal operations and the state of the system for both hardware and software. Lack
of in-depth knowledge of the architectural operations become an obstacle for adopting the
deterministic timing analysis (DTA) techniques for WCET measurement. Probabilistic timing
analysis (PTA) is a technique that emerged for the timing analysis of the next-generation
real-time systems. The PTA techniques reduce the extent of knowledge of a software execution
platform that is needed to perform the accurate WCET estimations. In this thesis,
we propose the development of a new probabilistically analyzable cache and applied PTA
techniques for time-prediction. In this work, we implemented a randomized cache for MIPS-
32 and Leon-3 processors. We designed and implemented random placement and replacement
policies, and applied probabilistic timing techniques to measure probabilistic WCET
(pWCET). We also measured the level of pessimism incurred by the probabilistic techniques
and compared it with the deterministic cache configuration. The WCET prediction provided
by the PTA techniques is closer to the real execution-time of the program. We compared the
estimates with the measurements done on the processor to help the designer to evaluate the
level of pessimism introduced by the cache architecture for each probabilistic timing analysis
technique. This work makes a first attempt towards the comparison of deterministic, static,
and measurement-based probabilistic timing analysis for time-prediction under varying cache
configurations. We identify strengths and limitations of each technique for time- prediction,
and provide guidelines for the design of the processor that minimize the pessimism associated
with WCET. Our experiments show that the cache fulfills all the requirements for PTA and
program prediction can be determined with arbitrary accuracy. Such probabilistic computer
architecture carries unmatched potential and great promise for next generation CRTESs
- …