185 research outputs found

    Worst-case temporal analysis of real-time dynamic streaming applications

    Get PDF

    Temporal analysis and scheduling of hard real-time radios running on a multi-processor

    Get PDF
    On a multi-radio baseband system, multiple independent transceivers must share the resources of a multi-processor, while meeting each its own hard real-time requirements. Not all possible combinations of transceivers are known at compile time, so a solution must be found that either allows for independent timing analysis or relies on runtime timing analysis. This thesis proposes a design flow and software architecture that meets these challenges, while enabling features such as independent transceiver compilation and dynamic loading, and taking into account other challenges such as ease of programming, efficiency, and ease of validation. We take data flow as the basic model of computation, as it fits the application domain, and several static variants (such as Single-Rate, Multi-Rate and Cyclo-Static) have been shown to possess strong analytical properties. Traditional temporal analysis of data flow can provide minimum throughput guarantees for a self-timed implementation of data flow. Since transceivers may need to guarantee strictly periodic execution and meet latency requirements, we extend the analysis techniques to show that we can enforce strict periodicity for an actor in the graph; we also provide maximum latency analysis techniques for periodic, sporadic and bursty sources. We propose a scheduling strategy and an automatic scheduling flow that enable the simultaneous execution of multiple transceivers with hard-realtime requirements, described as Single-Rate Data Flow (SRDF) graphs. Each transceiver has its own execution rate and starts and stops independently from other transceivers, at times unknown at compile time, on a multiprocessor. We show how to combine scheduling and mapping decisions with the input application data flow graph to generate a worst-case temporal analysis graph. We propose algorithms to find a mapping per transceiver in the form of clusters of statically-ordered actors, and a budget for either a Time Division Multiplex (TDM) or Non-Preemptive Non-Blocking Round Robin (NPNBRR) scheduler per cluster per transceiver. The budget is computed such that if the platform can provide it, then the desired minimum throughput and maximum latency of the transceiver are guaranteed, while minimizing the required processing resources. We illustrate the use of these techniques to map a combination of WLAN and TDS-CDMA receivers onto a prototype Software-Defined Radio platform. The functionality of transceivers for standards with very dynamic behavior – such as WLAN – cannot be conveniently modeled as an SRDF graph, since SRDF is not capable of expressing variations of actor firing rules depending on the values of input data. Because of this, we propose a restricted, customized data flow model of computation, Mode-Controlled Data Flow (MCDF), that can capture the data-value dependent behavior of a transceiver, while allowing rigorous temporal analysis, and tight resource budgeting. We develop a number of analysis techniques to characterize the temporal behavior of MCDF graphs, in terms of maximum latencies and throughput. We also provide an extension to MCDF of our scheduling strategy for SRDF. The capabilities of MCDF are then illustrated with a WLAN 802.11a receiver model. Having computed budgets for each transceiver, we propose a way to use these budgets for run-time resource mapping and admissibility analysis. During run-time, at transceiver start time, the budget for each cluster of statically-ordered actors is allocated by a resource manager to platform resources. The resource manager enforces strict admission control, to restrict transceivers from interfering with each other’s worst-case temporal behaviors. We propose algorithms adapted from Vector Bin-Packing to enable the mapping at start time of transceivers to the multi-processor architecture, considering also the case where the processors are connected by a network on chip with resource reservation guarantees, in which case we also find routing and resource allocation on the network-on-chip. In our experiments, our resource allocation algorithms can keep 95% of the system resources occupied, while suffering from an allocation failure rate of less than 5%. An implementation of the framework was carried out on a prototype board. We present performance and memory utilization figures for this implementation, as they provide insights into the costs of adopting our approach. It turns out that the scheduling and synchronization overhead for an unoptimized implementation with no hardware support for synchronization of the framework is 16.3% of the cycle budget for a WLAN receiver on an EVP processor at 320 MHz. However, this overhead is less than 1% for mobile standards such as TDS-CDMA or LTE, which have lower rates, and thus larger cycle budgets. Considering that clock speeds will increase and that the synchronization primitives can be optimized to exploit the addressing modes available in the EVP, these results are very promising

    An Abstraction-Refinement Theory for the Analysis and Design of Concurrent Real-Time Systems

    Get PDF
    Concurrent real-time systems with shared resources belong to the class of safety-critical systems for which it is required to determine both temporally and functionally conservative guarantees. However, the growing complexity of real-time systems makes it more and more challenging to apply standard techniques for their analysis. Especially the presence of both cyclic data dependencies and cyclic resource dependencies makes many related analysis approaches inapplicable. The usage of Static Priority Preemptive (SPP) scheduling further impedes the employment of many "classical" analysis techniques. To address this growing complexity and to be able to give guarantees nevertheless we present an abstraction-refinement theory for real-time systems. We introduce a timed component model that is defined in such a generic way that both real-time system implementations and any kinds of analysis models for such applications can be expressed therein. Thereafter, we devise three different abstraction-refinement theories for the timed component model, exclusion, inclusion and bounding. Exclusion can be used to remove unconsidered corner cases, inclusion allows for the substitution of uncertainty with non-determinism, while bounding permits to replace non-determinism with determinism. The latter enables the creation of efficiently analyzable models that can be used to give temporal or functional guarantees on non-deterministic and non-monotone implementations. We use such abstractions to construct analysis models from concurrent real-time systems with shared resources and SPP scheduling. On these models we apply various analysis techniques, with the goal to increase analysis accuracy. Our first accuracy improvement is achieved by combining the rather coarse state-of-the-art period-and-jitter interference characterization with an explicit consideration of cyclic data dependencies. The interference-limiting effect of such cycles can be exploited even more with an "iterative buffer sizing". Next we replace period-and-jitter with execution intervals, resulting in an even higher accuracy. In our last approach we increase both accuracy and applicability by enabling the support of real-time systems with tasks consisting of multiple phases and operating at different rates. With a modification of this approach we further enable the analysis of applications with multiple shared resources. Finally, we also present the so-called HAPI simulator that is capable of simulating any kinds of concurrent real-time systems with shared resources

    Ordonnancement hybride des applications flots de données sur des systèmes embarqués multi-coeurs

    Get PDF
    Les systèmes embarqués sont de plus en plus présents dans l'industrie comme dans la vie quotidienne. Une grande partie de ces systèmes comprend des applications effectuant du traitement intensif des données: elles utilisent de nombreux filtres numériques, où les opérations sur les données sont répétitives et ont un contrôle limité. Les graphes "flots de données", grâce à leur déterminisme fonctionnel inhérent, sont très répandus pour modéliser les systèmes embarqués connus sous le nom de "data-driven". L'ordonnancement statique et périodique des graphes flot de données a été largement étudié, surtout pour deux modèles particuliers: SDF et CSDF. Dans cette thèse, on s'intéresse plus particulièrement à l'ordonnancement périodique des graphes CSDF. Le problème consiste à identifier des séquences périodiques infinies d'actionnement des acteurs qui aboutissent à des exécutions complètes à buffers bornés. L'objectif est de pouvoir aborder ce problème sous des angles différents : maximisation de débit, minimisation de la latence et minimisation de la capacité des buffers. La plupart des travaux existants proposent des solutions pour l'optimisation du débit et négligent le problème d'optimisation de la latence et propose même dans certains cas des ordonnancements qui ont un impact négatif sur elle afin de conserver les propriétés de périodicité. On propose dans cette thèse un ordonnancement hybride, nommé Self-Timed Périodique (STP), qui peut conserver les propriétés d'un ordonnancement périodique et à la fois améliorer considérablement sa performance en terme de latence.One of the most important aspects of parallel computing is its close relation to the underlying hardware and programming models. In this PhD thesis, we take dataflow as the basic model of computation, as it fits the streaming application domain. Cyclo-Static Dataflow (CSDF) is particularly interesting because this variant is one of the most expressive dataflow models while still being analyzable at design time. Describing the system at higher levels of abstraction is not sufficient, e.g. dataflow have no direct means to optimize communication channels generally based on shared buffers. Therefore, we need to link the dataflow MoCs used for performance analysis of the programs, the real time task models used for timing analysis and the low-level model used to derive communication times. This thesis proposes a design flow that meets these challenges, while enabling features such as temporal isolation and taking into account other challenges such as predictability and ease of validation. To this end, we propose a new scheduling policy noted Self-Timed Periodic (STP), which is an execution model combining Self-Timed Scheduling (STS) with periodic scheduling. In STP scheduling, actors are no longer strictly periodic but self-timed assigned to periodic levels: the period of each actor under periodic scheduling is replaced by its worst-case execution time. Then, STP retains some of the performance and flexibility of self-timed schedule, in which execution times of actors need only be estimates, and at the same time makes use of the fact that with a periodic schedule we can derive a tight estimation of the required performance metrics

    An accurate analysis for guaranteed performance of multiprocessor streaming applications

    Get PDF
    Already for more than a decade, consumer electronic devices have been available for entertainment, educational, or telecommunication tasks based on multimedia streaming applications, i.e., applications that process streams of audio and video samples in digital form. Multimedia capabilities are expected to become more and more commonplace in portable devices. This leads to challenges with respect to cost efficiency and quality. This thesis contributes models and analysis techniques for improving the cost efficiency, and therefore also the quality, of multimedia devices. Portable consumer electronic devices should feature flexible functionality on the one hand and low power consumption on the other hand. Those two requirements are conflicting. Therefore, we focus on a class of hardware that represents a good trade-off between those two requirements, namely on domain-specific multiprocessor systems-on-chip (MP-SoC). Our research work contributes to dynamic (i.e., run-time) optimization of MP-SoC system metrics. The central question in this area is how to ensure that real-time constraints are satisfied and the metric of interest such as perceived multimedia quality or power consumption is optimized. In these cases, we speak of quality-of-service (QoS) and power management, respectively. In this thesis, we pursue real-time constraint satisfaction that is guaranteed by the system by construction and proven mainly based on analytical reasoning. That approach is often taken in real-time systems to ensure reliable performance. Therefore the performance analysis has to be conservative, i.e. it has to use pessimistic assumptions on the unknown conditions that can negatively influence the system performance. We adopt this hypothesis as the foundation of this work. Therefore, the subject of this thesis is the analysis of guaranteed performance for multimedia applications running on multiprocessors. It is very important to note that our conservative approach is essentially different from considering only the worst-case state of the system. Unlike the worst-case approach, our approach is dynamic, i.e. it makes use of run-time characteristics of the input data and the environment of the application. The main purpose of our performance analysis method is to guide the run-time optimization. Typically, a resource or quality manager predicts the execution time, i.e., the time it takes the system to process a certain number of input data samples. When the execution times get smaller, due to dependency of the execution time on the input data, the manager can switch the control parameter for the metric of interest such that the metric improves but the system gets slower. For power optimization, that means switching to a low-power mode. If execution times grow, the manager can set parameters so that the system gets faster. For QoS management, for example, the application can be switched to a different quality mode with some degradation in perceived quality. The real-time constraints are then never violated and the metrics of interest are kept as good as possible. Unfortunately, maintaining system metrics such as power and quality at the optimal level contradicts with our main requirement, i.e., providing performance guarantees, because for this one has to give up some quality or power consumption. Therefore, the performance analysis approach developed in this thesis is not only conservative, but also accurate, so that the optimization of the metric of interest does not suffer too much from conservativity. This is not trivial to realize when two factors are combined: parallel execution on multiple processors and dynamic variation of the data-dependent execution delays. We achieve the goal of conservative and accurate performance estimation for an important class of multiprocessor platforms and multimedia applications. Our performance analysis technique is realizable in practice in QoS or power management setups. We consider a generic MP-SoC platform that runs a dynamic set of applications, each application possibly using multiple processors. We assume that the applications are independent, although it is possible to relax this requirement in the future. To support real-time constraints, we require that the platform can provide guaranteed computation, communication and memory budgets for applications. Following important trends in system-on-chip communication, we support both global buses and networks-on-chip. We represent every application as a homogeneous synchronous dataflow (HSDF) graph, where the application tasks are modeled as graph nodes, called actors. We allow dynamic datadependent actor execution delays, which makes HSDF graphs very useful to express modern streaming applications. Our reason to consider HSDF graphs is that they provide a good basic foundation for analytical performance estimation. In this setup, this thesis provides three major contributions: 1. Given an application mapped to an MP-SoC platform, given the performance guarantees for the individual computation units (the processors) and the communication unit (the network-on-chip), and given constant actor execution delays, we derive the throughput and the execution time of the system as a whole. 2. Given a mapped application and platform performance guarantees as in the previous item, we extend our approach for constant actor execution delays to dynamic datadependent actor delays. 3. We propose a global implementation trajectory that starts from the application specification and goes through design-time and run-time phases. It uses an extension of the HSDF model of computation to reflect the design decisions made along the trajectory. We present our model and trajectory not only to put the first two contributions into the right context, but also to present our vision on different parts of the trajectory, to make a complete and consistent story. Our first contribution uses the idea of so-called IPC (inter-processor communication) graphs known from the literature, whereby a single model of computation (i.e., HSDF graphs) are used to model not only the computation units, but also the communication unit (the global bus or the network-on-chip) and the FIFO (first-in-first-out) buffers that form a ‘glue’ between the computation and communication units. We were the first to propose HSDF graph structures for modeling bounded FIFO buffers and guaranteed throughput network connections for the network-on-chip communication in MP-SoCs. As a result, our HSDF models enable the formalization of the on-chip FIFO buffer capacity minimization problem under a throughput constraint as a graph-theoretic problem. Using HSDF graphs to formalize that problem helps to find the performance bottlenecks in a given solution to this problem and to improve this solution. To demonstrate this, we use the JPEG decoder application case study. Also, we show that, assuming constant – worst-case for the given JPEG image – actor delays, we can predict execution times of JPEG decoding on two processors with an accuracy of 21%. Our second contribution is based on an extension of the scenario approach. This approach is based on the observation that the dynamic behavior of an application is typically composed of a limited number of sub-behaviors, i.e., scenarios, that have similar resource requirements, i.e., similar actor execution delays in the context of this thesis. The previous work on scenarios treats only single-processor applications or multiprocessor applications that do not exploit all the flexibility of the HSDF model of computation. We develop new scenario-based techniques in the context of HSDF graphs, to derive the timing overlap between different scenarios, which is very important to achieve good accuracy for general HSDF graphs executing on multiprocessors. We exploit this idea in an application case study – the MPEG-4 arbitrarily-shaped video decoder, and demonstrate execution time prediction with an average accuracy of 11%. To the best of our knowledge, for the given setup, no other existing performance technique can provide a comparable accuracy and at the same time performance guarantees

    Application Domain-Driven System Design for Pervasive Video Processing

    Get PDF
    International audiencePervasive video processing in future Ambient Intelligence environments sets new challenges in embedded system design. In particular, very high performance requirements have to be combined with the constraints of deeply embedded systems, frequently changing operating modes, and low-cost, high-volume production. By leveraging upon the key properties of the application domain, we devised a computation model, a hardware template, and a programming approach which provide a natural mapping from application requirements to a complete system solution. Our approach enables the direct exploitation of concurrency and regularity in achieving the combined challenge of adaptability, performance, and efficiency

    Response modeling:model refinements for timing analysis of runtime scheduling in real-time streaming systems

    Get PDF

    Wireless Real-Time Communication in Tunnel-like Environments using Wireless Mesh Networks: The WICKPro Protocol

    Get PDF
    En los últimos años, las redes inalámbricas se están utilizando cada vez más en entornos industriales debido a sus ventajas respecto a redes cableadas: menor coste de instalación, soporte de movilidad, instalación en lugares donde los cables pueden ser problemáticos y mayor facilidad de reconfiguración. Estas redes inalámbricas normalmente deben proporcionar comunicación en tiempo real para satisfacer los requerimientos de las aplicaciones. Podemos encontrar ejemplos de comunicación en tiempo real con redes inalámbricas para entornos industriales en el campo de la automatización industrial y en el control de procesos, donde redes inalámbricas de radiofrecuencia han sido utilizadas para posibilitar comunicación en tiempo real con un despliegue sencillo. Asimismo, la industria también está interesada en comunicaciones en tiempo real en entornos subterráneos, puesto que existen diversas actividades que se llevan a cabo en escenarios tales como túneles y minas, incluyendo operaciones de minería, vigilancia, intervención y rescate. Las redes inalámbricas malladas (Wireless Mesh Networks, WMNs) representan una solución prometedora para conseguir comunicaci ón en tiempo real en entornos inalámbricos, dado que proporcionan una red troncal inalámbrica formada por encaminadores (routers) que es utilizada por terminales móviles. Sin embargo, las WMNs también presentan algunos retos: la naturaleza multisalto de estas redes causa interferencias entre flujos e interferencias de un flujo consigo mismo, además de que la propagación inalámbrica sufre shadowing y propagación multicamino. El estándar IEEE 802.11 ha sido ampliamente utilizado en redes WMNs debido a su bajo coste y la operación en bandas frecuenciales sin licencia. El problema es que su protocolo de acceso al medio (Medium Access Control, MAC) no es determinista y que sus comunicaciones sufren los problemas del terminal oculto y expuesto. Esta tesis doctoral se centra en el soporte de comunicaciones en tiempo real en entornos tipo túnel utilizando redes WMNs. Con este objetivo, desarrollamos un protocolo MAC y de nivel de red denominado WIreless Chain networK Protocol (WICKPro) que funciona sobre IEEE 802.11. Más concretamente, en este trabajo diseñamos dos versiones de este protocolo para proporcionar soporte de tráfico de tiempo real firme (Firm Real-Time, FRT) y de tiempo real no estricto (Soft Real-Time, SRT): FRT-WICKPro y SRT-WICKPro. Asimismo, proponemos un algoritmo de hand-off conocido como Double-Threshold Hand-off (DoTHa) para el manejo de la movilidad en SRT-WICKPro WICKPro utiliza un esquema de paso de testigo para solventar las interferencias entre flujos y de un flujo consigo mismo, así como los problemas del terminal oculto y expuesto, dado que este esquema no permite que dos nodos transmitan al mismo tiempo. Esta solución es razonable para redes pequeñas donde el re uso espacial es imposible o limitado. Para tratar la naturaleza no determinista de IEEE 802.11, combinamos el esquema de paso de testigo con una planificación cíclica global. Como es habitual en planificación cíclica, el hiperperiodo es dividido en un conjunto de ciclos secundarios. FRT-WICKPro inicia el paso de testigo de forma síncrona para satisfacer estrictamente dichos ciclos secundarios, mientras que SRT-WICKPro implementa un paso de testigo asíncrono y permite sobrepasar los ciclos secundarios, por lo que desacopla los ciclos secundarios reales de los te_oricos. Finalmente, DoTHa lidia con el shadowing y la propagación multicamino. Para abordar el shadowing, DoTHa permite llevar a cabo el proceso de hand-off en la región conectada y en la región de transición de un enlace, mientras que la propagación multicamino es ignorada para el proceso de hand-off porque la potencia recibida es promediada. Nuestras propuestas fueron validadas en experimentos de laboratorio y de campo, así como en simulación. Como un estudio de caso, llevamos a cabo la teleoperación de un robot móvil en dos entornos confinados: los pasillos de un edificio y el túnel del Somport. El túnel del Somport es un antiguo túnel ferroviario fuera de servicio que conecta España y Francia por los Pirineos Centrales. Aunque los robots autónomos son cada vez más importantes, la tecnología no está suficientemente madura para manejar entornos con alto dinamismo como sistemas de fabricación reconfigurables, o para realizar decisiones de vida o muerte, por ejemplo después de un desastre con contaminación radiactiva. Las aplicaciones que pueden beneficiarse de la teleoperación de robots móviles incluyen la monitorización en tiempo real y el uso de maquinaria robotizada, por ejemplo camiones dumper y máquinas tuneladoras, que podrían ser operadas remotamente para evitar poner en peligro vidas humanas.Industrial applications have been shifting towards wireless networks in recent years because they present several advantages compared with their wired counterparts: lower deployment cost, mobility support, installation in places where cables may be problematic, and easier reconfiguration. These industrial wireless networks usually must provide real-time communication to meet application requirements. Examples of wireless real-time communication for industrial applications can be found in factory automation and process control, where Radio Frequency wireless communication technologies have been employed to support flexible real-time communication with simple deployment. Likewise, industry is also interested in real-time communication in underground environments, since there are several activities that are carried out in scenarios such as tunnels and mines, including mining, surveillance, intervention, and rescue operations. Wireless Mesh Networks (WMNs) are promising enablers to achieve wireless real-time communication because they provide a wireless backbone comprised by dedicated routers that is utilized by mobile terminals. However, WMNs also present several challenges: wireless multi-hopping causes inter-flow and intra-flow interferences, and wireless propagation suffers shadowing and multi-path fading. The IEEE 802.11 standard has been widely used in WMNs due to its low cost and the operation in unlicensed frequency bands. The downside is that its Medium Access Control (MAC) protocol is non-deterministic, and that its communications suffer from the hidden and exposed terminal problems. This PhD thesis focuses on real-time communication in tunnel-like environments by using WMNs. Particularly, we develop a MAC and network protocol on top of the IEEE 802.11 standard to provide real-time capabilities, so-called WIreless Chain networK Protocol (WICKPro). Two WICKPro versions are designed to provide Firm Real-Time (FRT) or Soft Real-Time (SRT) traffic support: FRT-WICKPro and SRT-WICKPro. We also propose a hand-off algorithm dubbed Double-Threshold Hand-off (DoTHa) to manage mobility in SRT-WICKPro. WICKPro employs a token-passing scheme to solve the inter-flow and intra-flow interferences as well as the hidden and exposed terminal problems, since this scheme does not allow two nodes to transmit at the same time. This is a reasonable solution for small-scale networks where spatial reuse is impossible or limited. The non-deterministic nature of IEEE 802.11 is faced by combining the token-passing mechanism with a polling approach based on a global cyclic packet schedule. As usual in cyclic scheduling, the hyper-period is divided into minor cycles. FRT-WICKPro triggers the token synchronously and fulfills strictly minor cycles, whereas SRT-WICKPro carries out asynchronous token-passing and lets minor cycles be overrun, thereby decoupling the theoretic and the actual minor cycles. Finally, DoTHa deals with shadowing and multi-path fading. Shadowing is addressed by providing the opportunity of triggering hand-off in the connected and transitional regions of a link, while multi-path fading is neglected for hand-off purposes by smoothing the received signal power. We tested our proposals in laboratory and field experiments, as well as in simulation. As a case study, we carried out the tele-operation of a mobile robot within two confined environments: the corridors of a building and the Somport tunnel. The Somport tunnel is an old out-of-service railway tunnel that connects Spain and France through the Central Pyrenees. Although autonomous robots are becoming more and more important, technology is not mature enough to manage highly dynamic environments such as reconfigurable manufacturing systems, or to make life-and-death decisions, e.g., after a disaster with radioactivity contamination. Applications that can benefit from mobile robot tele-operation include real-time monitoring and the use of robotized machinery, for example, dumper trucks and tunneling machines, which could be remotely operated to avoid endangering human lives

    Efficient Model Checking: The Power of Randomness

    Get PDF
    • …
    corecore