870 research outputs found

    Modeling, Analysis, and Hard Real-time Scheduling of Adaptive Streaming Applications

    Get PDF
    In real-time systems, the application's behavior has to be predictable at compile-time to guarantee timing constraints. However, modern streaming applications which exhibit adaptive behavior due to mode switching at run-time, may degrade system predictability due to unknown behavior of the application during mode transitions. Therefore, proper temporal analysis during mode transitions is imperative to preserve system predictability. To this end, in this paper, we initially introduce Mode Aware Data Flow (MADF) which is our new predictable Model of Computation (MoC) to efficiently capture the behavior of adaptive streaming applications. Then, as an important part of the operational semantics of MADF, we propose the Maximum-Overlap Offset (MOO) which is our novel protocol for mode transitions. The main advantage of this transition protocol is that, in contrast to self-timed transition protocols, it avoids timing interference between modes upon mode transitions. As a result, any mode transition can be analyzed independently from the mode transitions that occurred in the past. Based on this transition protocol, we propose a hard real-time analysis as well to guarantee timing constraints by avoiding processor overloading during mode transitions. Therefore, using this protocol, we can derive a lower bound and an upper bound on the earliest starting time of the tasks in the new mode during mode transitions in such a way that hard real-time constraints are respected.Comment: Accepted for presentation at EMSOFT 2018 and for publication in IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD) as part of the ESWEEK-TCAD special issu

    Improving time predictability of shared hardware resources in real-time multicore systems : emphasis on the space domain

    Get PDF
    Critical Real-Time Embedded Systems (CRTES) follow a verification and validation process on the timing and functional correctness. This process includes the timing analysis that provides Worst-Case Execution Time (WCET) estimates to provide evidence that the execution time of the system, or parts of it, remain within the deadlines. A key design principle for CRTES is the incremental qualification, whereby each software component can be subject to verification and validation independently of any other component, with obvious benefits for cost. At timing level, this requires time composability, such that the timing behavior of a function is not affected by other functions. CRTES are experiencing an unprecedented growth with rising performance demands that have motivated the use of multicore architectures. Multicores can provide the performance required and bring the potential of integrating several software functions onto the same hardware. However, multicore contention in the access to shared hardware resources creates a dependence of the execution time of a task with the rest of the tasks running simultaneously. This dependence threatens time predictability and jeopardizes time composability. In this thesis we analyze and propose hardware solutions to be applied on current multicore designs for CRTES to improve time predictability and time composability, focusing on the on-chip bus and the memory controller. At hardware level, we propose new bus and memory controller designs that control and mitigate contention between different cores and allow to have time composability by design, also in the context of mixed-criticality systems. At analysis level, we propose contention prediction models that factor the impact of contenders and don¿t need modifications to the hardware. We also propose a set of Performance Monitoring Counters (PMC) that provide evidence about the contention. We give an special emphasis on the Space domain focusing on the Cobham Gaisler NGMP multicore processor, which is currently assessed by the European Space Agency for its future missions.Los Sistemas Críticos Empotrados de Tiempo Real (CRTES) siguen un proceso de verificación y validación para su correctitud funcional y temporal. Este proceso incluye el análisis temporal que proporciona estimaciones de el peor caso del tiempo de ejecución (WCET) para dar evidencia de que el tiempo de ejecución del sistema, o partes de él, permanecen dentro de los límites temporales. Un principio de diseño clave para los CRTES es la cualificación incremental, por la que cada componente de software puede ser verificado y validado independientemente del resto de componentes, con beneficios obvios para el coste. A nivel temporal, esto requiere composabilidad temporal, por la que el comportamiento temporal de una función no se ve afectado por otras funciones. CRTES están experimentando un crecimiento sin precedentes con crecientes demandas de rendimiento que han motivado el uso the arquitecturas multi-núcleo (multicore). Los procesadores multi-núcleo pueden proporcionar el rendimiento requerido y tienen el potencial de integrar varias funcionalidades software en el mismo hardware. A pesar de ello, la interferencia entre los diferentes núcleos que aparece en los recursos compartidos de os procesadores multi núcleo crea una dependencia del tiempo de ejecución de una tarea con el resto de tareas ejecutándose simultáneamente en el procesador. Esta dependencia amenaza la predictabilidad temporal y compromete la composabilidad temporal. En esta tésis analizamos y proponemos soluciones hardware para ser aplicadas en los diseños multi núcleo actuales para CRTES que mejoran la predictabilidad y composabilidad temporal, centrándose en el bus y el controlador de memoria internos al chip. A nivel de hardware, proponemos nuevos diseños de buses y controladores de memoria que controlan y mitigan la interferencia entre los diferentes núcleos y permiten tener composabilidad temporal por diseño, también en el contexto de sistemas de criticalidad mixta. A nivel de análisis, proponemos modelos de predicción de la interferencia que factorizan el impacto de los núcleos y no necesitan modificaciones hardware. También proponemos un conjunto de Contadores de Control del Rendimiento (PMC) que proporcionoan evidencia de la interferencia. En esta tésis, damós especial importancia al dominio espacial, centrándonos en el procesador mutli núcleo Cobham Gaisler NGMP, que está siendo actualmente evaluado por la Agencia Espacial Europea para sus futuras misiones

    Performanzanalyse für Multi-Core Multi-Mode Systeme mit gemeinsam genutzten Ressourcen - Verfahren und Anwendung auf AUTOSAR -

    Get PDF
    In order to implement multi-core systems for single-mode and multi-mode real-time applications, as can be found in modern automobiles, their development process requires appropriate methods and tools for timing and performance verification. In this context, this thesis proposes first novel approaches for the analysis of worst-case blocking-times and response-times for single-mode real-time applications that share resources in partitioned multi-core systems. For this purpose a compositional performance analysis methodology is adopted and extended to take into account the contention of tasks on the processor cores and on the shared resources under different combinations of processor scheduling policies and shared resource arbitration strategies. Highly relevant is the compatibility of the proposed analysis methods with the specifications of the automotive AUTOSAR standard, which defines the combination of (1) preemptive, non-preemptive and cooperative core local scheduling with (2) lock-based arbitration of core local shared resources and spinlock-based arbitration of inter-core shared resources. Further, this thesis proposes novel timing analysis solutions for multi-mode distributed real-time systems. For such systems, the settling time of a mode change, called mode change transition latency, is identified as an important system parameter that has been neglected before. This thesis contributes a novel analysis algorithm which gives a maximum bound on each mode change transition latency of multi-mode distributed applications. Knowing the settling time of each mode change, the impact of multiple mode changes and of the possible overload situations can be handled in the early development phases of real-time systems. Finally, an approach for safely handling shared resources across mode changes is presented and a corresponding timing analysis method is contributed. The new analysis solution combines modeling and analysis elements of the multi-core and multi-mode related analysis solutions and focuses on the specification of the AUTOSAR standard. This enables system designers to handle the timing behavior of more complex systems in which the problems of mode management, multi-core scheduling and shared resource arbitration coexist. The applicability and usefulness of the contributed analysis solutions are highlighted by experimental evaluations, which are enabled by the implementation of the proposed analysis methods in a performance analysis tool framework.Um Multicore-Systeme für die Umsetzung zeitkritischer Single- und Multi-Mode Anwendungen in sicherheitskritischen Umgebungen einsetzen zu können, werden in dem Entwicklungsprozess geeignete Analysemethoden und Tools zur Bestimmung des Zeitverhaltens und der Performanz benötigt. Als erster Beitrag dieser Dissertation werden neue Analyseverfahren eingeführt, um die Worst-Case-Antwortzeiten und -Blockierungszeiten für statische Echtzeitanwendungen in Single-Mode eingebetteten Multicore-Systemen mit gemeinsam genutzten Ressourcen zu bestimmen. Die entwickelten Verfahren nutzen einen existierenden kompositionellen Performanzanalyseansatz und erweitern diesen, um verschiedene Kombinationen von partitionierenden Multiprozessor-Schedulingverfahren und –Synchronisationsmechanismen behandeln zu können. Besonders praxisrelevant ist die Möglichkeit, die Kombination von (1) preemptives, nicht-preemptives sowie kooperatives Prozessor-Scheduling und (2) Spinlock-basierten Synchronisationsmechanismen zu analysieren, die heute in AUTOSAR-konformen Automotive-Softwarearchitekturen standardisiert sind. Als zweiter Beitrag wird in dieser Dissertation ein neuer Ansatz für die Analyse der zeitlichen Auswirkungen von mehreren Szenarienübergängen in vernetzten Multi-Mode eingebetteten Systemen eingeführt. Als erste konstruktive Maßnahme ermöglicht das in dieser Arbeit präsentierte Verfahren die Berechnung der Einschwingzeit jedes Szenarioübergangs und leistet dadurch eine wichtige Hilfestellung beim Systementwurf. Auf diese Weise können die Auswirkungen der Szenarienübergänge, einschließlich der zeitlich begrenzten Überlastsituationen, kontrolliert und in den Systementwurf frühzeitig einbezogen werden. Als letzter Beitrag dieser Dissertation wird ein Ansatz für die Handhabung der Zugriffskonflikte auf gemeinsam genutzten Ressourcen in Multi-Mode eingebetteten Multicore-Systemen präsentiert und eine entsprechende Analysemethode eingeführt. Die neue Analyse kombiniert Modellierungs- und Analyse-Elemente der vorher in dieser Arbeit eingeführten Analyseansätze, und ermöglicht die Untersuchung des ungünstigsten Zeitverhaltens viel komplexer eingebetteten Multicore-Systemen. Dabei werden erneut Spezifikationen der AUTOSAR-Standards berücksichtigt. Nicht zuletzt werden alle Analysemethoden in eine Toolumgebung implementiert und für verschiedene Experimente, die deren praktische Anwendbarkeit hervorheben, angewendet

    Multi-resource management in embedded real-time systems

    Get PDF
    This thesis addresses the problem of online multi-resource management in embedded real-time systems. It focuses on three research questions. The first question concentrates on how to design an efficient hierarchical scheduling framework for supporting independent development and analysis of component based systems, to provide temporal isolation between components. The second question investigates how to change the mapping of resources to tasks and components during run-time efficiently and predictably, and how to analyze the latency of such a system mode change in systems comprised of several scalable components. The third question deals with the scheduling and analysis of a set of parallel-tasks with real-time constraints which require simultaneous access to several different resources. For providing temporal isolation we chose a reservation-based approach. We first focused on processor reservations, where timed events play an important role. Common examples are task deadlines, periodic release of tasks, budget replenishment and budget depletion. Efficient timer management is therefore essential. We investigated the overheads in traditional timer management techniques and presented a mechanism called Relative Timed Event Queues (RELTEQ), which provides an expressive set of primitives at a low processor and memory overhead. We then leveraged RELTEQ to create an efficient, modular and extensible design for enhancing a real-time operating system with periodic tasks, polling, idling periodic and deferrable servers, and a two-level fixed-priority Hierarchical Scheduling Framework (HSF). The HSF design provides temporal isolation and supports independent development of components by separating the global and local scheduling, and allowing each server to define a dedicated scheduler. Furthermore, the design addresses the system overheads inherent to an HSF and prevents undesirable interference between components. It limits the interference of inactive servers on the system level by means of wakeup events and a combination of inactive server queues with a stopwatch queue. Our implementation is modular and requires only a few modifications of the underlying operating system. We then investigated scalable components operating in a memory-constrained system. We first showed how to reduce the memory requirements in a streaming multimedia application, based on a particular priority assignment of the different components along the processing chain. Then we investigated adapting the resource provisions to tasks during runtime, referred to as mode changes. We presented a novel mode change protocol called Swift Mode Changes, which relies on Fixed Priority with Deferred preemption Scheduling to reduce the mode change latency bound compared to existing protocols based on Fixed Priority Preemptive Scheduling. We then presented a new partitioned parallel-task scheduling algorithm called Parallel-SRP (PSRP), which generalizes MSRP for multiprocessors, and the corresponding schedulability analysis for the problem of multi-resource scheduling of parallel tasks with real-time constraints. We showed that the algorithm is deadlock-free, derived a maximum bound on blocking, and used this bound as a basis for a schedulability test. We then demonstrated how PSRP can exploit the inherent parallelism of a platform comprised of multiple heterogeneous resources. Finally, we presented Grasp, which is a visualization toolset aiming to provide insight into the behavior of complex real-time systems. Its flexible plugin infrastructure allows for easy extension with custom visualization and analysis techniques for automatic trace verification. Its capabilities include the visualization of hierarchical multiprocessor systems, including partitioned and global multiprocessor scheduling with migrating tasks and jobs, communication between jobs via shared memory and message passing, and hierarchical scheduling in combination with multiprocessor scheduling. For tracing distributed systems with asynchronous local clocks Grasp also supports the synchronization of traces from different processors during the visualization and analysis

    Evaluation of Edge AI Co-Processing Methods for Space Applications

    Get PDF
    The recent years spread of SmallSats offers several new services and opens to the implementation of new technologies to improve the existent ones. However, the communication link to Earth in order to process data often is a bottleneck, due to the amount of collected data and the limited bandwidth. A way to face this challenge is edge computing, which supposedly discards useless data and fasten up the transmission, and therefore the research has moved towards the study of COTS architectures to be used in space, often organized in co-processing setups. This thesis considers AI as application use case and two devices in a controller-accelerator configuration. It proposes to investigate the performances of co-processing methods such as simple parallel, horizontal partitioning and vertical partitioning, for a set of different tasks and taking advantage of different pre-trained models. The actual experiments regard only simple parallel and horizontal partitioning mode, and they compare latency and accuracy results with single processing runs on both devices. Evaluating the results task-by-task, image classification has the best performance improvement taking advantage of horizontal partitioning, with a clear accuracy improvement, as well as semantic segmentation, which shows almost stable accuracy and potentially higher throughput with smaller models input sizes. On the other hand, object detection shows a drop in performances, especially accuracy, which could maybe be improved with more specifically developed models for the chosen hardware. The project clearly shows how co-processing methods are worth of being investigated and can improve system outcomes for some of the analyzed tasks, making future work about it interesting

    Parallel Real-Time Scheduling for Latency-Critical Applications

    Get PDF
    In order to provide safety guarantees or quality of service guarantees, many of today\u27s systems consist of latency-critical applications, e.g. applications with timing constraints. The problem of scheduling multiple latency-critical jobs on a multiprocessor or multicore machine has been extensively studied for sequential (non-parallizable) jobs and different system models and different objectives have been considered. However, the computational requirement of a single job is still limited by the capacity of a single core. To provide increasingly complex functionalities of applications and to complete their higher computational demands within the same or even more stringent timing constraints, we must exploit the internal parallelism of jobs, where individual jobs are parallel programs and can potentially utilize more than one core in parallel. However, there is little work considering scheduling multiple parallel jobs that are latency-critical. This dissertation focuses on developing new scheduling strategies, analysis tools, and practical platform design techniques to enable efficient and scalable parallel real-time scheduling for latency-critical applications on multicore systems. In particular, the research is focused on two types of systems: (1) static real-time systems for tasks with deadlines where the temporal properties of the tasks that need to execute is known a priori and the goal is to guarantee the temporal correctness of the tasks prior to their executions; and (2) online systems for latency-critical jobs where multiple jobs arrive over time and the goal to optimize for a performance objective of jobs during the execution. For static real-time systems for parallel tasks, several scheduling strategies, including global earliest deadline first, global rate monotonic and a novel federated scheduling, are proposed, analyzed and implemented. These scheduling strategies have the best known theoretical performance for parallel real-time tasks under any global strategy, any fixed priority scheduling and any scheduling strategy, respectively. In addition, federated scheduling is generalized to systems with multiple criticality levels and systems with stochastic tasks. Both numerical and empirical experiments show that federated scheduling and its variations have good schedulability performance and are efficient in practice. For online systems with multiple latency-critical jobs, different online scheduling strategies are proposed and analyzed for different objectives, including maximizing the number of jobs meeting a target latency, maximizing the profit of jobs, minimizing the maximum latency and minimizing the average latency. For example, a simple First-In-First-Out scheduler is proven to be scalable for minimizing the maximum latency. Based on this theoretical intuition, a more practical work-stealing scheduler is developed, analyzed and implemented. Empirical evaluations indicate that, on both real world and synthetic workloads, this work-stealing implementation performs almost as well as an optimal scheduler
    • …
    corecore