15 research outputs found

    DESIGN & DEVELOPMENT OF REAL-TIME MULTITASKING MICROKERNEL BASED ON ARM7TDMI FOR INDUSTRIAL AUTOMATION.

    Get PDF
    A real-time microkernel is the near-minimum amount of software that can provide the mechanisms needed to implement a real-time operating system. Real-time systems are those systems whose response is deterministic in time. In our research a 32-task Real Time Microkernel is designed using which multi tasking can be done on the targeted processor ARM7TDMI. Two sets of functions are developed in this research work. First one is Operating System functions and second is application functions. Operating System functions are mainly for carrying out task creation, multi-tasking, scheduling, context switching and Inter task communication. The process of scheduling and switching the CPU (Central Processing Unit) between several tasks is illustrated in this paper. The number of application functions can vary between 1 to 32. Each of these application functions is created as a task by the microkernel and scheduled by the pre-emptive priority scheduler. Multi tasking of these application tasks is demonstrated in this paper

    Performance Evaluation of Multicore Cache Locking using Multimedia Applications

    Get PDF
    Supporting real-time multimedia applications on multicore systems is a great challenge due to cache’s dynamic behavior. Studies show that cache locking may improve execution time predictability and power/performance ratio. However, entire locking at level-1 cache (CL1) may not be efficient if smaller amount of instructions/data compared to the cache size is locked. An alternative choice may be way (i.e., partial) locking. For some processors, way locking is possible only at level-2 cache (CL2). Even though both CL1 cache locking and CL2 cache locking improve predictability, it is difficult to justify the performance and power trade-off between these two cache locking mechanisms. In this work, we assess the impact of CL1 and CL2 cache locking on the performance, power consumption, and predictability of a multicore system using ISO standard H.264/AVC, MPEG4, and MPEG3 multimedia applications and FFT and DFT codes. Simulation results show that both the performance and predictability can be increased and the total power consumption can be decreased by using a cache locking mechanism added to a cache memory hierarchy. Results also show that for the applications used, CL1 cache locking outperforms CL2 cache locking

    SACR: Scheduling-Aware Cache Reconfiguration for Real-Time Embedded Systems

    Full text link
    Dynamic reconfiguration techniques are widely used for efficient system optimization. Dynamic cache reconfiguration is a promising approach for reducing energy consumption as well as for improving overall system performance. It is a major challenge to introduce cache reconfiguration into real-time embedded systems since dynamic analysis may adversely affect tasks with real-time constraints. This paper presents a novel approach for implementing cache reconfiguration in soft real-time systems by efficiently leveraging static analysis during execution to both minimize energy and maximize performance. To the best of our knowledge, this is the first attempt to integrate dynamic cache reconfiguration in real-time scheduling techniques. Our experimental results using a wide variety of applications have demonstrated that our approach can significantly (up to 74%) reduce the overall energy consumption of the cache hierarchy in soft real-time systems. 1

    Reducing the WCET and analysis time of systems with simple lockable instruction caches

    Get PDF
    One of the key challenges in real-time systems is the analysis of the memory hierarchy. Many Worst-Case Execution Time (WCET) analysis methods supporting an instruction cache are based on iterative or convergence algorithms, which are rather slow. Our goal in this paper is to reduce the WCET analysis time on systems with a simple lockable instruction cache, focusing on the Lock-MS method. First, we propose an algorithm to obtain a structure-based representation of the Control Flow Graph (CFG). It organizes the whole WCET problem as nested subproblems, which takes advantage of common branch-and-bound algorithms of Integer Linear Programming (ILP) solvers. Second, we add support for multiple locking points per task, each one with specific cache contents, instead of a given locked content for the whole task execution. Locking points are set heuristically before outer loops. Such simple heuristics adds no complexity, and reduces the WCET by taking profit of the temporal reuse found in loops. Since loops can be processed as isolated regions, the optimal contents to lock into cache for each region can be obtained, and the WCET analysis time is further reduced. With these two improvements, our WCET analysis is around 10 times faster than other approaches. Also, our results show that the WCET is reduced, and the hit ratio achieved for the lockable instruction cache is similar to that of a real execution with an LRU instruction cache. Finally, we analyze the WCET sensitivity to compiler optimization, showing for each benchmark the right choices and pointing out that O0 is always the worst option

    A Survey on Cache Management Mechanisms for Real-Time Embedded Systems

    Get PDF
    © ACM, 2015. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in ACM Computing Surveys, {48, 2, (November 2015)} http://doi.acm.org/10.1145/2830555Multicore processors are being extensively used by real-time systems, mainly because of their demand for increased computing power. However, multicore processors have shared resources that affect the predictability of real-time systems, which is the key to correctly estimate the worst-case execution time of tasks. One of the main factors for unpredictability in a multicore processor is the cache memory hierarchy. Recently, many research works have proposed different techniques to deal with caches in multicore processors in the context of real-time systems. Nevertheless, a review and categorization of these techniques is still an open topic and would be very useful for the real-time community. In this article, we present a survey of cache management techniques for real-time embedded systems, from the first studies of the field in 1990 up to the latest research published in 2014. We categorize the main research works and provide a detailed comparison in terms of similarities and differences. We also identify key challenges and discuss future research directions.King Saud University NSER

    Instruction and data cache modeling for timing analysis in real-time systems

    Get PDF
    Master'sMASTER OF ENGINEERIN

    Architectures multi-flots simultanés pour le temps-réel strict

    Get PDF
    Dans les systèmes critiques, les applications doivent satisfaire des contraintes temporelles strictes, chaque tâche devant s'exécuter en un temps maximum prédéfini ; le non-respect d'une seule échéance peut compromettre toute la stabilité du système et engendrer des effets désastreux. Un tel système est appelé système temps-réel strict. Pour pouvoir assigner une échéance à une tâche, il faut être capable de déterminer le temps maximum que mettra cette tâche à s'exécuter, ceci indépendamment des données en entrée de la tâche. Ce temps maximum recherché s'appelle le WCET (Worst Case Execution Time, temps d'exécution pire cas), il est souvent déterminé à l'issue d'un processus de calcul nécessitant une modélisation des structures de l'architecture du processeur. Les mécanismes architecturaux qui augmentent les performances d'un processeur (prédiction de branchement, cache) induisent souvent un fort taux d'indéterminisme qui rend la modélisation difficile. C'est pourquoi il est souvent préférable d'utiliser des architectures relativement simples pour un système temps-réel strict, ou de simplifier des architectures hautes performances récentes. Notre optique est plutôt d'essayer d'adapter, par de légères modifications, une de ces architectures performantes mais peu prédictibles pour un respect de contraintes temps-réel strict et un calcul de WCET facilité. L'architecture que nous choisissons est l'architecture Multi-Flots Simultanés (Simultaneous Multihtreading, SMT), ou plusieurs programmes peuvent s'exécuter simultanément en partageant les ressources d'un seul cœur d'exécution.In critical systems, applications must satisfy hard timing constraints, each task must execute in a maximum predefinite time. Any unrespected constraint may compromise the stability of the whole system and generate disastrous effects. Such a system is called hard real-time system. To be able to assign a constraint to a task, you must be able to determinate the maximum time this task will execute, independently from the input data of the task. This maximum time you search is called the WCET (Worst Case Execution Time), it is obtained by a calculation process where we need to modelise the structures of the processor architecture. The architecture mechanisms increasing performance (caches, branch prediction) are often a lot undeterministic and thus are difficult to modelise. That's why we usually prefer using relatively simple architectures for a hard real-time system, or simplifying recent high-performance architecture. In this work, we will rather adapt, using small modifications, one of those high-performance but little predictible architecture to respect hard timing constraints and make simpler WCET calculation. We choose the Simultaneous Multithreading architecture where several programs can run at the same time sharing the resources of one core only

    Simulación de nuevas arquitecturas de memorias caché de procesadores para sistemas empotrados

    Full text link
    Actualmente mas del 95% de los procesadores fabricados se montan en sistemas empotrados. Muchos de estos procesadores se montan en dispositivos móviles alimentados por baterías o sistemas de tiempo real donde un bajo consumo de energía puede ser extremadamente necesario. Gran parte del gasto energético de un procesador es consumido por las memorias on-chip, esto hace que cobre un especial interés la reducción energética de estas memorias sin que ello conlleve una reducción de las prestaciones en dichos procesadores. Actualmente, muchos procesadores empotrados incluyen en su arquitecturas memorias estáticas onchip llamadas scratch-pad memories (SPM), coexistiendo o remplazando a las memorias cache. Comparadas con la cache estas memorias no requieren de etiquetas y una compleja lógica de control lo que conlleva un incremento en la eficiencia tanto en el área de silicio gastada como en el consumo energético. En los últimos años muchos estudios han propuesto algunos algoritmos para meter cuidadosamente segmentos de memoria en la SPM para incrementar el rendimiento y/o reducir el consumo de memoria. Sin embargo muy poco han cambiado la arquitectura de la SPM para hacerla mas controlable, mas eficiente energéticamente y más rápida. En esta memoria presentamos tres posibles técnicas para mejorar el rendimiento y/o consumo energético en un procesador empotrado con una cache convencional. La primera de ella consiste en introducir y bloquear trozos de código en la propia memoria cache, lo que resulta bastante útil en sistemas de tiempo real ya que permite ajustar la cota del WCET repercutiendo en un mejor aprovechamiento del procesador, en la segunda sustituimos la memoria cache por una spm y por ultimo en la tercera de estas técnicas proponemos un nuevo paradigma de control de la SPM para actualizar sus contenido al vuelo, mediante diversos cambios hardware y software. Esta ultima solución esta basada en una pequeña unidad de control que carga código en la SPM mientras este es lanzado a ejecución. Nosotros extendemos la arquitectura del procesador con unas pocas nuevas instrucciones para controlar la SPM, y añadimos diferentes modos de ejecución. La arquitectura resultante reduce los retrasos por la actualización de código en la SPM y motiva a un uso muy dinámico de esta, es decir con actualizaciones de su contenido frecuentes durante la ejecución del programa. Esta técnica presentada es una técnica ortogonal que puede complementarse con diversas técnicas presentadas hasta la fecha para el eficiente uso de la SPM. Todas estas técnicas han sido implementadas en un simulador basado en el popular Simplescalar y han mostrado mejoras en los resultados, de media, de un 30,6% de mejora en el consumo energético y un 7,6% en el rendimiento de la ultima técnica implementada respecto una sistema convencional con cache.Catalá Barber, C. (2011). Simulación de nuevas arquitecturas de memorias caché de procesadores para sistemas empotrados. http://hdl.handle.net/10251/11714.Archivo delegad

    Architecture multi-coeurs et temps d'exécution au pire cas

    Get PDF
    Les tâches critiques en systèmes temps-réel sont soumises à des contraintes temporelles et de correction. La validation d'un tel système repose sur l'estimation du comportement temporel au pire cas de ses tâches. Le partage de ressources, inhérent aux architectures multi-cœurs, entrave le calcul de ces estimations. Le comportement temporel d'une tâche dépend de ses rivales du fait de l'arbitrage de l'accès aux ressources ou de modifications concurrentes de leur état. Cette étude vise à l'estimation de la contribution temporelle de la hiérarchie mémoire au pire temps d'exécution de tâches critiques. Les méthodes existantes, pour caches d'instructions, sont étendues afin de supporter caches de données privés et partagés, et permettre l'analyse de hiérarchies mémoires riches. Le court-circuitage de cache est ensuite utilisé pour réduire la pression sur les caches partagés. Nous proposons à cette fin différentes heuristiques basées sur la capture de la réutilisation de blocs de cache entre différents accès mémoire. Notre seconde proposition est la politique de partitionnement Preti qui permet l'allocation d'un espace sans conflits à une tâche. Preti favorise aussi les performances de tâches non critiques concurrentes aux temps-réel dans les systèmes de criticité hybride.Critical tasks in the context of real-time systems submit to both timing and correctness constraints. Whence, the validation of a real-time system rely on the estimation of its tasks Worst case execution times. Resource sharing, as it occurs on multicore architectures, hinders the computation of such estimates. The timing behaviour of a task is impacted by its concurrents, whether because of resource access arbitration or concurrent modifications of a resource state. This study focuses on estimating the contribution of the memory hierarchy to tasks worst case execution time. Existing analysis methods, defined for instruction caches, are extended to support private and shared data caches, hence allowing for the analysis of rich memory hierarchies. Cache bypass is then used to reduce the pressure laid by concurrent tasks on shared caches levels. We propose different bypass heuristics, based on the capture of cache blocks reuse between memory accesses. Our second proposal is the Preti partitioning scheme which allows for the allocation to tasks of a cache space, free from inter-task conflicts. Preti offers the added benefit of providing for average-case performance to non-critical tasks concurrent to real-time ones on hybrid criticality systems.RENNES1-Bibl. électronique (352382106) / SudocSudocFranceF
    corecore