Search CORE

527 research outputs found

STUDIES OF PERFORMANCE AND PREDICTABILITY OF CACHE-LOCKING

Author: Loach Matt
Publication venue: VCU Scholars Compass
Publication date: 09/05/2013
Field of study

Today’s real-time systems need to be faster and more powerful than ever before. Caches are an architectural feature that helps solve this problem. Caches however are unpredictable and do not improve the worst case execution time. This work studies the effects of cache-locking on performance and time predictability. Two locking methods were evaluated: a dynamic locking method and a static locking method. The performance of single and multi-core processors and multiple levels of caches were studied. The time predictability of the single core system was studied and the cost of the time predictability was determined for each locking method. Cache-locking in the Level 2 cache had the best performance and the static-locking method had the highest predictability

VCU Scholars Compass

WCET-aware prefetching of unlocked instruction caches: a technique for reconciling real-time guarantees and energy efficiency

Author: Wuerges Emílio
Publication venue
Publication date: 01/01/2015
Field of study

Tese (doutorado) - Universidade Federal de Santa Catarina, Centro Tecnológico, Programa de Pós-Graduação em Engenharia de Automação e Sistemas, Florianópolis, 2015.A computação embarcada requer crescente vazão sob baixa potência. Ela requer um aumento de eficiência energética quando se executam programas de crescente complexidade. Muitos sistemas embarcados são também sistemas de tempo real, cuja correção temporal precisa ser garantida através de análise de escalonabilidade, a qual costuma assumir que o WCET de uma tarefa é conhecido em tempo de projeto. Como resultado da crescente complexidade do software, uma quantidade significativa de energia é gasta ao se prover instruções através da hierarquia de memória. Como a cache de instruções consome cerca de 40% da energia gasta em um processador embarcado e afeta a energia consumida em memória principal, ela se torna um relevante alvo para otimização. Entretanto, como ela afeta substancialmente o WCET, o comportamento da cache precisa ser restrito via Â cache lockingÂ ou previsto via análise de WCET. Para obter eficiência energética sob restrições de tempo real, é preciso estender a consciência que o compilador tem da plataforma de hardware. Entretanto, compiladores para tempo real ignoram a energia, embora determinem rapidamente limites superiores para o WCET, enquanto compiladores para sistemas embarcados estimem com precisão a energia, mas gastem muito tempo em Â profilingÂ . Por isso, esta tese propõe um método unificado para estimar a energia gasta em memória, o qual é baseado em Interpretação Abstrata, exatamente o mesmo substrato matemático usado para a análise de WCET em caches. As estimativas mostram derivadas que são tão precisas quanto as obtidas via Â profilingÂ , mas são computadas 1000 vezes mais rápido, sendo apropriadas para induzir otimização de código através de melhoria iterativa. Como Â cache lockingÂ troca eficiência energética por previsibilidade, esta tese propõe uma nova otimização de código, baseada em pré-carga por software, a qual reduz a taxa de faltas de caches de instruções e, provadamente, não aumenta o WCET. A otimização proposta é comparada com o estado-da-arte em Â cache lockingÂ parcial para 37 programas do Â Malardalen WCET benchmarkÂ para 36 configurações de cache e duas tecnologias distintas (2664 casos de uso). Em média, para obter uma melhoria de 68% no WCET, Â cache lockingÂ parcial requer 8% mais energia. Por outro lado, a pré-carga por software diminui o consumo de energia em 11% enquanto melhora em 15% o WCET, reconciliando assim eficiência energética e garantias de tempo real.Abstract : Embedded computing requires increasing throughput at low power budgets. It asks for growing energy efficiency when executing programs of rising complexity. Many embedded systems are also real-time systems, whose temporal correctness is asserted through schedulability analysis, which often assumes that the WCET of each task is known at design-time. As a result of the growing software complexity, a significant amount of energy is spent in supplying instructions through the memory hierarchy. Since an instruction cache consumes around 40% of an embedded processorÂ s energy and affects the energy spent in main memory, it becomes a relevant optimization target. However, since it largely impacts the WCET, cache behavior must be either constrained via cache locking or predicted by WCET analysis. To achieve energy efficiency under real-time constraints, a compiler must have extended awareness of the hardware platform. However, real-time compilers ignore energy, although they quickly determine bounds for WCET, whereas embedded compilers accurately estimate energy but require time-consuming profiling. That is why this thesis proposes a unifying method to estimate memory energy consumption that is based on Abstract Interpretation, the very same mathematical framework employed for the WCET analysis of caches. The estimates exhibit derivatives that are as accurate as those obtained by profiling, but are computed 1000 times faster, being suitable for driving code optimization through iterative improvement. Since cache locking gives up energy efficiency for predictability, this thesis proposes a novel code optimization, based on software prefetching, which reduces miss rate of unlocked instruction caches and, provenly, does not increase the WCET. The proposed optimization is compared with a state-of-the-art partial cache locking technique for the 37 programs of the Malardalen WCET benchmarks under 36 cache configurations and two distinct target technologies (2664 use cases). On average, to achieve an improvement of 68% in the WCET, partial cache locking required 8% more energy. On the other hand, software prefetching decreased the energy consumption by 11% while leading to an improvement of 15% in the WCET, thereby reconciling energy efficiency and real-time guarantees

Repositório Institucional da UFSC

Improving performance of single-path code through a time-predictable memory hierarchy

Author: Cilku Bekim
Prokesch Daniel
Puffitsch Wolfgang
Puschner Peter
Schoeberl Martin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 30/06/2017
Field of study

Crossref

Online Research Database In Technology

Best practice for caching of single-path code

Author: Cilku Bekim
Prokesch Daniel
Puschner Peter
Schoeberl Martin
Publication venue
Publication date: 01/01/2017
Field of study

Single-path code has some unique properties that make it interesting to explore different caching and prefetching alternatives for the stream of instructions. In this paper, we explore different cache organizations and how they perform with single-path code

Online Research Database In Technology

A Survey on Cache Management Mechanisms for Real-Time Embedded Systems

Author: ALHAMMAD AHMED
FRÖHLICH ANTÔNIO
GRACIOLI GIOVANI
MANCUSO RENATO
Pellizzoni Rodolfo
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/11/2015
Field of study

© ACM, 2015. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in ACM Computing Surveys, {48, 2, (November 2015)} http://doi.acm.org/10.1145/2830555Multicore processors are being extensively used by real-time systems, mainly because of their demand for increased computing power. However, multicore processors have shared resources that affect the predictability of real-time systems, which is the key to correctly estimate the worst-case execution time of tasks. One of the main factors for unpredictability in a multicore processor is the cache memory hierarchy. Recently, many research works have proposed different techniques to deal with caches in multicore processors in the context of real-time systems. Nevertheless, a review and categorization of these techniques is still an open topic and would be very useful for the real-time community. In this article, we present a survey of cache management techniques for real-time embedded systems, from the first studies of the field in 1990 up to the latest research published in 2014. We categorize the main research works and provide a detailed comparison in terms of similarities and differences. We also identify key challenges and discuss future research directions.King Saud University NSER

University of Waterloo's Institutional Repository

WCET-Aware Scratchpad Memory Management for Hard Real-Time Systems

Author
Publication venue
Publication date: 01/01/2017
Field of study

abstract: Cyber-physical systems and hard real-time systems have strict timing constraints that specify deadlines until which tasks must finish their execution. Missing a deadline can cause unexpected outcome or endanger human lives in safety-critical applications, such as automotive or aeronautical systems. It is, therefore, of utmost importance to obtain and optimize a safe upper bound of each task’s execution time or the worst-case execution time (WCET), to guarantee the absence of any missed deadline. Unfortunately, conventional microarchitectural components, such as caches and branch predictors, are only optimized for average-case performance and often make WCET analysis complicated and pessimistic. Caches especially have a large impact on the worst-case performance due to expensive off- chip memory accesses involved in cache miss handling. In this regard, software-controlled scratchpad memories (SPMs) have become a promising alternative to caches. An SPM is a raw SRAM, controlled only by executing data movement instructions explicitly at runtime, and such explicit control facilitates static analyses to obtain safe and tight upper bounds of WCETs. SPM management techniques, used in compilers targeting an SPM-based processor, determine how to use a given SPM space by deciding where to insert data movement instructions and what operations to perform at those program locations. This dissertation presents several management techniques for program code and stack data, which aim to optimize the WCETs of a given program. The proposed code management techniques include optimal allocation algorithms and a polynomial-time heuristic for allocating functions to the SPM space, with or without the use of abstraction of SPM regions, and a heuristic for splitting functions into smaller partitions. The proposed stack data management technique, on the other hand, finds an optimal set of program locations to evict and restore stack frames to avoid stack overflows, when the call stack resides in a size-limited SPM. In the evaluation, the WCETs of various benchmarks including real-world automotive applications are statically calculated for SPMs and caches in several different memory configurations.Dissertation/ThesisDoctoral Dissertation Computer Science 201

ASU Digital Repository

Memory Optimizations for Time-Predictable Embedded Software

Author: VIVY SUHENDRA
Publication venue
Publication date: 12/08/2009
Field of study

Ph.DDOCTOR OF PHILOSOPH

ScholarBank@NUS

Cache-Aware Instruction SPM Allocation for Hard Real-Time Systems

Author: Falk Heiko
Kittsteiner Christina
Luppold Arno
Publication venue
Publication date: 01/01/2016
Field of study

To improve the execution time of a program, parts of its instructions can be allocated to a fast Scratchpad Memory (SPM) at compile time. This is a well-known technique which can be used to minimize the program's worst-case Execution Time (WCET). However, modern embedded systems often use cached main memories. An SPM allocation will inevitably lead to changes in the program's memory layout in main memory, resulting in either improved or degraded worst-case caching behavior. We tackle this issue by proposing a cache-aware SPM allocation algorithm based on integer-linear programming which accounts for changes in the worst-case cache miss behavior

DBIS EPub

Traces as a Solution to Pessimism and Modeling Costs in WCET Analysis

Author: Audsley Neil
Whitham Jack
Publication venue: OASIcs - OpenAccess Series in Informatics. 8th International Workshop on Worst-Case Execution Time WCET Analysis (WCET\u2708)
Publication date: 01/01/2008
Field of study

WCET analysis models for superscalar out-of-order CPUs generally need to be pessimistic in order to account for a wide range of possible dynamic behavior. CPU hardware modifications could be used to constrain operations to known execution paths called traces, permitting exploitation of instruction level parallelism with guaranteed timing. Previous implementations of traces have used microcode to constrain operations, but other possibilities exist. A new implementation strategy (virtual traces) is introduced here. In this paper the benefits and costs of traces are discussed. Advantages of traces include a reduction in pessimism in WCET analysis, with the need to accurately model CPU internals removed. Disadvantages of traces include a reduction of peak throughput of the CPU, a need for deterministic memory and a potential increase in the complexity of WCET models

Dagstuhl Research Online Publication Server