1,069 research outputs found

    pTNoC: Probabilistically time-analyzable tree-based NoC for mixed-criticality systems

    Get PDF
    The use of networks-on-chip (NoC) in real-time safety-critical multicore systems challenges deriving tight worst-case execution time (WCET) estimates. This is due to the complexities in tightly upper-bounding the contention in the access to the NoC among running tasks. Probabilistic Timing Analysis (PTA) is a powerful approach to derive WCET estimates on relatively complex processors. However, so far it has only been tested on small multicores comprising an on-chip bus as communication means, which intrinsically does not scale to high core counts. In this paper we propose pTNoC, a new tree-based NoC design compatible with PTA requirements and delivering scalability towards medium/large core counts. pTNoC provides tight WCET estimates by means of asymmetric bandwidth guarantees for mixed-criticality systems with negligible impact on average performance. Finally, our implementation results show the reduced area and power costs of the pTNoC.The research leading to these results has received funding from the European Community’s Seventh Framework Programme [FP7/2007-2013] under the PROXIMA Project (www.proxima-project.eu), grant agreement no 611085. This work has also been partially supported by the Spanish Ministry of Science and Innovation under grant TIN2015-65316-P and the HiPEAC Network of Excellence. Mladen Slijepcevic is funded by the Obra Social Fundación la Caixa under grant Doctorado “la Caixa” - Severo Ochoa. Carles Hern´andez is jointly funded by the Spanish Ministry of Economy and Competitiveness (MINECO) and FEDER funds through grant TIN2014-60404-JIN. Jaume Abella has been partially supported by the MINECO under Ramon y Cajal postdoctoral fellowship number RYC-2013-14717.Peer ReviewedPostprint (author's final draft

    Programming MPSoC platforms: Road works ahead

    Get PDF
    This paper summarizes a special session on multicore/multi-processor system-on-chip (MPSoC) programming challenges. The current trend towards MPSoC platforms in most computing domains does not only mean a radical change in computer architecture. Even more important from a SW developer´s viewpoint, at the same time the classical sequential von Neumann programming model needs to be overcome. Efficient utilization of the MPSoC HW resources demands for radically new models and corresponding SW development tools, capable of exploiting the available parallelism and guaranteeing bug-free parallel SW. While several standards are established in the high-performance computing domain (e.g. OpenMP), it is clear that more innovations are required for successful\ud deployment of heterogeneous embedded MPSoC. On the other hand, at least for coming years, the freedom for disruptive programming technologies is limited by the huge amount of certified sequential code that demands for a more pragmatic, gradual tool and code replacement strategy

    Architecture-Aware Configuration and Scheduling of Matrix Multiplication on Asymmetric Multicore Processors

    Get PDF
    Asymmetric multicore processors (AMPs) have recently emerged as an appealing technology for severely energy-constrained environments, especially in mobile appliances where heterogeneity in applications is mainstream. In addition, given the growing interest for low-power high performance computing, this type of architectures is also being investigated as a means to improve the throughput-per-Watt of complex scientific applications. In this paper, we design and embed several architecture-aware optimizations into a multi-threaded general matrix multiplication (gemm), a key operation of the BLAS, in order to obtain a high performance implementation for ARM big.LITTLE AMPs. Our solution is based on the reference implementation of gemm in the BLIS library, and integrates a cache-aware configuration as well as asymmetric--static and dynamic scheduling strategies that carefully tune and distribute the operation's micro-kernels among the big and LITTLE cores of the target processor. The experimental results on a Samsung Exynos 5422, a system-on-chip with ARM Cortex-A15 and Cortex-A7 clusters that implements the big.LITTLE model, expose that our cache-aware versions of gemm with asymmetric scheduling attain important gains in performance with respect to its architecture-oblivious counterparts while exploiting all the resources of the AMP to deliver considerable energy efficiency

    Core Count vs Cache Size for Manycore Architectures in the Cloud

    Get PDF
    The number of cores which fit on a single chip is growing at an exponential rate while off-chip main memory bandwidth is growing at a linear rate at best. This core count to off-chip bandwidth disparity causes per-core memory bandwidth to decrease as process technology advances. Continuing per-core off-chip bandwidth reduction will cause multicore and manycore chip architects to rethink the optimal grain size of a core and the on-chip cache configuration in order to save main memory bandwidth. This work introduces an analytic model to study the tradeoffs of utilizing increased chip area for larger caches versus more cores. We focus this study on constructing manycore architectures well suited for the emerging application space of cloud computing where many independent applications are consolidated onto a single chip. This cloud computing application mix favors small, power-efficient cores. The model is exhaustively evaluated across a large range of cache and core-count configurations utilizing SPEC Int 2000 miss rates and CACTI timing and area models to determine the optimal cache configurations and the number of cores across four process nodes. The model maximizes aggregate computational throughput and is applied to SRAM and logic process DRAM caches. As an example, our study demonstrates that the optimal manycore configuration in the 32nm node for a 200 mm^2 die uses on the order of 158 cores, with each core containing a 64KB L1I cache, a 16KB L1D cache, and a 1MB L2 embedded-DRAM cache. This study finds that the optimal cache size will continue to grow as process technology advances, but the tradeoff between more cores and larger caches is a complex tradeoff in the face of limited off-chip bandwidth and the non-linearities of cache miss rates and memory controller queuing delay

    Fuzzy logic based energy and throughput aware design space exploration for MPSoCs

    Get PDF
    Multicore architectures were introduced to mitigate the issue of increase in power dissipation with clock frequency. Introduction of deeper pipelines, speculative threading etc. for single core systems were not able to bring much increase in performance as compared to their associated power overhead. However for multicore architectures performance scaling with number of cores has always been a challenge. The Amdahl's law shows that the theoretical maximum speedup of a multicore architecture is not even close to the multiple of number of cores. With less amount of code in parallel having more number of cores for an application might just contribute in greater power dissipation instead of bringing some performance advantage. Therefore there is a need of an adaptive multicore architecture that can be tailored for the application in use for higher energy efficiency. In this paper a fuzzy logic based design space exploration technique is presented that is targeted to optimize a multicore architecture according to the workload requirements in order to achieve optimum balance between throughput and energy of the system

    Development and certification of mixed-criticality embedded systems based on probabilistic timing analysis

    Get PDF
    An increasing variety of emerging systems relentlessly replaces or augments the functionality of mechanical subsystems with embedded electronics. For quantity, complexity, and use, the safety of such subsystems is an increasingly important matter. Accordingly, those systems are subject to safety certification to demonstrate system's safety by rigorous development processes and hardware/software constraints. The massive augment in embedded processors' complexity renders the arduous certification task significantly harder to achieve. The focus of this thesis is to address the certification challenges in multicore architectures: despite their potential to integrate several applications on a single platform, their inherent complexity imperils their timing predictability and certification. Recently, the Measurement-Based Probabilistic Timing Analysis (MBPTA) technique emerged as an alternative to deal with hardware/software complexity. The innovation that MBPTA brings about is, however, a major step from current certification procedures and standards. The particular contributions of this Thesis include: (i) the definition of certification arguments for mixed-criticality integration upon multicore processors. In particular we propose a set of safety mechanisms and procedures as required to comply with functional safety standards. For timing predictability, (ii) we present a quantitative approach to assess the likelihood of execution-time exceedance events with respect to the risk reduction requirements on safety standards. To this end, we build upon the MBPTA approach and we present the design of a safety-related source of randomization (SoR), that plays a key role in the platform-level randomization needed by MBPTA. And (iii) we evaluate current certification guidance with respect to emerging high performance design trends like caches. Overall, this Thesis pushes the certification limits in the use of multicore and MBPTA technology in Critical Real-Time Embedded Systems (CRTES) and paves the way towards their adoption in industry.Una creciente variedad de sistemas emergentes reemplazan o aumentan la funcionalidad de subsistemas mecánicos con componentes electrónicos embebidos. El aumento en la cantidad y complejidad de dichos subsistemas electrónicos así como su cometido, hacen de su seguridad una cuestión de creciente importancia. Tanto es así que la comercialización de estos sistemas críticos está sujeta a rigurosos procesos de certificación donde se garantiza la seguridad del sistema mediante estrictas restricciones en el proceso de desarrollo y diseño de su hardware y software. Esta tesis trata de abordar los nuevos retos y dificultades dadas por la introducción de procesadores multi-núcleo en dichos sistemas críticos: aunque su mayor rendimiento despierta el interés de la industria para integrar múltiples aplicaciones en una sola plataforma, suponen una mayor complejidad. Su arquitectura desafía su análisis temporal mediante los métodos tradicionales y, asimismo, su certificación es cada vez más compleja y costosa. Con el fin de lidiar con estas limitaciones, recientemente se ha desarrollado una novedosa técnica de análisis temporal probabilístico basado en medidas (MBPTA). La innovación de esta técnica, sin embargo, supone un gran cambio cultural respecto a los estándares y procedimientos tradicionales de certificación. En esta línea, las contribuciones de esta tesis están agrupadas en tres ejes principales: (i) definición de argumentos de seguridad para la certificación de aplicaciones de criticidad-mixta sobre plataformas multi-núcleo. Se definen, en particular, mecanismos de seguridad, técnicas de diagnóstico y reacción de faltas acorde con el estándar IEC 61508 sobre una arquitectura multi-núcleo de referencia. Respecto al análisis temporal, (ii) presentamos la cuantificación de la probabilidad de exceder un límite temporal y su relación con los requisitos de reducción de riesgos derivados de los estándares de seguridad funcional. Con este fin, nos basamos en la técnica MBPTA y presentamos el diseño de una fuente de números aleatorios segura; un componente clave para conseguir las propiedades aleatorias requeridas por MBPTA a nivel de plataforma. Por último, (iii) extrapolamos las guías actuales para la certificación de arquitecturas multi-núcleo a una solución comercial de 8 núcleos y las evaluamos con respecto a las tendencias emergentes de diseño de alto rendimiento (caches). Con estas contribuciones, esta tesis trata de abordar los retos que el uso de procesadores multi-núcleo y MBPTA implican en el proceso de certificación de sistemas críticos de tiempo real y facilita, de esta forma, su adopción por la industria.Postprint (published version

    A Novel Method for Online Detection of Faults Affecting Execution-Time in Multicore-Based Systems

    Get PDF
    This article proposes a bounded interference method, based on statistical evaluations, for online detection and tolerance of any fault capable of causing a deadline miss. The proposed method requires data that can be gathered during the profiling and worst-case execution time (WCET) analysis phase. This article describes the method, its application, and then it presents an avionic mixed-criticality use case for experimental evaluation, considering both dual-core and quad-core platforms. Results show that faults that can cause a timing violation are correctly identified while other faults that do not introduce a significant temporal interference can be tolerated to avoid high recovery overheads

    A survey of techniques for reducing interference in real-time applications on multicore platforms

    Get PDF
    This survey reviews the scientific literature on techniques for reducing interference in real-time multicore systems, focusing on the approaches proposed between 2015 and 2020. It also presents proposals that use interference reduction techniques without considering the predictability issue. The survey highlights interference sources and categorizes proposals from the perspective of the shared resource. It covers techniques for reducing contentions in main memory, cache memory, a memory bus, and the integration of interference effects into schedulability analysis. Every section contains an overview of each proposal and an assessment of its advantages and disadvantages.This work was supported in part by the Comunidad de Madrid Government "Nuevas Técnicas de Desarrollo de Software de Tiempo Real Embarcado Para Plataformas. MPSoC de Próxima Generación" under Grant IND2019/TIC-17261
    corecore