14 research outputs found

    Boosting Guaranteed Performance in Wormhole NoCs with Probabilistic Timing Analysis

    Get PDF
    Wormhole-based NoCs (wNoCs) are widely accepted in high-performance domains as the most appropriate solution to interconnect an increasing number of cores in the chip. However, wNoCs suitability in the context of critical real-time applications has not been demonstrated yet. In this paper, in the context of probabilistic timing analysis (PTA), we propose a PTA-compatible wNoC design that provides tight time-composable contention bounds. The proposed wNoC design builds on PTA ability to reason in probabilistic terms about hardware events impacting execution time (e.g. wNoC contention), discarding those sequences of events occurring with a negligible low probability. This allows our wNoC design to deliver improved guaranteed performance. ur results show that WCET estimates of applications running on top of probabilistic wNoCs are reduced by 40% and 75% on average for 4x4 and 6x6 wNoC setups respectively when compared against deterministic wNoCs.This work has also been partially supported by the Spanish Ministry of Science and Innovation under grant TIN2015-65316-P and the HiPEAC Network of Excellence. Mladen Slijepcevic is funded by the Obra Social Fundación la Caixa under grant Doctorado “la Caixa” - Severo Ochoa. Carles Hernández is jointly funded by the Spanish Ministry of Economy and Competitiveness (MINECO) and FEDER funds through grant TIN2014-60404-JIN. Jaume Abella has been partially supported by the MINECO under Ramon y Cajal postdoctoral fellowship number RYC-2013-14717.Peer ReviewedPostprint (author's final draft

    A time-predictable many-core processor design for critical real-time embedded systems

    Get PDF
    Critical Real-Time Embedded Systems (CRTES) are in charge of controlling fundamental parts of embedded system, e.g. energy harvesting solar panels in satellites, steering and breaking in cars, or flight management systems in airplanes. To do so, CRTES require strong evidence of correct functional and timing behavior. The former guarantees that the system operates correctly in response of its inputs; the latter ensures that its operations are performed within a predefined time budget. CRTES aim at increasing the number and complexity of functions. Examples include the incorporation of \smarter" Advanced Driver Assistance System (ADAS) functionality in modern cars or advanced collision avoidance systems in Unmanned Aerial Vehicles (UAVs). All these new features, implemented in software, lead to an exponential growth in both performance requirements and software development complexity. Furthermore, there is a strong need to integrate multiple functions into the same computing platform to reduce the number of processing units, mass and space requirements, etc. Overall, there is a clear need to increase the computing power of current CRTES in order to support new sophisticated and complex functionality, and integrate multiple systems into a single platform. The use of multi- and many-core processor architectures is increasingly seen in the CRTES industry as the solution to cope with the performance demand and cost constraints of future CRTES. Many-cores supply higher performance by exploiting the parallelism of applications while providing a better performance per watt as cores are maintained simpler with respect to complex single-core processors. Moreover, the parallelization capabilities allow scheduling multiple functions into the same processor, maximizing the hardware utilization. However, the use of multi- and many-cores in CRTES also brings a number of challenges related to provide evidence about the correct operation of the system, especially in the timing domain. Hence, despite the advantages of many-cores and the fact that they are nowadays a reality in the embedded domain (e.g. Kalray MPPA, Freescale NXP P4080, TI Keystone II), their use in CRTES still requires finding efficient ways of providing reliable evidence about the correct operation of the system. This thesis investigates the use of many-core processors in CRTES as a means to satisfy performance demands of future complex applications while providing the necessary timing guarantees. To do so, this thesis contributes to advance the state-of-the-art towards the exploitation of parallel capabilities of many-cores in CRTES contributing in two different computing domains. From the hardware domain, this thesis proposes new many-core designs that enable deriving reliable and tight timing guarantees. From the software domain, we present efficient scheduling and timing analysis techniques to exploit the parallelization capabilities of many-core architectures and to derive tight and trustworthy Worst-Case Execution Time (WCET) estimates of CRTES.Los sistemas críticos empotrados de tiempo real (en ingles Critical Real-Time Embedded Systems, CRTES) se encargan de controlar partes fundamentales de los sistemas integrados, e.g. obtención de la energía de los paneles solares en satélites, la dirección y frenado en automóviles, o el control de vuelo en aviones. Para hacerlo, CRTES requieren fuerte evidencias del correcto comportamiento funcional y temporal. El primero garantiza que el sistema funciona correctamente en respuesta de sus entradas; el último asegura que sus operaciones se realizan dentro de unos limites temporales establecidos previamente. El objetivo de los CRTES es aumentar el número y la complejidad de las funciones. Algunos ejemplos incluyen los sistemas inteligentes de asistencia a la conducción en automóviles modernos o los sistemas avanzados de prevención de colisiones en vehiculos aereos no tripulados. Todas estas nuevas características, implementadas en software,conducen a un crecimiento exponencial tanto en los requerimientos de rendimiento como en la complejidad de desarrollo de software. Además, existe una gran necesidad de integrar múltiples funciones en una sóla plataforma para así reducir el número de unidades de procesamiento, cumplir con requisitos de peso y espacio, etc. En general, hay una clara necesidad de aumentar la potencia de cómputo de los actuales CRTES para soportar nueva funcionalidades sofisticadas y complejas e integrar múltiples sistemas en una sola plataforma. El uso de arquitecturas multi- y many-core se ve cada vez más en la industria CRTES como la solución para hacer frente a la demanda de mayor rendimiento y las limitaciones de costes de los futuros CRTES. Las arquitecturas many-core proporcionan un mayor rendimiento explotando el paralelismo de aplicaciones al tiempo que proporciona un mejor rendimiento por vatio ya que los cores se mantienen más simples con respecto a complejos procesadores de un solo core. Además, las capacidades de paralelización permiten programar múltiples funciones en el mismo procesador, maximizando la utilización del hardware. Sin embargo, el uso de multi- y many-core en CRTES también acarrea ciertos desafíos relacionados con la aportación de evidencias sobre el correcto funcionamiento del sistema, especialmente en el ámbito temporal. Por eso, a pesar de las ventajas de los procesadores many-core y del hecho de que éstos son una realidad en los sitemas integrados (por ejemplo Kalray MPPA, Freescale NXP P4080, TI Keystone II), su uso en CRTES aún precisa de la búsqueda de métodos eficientes para proveer evidencias fiables sobre el correcto funcionamiento del sistema. Esta tesis ahonda en el uso de procesadores many-core en CRTES como un medio para satisfacer los requisitos de rendimiento de aplicaciones complejas mientras proveen las garantías de tiempo necesarias. Para ello, esta tesis contribuye en el avance del estado del arte hacia la explotación de many-cores en CRTES en dos ámbitos de la computación. En el ámbito del hardware, esta tesis propone nuevos diseños many-core que posibilitan garantías de tiempo fiables y precisas. En el ámbito del software, la tesis presenta técnicas eficientes para la planificación de tareas y el análisis de tiempo para aprovechar las capacidades de paralelización en arquitecturas many-core, y también para derivar estimaciones de peor tiempo de ejecución (Worst-Case Execution Time, WCET) fiables y precisas

    Design of complex integrated systems based on networks-on-chip: Trading off performance, power and reliability

    Get PDF
    The steady advancement of microelectronics is associated with an escalating number of challenges for design engineers due to both the tiny dimensions and the enormous complexity of integrated systems. Against this background, this work deals with Network-On-Chip (NOC) as the emerging design paradigm to cope with diverse issues of nanotechnology. The detailed investigations within the chapters focus on the communication-centric aspects of multi-core-systems, whereas performance, power consumption as well as reliability are considered likewise as the essential design criteria

    Doctor of Philosophy

    Get PDF
    dissertationCommunication surpasses computation as the power and performance bottleneck in forthcoming exascale processors. Scaling has made transistors cheap, but on-chip wires have grown more expensive, both in terms of latency as well as energy. Therefore, the need for low energy, high performance interconnects is highly pronounced, especially for long distance communication. In this work, we examine two aspects of the global signaling problem. The first part of the thesis focuses on a high bandwidth asynchronous signaling protocol for long distance communication. Asynchrony among intellectual property (IP) cores on a chip has become necessary in a System on Chip (SoC) environment. Traditional asynchronous handshaking protocol suffers from loss of throughput due to the added latency of sending the acknowledge signal back to the sender. We demonstrate a method that supports end-to-end communication across links with arbitrarily large latency, without limiting the bandwidth, so long as line variation can be reliably controlled. We also evaluate the energy and latency improvements as a result of the design choices made available by this protocol. The use of transmission lines as a physical interconnect medium shows promise for deep submicron technologies. In our evaluations, we notice a lower energy footprint, as well as vastly reduced wire latency for transmission line interconnects. We approach this problem from two sides. Using field solvers, we investigate the physical design choices to determine the optimal way to implement these lines for a given back-end-of-line (BEOL) stack. We also approach the problem from a system designer's viewpoint, looking at ways to optimize the lines for different performance targets. This work analyzes the advantages and pitfalls of implementing asynchronous channel protocols for communication over long distances. Finally, the innovations resulting from this work are applied to a network-on-chip design example and the resulting power-performance benefits are reported

    A comprehensive approach to MPSoC security: achieving network-on-chip security : a hierarchical, multi-agent approach

    Get PDF
    Multiprocessor Systems-on-Chip (MPSoCs) are pervading our lives, acquiring ever increasing relevance in a large number of applications, including even safety-critical ones. MPSoCs, are becoming increasingly complex and heterogeneous; the Networks on Chip (NoC paradigm has been introduced to support scalable on-chip communication, and (in some cases) even with reconfigurability support. The increased complexity as well as the networking approach in turn make security aspects more critical. In this work we propose and implement a hierarchical multi-agent approach providing solutions to secure NoC based MPSoCs at different levels of design. We develop a flexible, scalable and modular structure that integrates protection of different elements in the MPSoC (e.g. memory, processors) from different attack scenarios. Rather than focusing on protection strategies specifically devised for an individual attack or a particular core, this work aims at providing a comprehensive, system-level protection strategy: this constitutes its main methodological contribution. We prove feasibility of the concepts via prototype realization in FPGA technology

    Increasing the reliability and applicability of measurement-based probabilistic timing analysis

    Get PDF
    Tese (doutorado) - Universidade Federal de Santa Catarina, Centro Tecnológico, Programa de Pós-Graduação em Engenharia de Automação e Sistemas, Florianópolis, 2019Conforme a complexidade das arquiteturas computacionais aumenta para melhorar desempenho ou reduzir custos, o uso de processadores modernos em Sistemas de Tempo Real (STRs) é prejudicado cada vez mais pelo surgimento de efeitos temporais que dificultam a obtenção de limites confiáveis e precisos para os Worst-Case Execution Times (WCETs) de tarefas. A Análise Temporal Probabilística Baseada em Medições (ATPBM) visa determinar limites probabilísticos de WCET (i.e. pWCETs) aplicando a Teoria de Valores Extremos (TVE) sobre medições de tempos de execução, e é portanto promissora no tratamento da complexidade de hardware no projeto de STRs. Processadores temporalmente aleatorizados foram recentemente propostos para tornar o comportamento temporal de sistemas computacionais mais facilmente analisável através de ferramental probabilístico, e são projetados substituindo informações internas determinísticas ou especulativas por números (pseudo-)aleatórios. A pesquisa cujos resultados são apresentados nesta tese produziu contribuições em duas frentes distintas. Em primeiro lugar, foram propostos e aplicados métodos para avaliar a confiabilidade dos pWCETs produzidos pela ATPBM, baseados na coleta de grandes amostras de tempos de execução e na comparação (1) dos pWCETs com os maiores tempos de execução observados, e (2) das densidades de excedência dos pWCETs com seus valores esperados. Essas avaliações indicaram que modelos probabilísticos da TVE projetados para gerar margens mais precisas podem muitas vezes levar a subestimativas de pWCETs, e recomendou-se então que modelos sobrestimadores devem ser utilizados para obter-se pWCETs mais confiáveis. Em segundo lugar, avaliou-se a hipótese de que técnicas de escalonamento aleatorizado podem beneficiar a análise temporal de tarefas executadas em pipelines multithread através da ATPBM, por levarem os tempos de execução produzidos a atenderem às premissas básicas de aplicabilidade da técnica. Para tal, foram considerados tanto (A) um escalonador puramente aleatório, quanto (B) um escalonador aleatorizado capaz de limitar os efeitos temporais da interferência entre threads, sem comprometer sua analisabilidade pela ATPBM, através de um mecanismo de regulação de elegibilidade baseado em créditos.Abstract: As the complexity of computer architectures grows in order to improve performance and/or to reduce costs, the use of modern processors in the design of Real-Time Systems (RTSs) is increasingly hampered by the emergence of timing effects that challenge determining reliable and tight bounds for tasks' Worst-Case Execution Times (WCETs). The Measurement-Based Probabilistic Timing Analysis (MBPTA) technique aims determining probabilistic WCET bounds (i.e. pWCETs) by applying Extreme Value Theory (EVT) on tasks' execution time measurements, and is hence promising in handling hardware complexity issues within RTSs' design. Hardware-level time-randomized processors were recently proposed as a means to cause computing systems' timing behaviour to become more easily analysable through probabilistic tools, and are designed replacing deterministic or speculative internal information with (pseudo-)random numbers. The scientific research whose outcomes are presented in this thesis produced contributions on two distinct fronts. In first place, we proposed and applied methods for evaluating the reliability of pWCET estimates produced using MBPTA, based on collecting large execution time samples and then comparing (1) the pWCETs against the largest observed execution times, and (2) pWCETs' exceedance densities against their expected values. These evaluations led us to conclude that EVT probabilistic models intended to yield more precise bounds may often lead to pWCET underestimations, and we hence recommended that upper-bounding models should instead be used for deriving pWCETs with increased reliability. In second place, we evaluated the hypothesis that randomized scheduling techniques can benefit the timing analysis of tasks executed on multithread pipelines through MBPTA, by causing the yielded execution times to meet the technique's basic application requirements. For that, we considered both (A) a scheduler that employs a purely random policy, and (B) a randomized scheduler capable of limiting the timing effects of inter-thread interference, without compromising analysability, by using a credit-based eligibility regulation mechanism

    Architectural Support for High-Performance, Power-Efficient and Secure Multiprocessor Systems

    Get PDF
    High performance systems have been widely adopted in many fields and the demand for better performance is constantly increasing. And the need of powerful yet flexible systems is also increasing to meet varying application requirements from diverse domains. Also, power efficiency in high performance computing has been one of the major issues to be resolved. The power density of core components becomes significantly higher, and the fraction of power supply in total management cost is dominant. Providing dependability is also a main concern in large-scale systems since more hardware resources can be abused by attackers. Therefore, designing high-performance, power-efficient and secure systems is crucial to provide adequate performance as well as reliability to users. Adhering to using traditional design methodologies for large-scale computing systems has a limit to meet the demand under restricted resource budgets. Interconnecting a large number of uniprocessor chips to build parallel processing systems is not an efficient solution in terms of performance and power. Chip multiprocessor (CMP) integrates multiple processing cores and caches on a chip and is thought of as a good alternative to previous design trends. In this dissertation, we deal with various design issues of high performance multiprocessor systems based on CMP to achieve both performance and power efficiency while maintaining security. First, we propose a fast and secure off-chip interconnects through minimizing network overheads and providing an efficient security mechanism. Second, we propose architectural support for fast and efficient memory protection in CMP systems, making the best use of the characteristics in CMP environments and multi-threaded workloads. Third, we propose a new router design for network-on-chip (NoC) based on a new memory technique. We introduce hybrid input buffers that use both SRAM and STT-MRAM for better performance as well as power efficiency. Simulation results show that the proposed schemes improve the performance of off-chip networks through reducing the message size by 54% on average. Also, the schemes diminish the overheads of bounds checking operations, thus enhancing the overall performance by 11% on average. Adopting hybrid buffers in NoC routers contributes to increasing the network throughput up to 21%

    Architectural Support for High-Performance, Power-Efficient and Secure Multiprocessor Systems

    Get PDF
    High performance systems have been widely adopted in many fields and the demand for better performance is constantly increasing. And the need of powerful yet flexible systems is also increasing to meet varying application requirements from diverse domains. Also, power efficiency in high performance computing has been one of the major issues to be resolved. The power density of core components becomes significantly higher, and the fraction of power supply in total management cost is dominant. Providing dependability is also a main concern in large-scale systems since more hardware resources can be abused by attackers. Therefore, designing high-performance, power-efficient and secure systems is crucial to provide adequate performance as well as reliability to users. Adhering to using traditional design methodologies for large-scale computing systems has a limit to meet the demand under restricted resource budgets. Interconnecting a large number of uniprocessor chips to build parallel processing systems is not an efficient solution in terms of performance and power. Chip multiprocessor (CMP) integrates multiple processing cores and caches on a chip and is thought of as a good alternative to previous design trends. In this dissertation, we deal with various design issues of high performance multiprocessor systems based on CMP to achieve both performance and power efficiency while maintaining security. First, we propose a fast and secure off-chip interconnects through minimizing network overheads and providing an efficient security mechanism. Second, we propose architectural support for fast and efficient memory protection in CMP systems, making the best use of the characteristics in CMP environments and multi-threaded workloads. Third, we propose a new router design for network-on-chip (NoC) based on a new memory technique. We introduce hybrid input buffers that use both SRAM and STT-MRAM for better performance as well as power efficiency. Simulation results show that the proposed schemes improve the performance of off-chip networks through reducing the message size by 54% on average. Also, the schemes diminish the overheads of bounds checking operations, thus enhancing the overall performance by 11% on average. Adopting hybrid buffers in NoC routers contributes to increasing the network throughput up to 21%

    A Survey of Probabilistic Timing Analysis Techniques for Real-Time Systems

    Get PDF
    This survey covers probabilistic timing analysis techniques for real-time systems. It reviews and critiques the key results in the field from its origins in 2000 to the latest research published up to the end of August 2018. The survey provides a taxonomy of the different methods used, and a classification of existing research. A detailed review is provided covering the main subject areas: static probabilistic timing analysis, measurement-based probabilistic timing analysis, and hybrid methods. In addition, research on supporting mechanisms and techniques, case studies, and evaluations is also reviewed. The survey concludes by identifying open issues, key challenges and possible directions for future research

    Methoden und Beschreibungssprachen zur Modellierung und Verifikation vonSchaltungen und Systemen: MBMV 2015 - Tagungsband, Chemnitz, 03. - 04. März 2015

    Get PDF
    Der Workshop Methoden und Beschreibungssprachen zur Modellierung und Verifikation von Schaltungen und Systemen (MBMV 2015) findet nun schon zum 18. mal statt. Ausrichter sind in diesem Jahr die Professur Schaltkreis- und Systementwurf der Technischen Universität Chemnitz und das Steinbeis-Forschungszentrum Systementwurf und Test. Der Workshop hat es sich zum Ziel gesetzt, neueste Trends, Ergebnisse und aktuelle Probleme auf dem Gebiet der Methoden zur Modellierung und Verifikation sowie der Beschreibungssprachen digitaler, analoger und Mixed-Signal-Schaltungen zu diskutieren. Er soll somit ein Forum zum Ideenaustausch sein. Weiterhin bietet der Workshop eine Plattform für den Austausch zwischen Forschung und Industrie sowie zur Pflege bestehender und zur Knüpfung neuer Kontakte. Jungen Wissenschaftlern erlaubt er, ihre Ideen und Ansätze einem breiten Publikum aus Wissenschaft und Wirtschaft zu präsentieren und im Rahmen der Veranstaltung auch fundiert zu diskutieren. Sein langjähriges Bestehen hat ihn zu einer festen Größe in vielen Veranstaltungskalendern gemacht. Traditionell sind auch die Treffen der ITGFachgruppen an den Workshop angegliedert. In diesem Jahr nutzen zwei im Rahmen der InnoProfile-Transfer-Initiative durch das Bundesministerium für Bildung und Forschung geförderte Projekte den Workshop, um in zwei eigenen Tracks ihre Forschungsergebnisse einem breiten Publikum zu präsentieren. Vertreter der Projekte Generische Plattform für Systemzuverlässigkeit und Verifikation (GPZV) und GINKO - Generische Infrastruktur zur nahtlosen energetischen Kopplung von Elektrofahrzeugen stellen Teile ihrer gegenwärtigen Arbeiten vor. Dies bereichert denWorkshop durch zusätzliche Themenschwerpunkte und bietet eine wertvolle Ergänzung zu den Beiträgen der Autoren. [... aus dem Vorwort
    corecore