707 research outputs found

    Contention in multicore hardware shared resources: Understanding of the state of the art

    Get PDF
    The real-time systems community has over the years devoted considerable attention to the impact on execution timing that arises from contention on access to hardware shared resources. The relevance of this problem has been accentuated with the arrival of multicore processors. From the state of the art on the subject, there appears to be considerable diversity in the understanding of the problem and in the “approach” to solve it. This sparseness makes it difficult for any reader to form a coherent picture of the problem and solution space. This paper draws a tentative taxonomy in which each known approach to the problem can be categorised based on its specific goals and assumptions.Postprint (published version

    Software-enforced Interconnect Arbitration for COTS Multicores

    Get PDF
    The advent of multicore processors complicates timing analysis owing to the need to account for the interference between cores accessing shared resources, which is not always easy to characterize in a safe and tight way. Solutions have been proposed that take two distinct but complementary directions: on the one hand, complex analysis techniques have been developed to provide safe and tight bounds to contention; on the other hand, sophisticated arbitration policies (hardware or software) have been proposed to limit or control inter-core interference. In this paper we propose a software-based TDMA-like arbitration of accesses to a shared interconnect (e.g. a bus) that prevents inter-core interference. A more flexible arbitration scheme is also proposed to reserve more bandwidth to selected cores while still avoiding contention. A proof-of-concept implementation on an AURIX TC277TU processor shows that our approach can apply to COTS processors, thus not relying on dedicated hardware arbiters, while introducing little overhead

    The Road towards Predictable Automotive High-Performance Platforms

    Get PDF
    Due to the trends of centralizing the E/E architecture and new computing-intensive applications, high-performance hardware platforms are currently finding their way into automotive systems. However, the SoCs currently available on the market have significant weaknesses when it comes to providing predictable performance for time-critical applications. The main reason for this is that these platforms are optimized for averagecase performance. This shortcoming represents one major risk in the development of current and future automotive systems. In this paper we describe how high-performance and predictability could (and should) be reconciled in future HW/SW platforms. We believe that this goal can only be reached in a close collaboration between system suppliers, IP providers, semiconductor companies, and OS/hypervisor vendors. Furthermore, academic input will be needed to solve remaining challenges and to further improve initial solutions

    Enforcing Predictability of Many-cores with DCFNoC

    Get PDF
    © 2021 IEEE. Personal use of this material is permitted. Permissíon from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertisíng or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.[EN] The ever need for higher performance forces industry to include technology based on multi-processors system on chip (MPSoCs) in their safety-critical embedded systems. MPSoCs include a network-on-chip (NoC) to interconnect the cores between them and with memory and the rest of shared resources. Unfortunately, the inclusion of NoCs compromises guaranteeing time predictability as network-level conflicts may occur. To overcome this problem, in this paper we propose DCFNoC, a new time-predictable NoC design paradigm where conflicts within the network are eliminated by design. This new paradigm builds on top of the Channel Dependency Graph (CDG) in order to deterministically avoid network conflicts. The network guarantees predictability to applications and is able to naturally inject messages using a TDM period equal to the optimal theoretical bound without the need of using a computationally demanding offline process. DCFNoC is integrated in a tile-based many-core system and adapted to its memory hierarchy. Our results show that DCFNoC guarantees time predictability avoiding network interference among multiple running applications. DCFNoC always guarantees performance and also improves wormhole performance in a 4 × 4 setting by a factor of 3.7× when interference traffic is injected. For a 8 × 8 network differences are even larger. In addition, DCFNoC obtains a total area saving of 10.79% over a standard wormhole implementation.This work has been supported by MINECO under Grant BES-2016-076885, by MINECO and funds from the European ERDF under Grant TIN2015-66972-C05-1-R and Grant RTI2018-098156-B-C51, and by the EC H2020 RECIPE project under Grant 801137.Picornell-Sanjuan, T.; Flich Cardo, J.; Hernández Luz, C.; Duato Marín, JF. (2021). Enforcing Predictability of Many-cores with DCFNoC. IEEE Transactions on Computers. 70(2):270-283. https://doi.org/10.1109/TC.2020.2987797S27028370

    Conflict-Free Networks on Chip for Real Time Systems

    Full text link
    [ES] La constante necesidad de un mayor rendimiento para cumplir con la gran demanda de potencia de cómputo de las nuevas aplicaciones, (ej. sistemas de conducción autónoma), obliga a la industria a apostar por la tecnología basada en Sistemas en Chip con Procesadores Multinúcleo (MPSoCs) en sus sistemas embebidos de seguridad-crítica. Los sistemas MPSoCs generalmente incluyen una red en el chip (NoC) para interconectar los núcleos de procesamiento entre ellos, con la memoria y con el resto de recursos compartidos. Desafortunadamente, el uso de las NoCs dificulta alcanzar la predecibilidad en el tiempo, ya que pueden aparecer conflictos en muchos puntos y de forma distribuida a nivel de red. Para afrontar este problema, en esta tesis se propone un nuevo paradigma de diseño para NoCs de tiempo real donde los conflictos en la red son eliminados por diseño. Este nuevo paradigma parte del Grafo de Dependencia de Canales (CDG) para evitar los conflictos de red de forma determinista. Nuestra solución es capaz de inyectar mensajes de forma natural usando un periodo TDM igual al límite teórico óptimo sin la necesidad de usar un proceso offline exigente computacionalmente. La red se ha integrado en un sistema multinúcleo basado en tiles y adaptado a su jerarquía de memoria. Como segunda contribución principal, proponemos un nuevo planificador dinámico y distribuido capaz de alcanzar un rendimiento pico muy cercanos a las NoC basadas en un diseño wormhole sin comprometer sus garantías de tiempo real. El planificador se basa en nuestro diseño de red para explotar sus propiedades clave. Los resultados de nuestra NoC muestran que nuestro diseño garantiza la predecibilidad en el tiempo evitando interferencias en la red entre múltiples aplicaciones ejecutándose concurrentemente. La red siempre garantiza el rendimiento y también mejora el rendimiento respecto al de las redes wormhole en una red 4 x 4 en un factor de 3,7x cuando se inyecta trafico para generar interferencias. En una red 8 x 8 las diferencias son incluso mayores. Además, la red obtiene un ahorro de área total del 10,79% frente a una implementación básica de una red wormhole. El planificador propuesto alcanza una mejora de rendimiento de 6,9x y 14,4x frente la versión básica de la red DCFNoC para redes en forma de malla de 16 y 64 nodos, respectivamente. Cuando lo comparamos frente a un conmutador estándar wormhole se preserva un rendimiento de red del 95% al mismo tiempo que preserva la estricta predecibilidad en el tiempo. Este logro abre la puerta a nuevos diseños de NoCs de alto rendimiento con predecibilidad en el tiempo. Como contribución final, construimos una taxonomía de NoCs basadas en TDM con propiedades de tiempo real. Con esta taxonomía realizamos un análisis exhaustivo para estudiar y comparar desde tiempos de respuesta, a implementaciones con bajo coste, pasando por soluciones de compromiso para diseños de NoCs de tiempo real. Como resultado, obtenemos nuevos diseños de NoCs basadas en TDM.[CA] La constant necessitat d'un major rendiment per a complir amb la gran demanda de potència de còmput de les noves aplicacions, (ex. sistemes de conducció autònoma), obliga la indústria a apostar per la tecnologia basada en Sistemes en Xip amb Processadors Multinucli (MPSoCs) en els seus sistemes embeguts de seguretat-crítica. Els sistemes MPSoCs generalment inclouen una xarxa en el xip (NoC) per a interconnectar els nuclis de processament entre ells, amb la memòria i amb la resta de recursos compartits. Desafortunadament, l'ús de les NoCs dificulta aconseguir la predictibilitat en el temps, ja que poden aparéixer conflictes en molts punts i de forma distribuïda a nivell de xarxa. Per a afrontar aquest problema, en aquesta tesi es proposa un nou paradigma de disseny per a NoCs de temps real on els conflictes en la xarxa són eliminats per disseny. Aquest nou paradigma parteix del Graf de Dependència de Canals (CDG) per a evitar els conflictes de xarxa de manera determinista. La nostra solució és capaç d'injectar missatges de mra natural fent ús d'un període TDM igual al límit teòric òptim sense la necessitat de fer ús d'un procés offline exigent computacionalment. La xarxa s'ha integrat en un sistema multinucli basat en tiles i adaptat a la seua jerarquia de memòria. Com a segona contribució principal, proposem un nou planificador dinàmic i distribuït capaç d'aconseguir un rendiment pic molt pròxims a les NoC basades en un disseny wormhole sense comprometre les seues garanties de temps real. El planificador es basa en el nostre disseny de xarxa per a explotar les seues propietats clau. Els resultats de la nostra NoC mostren que el nostre disseny garanteix la predictibilitat en el temps evitant interferències en la xarxa entre múltiples aplicacions executant-se concurrentment. La xarxa sempre garanteix el rendiment i també millora el rendiment respecte al de les xarxes wormhole en una xarxa 4 x 4 en un factor de 3,7x quan s'injecta trafic per a generar interferències. En una xarxa 8 x 8 les diferències són fins i tot majors. A més, la xarxa obté un estalvi d'àrea total del 10,79% front una implementació bàsica d'una xarxa wormhole. El planificador proposat aconsegueix una millora de rendiment de 6,9x i 14,4x front la versió bàsica de la xarxa DCFNoC per a xarxes en forma de malla de 16 i 64 nodes, respectivament. Quan ho comparem amb un commutador estàndard wormhole es preserva un rendiment de xarxa del 95% al mateix temps que preserva la estricta predictibilitat en el temps. Aquest assoliment obri la porta a nous dissenys de NoCs d'alt rendiment amb predictibilitat en el temps. Com a contribució final, construïm una taxonomia de NoCs basades en TDM amb propietats de temps real. Amb aquesta taxonomia realitzem una anàlisi exhaustiu per a estudiar i comparar des de temps de resposta, a implementacions amb baix cost, passant per solucions de compromís per a dissenys de NoCs de temps real. Com a resultat, obtenim nous dissenys de NoCs basades en TDM.[EN] The ever need for higher performance to cope with the high computational power demands of new applications (e.g autonomous driving systems), forces industry to support technology based on multi-processors system on chip (MPSoCs) in their safety-critical embedded systems. MPSoCs usually include a network-on-chip (NoC) to interconnect the cores between them and, with memory and the rest of shared resources. Unfortunately, the inclusion of NoCs difficults achieving time predictability as network-level conflicts may occur in many points in a distributed manner. To overcome this problem, this thesis proposes a new time-predictable NoC design paradigm where conflicts within the network are eliminated by design. This new paradigm builds on top of the Channel Dependency Graph (CDG) in order to deterministically avoid network conflicts. Our solution is able to naturally inject messages using a TDM period equal to the optimal theoretical bound without the need of using a computationally demanding offline process. The network is integrated in a tile-based manycore system and adapted to its memory hierarchy. As a second main contribution, we propose a novel distributed dynamic scheduler that is able to achieve peak performance close to a wormhole-based NoC design without compromising its real-time guarantees. The scheduler builds on top of our NoC design to exploit its key properties. The results of our NoC show that our design guarantees time predictability avoiding network interference among multiple running applications. The network always guarantees performance and also improves wormhole performance in a 4 x 4 setting by a factor of 3.7x when interference traffic is injected. For a 8 x 8 network differences are even larger. In addition, the network obtains a total area saving of 10.79% over a standard wormhole implementation. The proposed scheduler achieves an overall throughput improvement of 6.9x and 14.4x over a baseline conflict-free NoC for 16 and 64-node meshes, respectively. When compared against a standard wormhole router 95% of its network throughput is preserved while strict timing predictability is kept. This achievement opens the door to new high performance time predictable NoC designs. As a final contribution, we build a taxonomy of TDM-based NoCs with real-time properties. With this taxonomy we perform a comprehensive analysis to study and compare from response time specific, to low resource implementation cost, through trade-off solutions for real-time NoCs designs. As a result, we derive new TDM-based NoC designs.Picornell Sanjuan, T. (2021). Conflict-Free Networks on Chip for Real Time Systems [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/177347TESI

    Erreichen von Performance in Netzwerken-On-Chip für Echtzeitsysteme

    Get PDF
    In many new applications, such as in automatic driving, high performance requirements have reached safety critical real-time systems. Consequently, Networks-on-Chip (NoCs) must efficiently host new sets of highly dynamic workloads e.g., high resolution sensor fusion and data processing, autonomous decision’s making combined with machine learning. The static platform management, as used in current safety critical systems, is no more sufficient to provide the needed level of service. A dynamic platform management could meet the challenge, but it usually suffers from a lack of predictability and the simplicity necessary for certification of safety and real-time properties. In this work, we propose a novel, global and dynamic arbitration for NoCs with real-time QoS requirements. The mechanism decouples the admission control from arbitration in routers thereby simplifying a dynamic adaptation and real-time analysis. Consequently, the proposed solution allows the deployment of a sophisticated contract-based QoS provisioning without introducing complicated and hard to maintain schemes, known from the frequently applied static arbiters. The presented work introduces an overlay network to synchronize transmissions using arbitration units called Resource Managers (RMs), which allows global and work-conserving scheduling. The description of resource allocation strategies is supplemented by protocol design and verification methodology bringing adaptive control to NoC communication in setups with different QoS requirements and traffic classes. For doing that, a formal worst-case timing analysis for the mechanism has been proposed which demonstrates that this solution not only exposes higher performance in simulation but, even more importantly, consistently reaches smaller formally guaranteed worst-case latencies than other strategies for realistic levels of system's utilization. The approach is not limited to a specific network architecture or topology as the mechanism does not require modifications of routers and therefore can be used together with the majority of existing manycore systems. Indeed, the evaluation followed using the generic performance optimized router designs, as well as two systems-on-chip focused on real-time deployments. The results confirmed that the proposed approach proves to exhibit significantly higher average performance in simulation and execution.In vielen neuen sicherheitskritische Anwendungen, wie z.B. dem automatisierten Fahren, werden große Anforderungen an die Leistung von Echtzeitsysteme gestellt. Daher müssen Networks-on-Chip (NoCs) neue, hochdynamische Workloads wie z.B. hochauflösende Sensorfusion und Datenverarbeitung oder autonome Entscheidungsfindung kombiniert mit maschineller Lernen, effizient auf einem System unterbringen. Die Steuerung der zugrunde liegenden NoC-Architektur, muss die Systemsicherheit vor Fehlern, resultierend aus dem dynamischen Verhalten des Systems schützen und gleichzeitig die geforderte Performance bereitstellen. In dieser Arbeit schlagen wir eine neuartige, globale und dynamische Steuerung für NoCs mit Echtzeit QoS Anforderungen vor. Das Schema entkoppelt die Zutrittskontrolle von der Arbitrierung in Routern. Hierdurch wird eine dynamische Anpassung ermöglicht und die Echtzeitanalyse vereinfacht. Der Einsatz einer ausgefeilten vertragsbasierten Ressourcen-Zuweisung wird so ermöglicht, ohne komplexe und schwer wartbare Mechanismen, welche bereits aus dem statischen Plattformmanagement bekannt sind einzuführen. Diese Arbeit stellt ein übergelagertes Netzwerk vor, welches Übertragungen mit Hilfe von Arbitrierungseinheiten, den so genannten Resource Managern (RMs), synchronisiert. Dieses überlagerte Netzwerk ermöglicht eine globale und lasterhaltende Steuerung. Die Beschreibung verschiedener Ressourcenzuweisungstrategien wird ergänzt durch ein Protokolldesign und Methoden zur Verifikation der adaptiven NoC Steuerung mit unterschiedlichen QoS Anforderungen und Verkehrsklassen. Hierfür wird eine formale Worst Case Timing Analyse präsentiert, welche das vorgestellte Verfahren abbildet. Die Resultate bestätitgen, dass die präsentierte Lösung nicht nur eine höhere Performance in der Simulation bietet, sondern auch formal kleinere Worst-Case Latenzen für realistische Systemauslastungen als andere Strategien garantiert. Der vorgestellte Ansatz ist nicht auf eine bestimmte Netzwerkarchitektur oder Topologie beschränkt, da der Mechanismus keine Änderungen an den unterliegenden Routern erfordert und kann daher zusammen mit bestehenden Manycore-Systemen eingesetzt werden. Die Evaluierung erfolgte auf Basis eines leistungsoptimierten Router-Designs sowie zwei auf Echtzeit-Anwendungen fokusierten Platformen. Die Ergebnisse bestätigten, dass der vorgeschlagene Ansatz im Durchschnitt eine deutlich höhere Leistung in der Simulation und Ausführung liefert

    Side-Channel Protected MPSoC through Secure Real-Time Networks-on-Chip

    Get PDF
    The integration of Multi-Processors System-on-Chip (MPSoCs) into the Internet -of -Things (IoT) context brings new opportunities, but also represent risks. Tight real-time constraints and security requirements should be considered simultaneously when designing MPSoCs. Network-on-Chip (NoCs) are specially critical when meeting these two conflicting characteristics. For instance the NoC design has a huge influence in the security of the system. A vital threat to system security are so-called side-channel attacks based on the NoC communication observations. To this end, we propose a NoC security mechanism suitable for hard real-time systems, in which schedulability is a vital design requirement. We present three contributions. First, we show the impact of the NoC routing in the security of the system. Second, we propose a packet route randomisation mechanism to increase NoC resilience against side-channel attacks. Third, using an evolutionary optimisation approach, we effectively apply route randomisation while controlling its impact on hard real-time performance guarantees. Extensive experimental evidence based on analytical and simulation models supports our findings

    An extensible framework for multicore response time analysis

    Get PDF
    In this paper, we introduce a multicore response time analysis (MRTA) framework, which decouples response time analysis from a reliance on context independent WCET values. Instead, the analysis formulates response times directly from the demands placed on different hardware resources. The MRTA framework is extensible to different multicore architectures, with a variety of arbitration policies for the common interconnects, and different types and arrangements of local memory. We instantiate the framework for single level local data and instruction memories (cache or scratchpads), for a variety of memory bus arbitration policies, including: Round-Robin, FIFO, Fixed-Priority, Processor-Priority, and TDMA, and account for DRAM refreshes. The MRTA framework provides a general approach to timing verification for multicore systems that is parametric in the hardware configuration and so can be used at the architectural design stage to compare the guaranteed levels of real-time performance that can be obtained with different hardware configurations. We use the framework in this way to evaluate the performance of multicore systems with a variety of different architectural components and policies. These results are then used to compose a predictable architecture, which is compared against a reference architecture designed for good average-case behaviour. This comparison shows that the predictable architecture has substantially better guaranteed real-time performance, with the precision of the analysis verified using cycle-accurate simulation

    A reconfigurable real-time SDRAM controller for mixed time-criticality systems

    Get PDF
    Verifying real-time requirements of applications is increasingly complex on modern Systems-on-Chips (SoCs). More applications are integrated into one system due to power, area and cost constraints. Resource sharing makes their timing behavior interdependent, and as a result the verification complexity increases exponentially with the number of applications. Predictable and composable virtual platforms solve this problem by enabling verification in isolation, but designing SoC resources suitable to host such platforms is challenging. This paper focuses on a reconfigurable SDRAM controller for predictable and composable virtual platforms. The main contributions are: 1) A run-time reconfigurable SDRAM controller architecture, which allows trade-offs between guaranteed bandwidth, response time and power. 2) A methodology for offering composable service to memory clients, by means of composable memory patterns. 3) A reconfigurable Time-Division Multiplexing (TDM) arbiter and an associated reconfiguration protocol. The TDM slot allocations can be changed at run time, while the predictable and composable performance guarantees offered to active memory clients are unaffected by the reconfiguration. The SDRAM controller has been implemented as a TLM-level SystemC model, and in synthesizable VHDL for use on an FPGA
    corecore