3 research outputs found

    Bridging the Gap between Resilient Networks-on-Chip and Real-Time Systems

    Get PDF
    Conventional fault-tolerance approaches for Networks-on-Chip (NoCs) cannot be applied to high dependability systems due to their different goals and constraints. These systems impose strict integrity, resilience and real-time requirements. In order to meet these requirements, all possible effects of random hardware errors must be taken into account, silent data corruption must be prevented and the resulting system must be predictable in the presence of errors. In this paper, we present a wormhole-switched NoC with virtual channels for high dependability systems hardened against soft errors. The NoC is developed based on results of a Failure Mode and Effects Analysis. It efficiently handles errors in different network layers and operates with formal guarantees. Our experimental evaluation, including an industrial avionics use case, shows that the network is able to achieve predictable behavior even in aggressive environments with very high error rates while presenting competitive overheads

    ENOC : rede-em-chip expansível

    Get PDF
    Orientador: Luiz Carlos Pessoa AlbiniCoorientador: Marco Antonio Zanata AlvesTese (doutorado) - Universidade Federal do Paraná, Setor de Ciências Exatas, Programa de Pós-Graduação em Informática. Defesa : Curitiba, 10/02/2018Inclui referências: p. 71-81Resumo: Os sistemas multiprocessados integrados em chip têm emergido como uma importante tendência para projetos de sistemas em chip. Estes sistemas são formados por vários elementos de processamento conectados originalmente por um barramento compartilhado. Este barramento possui restrições à crescente integração de mais elementos de processamento em um único chip, pois não permite a comunicação paralela e à medida que os elementos aumentam o barramento apresenta menor desempenho na comunicação devido a capacidade fixa. A rede em chip, do inglês Network-on-Chip (NoC), é uma alternativa ao barramento que permite a comunicação paralela e escalável entre os diferentes elementos de processamento de um chip. Tradicionalmente, a NoC é composta por interligações metálicas entre os roteadores e cada roteador é ligado a um elemento de processamento, a comunicação acontece por encaminhamento de pacotes seguindo um determinado algoritmo de roteamento. Esta comunicação pode ser estendida de ligações metálicas para ligações sem fio principalmente para mitigar a latência resultante dos diversos saltos necessários para comunicar elementos de processamento de um chip, em especial dos mais distantes, uma vez que na comunicação sem fio o pacote é transmitido com apenas um salto. Entretanto, há sobrecustos em utilizar esta tecnologia, e por isto várias pesquisas abordam a interligação de apenas regiões do chip, e não todos os elementos. Mesmo com a evolução das formas de comunicação em um chip, a capacidade de um sistema em chip estava limitada aos seus elementos inseridos em momento de fabricação. Esta tese apresenta a ENoC, uma rede em chip expansível capaz de interligar sistemas em chip distintos reconfigurando-se para oferecer uma visão única de sistema com processamento paralelo distribuído por passagem de mensagem. A arquitetura e a comunicação na ENoC são apresentadas juntamente com uma discussão sobre o uso de sistema operacional e organização da memória. A avaliação é realizada por meio de simulações e análise de desempenho. A segurança da comunicação entre os chips é discutida e sistemas de criptografias são avaliados para manter a confidencialidade da informação. Com os resultados dos experimentos concluímos que a ENoC é uma abordagem adequada para a expandir os recursos entre chips e que cada sistema de criptografia possui vantagens e desvantagens próprias para proteger a comunicação sem fio entre as ENoCs, e a escolha de qual criptossistema é uma decisão de projeto. Palavras-chave: sistema em chip, rede em chip, criptografia.Abstract: Multiprocessor Systems-on-Chip has emerged as an important trend for System-on-Chip designs. These systems consists in several processing elements interconnected, originally, by a shared bus. This bus has restrictions to the increasing integration of many processing elements in a single chip, due to does not allow the parallel communication and as the elements increase the bus presents fewer communication performance because its capacity is fixed. The Network-on-Chip (NoC) is an alternative to the bus that allows parallel and scalable communication among all processing elements on chip. Traditionally, the NoC is made up of metallic wired interconnecting the routers and each router is connected to a processing element, the communication is performed by packets routing following a routing algorithm. This communication may be extended from metal wired links to wireless links, mainly to mitigate the latency from several needed hops to communicate processing elements, in special, the more distant ones, once in wireless communication the packet is transmitted by a single hop. However, there are additional costs in using this technology, and for this reason several researches focus on interconnecting only chip regions, not all elements. Even with the evolution of communication on NoC, the capacity of a system-on-chip was limited to its elements at manufacture time. This thesis presents the ENoC, an Expansible Network-on-Chip capable of interconnecting distinct reconfigurable SoCs to provide a single system view with parallel processing distributed by message passing. The architecture and communication of ENoC are presented within a discussion of operational system and memory organization. The evaluation is performed by simulation and performance analysis. The security of inter-chip communication is discussed and cryptography systems are evaluated to offer a confidentiality of the information. With the results, we conclude that the ENoC is a suitable approach to expand the resources between chips and that each encryption system has its own advantages and disadvantages in order to protect the wireless inter-chip communication, in such way, the choice of which criptosystem is a design decision. Keywords: system-on-chip, network-on-chip, cryptography

    Fehlertolerante Mehrkernprozessoren für gemischt-kritische Echtzeitsysteme

    Get PDF
    Current and future computing systems must be appropriately designed to cope with random hardware faults in order to provide a dependable service and correct functionality. Dependability has many facets to be addressed when designing a system and that is specially challenging in mixed-critical real-time systems, where safety standards play an important role and where responding in time can be as important as responding correctly or even responding at all. The thesis addresses the dependability of mixed-critical real-time systems, considering three important requirements: integrity, resilience and real-time. More specifically, it looks into the architectural and performance aspects of achieving dependability, concentrating its scope on error detection and handling in hardware -- more specifically in the Network-on-Chip (NoC), the backbone of modern MPSoC -- and on the performance of error handling and recovery in software. The thesis starts by looking at the impacts of random hardware faults on the NoC and on the system, with special focus on soft errors. Then, it addresses the uncovered weaknesses in the NoC by proposing a resilient NoC for mixed-critical real-time systems that is able to provide a highly reliable service with transparent protection for the applications. Formal communication time analysis is provided with common ARQ protocols modeled for NoCs and including a novel ARQ-based protocol optimized for DMAs. After addressing the efficient use of ARQ-based protocols in NoCs, the thesis proposes the Advanced Integrity Q-service (AIQ), a low-overhead mechanism to achieve integrity and real-time guarantees of NoC transactions on an End-to-End (E2E) basis. Inspired by transactions in distributed systems, the mechanism differs from the previous approach in that it does not provide error recovery in hardware but delegates the task to software, making use of existing functionality in cross-layer fault-tolerance solutions. Finally, the thesis addresses error handling in software as seen in cross-layer approaches. It addresses the performance of replicated software execution in many-core platforms. Replicated software execution provides protection to the system against random hardware faults. It relies on hardware-supported error detection and error handling in software. The replica-aware co-scheduling is proposed to achieve high performance with replicated execution, which is not possible with standard real-time schedulers.Um einen zuverlässigen Betrieb und korrekte Funktionalität zu gewährleisten, müssen aktuelle und zukünftige Computersysteme so ausgelegt werden, dass sie mit diesen Fehlern umgehen können. Zuverlässigkeit hat viele Aspekte, die bei der Entwicklung eines Systems berücksichtigt werden müssen. Das gilt insbesondere für Echtzeitsysteme mit gemischter Kritikalität, bei denen Sicherheitsstandards, die ein korrektes und rechtzeitiges Verhalten fordern, eine wichtige Rolle spielen. Diese Dissertation befasst sich mit der Zuverlässigkeit von gemischt-kritischen Echtzeitsystemen unter Berücksichtigung von drei wichtigen Anforderungen: Integrität, Resilienz und Echtzeit. Genauer gesagt, behandelt sie Architektur- und Leistungsaspekte die notwendig sind um Zuverlässigkeit zu erreichen, wobei der Schwerpunkt auf der Fehlererkennung und -behandlung in der Hardware – genauer gesagt im Network-on-Chip (NoC), dem Rückgrat des modernen MPSoC – und auf der Leistung der Fehlerbehandlung und -behebung in der Software liegt. Die Arbeit beginnt mit der Untersuchung der Auswirkung von zufälligen Hardwarefehlern auf das NoC und das System, wobei der Schwerpunkt auf weichen Fehler (soft errors) liegt. Anschließend werden die aufgedeckten Schwachstellen im NoC behoben, indem ein widerstandsfähiges NoC für gemischt-kritische Echtzeitsysteme vorgeschlagen wird, das in der Lage ist, einen höchst zuverlässigen Betrieb mit transparentem Schutz für die Anwendungen zu bieten. Nach der Auseinandersetzung mit der effizienten Nutzung von ARQ-basierten Protokolle in NoCs, wird der Advanced Integrity Q-Service (AIQ) vorgestellt, der ein Mechanismus mit geringem Overhead ist, um Integrität und Echtzeit-Garantien von NoC-Transaktionen auf Ende-zu-Ende (E2E)-Basis zu erreichen. Inspiriert von Transaktionen in verteilten Systemen unterscheidet sich der Mechanismus vom bisherigen Konzept dadurch, dass er keine Fehlerbehebung in der Hardware vorsieht, sondern diese Aufgabe an die Software delegiert. Schließlich befasst sich die Dissertation mit der Fehlerbehandlung in Software, wie sie in schichtübergreifenden Methoden zu sehen ist. Sie behandelt die Leistung der replizierten Software-Ausführung in Many-Core-Plattformen. Es setzt auf hardwaregestützte Fehlererkennung und Fehlerbehandlung in der Software. Das Replika-bewusste Co-Scheduling wird vorgeschlagen, um eine hohe Performance bei replizierter Ausführung zu erreichen, was mit Standard-Echtzeit-Schedulern nicht möglich ist
    corecore