1 research outputs found
Holistic Approach for Fault-Tolerant Network-on-Chip based Many-Core Systems
In this paper we describe a holistic approach for Fault-Tolerant
Network-on-Chip (NoC) based many-core systems that incorporates a System Health
Monitoring Unit (SHMU) which collects all the fault information from the
system, classifies them and provides different solutions for different fault
classes. A Mapper/Scheduler Unit (MSU) is used for online generation of
different mapping and scheduling solutions based on the current fault
configuration of the system. For detection of faults, we have leveraged
concurrent online checkers, able to capture faults with low detection latency
and providing the fault information for SHMU, which can be later used for the
recovery process. The experimentation setup is performed in an open source
tool, able to perform the mapping, scheduling and simulation of the system.Comment: 2nd International Workshop on Dynamic Resource Allocation and
Management in Embedded, High Performance and Cloud Computing DREAMCloud 2016
(arXiv:cs/1601.04675), DREAMCloud/2016/0