Search CORE

3,391 research outputs found

DeSyRe: on-Demand System Reliability

Author: Armato Antonino
Bouganis Christos-Savvas
Falsafi Babak
Gaydadjiev Georgi
Isaza Sebastian
Malek Alirad
Mariani Riccardo
Pnevmatikatos Dionisios N
Pradhan Dhiraj K
Rauwerda Gerard
Seepers Robert
Shafik Rishad Ahmed
Sourdis Ioannis
Strydis Christos
Sunesen Kim
Theodoropoulos Dimitris
Tzilis Stavros
Vavouras Michail
Publication venue: 'Elsevier BV'
Publication date: 01/01/2013
Field of study

The DeSyRe project builds on-demand adaptive and reliable Systems-on-Chips (SoCs). As fabrication technology scales down, chips are becoming less reliable, thereby incurring increased power and performance costs for fault tolerance. To make matters worse, power density is becoming a significant limiting factor in SoC design, in general. In the face of such changes in the technological landscape, current solutions for fault tolerance are expected to introduce excessive overheads in future systems. Moreover, attempting to design and manufacture a totally defect and fault-free system, would impact heavily, even prohibitively, the design, manufacturing, and testing costs, as well as the system performance and power consumption. In this context, DeSyRe delivers a new generation of systems that are reliable by design at well-balanced power, performance, and design costs. In our attempt to reduce the overheads of fault-tolerance, only a small fraction of the chip is built to be fault-free. This fault-free part is then employed to manage the remaining fault-prone resources of the SoC. The DeSyRe framework is applied to two medical systems with high safety requirements (measured using the IEC 61508 functional safety standard) and tight power and performance constraints

Southampton (e-Prints Soton)

EUR Research Repository

Chalmers Research

Chalmers Publication Library

Explore Bristol Research

Reliability and Makespan Optimization of Hardware Task Graphs in Partially Reconfigurable Platforms

Author: Clemente Juan Antonio
Naghibzadeh Mahmoud
Ramezani Reza
Sedaghat Yasser
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/04/2017
Field of study

This paper addresses the problem of reliability and makespan optimization of hardware task graphs in reconfigurable platforms by applying fault tolerance (FT) techniques to the running tasks based on the exploration of the Pareto set of solutions. In the presented solution, in contrast to the existing approaches in the literature, task graph scheduling, tasks parallelism, reconfiguration delay, and FT requirements are taken into account altogether. This paper first presents a model for hardware task graphs, task prefetch and scheduling, reconfigurable computer, and a fault model for reliability. Then, a mathematical model of an integer nonlinear multi-objective optimization problem is presented for improving the FT of hardware task graphs, scheduled in partially reconfigurable platforms. Experimental results show the positive impacts of choosing the FT techniques selected by the proposed solution, which is named Pareto-based. Thus, in comparison to nonfault-tolerant designs or other state-of-the-art FT approaches, without increasing makespan, about 850% mean time to failure (MTTF) improvement is achieved and, without degrading reliability, makespan is improved by 25%. In addition, experiments in fault-varying environments have demonstrated that the presented approach outperforms the existing state-of-the-art adaptive FT techniques in terms of both MTTF and makespan

Docta Complutense

Proceedings of the 5th International Workshop on Reconfigurable Communication-centric Systems on Chip 2010 - ReCoSoC\u2710 - May 17-19, 2010 Karlsruhe, Germany. (KIT Scientific Reports ; 7551)

Author: Becker Jürgen
Hübner Michael
Lagadec Loïc
Sander Oliver
Publication venue: KIT Scientific Publishing, Karlsruhe
Publication date: 01/01/2010
Field of study

ReCoSoC is intended to be a periodic annual meeting to expose and discuss gathered expertise as well as state of the art research around SoC related topics through plenary invited papers and posters. The workshop aims to provide a prospective view of tomorrow\u27s challenges in the multibillion transistor era, taking into account the emerging techniques and architectures exploring the synergy between flexible on-chip communication and system reconfigurability

KITopen

LEGaTO: first steps towards energy-efficient toolset for heterogeneous computing

Author: Alvarez Carlos
Bautista Leonardo
Becker Tobias
Billung-Meyer Gunnar
Carpenter Paul
Christmann Wolfgang
Cristal Adrian
De La Cruz Raul
Dubhashi Devdatt
Etsion Yoav
Felber Pascal
Fetzer Christof
Gaydadjiev Georgi
Göttel Christian
Hadar Elad
Hagemeyer Jens
Jimenez Daniel
Jungeblut Thorsten
Kaiser Martin
Klawonn Frank
Krupop Stefan
Kucza Nils
Madonar Sergi
Martorell Xavier
Mihklafi Amani
Mudge Trevor
Mudge Trevor
Pasin Marcelo
Pericàs Miquel
Pnevmatikatos Dionisios N.
Porrmann Mario
Port Oron
Rocha Isabelly
Salami Behzad
Salomonsson Hans
Schiavoni Valerio
Trancoso Pedro
Unsal Osman S.
vor dem Berge Micha
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2018
Field of study

LEGaTO is a three-year EU H2020 project which started in December 2017. The LEGaTO project will leverage task-based programming models to provide a software ecosystem for Made-in-Europe heterogeneous hardware composed of CPUs, GPUs, FPGAs and dataflow engines. The aim is to attain one order of magnitude energy savings from the edge to the converged cloud/HPC.Peer ReviewedPostprint (author's final draft

Crossref

UPCommons. Portal del coneixement obert de la UPC

Chalmers Research

Publications at Bielefeld University

Real-Time Application Processing for FPGA-Based Resilient Embedded Systems in Harsh Environments

Author: Ehsan Shoaib
McDonald-Maier Klaus
Saha Sangeet
Stoica Adrian
Stolkin Rustam
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 09/08/2018
Field of study

Real-time embedded systems nowadays get employed in harsh environments such as space, nuclear sites to carry out critical operations. Along with the traditional software based (CPU) execution, FPGAs are now also emerging as a bright prospect to accomplish such routines. However, these platforms are often get plagued by faults generated due to the high radiations in such environments. As a result, the real-time applications running on the platform could also get jeopardized. Thus, efficient execution of a set of hard real-time applications on reconfigurable systems with anomaly detection and recovery mechanism is inevitable. This work aims at tackling such problem with a “healing” approach for extreme environments. Initially, the applications are intelligently partitioned for hardware and software execution, then attempts have been made to schedule hardware applications with intermittent preemption point. Upon detecting any abnormality on such distinct points, our approach orchestrates a healing mechanism to remediate the scenario without hampering the pre-determined schedule. Experimental validation of our proposed method reveals its effectiveness

University of Essex Research Repository

Crossref

Southampton (e-Prints Soton)

Safety verification of a fault tolerant reconfigurable autonomous goal-based robotic control system

Author: Braman Julia M. B.
Murray Richard M.
Wagner David A.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 29/10/2007
Field of study

Fault tolerance and safety verification of control systems are essential for the success of autonomous robotic systems. A control architecture called Mission Data System (MDS), developed at the Jet Propulsion Laboratory, takes a goal-based control approach. In this paper, a method for converting goal network control programs into linear hybrid systems is developed. The linear hybrid system can then be verified for safety in the presence of failures using existing symbolic model checkers. An example task is simulated in MDS and successfully verified using HyTech, a symbolic model checking software for linear hybrid systems

Crossref

NASA Technical Reports Server

Caltech Authors

Reconfiguration for Fault Tolerance and Performance Analysis

Author: Kollmeier Harold Henry
Publication venue: ScholarlyCommons
Publication date: 01/11/1987
Field of study

Architecture reconfiguration, the ability of a system to alter the active interconnection among modules, has a history of different purposes and strategies. Its purposes develop from the relatively simple desire to formalize procedures that all processes have in common to reconfiguration for the improvement of fault-tolerance, to reconfiguration for performance enhancement, either through the simple maximizing of system use or by sophisticated notions of wedding topology to the specific needs of a given process. Strategies range from straightforward redundancy by means of an identical backup system to intricate structures employing multistage interconnection networks. The present discussion surveys the more important contributions to developments in reconfigurable architecture. The strategy here is in a sense to approach the field from an historical perspective, with the goal of developing a more coherent theory of reconfiguration. First, the Turing and von Neumann machines are discussed from the perspective of system reconfiguration, and it is seen that this early important theoretical work contains little that anticipates reconfiguration. Then some early developments in reconfiguration are analyzed, including the work of Estrin and associates on the fixed plus variable restructurable computer system, the attempt to theorize about configurable computers by Miller and Cocke, and the work of Reddi and Feustel on their restructable computer system. The discussion then focuses on the most sustained systems for fault tolerance and performance enhancement that have been proposed. An attempt will be made to define fault tolerance and to investigate some of the strategies used to achieve it. By investigating four different systems, the Tandern computer, the C.vmp system, the Extra Stage Cube, and the Gamma network, the move from dynamic redundancy to reconfiguration is observed. Then reconfiguration for performance enhancement is discussed. A survey of some proposals is attempted, then the discussion focuses on the most sustained systems that have been proposed: PASM, the DC architecture, the Star local network, and the NYU Ultracomputer. The discussion is organized around a comparison of control, scheduling, communication, and network topology. Finally, comparisons are drawn between fault tolerance and performance enhancement, in order to clarify the notion of reconfiguration and to reveal the common ground of fault tolerance and performance enhancement as well as the areas in which they diverge. An attempt is made in the conclusion to derive from this survey and analysis some observations on the nature of reconfiguration, as well as some remarks on necessary further areas of research

CiteSeerX

ScholarlyCommons@Penn