2,829 research outputs found

    DeSyRe: on-Demand System Reliability

    No full text
    The DeSyRe project builds on-demand adaptive and reliable Systems-on-Chips (SoCs). As fabrication technology scales down, chips are becoming less reliable, thereby incurring increased power and performance costs for fault tolerance. To make matters worse, power density is becoming a significant limiting factor in SoC design, in general. In the face of such changes in the technological landscape, current solutions for fault tolerance are expected to introduce excessive overheads in future systems. Moreover, attempting to design and manufacture a totally defect and fault-free system, would impact heavily, even prohibitively, the design, manufacturing, and testing costs, as well as the system performance and power consumption. In this context, DeSyRe delivers a new generation of systems that are reliable by design at well-balanced power, performance, and design costs. In our attempt to reduce the overheads of fault-tolerance, only a small fraction of the chip is built to be fault-free. This fault-free part is then employed to manage the remaining fault-prone resources of the SoC. The DeSyRe framework is applied to two medical systems with high safety requirements (measured using the IEC 61508 functional safety standard) and tight power and performance constraints

    The Chameleon Architecture for Streaming DSP Applications

    Get PDF
    We focus on architectures for streaming DSP applications such as wireless baseband processing and image processing. We aim at a single generic architecture that is capable of dealing with different DSP applications. This architecture has to be energy efficient and fault tolerant. We introduce a heterogeneous tiled architecture and present the details of a domain-specific reconfigurable tile processor called Montium. This reconfigurable processor has a small footprint (1.8 mm2^2 in a 130 nm process), is power efficient and exploits the locality of reference principle. Reconfiguring the device is very fast, for example, loading the coefficients for a 200 tap FIR filter is done within 80 clock cycles. The tiles on the tiled architecture are connected to a Network-on-Chip (NoC) via a network interface (NI). Two NoCs have been developed: a packet-switched and a circuit-switched version. Both provide two types of services: guaranteed throughput (GT) and best effort (BE). For both NoCs estimates of power consumption are presented. The NI synchronizes data transfers, configures and starts/stops the tile processor. For dynamically mapping applications onto the tiled architecture, we introduce a run-time mapping tool

    Deep Space Network information system architecture study

    Get PDF
    The purpose of this article is to describe an architecture for the Deep Space Network (DSN) information system in the years 2000-2010 and to provide guidelines for its evolution during the 1990s. The study scope is defined to be from the front-end areas at the antennas to the end users (spacecraft teams, principal investigators, archival storage systems, and non-NASA partners). The architectural vision provides guidance for major DSN implementation efforts during the next decade. A strong motivation for the study is an expected dramatic improvement in information-systems technologies, such as the following: computer processing, automation technology (including knowledge-based systems), networking and data transport, software and hardware engineering, and human-interface technology. The proposed Ground Information System has the following major features: unified architecture from the front-end area to the end user; open-systems standards to achieve interoperability; DSN production of level 0 data; delivery of level 0 data from the Deep Space Communications Complex, if desired; dedicated telemetry processors for each receiver; security against unauthorized access and errors; and highly automated monitor and control

    Time constrained fault tolerance and management framework for k-connected distributed wireless sensor networks based on composite event detection

    Get PDF
    Wireless sensor nodes themselves are exceptionally complex systems where a variety of components interact in a complex way. In enterprise scenarios it becomes highly important to hide the details of the underlying sensor networks from the applications and to guarantee a minimum level of reliability of the system. One of the challenges faced to achieve this level of reliability is to overcome the failures frequently faced by sensor networks due to their tight integration with the environment. Failures can generate false information, which may trigger incorrect business processes, resulting in additional costs. Sensor networks are inherently fault prone due to the shared wireless communication medium. Thus, sensor nodes can lose synchrony and their programs can reach arbitrary states. Since on-site maintenance is not feasible, sensor network applications should be local and communication-efficient self-healing. Also, as per my knowledge, no such general framework exist that addresses all the fault issues one may encounter in a WSN, based on the extensive, exhaustive and comprehensive literature survey in the related areas of research. As one of the main goals of enterprise applications is to reduce the costs of business processes, a complete and more general Fault Tolerance and management framework for a general WSN, irrespective of the node types and deployment conditions is proposed which would help to mitigate the propagation of failures in a business environment, reduce the installation and maintenance costs and to gain deployment flexibility to allow for unobtrusive installation

    Case study: Bio-inspired self-adaptive strategy for spike-based PID controller

    Get PDF
    A key requirement for modern large scale neuromorphic systems is the ability to detect and diagnose faults and to explore self-correction strategies. In particular, to perform this under area-constraints which meet scalability requirements of large neuromorphic systems. A bio-inspired online fault detection and self-correction mechanism for neuro-inspired PID controllers is presented in this paper. This strategy employs a fault detection unit for online testing of the PID controller; uses a fault detection manager to perform the detection procedure across multiple controllers, and a controller selection mechanism to select an available fault-free controller to provide a corrective step in restoring system functionality. The novelty of the proposed work is that the fault detection method, using synapse models with excitatory and inhibitory responses, is applied to a robotic spike-based PID controller. The results are presented for robotic motor controllers and show that the proposed bioinspired self-detection and self-correction strategy can detect faults and re-allocate resources to restore the controller’s functionality. In particular, the case study demonstrates the compactness (~1.4% area overhead) of the fault detection mechanism for large scale robotic controllers.Ministerio de Economía y Competitividad TEC2012-37868-C04-0

    Accelerated artificial neural networks on FPGA for fault detection in automotive systems

    Get PDF
    Modern vehicles are complex distributed systems with critical real-time electronic controls that have progressively replaced their mechanical/hydraulic counterparts, for performance and cost benefits. The harsh and varying vehicular environment can induce multiple errors in the computational/communication path, with temporary or permanent effects, thus demanding the use of fault-tolerant schemes. Constraints in location, weight, and cost prevent the use of physical redundancy for critical systems in many cases, such as within an internal combustion engine. Alternatively, algorithmic techniques like artificial neural networks (ANNs) can be used to detect errors and apply corrective measures in computation. Though adaptability of ANNs presents advantages for fault-detection and fault-tolerance measures for critical sensors, implementation on automotive grade processors may not serve required hard deadlines and accuracy simultaneously. In this work, we present an ANN-based fault-tolerance system based on hybrid FPGAs and evaluate it using a diesel engine case study. We show that the hybrid platform outperforms an optimised software implementation on an automotive grade ARM Cortex M4 processor in terms of latency and power consumption, also providing better consolidation

    Fault-tolerant building-block computer study

    Get PDF
    Ultra-reliable core computers are required for improving the reliability of complex military systems. Such computers can provide reliable fault diagnosis, failure circumvention, and, in some cases serve as an automated repairman for their host systems. A small set of building-block circuits which can be implemented as single very large integration devices, and which can be used with off-the-shelf microprocessors and memories to build self checking computer modules (SCCM) is described. Each SCCM is a microcomputer which is capable of detecting its own faults during normal operation and is described to communicate with other identical modules over one or more Mil Standard 1553A buses. Several SCCMs can be connected into a network with backup spares to provide fault-tolerant operation, i.e. automated recovery from faults. Alternative fault-tolerant SCCM configurations are discussed along with the cost and reliability associated with their implementation

    Space Station Freedom data management system growth and evolution report

    Get PDF
    The Information Sciences Division at the NASA Ames Research Center has completed a 6-month study of portions of the Space Station Freedom Data Management System (DMS). This study looked at the present capabilities and future growth potential of the DMS, and the results are documented in this report. Issues have been raised that were discussed with the appropriate Johnson Space Center (JSC) management and Work Package-2 contractor organizations. Areas requiring additional study have been identified and suggestions for long-term upgrades have been proposed. This activity has allowed the Ames personnel to develop a rapport with the JSC civil service and contractor teams that does permit an independent check and balance technique for the DMS

    Reconfigurable High Performance Secured NoC Design Using Hierarchical Agent-based Monitoring System

    Get PDF
    With the rapid increase in demand for high performance computing, there is also a significant growth of data communication that leads to leverage the significance of network on chip. This paper proposes a reconfigurable fault tolerant on chip architecture with hierarchical agent based monitoring system for enhancing the performance of network based multiprocessor system on chip against faulty links and nodes. These distributed agents provide healthy status and congestion information of the network. This status information is used for further packet routing in the network with the help of XY routing algorithm. The functionality of Agent is enhanced not only to work as information provider but also to take decision for packet to either pass or stop to the processing element by setting the firewall in order to provide security. Proposed design provides a better performance and area optimization by avoiding deadlock and live lock as compared to existing approaches over network design
    corecore