Search CORE

26 research outputs found

Self-checking on-line testable static RAM

Author: Chau Savio N.
Rennels David A.
Publication venue
Publication date: 06/04/1993
Field of study

This is a fault-tolerant random access memory for use in fault-tolerant computers. It comprises a plurality of memory chips each comprising a plurality of on-line testable and correctable memory cells disposed in rows and columns for holding individually addressable binary bits and provision for error detection incorporated into each memory cell for outputting an error signal whenever a transient error occurs therein. In one embodiment, each of the memory cells comprises a pair of static memory sub-cells for simultaneously receiving and holding a common binary data bit written to the memory cell and the error detection provision comprises comparator logic for continuously sensing and comparing the contents of the memory sub-cells to one another and for outputting the error signal whenever the contents do not match. In another embodiment, each of the memory cells comprises a static memory sub-cell and a dynamic memory sub-cell for simultaneously receiving and holding a common binary data bit written to the memory cell and the error detection provision comprises comparator logic for continuously sensing and comparing the contents of the static memory sub-cell to the dynamic memory sub-cell and for outputting the error signal whenever the contents do not match. Capability for correction of errors is also included

NASA Technical Reports Server

Fault-tolerant communication channel structures

Author: Alkalai Leon
Chau Savio N.
Tai Ann T.
Publication venue
Publication date: 28/03/2006
Field of study

Systems and techniques for implementing fault-tolerant communication channels and features in communication systems. Selected commercial-off-the-shelf devices can be integrated in such systems to reduce the cost

NASA Technical Reports Server

NEXUS Scalable and Distributed Next-Generation Avionics Bus for Space Missions

Author: Bolotin Gary S.
Chau Savio N.
He Yutao
Shalom Eddy
Some Raphael R.
Publication venue
Publication date
Field of study

A paper discusses NEXUS, a common, next-generation avionics interconnect that is transparently compatible with wired, fiber-optic, and RF physical layers; provides a flexible, scalable, packet switched topology; is fault-tolerant with sub-microsecond detection/recovery latency; has scalable bandwidth from 1 Kbps to 10 Gbps; has guaranteed real-time determinism with sub-microsecond latency/jitter; has built-in testability; features low power consumption (< 100 mW per Gbps); is lightweight with about a 5,000-logic-gate footprint; and is implemented in a small Bus Interface Unit (BIU) with reconfigurable back-end providing interface to legacy subsystems. NEXUS enhances a commercial interconnect standard, Serial RapidIO, to meet avionics interconnect requirements without breaking the standard. This unified interconnect technology can be used to meet performance, power, size, and reliability requirements of all ranges of equipment, sensors, and actuators at chip-to-chip, board-to-board, or box-to-box boundary. Early results from in-house modeling activity of Serial RapidIO using VisualSim indicate that the use of a switched, high-performance avionics network will provide a quantum leap in spacecraft onboard science and autonomy capability for science and exploration missions

NASA Technical Reports Server

Lunar Surface Systems Supportability Technology Development Roadmap

The Lunar Surface Systems Supportability Technology Development Roadmap is a guide for developing the technologies needed to enable the supportable, sustainable, and affordable exploration of the Moon and other destinations beyond Earth. Supportability is defined in terms of space maintenance, repair, and related logistics. This report considers the supportability lessons learned from NASA and the Department of Defense. Lunar Outpost supportability needs are summarized, and a supportability technology strategy is established to make the transition from high logistics dependence to logistics independence. This strategy will enable flight crews to act effectively to respond to problems and exploit opportunities in an environment of extreme resource scarcity and isolation. The supportability roadmap defines the general technology selection criteria. Technologies are organized into three categories: diagnostics, test, and verification; maintenance and repair; and scavenge and recycle. Furthermore, "embedded technologies" and "process technologies" are used to designate distinct technology types with different development cycles. The roadmap examines the current technology readiness level and lays out a four-phase incremental development schedule with selection decision gates. The supportability technology roadmap is intended to develop technologies with the widest possible capability and utility while minimizing the impact on crew time and training and remaining within the time and cost constraints of the program

NASA Technical Reports Server

Catastrophic Fault Recovery with Self-Reconfigurable Chips

Author: Chau Savio N.
Marzwell Neville I.
Zheng Will Hua
Publication venue
Publication date: 04/10/2006
Field of study

Mission critical systems typically employ multi-string redundancy to cope with possible hardware failure. Such systems are only as fault tolerant as there are many redundant strings. Once a particular critical component exhausts its redundant spares, the multi-string architecture cannot tolerate any further hardware failure. This paper aims at addressing such catastrophic faults through the use of 'Self-Reconfigurable Chips' as a last resort effort to 'repair' a faulty critical component

NASA Technical Reports Server

On Automating Failure Mode Analysis and Enforcing its Integrity

Author: Chau Savio N.
Tai Ann T.
Tso Kam S.
Publication venue
Publication date: 01/01/2005
Field of study

This paper reports our experience on the development of a design-for-safety (DFS) workbench called Risk Assessment and Management Environment (RAME) for microelectronic avionics systems. Our objective is to transform DFS practice from an ad-hoc, inefficient, error-prone approach to a stringent engineering process such that DFS can keep up with the rapidly growing complexity of avionics systems. In particular, RAME is built upon an information infrastructure that comprises a fault model, a knowledge base, and a failure reporting/tracking system. This infrastructure permits systematic learning from prior projects and enables the automation of failure modes, effects and criticality analysis (FMECA). Among other unique features, the most important advantage of RAME is its capability of directly accepting design source code in hardware description languages (HDLs) for automated failure mode analysis, which enables RAME to be compatible and to evolve with most electronic-computer-aided-design systems. Through an initial experimental evaluation of the RAME prototype, we show that our approach to FMECA automation improves failure mode analysis turn-around-time, completeness, and accuracy

CiteSeerX

NASA Technical Reports Server

On-board preventive maintenance: A design-oriented analytic study for longlife applications

Author: Ann T. Tai
Leon Alkalai
Savio N. Chau
Publication venue
Publication date
Field of study

With respect to the long-life missions associated with NASA’s X2000 Advanced Deep-Space System Development Program, reliability implies a system’s continuous operation for many years in an unsurveyed radiation-intense environment. Further, the stringent constraints on the mass of a spacecraft and the power on-board create unprecedented challenges on the means for achieving the ultra-high mission reliability. In this paper, we present an approach to on-board preventive maintenance which re-juvenates a system by letting system components rotate between on-duty and off-duty shifts, slowing down a system’s aging process and thus enhancing mission reliability. By exploiting nondedicated system redundancy, hardware and software rejuvenation are realized simultaneously without significant performance penalty. Our design-oriented analysis confirms a potential for significant gains in mission reliability from on-board preventive maintenance and provides to us useful insights about the collective effect of age-dependent failure behavior, residual mission life, risk of unsuccessful maintenance and maintenance frequency on mission reliability. Keywords: On-board preventive maintenance, hardware and software rejuvenation, phased-mission analysis, mission reliability gai

CiteSeerX

COTS-Based Fault Tolerance in Deep Space: Qualitative and Quantitative Analyses of a Bus Network Architecture

Author: Alkalai Leon
Chau Savio N.
Tai Ann T.
Publication venue
Publication date
Field of study

Using COTS products, standards and intellectual properties (IPs) for all the system and component interfaces is a crucial step toward significant reduction of both system cost and development cost as the COTS interfaces enable other COTS products and IPs to be readily accommodated by the target system architecture. With respect to the long-term survivable systems for deep-space missions, the major challenge for us is, under stringent power and mass constraints, to achieve ultra-high reliability of the system comprising COTS products and standards that are not developed for mission-critical applications. The spirit of our solution is to exploit the pertinent standard features of a COTS product to circumvent its shortcomings, though these standard features may not be originally designed for highly reliable systems. In this paper, we discuss our experiences and findings on the design of an IEEE 1394 compliant fault-tolerant COTS-based bus architecture. We first derive and qualitatively analyze a -'stacktree topology" that not only complies with IEEE 1394 but also enables the implementation of a fault-tolerant bus architecture without node redundancy. We then present a quantitative evaluation that demonstrates significant reliability improvement from the COTS-based fault tolerance

NASA Technical Reports Server