2,337 research outputs found

    Parallel Architectures for Planetary Exploration Requirements (PAPER)

    Get PDF
    The Parallel Architectures for Planetary Exploration Requirements (PAPER) project is essentially research oriented towards technology insertion issues for NASA's unmanned planetary probes. It was initiated to complement and augment the long-term efforts for space exploration with particular reference to NASA/LaRC's (NASA Langley Research Center) research needs for planetary exploration missions of the mid and late 1990s. The requirements for space missions as given in the somewhat dated Advanced Information Processing Systems (AIPS) requirements document are contrasted with the new requirements from JPL/Caltech involving sensor data capture and scene analysis. It is shown that more stringent requirements have arisen as a result of technological advancements. Two possible architectures, the AIPS Proof of Concept (POC) configuration and the MAX Fault-tolerant dataflow multiprocessor, were evaluated. The main observation was that the AIPS design is biased towards fault tolerance and may not be an ideal architecture for planetary and deep space probes due to high cost and complexity. The MAX concepts appears to be a promising candidate, except that more detailed information is required. The feasibility for adding neural computation capability to this architecture needs to be studied. Key impact issues for architectural design of computing systems meant for planetary missions were also identified

    Redundancy management for efficient fault recovery in NASA's distributed computing system

    Get PDF
    The management of redundancy in computer systems was studied and guidelines were provided for the development of NASA's fault-tolerant distributed systems. Fault recovery and reconfiguration mechanisms were examined. A theoretical foundation was laid for redundancy management by efficient reconfiguration methods and algorithmic diversity. Algorithms were developed to optimize the resources for embedding of computational graphs of tasks in the system architecture and reconfiguration of these tasks after a failure has occurred. The computational structure represented by a path and the complete binary tree was considered and the mesh and hypercube architectures were targeted for their embeddings. The innovative concept of Hybrid Algorithm Technique was introduced. This new technique provides a mechanism for obtaining fault tolerance while exhibiting improved performance

    Autonomous spacecraft maintenance study group

    Get PDF
    A plan to incorporate autonomous spacecraft maintenance (ASM) capabilities into Air Force spacecraft by 1989 is outlined. It includes the successful operation of the spacecraft without ground operator intervention for extended periods of time. Mechanisms, along with a fault tolerant data processing system (including a nonvolatile backup memory) and an autonomous navigation capability, are needed to replace the routine servicing that is presently performed by the ground system. The state of the art fault handling capabilities of various spacecraft and computers are described, and a set conceptual design requirements needed to achieve ASM is established. Implementations for near term technology development needed for an ASM proof of concept demonstration by 1985, and a research agenda addressing long range academic research for an advanced ASM system for 1990s are established

    Modeling and measurement of fault-tolerant multiprocessors

    Get PDF
    The workload effects on computer performance are addressed first for a highly reliable unibus multiprocessor used in real-time control. As an approach to studing these effects, a modified Stochastic Petri Net (SPN) is used to describe the synchronous operation of the multiprocessor system. From this model the vital components affecting performance can be determined. However, because of the complexity in solving the modified SPN, a simpler model, i.e., a closed priority queuing network, is constructed that represents the same critical aspects. The use of this model for a specific application requires the partitioning of the workload into job classes. It is shown that the steady state solution of the queuing model directly produces useful results. The use of this model in evaluating an existing system, the Fault Tolerant Multiprocessor (FTMP) at the NASA AIRLAB, is outlined with some experimental results. Also addressed is the technique of measuring fault latency, an important microscopic system parameter. Most related works have assumed no or a negligible fault latency and then performed approximate analyses. To eliminate this deficiency, a new methodology for indirectly measuring fault latency is presented

    Design of a fault tolerant airborne digital computer. Volume 1: Architecture

    Get PDF
    This volume is concerned with the architecture of a fault tolerant digital computer for an advanced commercial aircraft. All of the computations of the aircraft, including those presently carried out by analogue techniques, are to be carried out in this digital computer. Among the important qualities of the computer are the following: (1) The capacity is to be matched to the aircraft environment. (2) The reliability is to be selectively matched to the criticality and deadline requirements of each of the computations. (3) The system is to be readily expandable. contractible, and (4) The design is to appropriate to post 1975 technology. Three candidate architectures are discussed and assessed in terms of the above qualities. Of the three candidates, a newly conceived architecture, Software Implemented Fault Tolerance (SIFT), provides the best match to the above qualities. In addition SIFT is particularly simple and believable. The other candidates, Bus Checker System (BUCS), also newly conceived in this project, and the Hopkins multiprocessor are potentially more efficient than SIFT in the use of redundancy, but otherwise are not as attractive

    Generalized hypercube structures and hyperswitch communication network

    Get PDF
    This paper discusses an ongoing study that uses a recent development in communication control technology to implement hybrid hypercube structures. These architectures are similar to binary hypercubes, but they also provide added connectivity between the processors. This added connectivity increases communication reliability while decreasing the latency of interprocessor message passing. Because these factors directly determine the speed that can be obtained by multiprocessor systems, these architectures are attractive for applications such as remote exploration and experimentation, where high performance and ultrareliability are required. This paper describes and enumerates these architectures and discusses how they can be implemented with a modified version of the hyperswitch communication network (HCN). The HCN is analyzed because it has three attractive features that enable these architectures to be effective: speed, fault tolerance, and the ability to pass multiple messages simultaneously through the same hyperswitch controller

    Framework for simulation of fault tolerant heterogeneous multiprocessor system-on-chip

    Full text link
    Due to the ever growing requirement in high performance data computation, current Uniprocessor systems fall short of hand to meet critical real-time performance demands in (i) high throughput (ii) faster processing time (iii) low power consumption (iv) design cost and time-to-market factors and more importantly (v) fault tolerant processing. Shifting the design trend to MPSOCs is a work-around to meet these challenges. However, developing efficient fault tolerant task scheduling and mapping techniques requires optimized algorithms that consider the various scenarios in Multiprocessor environments. Several works have been done in the past few years which proposed simulation based frameworks for scheduling and mapping strategies that considered homogenous systems and error avoidance techniques. However, most of these works inadequately describe today\u27s MPSOC trend because they were focused on the network domain and didn\u27t consider heterogeneous systems with fault tolerant capabilities; In order to address these issues, this work proposes (i) a performance driven scheduling algorithm (PD SA) based on simulated annealing technique (ii) an optimized Homogenous-Workload-Distribution (HWD) Multiprocessor task mapping algorithm which considers the dynamic workload on processors and (iii) a dynamic Fault Tolerant (FT) scheduling/mapping algorithm to employ robust application processing system. The implementation was accompanied by a heterogeneous Multiprocessor system simulation framework developed in systemC/C++. The proposed framework reads user data, set the architecture, execute input task graph and finally generate performance variables. This framework alleviates previous work issues with respect to (i) architectural flexibility in number-of-processors, processor types and topology (ii) optimized scheduling and mapping strategies and (iii) fault-tolerant processing capability focusing more on the computational domain; A set of random as well as application specific STG benchmark suites were run on the simulator to evaluate and verify the performance of the proposed algorithms. The simulations were carried out for (i) scheduling policy evaluation (ii) fault tolerant evaluation (iii) topology evaluation (iv) Number of processor evaluation (v) Mapping policy evaluation and (vi) Processor Type evaluation. The results showed that PD scheduling algorithm showed marginally better performance than EDF with respect to utilization, Execution-Time and Power factors. The dynamic Fault Tolerant implementation showed to be a viable and efficient strategy to meet real-time constraints without posing significant system performance degradation. Torus topology gave better performance than Tile with respect to task completion time and power factors. Executing highly heterogeneous Tasks showed higher power consumption and execution time. Finally, increasing the number of processors showed a decrease in average Utilization but improved task completion time and power consumption; Based on the simulation results, the system designer can compare tradeoffs between a various design choices with respect to the performance requirement specifications. In general, designing an optimized Multiprocessor scheduling and mapping strategy with added fault tolerant capability will enable to develop efficient Multiprocessor systems which meet future performance goal requirements. This is the substance of this work

    High-level services for networks-on-chip

    Get PDF
    Future technology trends envision that next-generation Multiprocessors Systems-on- Chip (MPSoCs) will be composed of a combination of a large number of processing and storage elements interconnected by complex communication architectures. Communication and interconnection between these basic blocks play a role of crucial importance when the number of these elements increases. Enabling reliable communication channels between cores becomes therefore a challenge for system designers. Networks-on-Chip (NoCs) appeared as a strategy for connecting and managing the communication between several design elements and IP blocks, as required in complex Systems-on-Chip (SoCs). The topic can be considered as a multidisciplinary synthesis of multiprocessing, parallel computing, networking, and on- chip communication domains. Networks-on-Chip, in addition to standard communication services, can be employed for providing support for the implementation of system-level services. This dissertation will demonstrate how high-level services can be added to an MPSoC platform by embedding appropriate hardware/software support in the network interfaces (NIs) of the NoC. In this dissertation, the implementation of innovative modules acting in parallel with protocol translation and data transmission in NIs is proposed and evaluated. The modules can support the execution of the high-level services in the NoC at a relatively low cost in terms of area and energy consumption. Three types of services will be addressed and discussed: security, monitoring, and fault tolerance. With respect to the security aspect, this dissertation will discuss the implementation of an innovative data protection mechanism for detecting and preventing illegal accesses to protected memory blocks and/or memory mapped peripherals. The second aspect will be addressed by proposing the implementation of a monitoring system based on programmable multipurpose monitoring probes aimed at detecting NoC internal events and run-time characteristics. As last topic, new architectural solutions for the design of fault tolerant network interfaces will be presented and discussed
    corecore