58 research outputs found

    Autonomous Recovery Of Reconfigurable Logic Devices Using Priority Escalation Of Slack

    Get PDF
    Field Programmable Gate Array (FPGA) devices offer a suitable platform for survivable hardware architectures in mission-critical systems. In this dissertation, active dynamic redundancy-based fault-handling techniques are proposed which exploit the dynamic partial reconfiguration capability of SRAM-based FPGAs. Self-adaptation is realized by employing reconfiguration in detection, diagnosis, and recovery phases. To extend these concepts to semiconductor aging and process variation in the deep submicron era, resilient adaptable processing systems are sought to maintain quality and throughput requirements despite the vulnerabilities of the underlying computational devices. A new approach to autonomous fault-handling which addresses these goals is developed using only a uniplex hardware arrangement. It operates by observing a health metric to achieve Fault Demotion using Recon- figurable Slack (FaDReS). Here an autonomous fault isolation scheme is employed which neither requires test vectors nor suspends the computational throughput, but instead observes the value of a health metric based on runtime input. The deterministic flow of the fault isolation scheme guarantees success in a bounded number of reconfigurations of the FPGA fabric. FaDReS is then extended to the Priority Using Resource Escalation (PURE) online redundancy scheme which considers fault-isolation latency and throughput trade-offs under a dynamic spare arrangement. While deep-submicron designs introduce new challenges, use of adaptive techniques are seen to provide several promising avenues for improving resilience. The scheme developed is demonstrated by hardware design of various signal processing circuits and their implementation on a Xilinx Virtex-4 FPGA device. These include a Discrete Cosine Transform (DCT) core, Motion Estimation (ME) engine, Finite Impulse Response (FIR) Filter, Support Vector Machine (SVM), and Advanced Encryption Standard (AES) blocks in addition to MCNC benchmark circuits. A iii significant reduction in power consumption is achieved ranging from 83% for low motion-activity scenes to 12.5% for high motion activity video scenes in a novel ME engine configuration. For a typical benchmark video sequence, PURE is shown to maintain a PSNR baseline near 32dB. The diagnosability, reconfiguration latency, and resource overhead of each approach is analyzed. Compared to previous alternatives, PURE maintains a PSNR within a difference of 4.02dB to 6.67dB from the fault-free baseline by escalating healthy resources to higher-priority signal processing functions. The results indicate the benefits of priority-aware resiliency over conventional redundancy approaches in terms of fault-recovery, power consumption, and resource-area requirements. Together, these provide a broad range of strategies to achieve autonomous recovery of reconfigurable logic devices under a variety of constraints, operating conditions, and optimization criteria

    Large space structure experiments for AAP. Volume 5 - Parabolic antenna Final report, 15 Sep. 1966 - 15 Sep. 1967

    Get PDF
    Apollo Applications Program parabolic antenna experiment to evaluate large space structures and extravehicular activitie

    Fault Tolerant Task Mapping in Many-Core Systems

    Get PDF
    The advent of many-core systems, a network on chip containing hundreds or thousands of homogeneous processors cores, present new challenges in managing the cores effectively in response to processing demands, hardware faults and the need for heat management. Continually diminishing feature size of devices increase the probability of fabrication de- fects and the variability of performance of individual transistors. In many-core systems this can result in the failure of individual processing cores, routing nodes or communication links, which require the use of fault tolerant mechanisms. Diminishing feature size also increases the power density of devices, giving rise to the concept of dark silicon where only a portion of the functionality available on a chip can be active at any one time. Core fault tolerance and management of dark silicon can both be achieved by allocating a percentage of cores to be idle at any one time. Idle cores can be used as dark silicon to evenly distribute heat generated by processing cores and can also be used as spare cores to implement fault tolerance. Both of these can be achieved by the dynamic allocation of processes to tasks in response to changes to the status of hardware resources and the demands placed on the system, which in turn requires real time task mapping. This research proposes the use of a continuous fault/recovery cycle to implement graceful degradation and amelioration to provide real-time fault tolerance. Objective measures for core fault tolerance, link fault tolerance, network power and excess traffic have been developed for use by a multi-objective evolutionary algorithm that uses knowledge of the processing demands and hardware status to identify optimal task mappings. The fault/recovery cycle is shown to be effective in maintaining a high level of performance of a many-core array when presented with a series of hardware faults

    A Proposal for a Three Detector Short-Baseline Neutrino Oscillation Program in the Fermilab Booster Neutrino Beam

    Get PDF
    A Short-Baseline Neutrino (SBN) physics program of three LAr-TPC detectors located along the Booster Neutrino Beam (BNB) at Fermilab is presented. This new SBN Program will deliver a rich and compelling physics opportunity, including the ability to resolve a class of experimental anomalies in neutrino physics and to perform the most sensitive search to date for sterile neutrinos at the eV mass-scale through both appearance and disappearance oscillation channels. Using data sets of 6.6e20 protons on target (P.O.T.) in the LAr1-ND and ICARUS T600 detectors plus 13.2e20 P.O.T. in the MicroBooNE detector, we estimate that a search for muon neutrino to electron neutrino appearance can be performed with ~5 sigma sensitivity for the LSND allowed (99% C.L.) parameter region. In this proposal for the SBN Program, we describe the physics analysis, the conceptual design of the LAr1-ND detector, the design and refurbishment of the T600 detector, the necessary infrastructure required to execute the program, and a possible reconfiguration of the BNB target and horn system to improve its performance for oscillation searches.Comment: 209 pages, 129 figure

    The Seventeenth Space Simulation Conference. Terrestrial Test for Space Success

    Get PDF
    The Institute of Environmental Sciences' Seventeenth Space Simulation Conference, 'Terrestrial Test for Space Success' provided participants with a forum to acquire and exchange information on the state of the art in space simulation, test technology, atomic oxygen, dynamics testing, contamination, and materials. The papers presented at this conference and the resulting discussions carried out the conference theme of 'terrestrial test for space success.

    Cumulative index to NASA Tech Briefs, 1986-1990, volumes 10-14

    Get PDF
    Tech Briefs are short announcements of new technology derived from the R&D activities of the National Aeronautics and Space Administration. These briefs emphasize information considered likely to be transferrable across industrial, regional, or disciplinary lines and are issued to encourage commercial application. This cumulative index of Tech Briefs contains abstracts and four indexes (subject, personal author, originating center, and Tech Brief number) and covers the period 1986 to 1990. The abstract section is organized by the following subject categories: electronic components and circuits, electronic systems, physical sciences, materials, computer programs, life sciences, mechanics, machinery, fabrication technology, and mathematics and information sciences

    Technology 2001: The Second National Technology Transfer Conference and Exposition, volume 1

    Get PDF
    Papers from the technical sessions of the Technology 2001 Conference and Exposition are presented. The technical sessions featured discussions of advanced manufacturing, artificial intelligence, biotechnology, computer graphics and simulation, communications, data and information management, electronics, electro-optics, environmental technology, life sciences, materials science, medical advances, robotics, software engineering, and test and measurement

    NASA Tech Briefs, Fall 1978

    Get PDF
    Topics covered: NASA TU Services: Technology Utilization services that can assist you in learning about and applying NASA technology; New Product Ideas: A summary of selected innovations of value to manufacturers for the development of new products; Electronic Components and Circuits; Electronic Systems; Physical Sciences; Materials; Life Sciences; Mechanics; Machinery; Fabrication Technology; Mathematics and Information Sciences

    Scaling and Resilience in Numerical Algorithms for Exascale Computing

    Get PDF
    The first Petascale supercomputer, the IBM Roadrunner, went online in 2008. Ten years later, the community is now looking ahead to a new generation of Exascale machines. During the decade that has passed, several hundred Petascale capable machines have been installed worldwide, yet despite the abundance of machines, applications that scale to their full size remain rare. Large clusters now routinely have 50.000+ cores, some have several million. This extreme level of parallelism, that has allowed a theoretical compute capacity in excess of a million billion operations per second, turns out to be difficult to use in many applications of practical interest. Processors often end up spending more time waiting for synchronization, communication, and other coordinating operations to complete, rather than actually computing. Component reliability is another challenge facing HPC developers. If even a single processor fail, among many thousands, the user is forced to restart traditional applications, wasting valuable compute time. These issues collectively manifest themselves as low parallel efficiency, resulting in waste of energy and computational resources. Future performance improvements are expected to continue to come in large part due to increased parallelism. One may therefore speculate that the difficulties currently faced, when scaling applications to Petascale machines, will progressively worsen, making it difficult for scientists to harness the full potential of Exascale computing. The thesis comprises two parts. Each part consists of several chapters discussing modifications of numerical algorithms to make them better suited for future Exascale machines. In the first part, the use of Parareal for Parallel-in-Time integration techniques for scalable numerical solution of partial differential equations is considered. We propose a new adaptive scheduler that optimize the parallel efficiency by minimizing the time-subdomain length without making communication of time-subdomains too costly. In conjunction with an appropriate preconditioner, we demonstrate that it is possible to obtain time-parallel speedup on the nonlinear shallow water equation, beyond what is possible using conventional spatial domain-decomposition techniques alone. The part is concluded with the proposal of a new method for constructing Parallel-in-Time integration schemes better suited for convection dominated problems. In the second part, new ways of mitigating the impact of hardware failures are developed and presented. The topic is introduced with the creation of a new fault-tolerant variant of Parareal. In the chapter that follows, a C++ Library for multi-level checkpointing is presented. The library uses lightweight in-memory checkpoints, protected trough the use of erasure codes, to mitigate the impact of failures by decreasing the overhead of checkpointing and minimizing the compute work lost. Erasure codes have the unfortunate property that if more data blocks are lost than parity codes created, the data is effectively considered unrecoverable. The final chapter contains a preliminary study on partial information recovery for incomplete checksums. Under the assumption that some meta knowledge exists on the structure of the data encoded, we show that the data lost may be recovered, at least partially. This result is of interest not only in HPC but also in data centers where erasure codes are widely used to protect data efficiently

    Topical Workshop on Electronics for Particle Physics

    Get PDF
    corecore