21 research outputs found

    Advanced flight computer. Special study

    Get PDF
    This report documents a special study to define a 32-bit radiation hardened, SEU tolerant flight computer architecture, and to investigate current or near-term technologies and development efforts that contribute to the Advanced Flight Computer (AFC) design and development. An AFC processing node architecture is defined. Each node may consist of a multi-chip processor as needed. The modular, building block approach uses VLSI technology and packaging methods that demonstrate a feasible AFC module in 1998 that meets that AFC goals. The defined architecture and approach demonstrate a clear low-risk, low-cost path to the 1998 production goal, with intermediate prototypes in 1996

    Improving redundant multithreading performance for soft-error detection in HPC applications

    Get PDF
    Tesis de Graduación (Maestría en Computación) Instituto Tecnológico de Costa Rica, Escuela de Computación, 2018As HPC systems move towards extreme scale, soft errors leading to silent data corruptions become a major concern. In this thesis, we propose a set of three optimizations to the classical Redundant Multithreading (RMT) approach to allow faster soft error detection. First, we leverage the use of Simultaneous Multithreading (SMT) to collocate sibling replicated threads on the same physical core to efficiently exchange data to expose errors. Some HPC applications cannot fully exploit SMT for performance improvement and instead, we propose to use these additional resources for fault tolerance. Second, we present variable aggregation to group several values together and use this merged value to speed up detection of soft errors. Third, we introduce selective checking to decrease the number of checked values to a minimum. The last two techniques reduce the overall performance overhead by relaxing the soft error detection scope. Our experimental evaluation, executed on recent multicore processors with representative HPC benchmarks, proves that the use of SMT for fault tolerance can enhance RMT performance. It also shows that, at constant computing power budget, with optimizations applied, the overhead of the technique can be significantly lower than the classical RMT replicated execution. Furthermore, these results show that RMT can be a viable solution for soft-error detection at extreme scale

    Space Station Freedom data management system growth and evolution report

    Get PDF
    The Information Sciences Division at the NASA Ames Research Center has completed a 6-month study of portions of the Space Station Freedom Data Management System (DMS). This study looked at the present capabilities and future growth potential of the DMS, and the results are documented in this report. Issues have been raised that were discussed with the appropriate Johnson Space Center (JSC) management and Work Package-2 contractor organizations. Areas requiring additional study have been identified and suggestions for long-term upgrades have been proposed. This activity has allowed the Ames personnel to develop a rapport with the JSC civil service and contractor teams that does permit an independent check and balance technique for the DMS

    Study and development of a software implemented fault injection plug-in for the Xception tool/powerPC 750

    Get PDF
    Estágio realizado na Critical Software e orientado pelo Eng.º Ricardo BarbosaTese de mestrado integrado. Engenharia Informática e Computação. Faculdade de Engenharia. Universidade do Porto. 200

    Classification of Resilience Techniques Against Functional Errors at Higher Abstraction Layers of Digital Systems

    Get PDF
    Nanoscale technology nodes bring reliability concerns back to the center stage of digital system design. A systematic classification of approaches that increase system resilience in the presence of functional hardware (HW)-induced errors is presented, dealing with higher system abstractions, such as the (micro) architecture, the mapping, and platform software (SW). The field is surveyed in a systematic way based on nonoverlapping categories, which add insight into the ongoing work by exposing similarities and differences. HW and SW solutions are discussed in a similar fashion so that interrelationships become apparent. The presented categories are illustrated by representative literature examples to illustrate their properties. Moreover, it is demonstrated how hybrid schemes can be decomposed into their primitive components

    Computer science I like proceedings of miniconference on 4.11.2011

    Get PDF

    FAT-DBT engine (framework for application-tailorcd, co-designcd dynamic binary translation enginc)

    Get PDF
    Tese de Doutoramento em Engenharia Eletrónica e de Computadores (PDEEC)Dynamic binary translation (DBT) has emerged as an execution engine that monitors, modifies and possibly optimizes running applications for specific purposes. DBT is deployed as an execution layer between the application binary and the operating system or host-machine, which creates opportunities for collecting runtime information. Initially, DBT supported binary-level compatibility, but based on the collected runtime information, it also became popular for code instrumentation, ISA-virtualization and dynamic-optimization purposes. Building a DBT system brings many challenges, as it involves complex components integration and requires deep architectural level knowledge. Moreover, DBT incurs in significant overheads, mainly due to code decoding and translation, as well as execution along with general functionalities emulation. While initially conceived bearing in mind high-end architectures for performance demanding applications, such challenges become even more evident when directing DBT to embedded systems. The latter makes an effective deployment very challenging due to its complexity, tight constraints on memory, and limited performance and power. Legacy support and binary compatibility is a topic of relevant interest in such systems, due to their broad dissemination among industrial environments and wide utilization in sensing and monitoring processes, from yearly times, with considerable maintenance and replacement costs. To address such issues, this thesis intents to contribute with a solution that leverages an optimized and accelerated dynamic binary translator targeting resourceconstrained embedded systems while supporting legacy systems. The developed work allows to: (1) evaluate the potential of DBT for legacy support purposes on the resource-constrained embedded systems; (2) achieve a configurable DBT architecture specialized for resource-constrained embedded systems; (3) address DBT translation, execution and emulation overheads through the combination of software and hardware; and (4) promote DBT utilization as a legacy support tool for the industry as a end-product.A tradução binária dinâmica (TBD) emergiu como um motor de execução que permite a modificação e possível optimização de código executável para um determinado propósito. A TBD é integrada nos sistemas como uma camada de execução entre o código binário executável e o sistema operativo ou a máquina hospedeira alvo, o que origina oportunidades de recolha de informação de execução. A criação de um sistema de TBD traz consigo diversos desafios, uma vez que envolve a integração de componentes complexos e conhecimentos aprofundados das arquitecturas de processadores envolvidas. Ademais, a utilização de TBD gera diversos custos computacionais indirectos, maioritariamente devido à descodificação e tradução de código, bem como emulação de funcionalidades em geral. Considerando que a TBD foi inicialmente pensada para sistemas de gama alta, os desafios mencionados tornam-se ainda mais evidentes quando a mesma é aplicada em sistemas embebidos. Nesta área os limitados recursos de memória e os exigentes requisitos de desempenho e consumo energético,tornam uma implementação eficiente de TBD muito difícil de obter. Compatibilidade binária e suporte a código de legado são tópicos de interesse em sistemas embebidos, justificado pela ampla disseminação dos mesmos no meio industrial para tarefas de sensorização e monitorização ao longo dos tempos, reforçado pelos custos de manutenção adjacentes à sua utilização. Para endereçar os desafios descritos, nesta tese propõe-se uma solução para potencializar a tradução binária dinâmica, optimizada e com aceleração, para suporte a código de legado em sistemas embebidos de baixa gama. O trabalho permitiu (1) avaliar o potencial da TBD quando aplicada ao suporte a código de legado em sistemas embebidos de baixa gama; (2) a obtenção de uma arquitectura de TBD configurável e especializada para este tipo de sistemas; (3) reduzir os custos computacionais associados à tradução, execução e emulação, através do uso combinado de software e hardware; (4) e promover a utilização na industria de TBD como uma ferramenta de suporte a código de legado.This thesis was supported by a PhD scholarship from Fundação para a Ciência e Tecnologia, SFRH/BD/81681/201

    Dependable Embedded Systems

    Get PDF
    This Open Access book introduces readers to many new techniques for enhancing and optimizing reliability in embedded systems, which have emerged particularly within the last five years. This book introduces the most prominent reliability concerns from today’s points of view and roughly recapitulates the progress in the community so far. Unlike other books that focus on a single abstraction level such circuit level or system level alone, the focus of this book is to deal with the different reliability challenges across different levels starting from the physical level all the way to the system level (cross-layer approaches). The book aims at demonstrating how new hardware/software co-design solution can be proposed to ef-fectively mitigate reliability degradation such as transistor aging, processor variation, temperature effects, soft errors, etc. Provides readers with latest insights into novel, cross-layer methods and models with respect to dependability of embedded systems; Describes cross-layer approaches that can leverage reliability through techniques that are pro-actively designed with respect to techniques at other layers; Explains run-time adaptation and concepts/means of self-organization, in order to achieve error resiliency in complex, future many core systems

    Hardware-Assisted Dependable Systems

    Get PDF
    Unpredictable hardware faults and software bugs lead to application crashes, incorrect computations, unavailability of internet services, data losses, malfunctioning components, and consequently financial losses or even death of people. In particular, faults in microprocessors (CPUs) and memory corruption bugs are among the major unresolved issues of today. CPU faults may result in benign crashes and, more problematically, in silent data corruptions that can lead to catastrophic consequences, silently propagating from component to component and finally shutting down the whole system. Similarly, memory corruption bugs (memory-safety vulnerabilities) may result in a benign application crash but may also be exploited by a malicious hacker to gain control over the system or leak confidential data. Both these classes of errors are notoriously hard to detect and tolerate. Usual mitigation strategy is to apply ad-hoc local patches: checksums to protect specific computations against hardware faults and bug fixes to protect programs against known vulnerabilities. This strategy is unsatisfactory since it is prone to errors, requires significant manual effort, and protects only against anticipated faults. On the other extreme, Byzantine Fault Tolerance solutions defend against all kinds of hardware and software errors, but are inadequately expensive in terms of resources and performance overhead. In this thesis, we examine and propose five techniques to protect against hardware CPU faults and software memory-corruption bugs. All these techniques are hardware-assisted: they use recent advancements in CPU designs and modern CPU extensions. Three of these techniques target hardware CPU faults and rely on specific CPU features: ∆-encoding efficiently utilizes instruction-level parallelism of modern CPUs, Elzar re-purposes Intel AVX extensions, and HAFT builds on Intel TSX instructions. The rest two target software bugs: SGXBounds detects vulnerabilities inside Intel SGX enclaves, and “MPX Explained” analyzes the recent Intel MPX extension to protect against buffer overflow bugs. Our techniques achieve three goals: transparency, practicality, and efficiency. All our systems are implemented as compiler passes which transparently harden unmodified applications against hardware faults and software bugs. They are practical since they rely on commodity CPUs and require no specialized hardware or operating system support. Finally, they are efficient because they use hardware assistance in the form of CPU extensions to lower performance overhead
    corecore