2,967 research outputs found

    Optimal discrimination between transient and permanent faults

    Get PDF
    An important practical problem in fault diagnosis is discriminating between permanent faults and transient faults. In many computer systems, the majority of errors are due to transient faults. Many heuristic methods have been used for discriminating between transient and permanent faults; however, we have found no previous work stating this decision problem in clear probabilistic terms. We present an optimal procedure for discriminating between transient and permanent faults, based on applying Bayesian inference to the observed events (correct and erroneous results). We describe how the assessed probability that a module is permanently faulty must vary with observed symptoms. We describe and demonstrate our proposed method on a simple application problem, building the appropriate equations and showing numerical examples. The method can be implemented as a run-time diagnosis algorithm at little computational cost; it can also be used to evaluate any heuristic diagnostic procedure by compariso

    Benefits and Challenges of Model-based Software Engineering: Lessons Learned based on Qualitative and Quantitative Findings

    Get PDF
    Even though Model-based Software Engineering (MBSwE) techniques and Autogenerated Code (AGC) have been increasingly used to produce complex software systems, there is only anecdotal knowledge about the state-of-thepractice. Furthermore, there is a lack of empirical studies that explore the potential quality improvements due to the use of these techniques. This paper presents in-depth qualitative findings about development and Software Assurance (SWA) practices and detailed quantitative analysis of software bug reports of a NASA mission that used MBSwE and AGC. The missions flight software is a combination of handwritten code and AGC developed by two different approaches: one based on state chart models (AGC-M) and another on specification dictionaries (AGC-D). The empirical analysis of fault proneness is based on 380 closed bug reports created by software developers. Our main findings include: (1) MBSwE and AGC provide some benefits, but also impose challenges. (2) SWA done only at a model level is not sufficient. AGC code should also be tested and the models and AGC should always be kept in-sync. AGC must not be changed manually. (3) Fixes made to address an individual bug report were spread both across multiple modules and across multiple files. On average, for each bug report 1.4 modules, that is, 3.4 files were fixed. (4) Most bug reports led to changes in more than one type of file. The majority of changes to auto-generated source code files were made in conjunction to changes in either file with state chart models or XML files derived from dictionaries. (5) For newly developed files, AGC-M and handwritten code were of similar quality, while AGC-D files were the least fault prone

    Seismic Loading Effects within Orthogonally Connected Steel Lateral Force Resisting Systems

    Get PDF
    Steel buildings located within seismically active regions require special design considerations to ensure public safety and prevent collapse during an extreme seismic event. Two commonly used steel systems are special moment frames (SMFs) and buckling-restrained braced frames (BRBFs). When two seismic systems share a common column in an orthogonal configuration (such as at a building corner), design specifications currently consider a 100+30 rule wherein the shared column is designed for 100% fuse demand in one direction, plus 30% fuse demand from the other direction. While this rule has been shown to be reasonable for elastic building response, a few studies performed on inelastic systems suggest that the 100+30 rule may not be reasonable for systems expected to experience significant inelastic response. This study investigated nonlinear effects resulting from simultaneous earthquake loading of orthogonally oriented seismic systems. Detailed nonlinear time-history analysis of three-dimensional frame configurations was considered, addressing coupled and non-coupled orthogonal system effects on resulting shared column demands. Various seismic system pairs (sharing a column) are considered, including both moment frames and braced frames. Results indicate that the current 100+30 rule is non-conservative for some frame-type combinations. Bidirectional seismic effects in coupled steel systems showed increased column axial demands over independent demand additions from un-coupled (unidirectional loading) analyses. Braced-frame-to–moment-frame configurations were more affected by bidirectional lateral forces than braced-frame-to-braced-frame orthogonal configurations. Additionally uncoupled steel systems experienced higher inter-story drift demands than the coupled frame configurations of the same geometry. A new approach to estimating shared column demands in orthogonal seismic systems was proposed herein

    Compiler-Assisted Multiple Instruction Rollback Recovery Using a Read Buffer

    Get PDF
    Multiple instruction rollback (MIR) is a technique to provide rapid recovery from transient processor failures and was implemented in hardware by researchers and slow in mainframe computers. Hardware-based MIR designs eliminate rollback data hazards by providing data redundancy implemented in hardware. Compiler-based MIR designs were also developed which remove rollback data hazards directly with data flow manipulations, thus eliminating the need for most data redundancy hardware. Compiler-assisted techniques to achieve multiple instruction rollback recovery are addressed. It is observed that data some hazards resulting from instruction rollback can be resolved more efficiently by providing hardware redundancy while others are resolved more efficiently with compiler transformations. A compiler-assisted multiple instruction rollback scheme is developed which combines hardware-implemented data redundancy with compiler-driven hazard removal transformations. Experimental performance evaluations were conducted which indicate improved efficiency over previous hardware-based and compiler-based schemes. Various enhancements to the compiler transformations and to the data redundancy hardware developed for the compiler-assisted MIR scheme are described and evaluated. The final topic deals with the application of compiler-assisted MIR techniques to aid in exception repair and branch repair in a speculative execution architecture

    Lunar Surface-to-Surface Power Transfer

    Get PDF
    A human lunar outpost, under NASA study for construction in the 2020's, has potential requirements to transfer electric power up to 50-kW across the lunar surface from 0.1 to 10-km distances. This power would be used to operate surface payloads located remotely from the outpost and/or outpost primary power grid. This paper describes concept designs for state-of-the-art technology power transfer subsystems including AC or DC power via cables, beamed radio frequency power and beamed laser power. Power transfer subsystem mass and performance are calculated and compared for each option. A simplified qualitative assessment of option operations, hazards, costs and technology needs is also described. Based on these concept designs and performance analyses, a DC power cabling subsystem is recommended to minimize subsystem mass and to minimize mission and programmatic costs and risks. Avenues for additional power transfer subsystem studies are recommended

    FAST FLUX TEST FACILITY MONTHLY INFORMAL TECHNICAL PROGRESS REPORT: MARCH 1969

    Get PDF
    This report was prepared by Battelle-Northwest under Contract No. AT(4S-l)-1830 for the Atomic Energy Commission, Division of Reactor Development and Technology, to summarize technical progress made in the Fast Flux Test Facility Program during March 1969

    Adaptive Load Sharing for Network Processors

    Get PDF
    A novel scheme for processing packets in a router is presented, which provides for load sharing among multiple network processors distributed within the router. It is complemented by a feedback control mechanism designed to prevent processor overload. Incoming traffic is scheduled to multiple processors based on a deterministic mapping. The mapping formula is derived from the robust hash routing (also known as the highest random weight - HRW) scheme, introduced in K.W. Ross, IEEE Network, 11(6), 1997, and D.G. Thaler et al., IEEE Trans. Networking, 6(1), 1998. No state information on individual flow mapping needs to be stored, but for each packet, a mapping function is computed over an identifier vector, a predefined set of fields in the packet. An adaptive extension to the HRW scheme is provided in order to cope with biased traffic patterns. We prove that our adaptation possesses the minimal disruption property with respect to the mapping and exploit that property in order to minimize the probability of flow reordering. Simulation results indicate that the scheme achieves significant improvements in processor utilization. A higher number of router interfaces can thus be supported with the same amount of processing power

    Structures performance, benefit, cost-study

    Get PDF
    New technology concepts and structural analysis development needs which could lead to improved life cycle cost for future high-bypass turbofans were studied. The NASA-GE energy efficient engine technology is used as a base to assess the concept benefits. Recommended programs are identified for attaining these generic structural and other beneficial technologies

    Low-Memory Techniques for Routing and Fault-Tolerance on the Fat-Tree Topology

    Full text link
    Actualmente, los clústeres de PCs están considerados como una alternativa eficiente a la hora de construir supercomputadores en los que miles de nodos de computación se conectan mediante una red de interconexión. La red de interconexión tiene que ser diseñada cuidadosamente, puesto que tiene una gran influencia sobre las prestaciones globales del sistema. Dos de los principales parámetros de diseño de las redes de interconexión son la topología y el encaminamiento. La topología define la interconexión de los elementos de la red entre sí, y entre éstos y los nodos de computación. Por su parte, el encaminamiento define los caminos que siguen los paquetes a través de la red. Las prestaciones han sido tradicionalmente la principal métrica a la hora de evaluar las redes de interconexión. Sin embargo, hoy en día hay que considerar dos métricas adicionales: el coste y la tolerancia a fallos. Las redes de interconexión además de escalar en prestaciones también deben hacerlo en coste. Es decir, no sólo tienen que mantener su productividad conforme aumenta el tamaño de la red, sino que tienen que hacerlo sin incrementar sobremanera su coste. Por otra parte, conforme se incrementa el número de nodos en las máquinas de tipo clúster, la red de interconexión debe crecer en concordancia. Este incremento en el número de elementos de la red de interconexión aumenta la probabilidad de aparición de fallos, y por lo tanto, la tolerancia a fallos es prácticamente obligatoria para las redes de interconexión actuales. Esta tesis se centra en la topología fat-tree, ya que es una de las topologías más comúnmente usadas en los clústeres. El objetivo de esta tesis es aprovechar sus características particulares para proporcionar tolerancia a fallos y un algoritmo de encaminamiento capaz de equilibrar la carga de la red proporcionando una buena solución de compromiso entre las prestaciones y el coste.Gómez Requena, C. (2010). Low-Memory Techniques for Routing and Fault-Tolerance on the Fat-Tree Topology [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/8856Palanci

    Compiler-Assisted Multiple Instruction Rollback Recovery Using a Read Buffer

    Get PDF
    Coordinated Science Laboratory was formerly known as Control Systems LaboratoryNational Aeronautics and Space Administration / NASA NAG 1-613Department of the Navy / N00014-91-J-128
    corecore