Search CORE

1,957 research outputs found

Surviving sensor network software faults

Author: John Regehr
Maria Kaz
Omprakash Gnawali
Philip Levis
Yang Chen
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2009
Field of study

We describe Neutron, a version of the TinyOS operating system that efficiently recovers from memory safety bugs. Where existing schemes reboot an entire node on an error, Neutron’s compiler and runtime extensions divide programs into recovery units and reboot only the faulting unit. The TinyOS kernel itself is a recovery unit: a kernel safety violation appears to applications as the processor being unavailable for 10–20 milliseconds. Neutron further minimizes safety violation cost by supporting “precious ” state that persists across reboots. Application data, time synchronization state, and routing tables can all be declared as pre-cious. Neutron’s reboot sequence conservatively checks that pre-cious state is not the source of a fault before preserving it. Together, recovery units and precious state allow Neutron to reduce a safety violation’s cost to time synchronization by 94 % and to a routing protocol by 99.5%. Neutron also protects applications from losing data. Neutron provides this recovery on the very limited resources of a tiny, low-power microcontroller

CiteSeerX

Crossref

Surviving sensor network software faults

Author: Chen Yang
Regehr John
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2009
Field of study

ManuscriptWe describe Neutron, a version of the TinyOS operating system that efficiently recovers from memory safety bugs. Where existing schemes reboot an entire node on an error, Neutron's compiler and runtime extensions divide programs into recovery units and reboot only the faulting unit. The TinyOS kernel itself is a recovery unit: a kernel safety violation appears to applications as the processor being unavailable for 10-20 milliseconds. Neutron further minimizes safety violation cost by supporting "precious" state that persists across reboots. Application data, time synchronization state, and routing tables can all be declared as precious. Neutron's reboot sequence conservatively checks that precious state is not the source of a fault before preserving it. Together, recovery units and precious state allow Neutron to reduce a safety violation's cost to time synchronization by 94% and to a routing protocol by 99:5%. Neutron also protects applications from losing data. Neutron provides this recovery on the very limited resources of a tiny, low-power microcontroller

The University of Utah: J. Willard Marriott Digital Library

Recommended from our members

Assessing the Risk due to Software Faults: Estimates of Failure Rate versus Evidence of Perfection.

Author: Bertolino A.
Strigini L.
Publication venue: 'Wiley'
Publication date: 10/09/2002
Field of study

In the debate over the assessment of software reliability (or safety), as applied to critical software, two extreme positions can be discerned: the ‘statistical’ position, which requires that the claims of reliability be supported by statistical inference from realistic testing or operation, and the ‘perfectionist’ position, which requires convincing indications that the software is free from defects. These two positions naturally lead to requiring different kinds of supporting evidence, and actually to stating the dependability requirements in different ways, not allowing any direct comparison. There is often confusion about the relationship between statements about software failure rates and about software correctness, and about which evidence can support either kind of statement. This note clarifies the meaning of the two kinds of statement and how they relate to the probability of failure-free operation, and discusses their practical merits, especially for high required reliability or safety

City Research Online

Results of expert judgments on the faults and risks with Autosub3 and an analysis of its campaign to Pine Island Bay, Antarctica, 2009

Author: Brito Mario P.
Griffiths Gwyn
Publication venue: Autonomous Undersea Systems Institute
Publication date: 01/01/2009
Field of study

Probabilistic risk assessment is a methodology that can be systematically applied to estimate the risk associated with the design and operation of complex systems. The National Oceanography Centre, Southampton, UK has developed a risk management process tailored to the operation of autonomous underwater vehicles. Central to the application of the risk management process is a probabilistic risk assessment. The risk management process was applied to estimate the risk associated with an Autosub3 science campaign in the Pine Island Glacier, Antarctica, and to support decision making. The campaign was successful. In this paper we present the Autosub3 risk model and we show how this model was used to assess the campaign risk

Southampton (e-Prints Soton)

NERC Open Research Archive

Intelligent fault management for the Space Station active thermal control system

Author: Faltisco Robert M.
Hill Tim
Publication venue
Publication date
Field of study

The Thermal Advanced Automation Project (TAAP) approach and architecture is described for automating the Space Station Freedom (SSF) Active Thermal Control System (ATCS). The baseline functionally and advanced automation techniques for Fault Detection, Isolation, and Recovery (FDIR) will be compared and contrasted. Advanced automation techniques such as rule-based systems and model-based reasoning should be utilized to efficiently control, monitor, and diagnose this extremely complex physical system. TAAP is developing advanced FDIR software for use on the SSF thermal control system. The goal of TAAP is to join Knowledge-Based System (KBS) technology, using a combination of rules and model-based reasoning, with conventional monitoring and control software in order to maximize autonomy of the ATCS. TAAP's predecessor was NASA's Thermal Expert System (TEXSYS) project which was the first large real-time expert system to use both extensive rules and model-based reasoning to control and perform FDIR on a large, complex physical system. TEXSYS showed that a method is needed for safely and inexpensively testing all possible faults of the ATCS, particularly those potentially damaging to the hardware, in order to develop a fully capable FDIR system. TAAP therefore includes the development of a high-fidelity simulation of the thermal control system. The simulation provides realistic, dynamic ATCS behavior and fault insertion capability for software testing without hardware related risks or expense. In addition, thermal engineers will gain greater confidence in the KBS FDIR software than was possible prior to this kind of simulation testing. The TAAP KBS will initially be a ground-based extension of the baseline ATCS monitoring and control software and could be migrated on-board as additional computation resources are made available

NASA Technical Reports Server

Exhaustive Search-based Model for Hybrid Sensor Network

Author: Akbar Z.
Handoko L. T.
Suhartanto H.
Waskita A. A.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 30/09/2012
Field of study

A new model for a cluster of hybrid sensors network with multi sub-clusters is proposed. The model is in particular relevant to the early warning system in a large scale monitoring system in, for example, a nuclear power plant. It mainly addresses to a safety critical system which requires real-time processes with high accuracy. The mathematical model is based on the extended conventional search algorithm with certain interactions among the nearest neighborhood of sensors. It is argued that the model could realize a highly accurate decision support system with less number of parameters. A case of one dimensional interaction function is discussed, and a simple algorithm for the model is also given.Comment: 6 pages, Proceeding of the International Conference on Intelligent & Advanced Systems 2012 pp. 557-56

arXiv.org e-Print Archive

Crossref

Advanced flight control system study

Author: Klafin J. F.
Mcgough J.
Moses K.
Publication venue
Publication date
Field of study

The architecture, requirements, and system elements of an ultrareliable, advanced flight control system are described. The basic criteria are functional reliability of 10 to the minus 10 power/hour of flight and only 6 month scheduled maintenance. A distributed system architecture is described, including a multiplexed communication system, reliable bus controller, the use of skewed sensor arrays, and actuator interfaces. Test bed and flight evaluation program are proposed

NASA Technical Reports Server

System data communication structures for active-control transport aircraft, volume 2

Author: Brock L. D.
Hanley L. D.
Hopkins A. L.
Jansson D. G.
Martin J. H.
Serben S.
Smith T. B.
Publication venue
Publication date
Field of study

The application of communication structures to advanced transport aircraft are addressed. First, a set of avionic functional requirements is established, and a baseline set of avionics equipment is defined that will meet the requirements. Three alternative configurations for this equipment are then identified that represent the evolution toward more dispersed systems. Candidate communication structures are proposed for each system configuration, and these are compared using trade off analyses; these analyses emphasize reliability but also address complexity. Multiplex buses are recognized as the likely near term choice with mesh networks being desirable for advanced, highly dispersed systems

NASA Technical Reports Server

A fault-tolerant multiprocessor architecture for aircraft, volume 1

Author: Ausrotas R. A.
Hanley L. D.
Hopkins A. L.
Lala J. H.
Martin J. H.
Smith T. B.
Taylor W.
Publication venue
Publication date
Field of study

A fault-tolerant multiprocessor architecture is reported. This architecture, together with a comprehensive information system architecture, has important potential for future aircraft applications. A preliminary definition and assessment of a suitable multiprocessor architecture for such applications is developed

NASA Technical Reports Server