1,957 research outputs found
Surviving sensor network software faults
We describe Neutron, a version of the TinyOS operating system that efficiently recovers from memory safety bugs. Where existing schemes reboot an entire node on an error, Neutron’s compiler and runtime extensions divide programs into recovery units and reboot only the faulting unit. The TinyOS kernel itself is a recovery unit: a kernel safety violation appears to applications as the processor being unavailable for 10–20 milliseconds. Neutron further minimizes safety violation cost by supporting “precious ” state that persists across reboots. Application data, time synchronization state, and routing tables can all be declared as pre-cious. Neutron’s reboot sequence conservatively checks that pre-cious state is not the source of a fault before preserving it. Together, recovery units and precious state allow Neutron to reduce a safety violation’s cost to time synchronization by 94 % and to a routing protocol by 99.5%. Neutron also protects applications from losing data. Neutron provides this recovery on the very limited resources of a tiny, low-power microcontroller
Surviving sensor network software faults
ManuscriptWe describe Neutron, a version of the TinyOS operating system that efficiently recovers from memory safety bugs. Where existing schemes reboot an entire node on an error, Neutron's compiler and runtime extensions divide programs into recovery units and reboot only the faulting unit. The TinyOS kernel itself is a recovery unit: a kernel safety violation appears to applications as the processor being unavailable for 10-20 milliseconds. Neutron further minimizes safety violation cost by supporting "precious" state that persists across reboots. Application data, time synchronization state, and routing tables can all be declared as precious. Neutron's reboot sequence conservatively checks that precious state is not the source of a fault before preserving it. Together, recovery units and precious state allow Neutron to reduce a safety violation's cost to time synchronization by 94% and to a routing protocol by 99:5%. Neutron also protects applications from losing data. Neutron provides this recovery on the very limited resources of a tiny, low-power microcontroller
Recommended from our members
Assessing the Risk due to Software Faults: Estimates of Failure Rate versus Evidence of Perfection.
In the debate over the assessment of software reliability (or safety), as applied to critical software, two extreme positions can be discerned: the ‘statistical’ position, which requires that the claims of reliability be supported by statistical inference from realistic testing or operation, and the ‘perfectionist’ position, which requires convincing indications that the software is free from defects. These two positions naturally lead to requiring different kinds of supporting evidence, and actually to stating the dependability requirements in different ways, not allowing any direct comparison. There is often confusion about the relationship between statements about software failure rates and about software correctness, and about which evidence can support either kind of statement. This note clarifies the meaning of the two kinds of statement and how they relate to the probability of failure-free operation, and discusses their practical merits, especially for high required reliability or safety
Results of expert judgments on the faults and risks with Autosub3 and an analysis of its campaign to Pine Island Bay, Antarctica, 2009
Probabilistic risk assessment is a methodology that can be systematically applied to estimate the risk associated with the design and operation of complex systems. The National Oceanography Centre, Southampton, UK has developed a risk management process tailored to the operation of autonomous underwater vehicles. Central to the application of the risk management process is a probabilistic risk assessment. The risk management process was applied to estimate the risk associated with an Autosub3 science campaign in the Pine Island Glacier, Antarctica, and to support decision making. The campaign was successful. In this paper we present the Autosub3 risk model and we show how this model was used to assess the campaign risk
Intelligent fault management for the Space Station active thermal control system
The Thermal Advanced Automation Project (TAAP) approach and architecture is described for automating the Space Station Freedom (SSF) Active Thermal Control System (ATCS). The baseline functionally and advanced automation techniques for Fault Detection, Isolation, and Recovery (FDIR) will be compared and contrasted. Advanced automation techniques such as rule-based systems and model-based reasoning should be utilized to efficiently control, monitor, and diagnose this extremely complex physical system. TAAP is developing advanced FDIR software for use on the SSF thermal control system. The goal of TAAP is to join Knowledge-Based System (KBS) technology, using a combination of rules and model-based reasoning, with conventional monitoring and control software in order to maximize autonomy of the ATCS. TAAP's predecessor was NASA's Thermal Expert System (TEXSYS) project which was the first large real-time expert system to use both extensive rules and model-based reasoning to control and perform FDIR on a large, complex physical system. TEXSYS showed that a method is needed for safely and inexpensively testing all possible faults of the ATCS, particularly those potentially damaging to the hardware, in order to develop a fully capable FDIR system. TAAP therefore includes the development of a high-fidelity simulation of the thermal control system. The simulation provides realistic, dynamic ATCS behavior and fault insertion capability for software testing without hardware related risks or expense. In addition, thermal engineers will gain greater confidence in the KBS FDIR software than was possible prior to this kind of simulation testing. The TAAP KBS will initially be a ground-based extension of the baseline ATCS monitoring and control software and could be migrated on-board as additional computation resources are made available
Exhaustive Search-based Model for Hybrid Sensor Network
A new model for a cluster of hybrid sensors network with multi sub-clusters
is proposed. The model is in particular relevant to the early warning system in
a large scale monitoring system in, for example, a nuclear power plant. It
mainly addresses to a safety critical system which requires real-time processes
with high accuracy. The mathematical model is based on the extended
conventional search algorithm with certain interactions among the nearest
neighborhood of sensors. It is argued that the model could realize a highly
accurate decision support system with less number of parameters. A case of one
dimensional interaction function is discussed, and a simple algorithm for the
model is also given.Comment: 6 pages, Proceeding of the International Conference on Intelligent &
Advanced Systems 2012 pp. 557-56
Advanced flight control system study
The architecture, requirements, and system elements of an ultrareliable, advanced flight control system are described. The basic criteria are functional reliability of 10 to the minus 10 power/hour of flight and only 6 month scheduled maintenance. A distributed system architecture is described, including a multiplexed communication system, reliable bus controller, the use of skewed sensor arrays, and actuator interfaces. Test bed and flight evaluation program are proposed
System data communication structures for active-control transport aircraft, volume 2
The application of communication structures to advanced transport aircraft are addressed. First, a set of avionic functional requirements is established, and a baseline set of avionics equipment is defined that will meet the requirements. Three alternative configurations for this equipment are then identified that represent the evolution toward more dispersed systems. Candidate communication structures are proposed for each system configuration, and these are compared using trade off analyses; these analyses emphasize reliability but also address complexity. Multiplex buses are recognized as the likely near term choice with mesh networks being desirable for advanced, highly dispersed systems
A fault-tolerant multiprocessor architecture for aircraft, volume 1
A fault-tolerant multiprocessor architecture is reported. This architecture, together with a comprehensive information system architecture, has important potential for future aircraft applications. A preliminary definition and assessment of a suitable multiprocessor architecture for such applications is developed
- …