5 research outputs found

    Verification and Validation of System Health Management Models using Parametric Testing

    No full text
    System Health Management (SHM) systems have found their way into many safety-critical aerospace and industrial applications. A SHM system processes readings from sensors throughout the system and uses a Health Management (HM) model to detect and identify potential faults (diagnosis) and to predict possible failures in the near future (prognosis). It is essential that a SHM system, which monitors a safety-critical component, must be at least as reliable and safe as the component itself—false alarms or missed adverse events can potentially result in catastrophic failures. The SHM system including the HM model, a piece of software, must therefore undergo rigorous Verification and Validation (V&V). In this paper, we will describe an advanced technique for the analysis and V&V of Health Management models. Although our technique is generally applicable, we investigate in this paper HM models in the form of Bayesian networks (BNs). BNs are a powerful modeling paradigm to express notions of cause and effect, probability, and reliability. A BN model typically contains many parameters (e.g., thresholds for discretization and conditional probability tables); they need to be set carefully for reliable and accurate HM reasoning. We are investigating the use of Parametric Testing (PT), which uses a combination of n-factor and Monte Carlo methods, to exercise our HM model with variations of perturbed parameters. Multivariate clustering on the analysis is used to automatically find structure in the data set and to support visualization. Our approach can yield valuable insights regarding the sensitivity of parameters and helps to detect safety margins and boundaries. As a case study we use HM models from the NASA Advanced Diagnostics and Prognostics Testbed (ADAPT), which is a realistic hardware setup for a distributed power system as found in spacecraft or aircraft

    Software and System Health Management for Autonomous Robotics Missions

    No full text
    Advanced autonomous robotics space missions rely heavily on the flawless interaction of complex hardware, multiple sensors, and a mission-critical software system. This software system consists of an operating system, device drivers, controllers, and executives; recently highly complex AI-based autonomy software have also been introduced. Prior to launch, this software has to undergo rigorous verification and validation (V&V). Nevertheless, dormant software bugs, failing sensors, unexpected hardware-software interactions, and unanticipated environmental conditions—likely on a space exploration mission—can cause major software faults that can endanger the entire mission. Our Integrated Software Health Management (ISWHM) system continuously monitors the hardware sensors and the software in real-time. The ISWHM uses Bayesian networks, compiled to arithmetic circuits, to model software and hardware interactions. Advanced reasoning algorithms using arithmetic circuits not only enable the ISWHM to handle large, hierarchical models that are necessary in the realm of complex autonomous systems, but also enable efficient execution on small embedded processors. The latter capability is of extreme importance for small (mobile) autonomous units with limited computational power and low telemetry bandwidth. In this paper, we discuss the requirements of ISWHM. As our initial demonstration platform, we use a primitive Lego rover. A Lego Mindstorms microcontroller is used to implement a highly simplified autonomous rover driving system, running on the OSEK real-time operating system. We demonstrate that our ISWHM, running on this small embedded microcontroller, can perform fault detection as well as on-board reasoning for advanced diagnosis and root-cause detection in real time

    Towards Software Health Management with Bayesian Networks

    No full text
    More and more systems (e.g., aircraft, machinery, cars) rely heavily on software, which performs safety-critical operations. Assuring software safety though traditional V&V has become a tremendous, if not impossible task, given the growing size and complexity of the software. We propose that iSWHM (Integrated SoftWare Health Management) can increase safety and reliability of high-assurance software systems. iSWHM uses advanced techniques from the area of system health management in order to continuously monitor the behavior of the software during operation, quickly detect anomalies and perform automatic and reliable root-cause analysis, while not replacing traditional V&V. Information provided by the iSWHM system can be used for automatic mitigation mechanisms (e.g., recovery, dynamic reconfiguration) or presented to a human operator. iSWHM’s prognostic capabilities will further improve reliability and availability as it provides information about soon-to-occur failures or looming performance bottlenecks. In this paper, we will discuss challenges and future potential and describe how Bayesian networks (BN) could be used for iSWHM modeling

    Towards Real-time, On-board, Hardware-supported Sensor and Software Health Management for Unmanned Aerial Systems

    No full text
    Unmanned aerial systems (UASs) can only be deployed if they can effectively complete their missions and respond to failures and uncertain environmental conditions while maintaining safety with respect to other aircraft as well as humans and property on the ground. In this paper, we design a real-time, on-board system health management (SHM) capability to continuously monitor sensors, software, and hardware components for detection and diagnosis of failures and violations of safety or performance rules during the flight of a UAS. Our approach to SHM is three-pronged, providing: (1) real-time monitoring of sensor and/or software signals; (2) signal analysis, preprocessing, and advanced on-the-fly temporal and Bayesian probabilistic fault diagnosis; (3) an unobtrusive, lightweight, read-only, low-power realization using Field Programmable Gate Arrays (FPGAs) that avoids overburdening limited computing resources or costly re-certification of flight software due to instrumentation. Our implementation provides a novel approach of combining modular building blocks, integrating responsive run-time monitoring of temporal logic system safety requirements with model-based diagnosis and Bayesian network-based probabilistic analysis. We demonstrate this approach using actual data from the NASA Swift UAS, an experimental all-electric aircraft

    Software Health Management with Bayesian Networks

    No full text
    Software Health Management (SWHM) is an emerging field which addresses the critical need to detect, diagnose, predict, and mitigate adverse events due to software faults and failures. These faults could arise for numerous reasons including coding errors, unanticipated faults or failures in hardware, or problematic interactions with the external environment. This paper demonstrates a novel approach to software health management based on a rigorous Bayesian formulation that monitors the behavior of software and operating system, performs probabilistic diagnosis, and provides information about the most likely root causes of a failure or software problem. Translation of the Bayesian network model into an efficient data structure, an arithmetic circuit, makes it possible to perform SWHM on resource-restricted embedded computing platforms as found in aircraft, unmanned aircraft, or satellites. SWHM is especially important for safety critical systems such as aircraft control systems. In this paper, we demonstrate our Bayesian SWHM system on three realistic scenarios from an aircraft control system: (1) aircraft file-system based faults, (2) signal handling faults, and (3) navigation faults due to IMU (inertial measurement unit) failure or compromised GPS (Global Positioning System) integrity. We show that the method successfully detects and diagnoses faults in these scenarios. We also discuss the importance of verification and validation of SWHM systems
    corecore