1,423 research outputs found

    Anytime system level verification via parallel random exhaustive hardware in the loop simulation

    Get PDF
    System level verification of cyber-physical systems has the goal of verifying that the whole (i.e., software + hardware) system meets the given specifications. Model checkers for hybrid systems cannot handle system level verification of actual systems. Thus, Hardware In the Loop Simulation (HILS) is currently the main workhorse for system level verification. By using model checking driven exhaustive HILS, System Level Formal Verification (SLFV) can be effectively carried out for actual systems. We present a parallel random exhaustive HILS based model checker for hybrid systems that, by simulating all operational scenarios exactly once in a uniform random order, is able to provide, at any time during the verification process, an upper bound to the probability that the System Under Verification exhibits an error in a yet-to-be-simulated scenario (Omission Probability). We show effectiveness of the proposed approach by presenting experimental results on SLFV of the Inverted Pendulum on a Cart and the Fuel Control System examples in the Simulink distribution. To the best of our knowledge, no previously published model checker can exhaustively verify hybrid systems of such a size and provide at any time an upper bound to the Omission Probability

    Safety assessment of automated vehicle functions by simulation-based fault injection

    Get PDF
    As automated driving vehicles become more sophisticated and pervasive, it is increasingly important to assure its safety even in the presence of faults. This paper presents a simulation-based fault injection approach (Sabotage) aimed at assessing the safety of automated vehicle functions. In particular, we focus on a case study to forecast fault effects during the model-based design of a lateral control function. The goal is to determine the acceptable fault detection interval for permanent faults based on the maximum lateral error and steering saturation. In this work, we performed fault injection simulations to derive the most appropriate safety goals, safety requirements, and fault handling strategies at an early concept phase of an ISO 26262-compliant safety assessment process.The authors have partially received funding from the ECSEL JU AMASS project under H2020 grant agreement No 692474 and from MINETUR (Spain)

    Engineering Resilient Space Systems

    Get PDF
    Several distinct trends will influence space exploration missions in the next decade. Destinations are becoming more remote and mysterious, science questions more sophisticated, and, as mission experience accumulates, the most accessible targets are visited, advancing the knowledge frontier to more difficult, harsh, and inaccessible environments. This leads to new challenges including: hazardous conditions that limit mission lifetime, such as high radiation levels surrounding interesting destinations like Europa or toxic atmospheres of planetary bodies like Venus; unconstrained environments with navigation hazards, such as free-floating active small bodies; multielement missions required to answer more sophisticated questions, such as Mars Sample Return (MSR); and long-range missions, such as Kuiper belt exploration, that must survive equipment failures over the span of decades. These missions will need to be successful without a priori knowledge of the most efficient data collection techniques for optimum science return. Science objectives will have to be revised ‘on the fly’, with new data collection and navigation decisions on short timescales. Yet, even as science objectives are becoming more ambitious, several critical resources remain unchanged. Since physics imposes insurmountable light-time delays, anticipated improvements to the Deep Space Network (DSN) will only marginally improve the bandwidth and communications cadence to remote spacecraft. Fiscal resources are increasingly limited, resulting in fewer flagship missions, smaller spacecraft, and less subsystem redundancy. As missions visit more distant and formidable locations, the job of the operations team becomes more challenging, seemingly inconsistent with the trend of shrinking mission budgets for operations support. How can we continue to explore challenging new locations without increasing risk or system complexity? These challenges are present, to some degree, for the entire Decadal Survey mission portfolio, as documented in Vision and Voyages for Planetary Science in the Decade 2013–2022 (National Research Council, 2011), but are especially acute for the following mission examples, identified in our recently completed KISS Engineering Resilient Space Systems (ERSS) study: 1. A Venus lander, designed to sample the atmosphere and surface of Venus, would have to perform science operations as components and subsystems degrade and fail; 2. A Trojan asteroid tour spacecraft would spend significant time cruising to its ultimate destination (essentially hibernating to save on operations costs), then upon arrival, would have to act as its own surveyor, finding new objects and targets of opportunity as it approaches each asteroid, requiring response on short notice; and 3. A MSR campaign would not only be required to perform fast reconnaissance over long distances on the surface of Mars, interact with an unknown physical surface, and handle degradations and faults, but would also contain multiple components (launch vehicle, cruise stage, entry and landing vehicle, surface rover, ascent vehicle, orbiting cache, and Earth return vehicle) that dramatically increase the need for resilience to failure across the complex system. The concept of resilience and its relevance and application in various domains was a focus during the study, with several definitions of resilience proposed and discussed. While there was substantial variation in the specifics, there was a common conceptual core that emerged—adaptation in the presence of changing circumstances. These changes were couched in various ways—anomalies, disruptions, discoveries—but they all ultimately had to do with changes in underlying assumptions. Invalid assumptions, whether due to unexpected changes in the environment, or an inadequate understanding of interactions within the system, may cause unexpected or unintended system behavior. A system is resilient if it continues to perform the intended functions in the presence of invalid assumptions. Our study focused on areas of resilience that we felt needed additional exploration and integration, namely system and software architectures and capabilities, and autonomy technologies. (While also an important consideration, resilience in hardware is being addressed in multiple other venues, including 2 other KISS studies.) The study consisted of two workshops, separated by a seven-month focused study period. The first workshop (Workshop #1) explored the ‘problem space’ as an organizing theme, and the second workshop (Workshop #2) explored the ‘solution space’. In each workshop, focused discussions and exercises were interspersed with presentations from participants and invited speakers. The study period between the two workshops was organized as part of the synthesis activity during the first workshop. The study participants, after spending the initial days of the first workshop discussing the nature of resilience and its impact on future science missions, decided to split into three focus groups, each with a particular thrust, to explore specific ideas further and develop material needed for the second workshop. The three focus groups and areas of exploration were: 1. Reference missions: address/refine the resilience needs by exploring a set of reference missions 2. Capability survey: collect, document, and assess current efforts to develop capabilities and technology that could be used to address the documented needs, both inside and outside NASA 3. Architecture: analyze the impact of architecture on system resilience, and provide principles and guidance for architecting greater resilience in our future systems The key product of the second workshop was a set of capability roadmaps pertaining to the three reference missions selected for their representative coverage of the types of space missions envisioned for the future. From these three roadmaps, we have extracted several common capability patterns that would be appropriate targets for near-term technical development: one focused on graceful degradation of system functionality, a second focused on data understanding for science and engineering applications, and a third focused on hazard avoidance and environmental uncertainty. Continuing work is extending these roadmaps to identify candidate enablers of the capabilities from the following three categories: architecture solutions, technology solutions, and process solutions. The KISS study allowed a collection of diverse and engaged engineers, researchers, and scientists to think deeply about the theory, approaches, and technical issues involved in developing and applying resilience capabilities. The conclusions summarize the varied and disparate discussions that occurred during the study, and include new insights about the nature of the challenge and potential solutions: 1. There is a clear and definitive need for more resilient space systems. During our study period, the key scientists/engineers we engaged to understand potential future missions confirmed the scientific and risk reduction value of greater resilience in the systems used to perform these missions. 2. Resilience can be quantified in measurable terms—project cost, mission risk, and quality of science return. In order to consider resilience properly in the set of engineering trades performed during the design, integration, and operation of space systems, the benefits and costs of resilience need to be quantified. We believe, based on the work done during the study, that appropriate metrics to measure resilience must relate to risk, cost, and science quality/opportunity. Additional work is required to explicitly tie design decisions to these first-order concerns. 3. There are many existing basic technologies that can be applied to engineering resilient space systems. Through the discussions during the study, we found many varied approaches and research that address the various facets of resilience, some within NASA, and many more beyond. Examples from civil architecture, Department of Defense (DoD) / Defense Advanced Research Projects Agency (DARPA) initiatives, ‘smart’ power grid control, cyber-physical systems, software architecture, and application of formal verification methods for software were identified and discussed. The variety and scope of related efforts is encouraging and presents many opportunities for collaboration and development, and we expect many collaborative proposals and joint research as a result of the study. 4. Use of principled architectural approaches is key to managing complexity and integrating disparate technologies. The main challenge inherent in considering highly resilient space systems is that the increase in capability can result in an increase in complexity with all of the 3 risks and costs associated with more complex systems. What is needed is a better way of conceiving space systems that enables incorporation of capabilities without increasing complexity. We believe principled architecting approaches provide the needed means to convey a unified understanding of the system to primary stakeholders, thereby controlling complexity in the conception and development of resilient systems, and enabling the integration of disparate approaches and technologies. A representative architectural example is included in Appendix F. 5. Developing trusted resilience capabilities will require a diverse yet strategically directed research program. Despite the interest in, and benefits of, deploying resilience space systems, to date, there has been a notable lack of meaningful demonstrated progress in systems capable of working in hazardous uncertain situations. The roadmaps completed during the study, and documented in this report, provide the basis for a real funded plan that considers the required fundamental work and evolution of needed capabilities. Exploring space is a challenging and difficult endeavor. Future space missions will require more resilience in order to perform the desired science in new environments under constraints of development and operations cost, acceptable risk, and communications delays. Development of space systems with resilient capabilities has the potential to expand the limits of possibility, revolutionizing space science by enabling as yet unforeseen missions and breakthrough science observations. Our KISS study provided an essential venue for the consideration of these challenges and goals. Additional work and future steps are needed to realize the potential of resilient systems—this study provided the necessary catalyst to begin this process

    Complex Patterns of Failure:Fault Tolerance via Complex Event Processing for IoT Systems

    Get PDF
    Fault-tolerance (FT) support is a key challenge for ensuring dependable Internet of Things (IoT) systems. Many existing FT-support mechanisms for IoT are static, tightly coupled, and inflexible, and so they struggle to provide effective support for dynamic IoT environments. This paper proposes Complex Patterns of Failure (CPoF), an approach to providing FT support for IoT systems using Complex Event Processing (CEP) that promotes modularity and reusability in FT-support design. System defects are defined using our Vulnerabilities, Faults, and Failures (VFF) framework, and error-detection strategies are defined as nondeterministic finite automata (NFA) implemented via CEP systems. We evaluated CPoF on an automated agriculture system and demonstrated its effectiveness against three types of error-detection checks: reasonableness, timing, and reversal. Using CPoF, we identified unreasonable environmental conditions and performance degradation via sensor data analysis

    Model-based dependability analysis : state-of-the-art, challenges and future outlook

    Get PDF
    Abstract: Over the past two decades, the study of model-based dependability analysis has gathered significant research interest. Different approaches have been developed to automate and address various limitations of classical dependability techniques to contend with the increasing complexity and challenges of modern safety-critical system. Two leading paradigms have emerged, one which constructs predictive system failure models from component failure models compositionally using the topology of the system. The other utilizes design models - typically state automata - to explore system behaviour through fault injection. This paper reviews a number of prominent techniques under these two paradigms, and provides an insight into their working mechanism, applicability, strengths and challenges, as well as recent developments within these fields. We also discuss the emerging trends on integrated approaches and advanced analysis capabilities. Lastly, we outline the future outlook for model-based dependability analysis

    State Machine Replication Is More Expensive Than Consensus

    Get PDF
    Consensus and State Machine Replication (SMR) are generally considered to be equivalent problems. In certain system models, indeed, the two problems are computationally equivalent: any solution to the former problem leads to a solution to the latter, and vice versa. In this paper, we study the relation between consensus and SMR from a complexity perspective. We find that, surprisingly, completing an SMR command can be more expensive than solving a consensus instance. Specifically, given a synchronous system model where every instance of consensus always terminates in constant time, completing an SMR command does not necessarily terminate in constant time. This result naturally extends to partially synchronous models. Besides theoretical interest, our result also corresponds to practical phenomena we identify empirically. We experiment with two well-known SMR implementations (Multi-Paxos and Raft) and show that, indeed, SMR is more expensive than consensus in practice. One important implication of our result is that - even under synchrony conditions - no SMR algorithm can ensure bounded response times

    NASA's Human Rating Requirements - A Historical Interpretive Perspective

    Get PDF
    Section 3.0 of NASA's Human Rating Requirements for Space Systems, NPR 8705.2, represents technical engineering requirements that the Agenc y requires of Human Space Systems. In many cases the requirements are not unlike requirements for any space system, crewed or uncrewed, th ey deal with successfully accomplishing the mission objectives. Howev er, they go one step further and have requirements that go beyond suc cessful completion of the mission and dictate functions or actions ne cessary to assure the survival of the crew. In that regard they are u nique from other space system requirements. Even with their uniquenes s the technical requirements of the NPR 8705.2 have been relatively u nchanged in overall intent over the revisions. They all have provided for system redundancy, crew habitable environment, crew situational awareness, crew operation, system control, emergency egress and abort systems. In a few cases the intent of the requirement was changed in tentionally, either to restrict certain types of systems or their fun ctions, or to encompass lessons learned from previous programs. For t he most part the requirements are non controversial and represent the current best practices for human space systems, however, a few requi rements are always debated and have evolved over revisions of the NPR due to studies conducted with various programs like the Orbital Spac e Plane and the Constellation Programs. Those requirements will be di scussed using results of trade studies conducted during past programs highlighting how these particular requirements have evolved through the revisions of the NPR. Comments will also be provided for requirem ents that although not debated, have provided challenges in interpret ation

    Making intelligent systems team players: Overview for designers

    Get PDF
    This report is a guide and companion to the NASA Technical Memorandum 104738, 'Making Intelligent Systems Team Players,' Volumes 1 and 2. The first two volumes of this Technical Memorandum provide comprehensive guidance to designers of intelligent systems for real-time fault management of space systems, with the objective of achieving more effective human interaction. This report provides an analysis of the material discussed in the Technical Memorandum. It clarifies what it means for an intelligent system to be a team player, and how such systems are designed. It identifies significant intelligent system design problems and their impacts on reliability and usability. Where common design practice is not effective in solving these problems, we make recommendations for these situations. In this report, we summarize the main points in the Technical Memorandum and identify where to look for further information
    • …
    corecore