1,025 research outputs found

    Fault Injection Analytics: A Novel Approach to Discover Failure Modes in Cloud-Computing Systems

    Full text link
    Cloud computing systems fail in complex and unexpected ways due to unexpected combinations of events and interactions between hardware and software components. Fault injection is an effective means to bring out these failures in a controlled environment. However, fault injection experiments produce massive amounts of data, and manually analyzing these data is inefficient and error-prone, as the analyst can miss severe failure modes that are yet unknown. This paper introduces a new paradigm (fault injection analytics) that applies unsupervised machine learning on execution traces of the injected system, to ease the discovery and interpretation of failure modes. We evaluated the proposed approach in the context of fault injection experiments on the OpenStack cloud computing platform, where we show that the approach can accurately identify failure modes with a low computational cost.Comment: IEEE Transactions on Dependable and Secure Computing; 16 pages. arXiv admin note: text overlap with arXiv:1908.1164

    UML-Based Modeling of Robustness Testing

    Get PDF
    Abstract-The aim of robustness testing is to characterize the behavior of a system in the presence of erroneous or stressful input conditions. It is a well-established approach in the dependability community, which has a long tradition of testing based on fault injection. However, a recurring problem is the insufficient documentation of experiments, which may prevent their replication. Our work investigates whether UMLbased documentation could be used. It proposes an extension of the UML Testing Profile that accounts for the specificities of robustness testing experiments. The extension also reuses some elements of the QoSFT profile targeting measurements. Its ability to model realistic experiments is demonstrated on a case study from dependability research

    Distributed simulation of power systems using real time digital simulator

    Get PDF
    The simulation of power system behavior, especially transient behavior, helps us in the analysis and planning of various power systems. However, power systems are usually highly complex and geographically distributed. Therefore system partitioning can be used to allow for sharing resources in simulation. In this work, distributed simulations of power system models have been developed using an electromagnetic transient simulator, namely Real Time Digital Simulator (RTDS). The goal is to demonstrate and assess the feasibility of both non-real-time and real-time simulations using the RTDS in a geographically distributed scenario. Different protocols and options used in the communication between power systems have been studied and analyzed. In this work, a test bed has been developed for data transfer between a power system simulated in RTDS at Mississippi State University and the power system simulated in RTDS at Texas A&M University. Different protocols, available for the interface and communication in the RTDS, have been studied and applied in this work. Finally, a locally distributed wide area control test bed was developed and simulated

    Process monitoring and visualization solutions for hot-melt extrusion : a review

    Get PDF
    Objectives: Hot-melt extrusion (HME) is applied as a continuous pharmaceutical manufacturing process for the production of a variety of dosage forms and formulations. To ensure the continuity of this process, the quality of the extrudates must be assessed continuously during manufacturing. The objective of this review is to provide an overview and evaluation of the available process analytical techniques which can be applied in hot-melt extrusion. Key Findings: Pharmaceutical extruders are equipped with traditional (univariate) process monitoring tools, observing barrel and die temperatures, throughput, screw speed, torque, drive amperage, melt pressure and melt temperature. The relevance of several spectroscopic process analytical techniques for monitoring and control of pharmaceutical HME has been explored recently. Nevertheless, many other sensors visualizing HME and measuring diverse critical product and process parameters with potential use in pharmaceutical extrusion are available, and were thoroughly studied in polymer extrusion. The implementation of process analytical tools in HME serves two purposes: (1) improving process understanding by monitoring and visualizing the material behaviour and (2) monitoring and analysing critical product and process parameters for process control, allowing to maintain a desired process state and guaranteeing the quality of the end product. Summary: This review is the first to provide an evaluation of the process analytical tools applied for pharmaceutical HME monitoring and control, and discusses techniques that have been used in polymer extrusion having potential for monitoring and control of pharmaceutical HME

    Human Exploration of Mars: Preliminary Lists of Crew Tasks

    Get PDF
    This is a preliminary report of ongoing research that has identified 1,125 tasks that are likely to be performed during initial human expeditions to Mars. The purpose of the report is to facilitate immediate access to the task inventory by researchers whose efforts might benefit from concrete examples of the work that will likely be performed by the first human explorers of Mars and the tasks for which crew members must be prepared to perform in response to emergencies. The research that led to the task lists is being conducted under Cooperative Agreement NNX15AW34G / NNX16AQ86G for the Human Factors and Behavioral Performance Element, Human Performance Program, NASAs Johnson Space Center. The study is ongoing and will conclude with a final report that documents all research activities and presents the results of task and ability analyses and the implications of study results to crew size and composition, personnel selection and training, and design of equipment and procedures. The research addresses the Risk of Inadequate Mission, Process, and Task Design and the Risk of Performance Errors Due to Training Deficiencies by identifying the work that will be performed during an expedition to Mars and the abilities, skills, and knowledge that will be required of crew members. The study began by developing the comprehensive inventory of 1,125 tasks that are likely to be performed during the 12 phases of initial human expeditions to Mars, from launch to landing 30 months later. This full-mission task inventory was generated by a comprehensive review of documentation and concepts of operations with the understanding that plans and tasks might change in response to continuing technological development. Note: This interim report includes no discussion of analyses and has been prepared solely to facilitate dissemination of the task lists to others whose research might benefit from detailed information about the work and other activities that are likely to be performed during the human exploration of Mars

    Exploring applicability of blockchain to enhance Single Sign-On (SSO) systems

    Get PDF
    Single-Sign-On (SSO) systems usage has been on the rise exponentially. One of the major benefits of having an SSO system is to have a central authentication service, which other applications can use. However, SSO services are also prone to failure. If an SSO service becomes unavailable due to failure, every application that uses the SSO service become simultaneously inaccessible to users. The goal of this research is to explore a technique to mitigate the availability issue of the SSO by customizing its functionality, and distributing its data using blockchain technology over the network. The Blockchain data structure possesses inherent properties that can be useful to improve an SSO service\u27s availability and hence, its overall functionality and reliability

    Experimental Validation of Architectural Solutions

    Get PDF
    This is a interim report on the experimental validation of architectural solutions performed in WP5 of project CRUTIAL. The two main contributions are the description of an attack injection tool for testing the architectural solutions and the description of a monitor and data collector that collects and analyses information about the behavior of the software after it has been attacke

    Intelligent monitoring and fault diagnosis for ATLAS TDAQ: a complex event processing solution

    Get PDF
    Effective monitoring and analysis tools are fundamental in modern IT infrastructures to get insights on the overall system behavior and to deal promptly and effectively with failures. In recent years, Complex Event Processing (CEP) technologies have emerged as effective solutions for information processing from the most disparate fields: from wireless sensor networks to financial analysis. This thesis proposes an innovative approach to monitor and operate complex and distributed computing systems, in particular referring to the ATLAS Trigger and Data Acquisition (TDAQ) system currently in use at the European Organization for Nuclear Research (CERN). The result of this research, the AAL project, is currently used to provide ATLAS data acquisition operators with automated error detection and intelligent system analysis. The thesis begins by describing the TDAQ system and the controlling architecture, with a focus on the monitoring infrastructure and the expert system used for error detection and automated recovery. It then discusses the limitations of the current approach and how it can be improved to maximize the ATLAS TDAQ operational efficiency. Event processing methodologies are then laid out, with a focus on CEP techniques for stream processing and pattern recognition. The open-source Esper engine, the CEP solution adopted by the project is subsequently analyzed and discussed. Next, the AAL project is introduced as the automated and intelligent monitoring solution developed as the result of this research. AAL requirements and governing factors are listed, with a focus on how stream processing functionalities can enhance the TDAQ monitoring experience. The AAL processing model is then introduced and the architectural choices are justified. Finally, real applications on TDAQ error detection are presented. The main conclusion from this work is that CEP techniques can be successfully applied to detect error conditions and system misbehavior. Moreover, the AAL project demonstrates a real application of CEP concepts for intelligent monitoring in the demanding TDAQ scenario. The adoption of AAL by several TDAQ communities shows that automation and intelligent system analysis were not properly addressed in the previous infrastructure. The results of this thesis will benefit researchers evaluating intelligent monitoring techniques on large-scale distributed computing system
    corecore