1,065 research outputs found
Fault Injection Analytics: A Novel Approach to Discover Failure Modes in Cloud-Computing Systems
Cloud computing systems fail in complex and unexpected ways due to unexpected
combinations of events and interactions between hardware and software
components. Fault injection is an effective means to bring out these failures
in a controlled environment. However, fault injection experiments produce
massive amounts of data, and manually analyzing these data is inefficient and
error-prone, as the analyst can miss severe failure modes that are yet unknown.
This paper introduces a new paradigm (fault injection analytics) that applies
unsupervised machine learning on execution traces of the injected system, to
ease the discovery and interpretation of failure modes. We evaluated the
proposed approach in the context of fault injection experiments on the
OpenStack cloud computing platform, where we show that the approach can
accurately identify failure modes with a low computational cost.Comment: IEEE Transactions on Dependable and Secure Computing; 16 pages. arXiv
admin note: text overlap with arXiv:1908.1164
UML-Based Modeling of Robustness Testing
Abstract-The aim of robustness testing is to characterize the behavior of a system in the presence of erroneous or stressful input conditions. It is a well-established approach in the dependability community, which has a long tradition of testing based on fault injection. However, a recurring problem is the insufficient documentation of experiments, which may prevent their replication. Our work investigates whether UMLbased documentation could be used. It proposes an extension of the UML Testing Profile that accounts for the specificities of robustness testing experiments. The extension also reuses some elements of the QoSFT profile targeting measurements. Its ability to model realistic experiments is demonstrated on a case study from dependability research
Distributed simulation of power systems using real time digital simulator
The simulation of power system behavior, especially transient behavior, helps us in the analysis and planning of various power systems. However, power systems are usually highly complex and geographically distributed. Therefore system partitioning can be used to allow for sharing resources in simulation. In this work, distributed simulations of power system models have been developed using an electromagnetic transient simulator, namely Real Time Digital Simulator (RTDS). The goal is to demonstrate and assess the feasibility of both non-real-time and real-time simulations using the RTDS in a geographically distributed scenario. Different protocols and options used in the communication between power systems have been studied and analyzed. In this work, a test bed has been developed for data transfer between a power system simulated in RTDS at Mississippi State University and the power system simulated in RTDS at Texas A&M University. Different protocols, available for the interface and communication in the RTDS, have been studied and applied in this work. Finally, a locally distributed wide area control test bed was developed and simulated
Process monitoring and visualization solutions for hot-melt extrusion : a review
Objectives: Hot-melt extrusion (HME) is applied as a continuous pharmaceutical manufacturing process for the production of a variety of dosage forms and formulations. To ensure the continuity of this process, the quality of the extrudates must be assessed continuously during manufacturing. The objective of this review is to provide an overview and evaluation of the available process analytical techniques which can be applied in hot-melt extrusion.
Key Findings: Pharmaceutical extruders are equipped with traditional (univariate) process monitoring tools, observing barrel and die temperatures, throughput, screw speed, torque, drive amperage, melt pressure and melt temperature. The relevance of several spectroscopic process analytical techniques for monitoring and control of pharmaceutical HME has been explored recently. Nevertheless, many other sensors visualizing HME and measuring diverse critical product and process parameters with potential use in pharmaceutical extrusion are available, and were thoroughly studied in polymer extrusion. The implementation of process analytical tools in HME serves two purposes: (1) improving process understanding by monitoring and visualizing the material behaviour and (2) monitoring and analysing critical product and process parameters for process control, allowing to maintain a desired process state and guaranteeing the quality of the end product.
Summary: This review is the first to provide an evaluation of the process analytical tools applied for pharmaceutical HME monitoring and control, and discusses techniques that have been used in polymer extrusion having potential for monitoring and control of pharmaceutical HME
Exploring applicability of blockchain to enhance Single Sign-On (SSO) systems
Single-Sign-On (SSO) systems usage has been on the rise exponentially. One of the major benefits of having an SSO system is to have a central authentication service, which other applications can use. However, SSO services are also prone to failure. If an SSO service becomes unavailable due to failure, every application that uses the SSO service become simultaneously inaccessible to users. The goal of this research is to explore a technique to mitigate the availability issue of the SSO by customizing its functionality, and distributing its data using blockchain technology over the network. The Blockchain data structure possesses inherent properties that can be useful to improve an SSO service\u27s availability and hence, its overall functionality and reliability
Human Exploration of Mars: Preliminary Lists of Crew Tasks
This is a preliminary report of ongoing research that has identified 1,125 tasks that are likely to be performed during initial human expeditions to Mars. The purpose of the report is to facilitate immediate access to the task inventory by researchers whose efforts might benefit from concrete examples of the work that will likely be performed by the first human explorers of Mars and the tasks for which crew members must be prepared to perform in response to emergencies. The research that led to the task lists is being conducted under Cooperative Agreement NNX15AW34G / NNX16AQ86G for the Human Factors and Behavioral Performance Element, Human Performance Program, NASAs Johnson Space Center. The study is ongoing and will conclude with a final report that documents all research activities and presents the results of task and ability analyses and the implications of study results to crew size and composition, personnel selection and training, and design of equipment and procedures. The research addresses the Risk of Inadequate Mission, Process, and Task Design and the Risk of Performance Errors Due to Training Deficiencies by identifying the work that will be performed during an expedition to Mars and the abilities, skills, and knowledge that will be required of crew members. The study began by developing the comprehensive inventory of 1,125 tasks that are likely to be performed during the 12 phases of initial human expeditions to Mars, from launch to landing 30 months later. This full-mission task inventory was generated by a comprehensive review of documentation and concepts of operations with the understanding that plans and tasks might change in response to continuing technological development. Note: This interim report includes no discussion of analyses and has been prepared solely to facilitate dissemination of the task lists to others whose research might benefit from detailed information about the work and other activities that are likely to be performed during the human exploration of Mars
Experimental Validation of Architectural Solutions
This is a interim report on the experimental validation of architectural solutions performed in WP5 of project CRUTIAL. The two main contributions are the description of an attack injection tool for testing the architectural solutions and the description of a monitor and data collector that collects and analyses information about the behavior of the software after it has been attacke
Intelligent monitoring and fault diagnosis for ATLAS TDAQ: a complex event processing solution
Effective monitoring and analysis tools are fundamental in modern IT
infrastructures to get insights on the overall system behavior and to deal
promptly and effectively with failures. In recent years, Complex Event
Processing (CEP) technologies have emerged as effective solutions for
information processing from the most disparate fields: from wireless sensor
networks to financial analysis. This thesis proposes an innovative approach to
monitor and operate complex and distributed computing systems, in particular
referring to the ATLAS Trigger and Data Acquisition (TDAQ) system currently
in use at the European Organization for Nuclear Research (CERN). The
result of this research, the AAL project, is currently used to provide ATLAS
data acquisition operators with automated error detection and intelligent
system analysis.
The thesis begins by describing the TDAQ system and the controlling
architecture, with a focus on the monitoring infrastructure and the expert
system used for error detection and automated recovery. It then discusses
the limitations of the current approach and how it can be improved to
maximize the ATLAS TDAQ operational efficiency.
Event processing methodologies are then laid out, with a focus on CEP
techniques for stream processing and pattern recognition. The open-source
Esper engine, the CEP solution adopted by the project is subsequently
analyzed and discussed.
Next, the AAL project is introduced as the automated and intelligent
monitoring solution developed as the result of this research. AAL
requirements and governing factors are listed, with a focus on how stream
processing functionalities can enhance the TDAQ monitoring experience. The
AAL processing model is then introduced and the architectural choices are
justified. Finally, real applications on TDAQ error detection are presented. The main conclusion from this work is that CEP techniques can be
successfully applied to detect error conditions and system misbehavior.
Moreover, the AAL project demonstrates a real application of CEP concepts
for intelligent monitoring in the demanding TDAQ scenario. The adoption of
AAL by several TDAQ communities shows that automation and intelligent
system analysis were not properly addressed in the previous infrastructure.
The results of this thesis will benefit researchers evaluating intelligent
monitoring techniques on large-scale distributed computing system
- …