17 research outputs found

    Explaining Aviation Safety Incidents Using Deep Temporal Multiple Instance Learning

    Full text link
    Although aviation accidents are rare, safety incidents occur more frequently and require a careful analysis to detect and mitigate risks in a timely manner. Analyzing safety incidents using operational data and producing event-based explanations is invaluable to airline companies as well as to governing organizations such as the Federal Aviation Administration (FAA) in the United States. However, this task is challenging because of the complexity involved in mining multi-dimensional heterogeneous time series data, the lack of time-step-wise annotation of events in a flight, and the lack of scalable tools to perform analysis over a large number of events. In this work, we propose a precursor mining algorithm that identifies events in the multidimensional time series that are correlated with the safety incident. Precursors are valuable to systems health and safety monitoring and in explaining and forecasting safety incidents. Current methods suffer from poor scalability to high dimensional time series data and are inefficient in capturing temporal behavior. We propose an approach by combining multiple-instance learning (MIL) and deep recurrent neural networks (DRNN) to take advantage of MIL's ability to learn using weakly supervised data and DRNN's ability to model temporal behavior. We describe the algorithm, the data, the intuition behind taking a MIL approach, and a comparative analysis of the proposed algorithm with baseline models. We also discuss the application to a real-world aviation safety problem using data from a commercial airline company and discuss the model's abilities and shortcomings, with some final remarks about possible deployment directions

    Detecting Abnormal Machine Characteristics in Cloud Infrastructures

    Get PDF
    In the cloud computing environment resources are accessed as services rather than as a product. Monitoring this system for performance is crucial because of typical pay-peruse packages bought by the users for their jobs. With the huge number of machines currently in the cloud system, it is often extremely difficult for system administrators to keep track of all machines using distributed monitoring programs such as Ganglia1 which lacks system health assessment and summarization capabilities. To overcome this problem, we propose a technique for automated anomaly detection using machine performance data in the cloud. Our algorithm is entirely distributed and runs locally on each computing machine on the cloud in order to rank the machines in order of their anomalous behavior for given jobs. There is no need to centralize any of the performance data for the analysis and at the end of the analysis, our algorithm generates error reports, thereby allowing the system administrators to take corrective actions. Experiments performed on real data sets collected for different jobs validate the fact that our algorithm has a low overhead for tracking anomalous machines in a cloud infrastructure

    Literature review of machine learning techniques to analyse flight data

    Get PDF
    This paper analyses the increasing trend of using modern machine learning technologies to analyze flight data efficiently. Flight data offers an important insight into the operations of an aircraft. This paper reviews the research undertaken so far on the use of Machine Learning techniques for the analyses of flight data by evaluating various anomaly detection algorithms and the significance of feature selection in Flight Data Monitoring. These algorithms are compared to determine the best class of algorithms for highlighting significant flight anomalies. Furthermore, these algorithms are analyzed for various flight data parameters to determine which class of algorithms is sensitive to continuous parameters and which is sensitive to discrete parameters of flight data. The paper also addresses the ability of each anomaly detection algorithm to be easily adaptable to different datasets and different phases of flight, including take-off and landing.peer-reviewe

    A Dense Network Model for Outlier Prediction Using Learning Approaches

    Get PDF
    There are various sub-categories in outlier prediction and the investigators show less attention to related domains like outliers in audio recognition, video recognition, music recognition, etc. However, this research is specific to medical data analysis. It specifically concentrates on predicting the outliers from the medical database. Here, feature mapping and representation are achieved by adopting stacked LSTM-based CNN. The extracted features are fed as an input to the Linear Support Vector Machine () is used for classification purposes. Based on the analysis, it is known that there is a strong correlation between the features related to an individual's emotions. It can be analyzed in both a static and dynamic manner. Adopting both learning approaches is done to boost the drawbacks of one another. The statistical analysis is done with MATLAB 2016a environment where metrics like ROC, MCC, AUC, correlation co-efficiency, and prediction accuracy are evaluated and compared to existing approaches like standard CNN, standard SVM, logistic regression, multi-layer perceptrons, and so on. The anticipated learning model shows superior outcomes, and more concentration is provided to select an emotion recognition dataset connected with all the sub-domains

    Continuous Outlier Mining of Streaming Data in Flink

    Get PDF
    In this work, we focus on distance-based outliers in a metric space, where the status of an entity as to whether it is an outlier is based on the number of other entities in its neighborhood. In recent years, several solutions have tackled the problem of distance-based outliers in data streams, where outliers must be mined continuously as new elements become available. An interesting research problem is to combine the streaming environment with massively parallel systems to provide scalable streambased algorithms. However, none of the previously proposed techniques refer to a massively parallel setting. Our proposal fills this gap and investigates the challenges in transferring state-of-the-art techniques to Apache Flink, a modern platform for intensive streaming analytics. We thoroughly present the technical challenges encountered and the alternatives that may be applied. We show speed-ups of up to 117 (resp. 2076) times over a naive parallel (resp. non-parallel) solution in Flink, by using just an ordinary four-core machine and a real-world dataset. When moving to a three-machine cluster, due to less contention, we manage to achieve both better scalability in terms of the window slide size and the data dimensionality, and even higher speed-ups, e.g., by a factor of 510. Overall, our results demonstrate that oulier mining can be achieved in an efficient and scalable manner. The resulting techniques have been made publicly available as open-source software

    An enhanced sequential exception technique for semantic-based text anomaly detection

    Get PDF
    The detection of semantic-based text anomaly is an interesting research area which has gained considerable attention from the data mining community. Text anomaly detection identifies deviating information from general information contained in documents. Text data are characterized by having problems related to ambiguity, high dimensionality, sparsity and text representation. If these challenges are not properly resolved, identifying semantic-based text anomaly will be less accurate. This study proposes an Enhanced Sequential Exception Technique (ESET) to detect semantic-based text anomaly by achieving five objectives: (1) to modify Sequential Exception Technique (SET) in processing unstructured text; (2) to optimize Cosine Similarity for identifying similar and dissimilar text data; (3) to hybridize modified SET with Latent Semantic Analysis (LSA); (4) to integrate Lesk and Selectional Preference algorithms for disambiguating senses and identifying text canonical form; and (5) to represent semantic-based text anomaly using First Order Logic (FOL) and Concept Network Graph (CNG). ESET performs text anomaly detection by employing optimized Cosine Similarity, hybridizing LSA with modified SET, and integrating it with Word Sense Disambiguation algorithms specifically Lesk and Selectional Preference. Then, FOL and CNG are proposed to represent the detected semantic-based text anomaly. To demonstrate the feasibility of the technique, four selected datasets namely NIPS data, ENRON, Daily Koss blog, and 20Newsgroups were experimented on. The experimental evaluation revealed that ESET has significantly improved the accuracy of detecting semantic-based text anomaly from documents. When compared with existing measures, the experimental results outperformed benchmarked methods with an improved F1-score from all datasets respectively; NIPS data 0.75, ENRON 0.82, Daily Koss blog 0.93 and 20Newsgroups 0.97. The results generated from ESET has proven to be significant and supported a growing notion of semantic-based text anomaly which is increasingly evident in existing literatures. Practically, this study contributes to topic modelling and concept coherence for the purpose of visualizing information, knowledge sharing and optimized decision making

    Assessment of the State-of-the-Art of System-Wide Safety and Assurance Technologies

    Get PDF
    Since its initiation, the System-wide Safety Assurance Technologies (SSAT) Project has been focused on developing multidisciplinary tools and techniques that are verified and validated to ensure prevention of loss of property and life in NextGen and enable proactive risk management through predictive methods. To this end, four technical challenges have been listed to help realize the goals of SSAT, namely (i) assurance of flight critical systems, (ii) discovery of precursors to safety incidents, (iii) assuring safe human-systems integration, and (iv) prognostic algorithm design for safety assurance. The objective of this report is to provide an extensive survey of SSAT-related research accomplishments by researchers within and outside NASA to get an understanding of what the state-of-the-art is for technologies enabling each of the four technical challenges. We hope that this report will serve as a good resource for anyone interested in gaining an understanding of the SSAT technical challenges, and also be useful in the future for project planning and resource allocation for related research
    corecore