3 research outputs found

    On the Nature and Types of Anomalies: A Review

    Full text link
    Anomalies are occurrences in a dataset that are in some way unusual and do not fit the general patterns. The concept of the anomaly is generally ill-defined and perceived as vague and domain-dependent. Moreover, despite some 250 years of publications on the topic, no comprehensive and concrete overviews of the different types of anomalies have hitherto been published. By means of an extensive literature review this study therefore offers the first theoretically principled and domain-independent typology of data anomalies, and presents a full overview of anomaly types and subtypes. To concretely define the concept of the anomaly and its different manifestations, the typology employs five dimensions: data type, cardinality of relationship, anomaly level, data structure and data distribution. These fundamental and data-centric dimensions naturally yield 3 broad groups, 9 basic types and 61 subtypes of anomalies. The typology facilitates the evaluation of the functional capabilities of anomaly detection algorithms, contributes to explainable data science, and provides insights into relevant topics such as local versus global anomalies.Comment: 38 pages (30 pages content), 10 figures, 3 tables. Preprint; review comments will be appreciated. Improvements in version 2: Explicit mention of fifth anomaly dimension; Added section on explainable anomaly detection; Added section on variations on the anomaly concept; Various minor additions and improvement

    Anomaly detection in mixed telemetry data using a sparse representation and dictionary learning

    Get PDF
    International audienceSpacecraft health monitoring and failure prevention are major issues in space operations. In recent years, machine learning techniques have received an increasing interest in many fields and have been applied to housekeeping telemetry data via semi-supervised learning. The idea is to use past telemetry describing normal spacecraft behaviour in order to learn a reference model to which can be compared most recent data in order to detect potential anomalies. This paper introduces a new machine learning method for anomaly detection in telemetry time series based on a sparse representation and dictionary learning. The main advantage of the proposed method is the possibility to handle multivariate telemetry time series described by mixed continuous and discrete parameters, taking into account the potential correlations be- tween these parameters. The proposed method is evaluated on a representative anomaly dataset obtained from real satellite telemetry with an available ground-truth and compared to state-of-the-art algorithms
    corecore