3 research outputs found
On the Nature and Types of Anomalies: A Review
Anomalies are occurrences in a dataset that are in some way unusual and do
not fit the general patterns. The concept of the anomaly is generally
ill-defined and perceived as vague and domain-dependent. Moreover, despite some
250 years of publications on the topic, no comprehensive and concrete overviews
of the different types of anomalies have hitherto been published. By means of
an extensive literature review this study therefore offers the first
theoretically principled and domain-independent typology of data anomalies, and
presents a full overview of anomaly types and subtypes. To concretely define
the concept of the anomaly and its different manifestations, the typology
employs five dimensions: data type, cardinality of relationship, anomaly level,
data structure and data distribution. These fundamental and data-centric
dimensions naturally yield 3 broad groups, 9 basic types and 61 subtypes of
anomalies. The typology facilitates the evaluation of the functional
capabilities of anomaly detection algorithms, contributes to explainable data
science, and provides insights into relevant topics such as local versus global
anomalies.Comment: 38 pages (30 pages content), 10 figures, 3 tables. Preprint; review
comments will be appreciated. Improvements in version 2: Explicit mention of
fifth anomaly dimension; Added section on explainable anomaly detection;
Added section on variations on the anomaly concept; Various minor additions
and improvement
Anomaly detection in mixed telemetry data using a sparse representation and dictionary learning
International audienceSpacecraft health monitoring and failure prevention are major issues in space operations. In recent years, machine learning techniques have received an increasing interest in many fields and have been applied to housekeeping telemetry data via semi-supervised learning. The idea is to use past telemetry describing normal spacecraft behaviour in order to learn a reference model to which can be compared most recent data in order to detect potential anomalies. This paper introduces a new machine learning method for anomaly detection in telemetry time series based on a sparse representation and dictionary learning. The main advantage of the proposed method is the possibility to handle multivariate telemetry time series described by mixed continuous and discrete parameters, taking into account the potential correlations be- tween these parameters. The proposed method is evaluated on a representative anomaly dataset obtained from real satellite telemetry with an available ground-truth and compared to state-of-the-art algorithms