2 research outputs found
Exploring Data and Knowledge combined Anomaly Explanation of Multivariate Industrial Data
The demand for high-performance anomaly detection techniques of IoT data
becomes urgent, especially in industry field. The anomaly identification and
explanation in time series data is one essential task in IoT data mining. Since
that the existing anomaly detection techniques focus on the identification of
anomalies, the explanation of anomalies is not well-solved. We address the
anomaly explanation problem for multivariate IoT data and propose a 3-step
self-contained method in this paper. We formalize and utilize the domain
knowledge in our method, and identify the anomalies by the violation of
constraints. We propose set-cover-based anomaly explanation algorithms to
discover the anomaly events reflected by violation features, and further
develop knowledge update algorithms to improve the original knowledge set.
Experimental results on real datasets from large-scale IoT systems verify that
our method computes high-quality explanation solutions of anomalies. Our work
provides a guide to navigate the explicable anomaly detection in both IoT fault
diagnosis and temporal data cleaning
Time Series Data Cleaning with Regular and Irregular Time Intervals
Errors are prevalent in time series data, especially in the industrial field.
Data with errors could not be stored in the database, which results in the loss
of data assets. Handling the dirty data in time series is non-trivial, when
given irregular time intervals. At present, to deal with these time series
containing errors, besides keeping original erroneous data, discarding
erroneous data and manually checking erroneous data, we can also use the
cleaning algorithm widely used in the database to automatically clean the time
series data. This survey provides a classification of time series data cleaning
techniques and comprehensively reviews the state-of-the-art methods of each
type. In particular, we have a special focus on the irregular time intervals.
Besides we summarize data cleaning tools, systems and evaluation criteria from
research and industry. Finally, we highlight possible directions time series
data cleaning