38,090 research outputs found
Alaska Road Weather Project | Technical Performance Assessment Report | Fairbanks Field Demonstration 2013-2014
INE/AUTC 15.0
A framework for automated anomaly detection in high frequency water-quality data from in situ sensors
River water-quality monitoring is increasingly conducted using automated in
situ sensors, enabling timelier identification of unexpected values. However,
anomalies caused by technical issues confound these data, while the volume and
velocity of data prevent manual detection. We present a framework for automated
anomaly detection in high-frequency water-quality data from in situ sensors,
using turbidity, conductivity and river level data. After identifying end-user
needs and defining anomalies, we ranked their importance and selected suitable
detection methods. High priority anomalies included sudden isolated spikes and
level shifts, most of which were classified correctly by regression-based
methods such as autoregressive integrated moving average models. However, using
other water-quality variables as covariates reduced performance due to complex
relationships among variables. Classification of drift and periods of
anomalously low or high variability improved when we applied replaced anomalous
measurements with forecasts, but this inflated false positive rates.
Feature-based methods also performed well on high priority anomalies, but were
also less proficient at detecting lower priority anomalies, resulting in high
false negative rates. Unlike regression-based methods, all feature-based
methods produced low false positive rates, but did not and require training or
optimization. Rule-based methods successfully detected impossible values and
missing observations. Thus, we recommend using a combination of methods to
improve anomaly detection performance, whilst minimizing false detection rates.
Furthermore, our framework emphasizes the importance of communication between
end-users and analysts for optimal outcomes with respect to both detection
performance and end-user needs. Our framework is applicable to other types of
high frequency time-series data and anomaly detection applications
- …