8 research outputs found
A framework for automated anomaly detection in high frequency water-quality data from in situ sensors
River water-quality monitoring is increasingly conducted using automated in
situ sensors, enabling timelier identification of unexpected values. However,
anomalies caused by technical issues confound these data, while the volume and
velocity of data prevent manual detection. We present a framework for automated
anomaly detection in high-frequency water-quality data from in situ sensors,
using turbidity, conductivity and river level data. After identifying end-user
needs and defining anomalies, we ranked their importance and selected suitable
detection methods. High priority anomalies included sudden isolated spikes and
level shifts, most of which were classified correctly by regression-based
methods such as autoregressive integrated moving average models. However, using
other water-quality variables as covariates reduced performance due to complex
relationships among variables. Classification of drift and periods of
anomalously low or high variability improved when we applied replaced anomalous
measurements with forecasts, but this inflated false positive rates.
Feature-based methods also performed well on high priority anomalies, but were
also less proficient at detecting lower priority anomalies, resulting in high
false negative rates. Unlike regression-based methods, all feature-based
methods produced low false positive rates, but did not and require training or
optimization. Rule-based methods successfully detected impossible values and
missing observations. Thus, we recommend using a combination of methods to
improve anomaly detection performance, whilst minimizing false detection rates.
Furthermore, our framework emphasizes the importance of communication between
end-users and analysts for optimal outcomes with respect to both detection
performance and end-user needs. Our framework is applicable to other types of
high frequency time-series data and anomaly detection applications
Forecasting: theory and practice
Forecasting has always been in the forefront of decision making and planning.
The uncertainty that surrounds the future is both exciting and challenging,
with individuals and organisations seeking to minimise risks and maximise
utilities. The lack of a free-lunch theorem implies the need for a diverse set
of forecasting methods to tackle an array of applications. This unique article
provides a non-systematic review of the theory and the practice of forecasting.
We offer a wide range of theoretical, state-of-the-art models, methods,
principles, and approaches to prepare, produce, organise, and evaluate
forecasts. We then demonstrate how such theoretical concepts are applied in a
variety of real-life contexts, including operations, economics, finance,
energy, environment, and social good. We do not claim that this review is an
exhaustive list of methods and applications. The list was compiled based on the
expertise and interests of the authors. However, we wish that our encyclopedic
presentation will offer a point of reference for the rich work that has been
undertaken over the last decades, with some key insights for the future of the
forecasting theory and practice
A Feature-Based Procedure for Detecting Technical Outliers in Water-Quality Data From In Situ Sensors
Outliers due to technical errors in water-quality data from in situ sensors can reduce data quality and have a direct impact on inference drawn from subsequent data analysis. However, outlier detection through manual monitoring is infeasible given the volume and velocity of data the sensors produce. Here we introduce an automated procedure, named oddwater, that provides early detection of outliers in water-quality data from in situ sensors caused by technical issues. Our oddwater procedure is used to first identify the data features that differentiate outlying instances from typical behaviors. Then, statistical transformations are applied to make the outlying instances stand out in a transformed data space. Unsupervised outlier scoring techniques are applied to the transformed data space, and an approach based on extreme value theory is used to calculate a threshold for each potential outlier. Using two data sets obtained from in situ sensors in rivers flowing into the Great Barrier Reef lagoon, Australia, we show that oddwater successfully identifies outliers involving abrupt changes in turbidity, conductivity, and river level, including sudden spikes, sudden isolated drops, and level shifts, while maintaining very low false detection rates. We have implemented this oddwater procedure in the open source R package oddwater.</p
A framework for automated anomaly detection in high frequency water-quality data from in situ sensors
Monitoring the water quality of rivers is increasingly conducted using automated in situ sensors, enabling timelier identification of unexpected values or trends. However, the data are confounded by anomalies caused by technical issues, for which the volume and velocity of data preclude manual detection. We present a framework for automated anomaly detection in high-frequency water-quality data from in situ sensors, using turbidity, conductivity and river level data collected from rivers flowing into the Great Barrier Reef. After identifying end-user needs and defining anomalies, we ranked anomaly importance and selected suitable detection methods. High priority anomalies included sudden isolated spikes and level shifts, most of which were classified correctly by regression-based methods such as autoregressive integrated moving average models. However, incorporation of multiple water-quality variables as covariates reduced performance due to complex relationships among variables. Classifications of drift and periods of anomalously low or high variability were more often correct when we applied mitigation, which replaces anomalous measurements with forecasts for further forecasting, but this inflated false positive rates. Feature-based methods also performed well on high priority anomalies and were similarly less proficient at detecting lower priority anomalies, resulting in high false negative rates. Unlike regression-based methods, however, all feature-based methods produced low false positive rates and have the benefit of not requiring training or optimization. Rule-based methods successfully detected a subset of lower priority anomalies, specifically impossible values and missing observations. We therefore suggest that a combination of methods will provide optimal performance in terms of correct anomaly detection, whilst minimizing false detection rates. Furthermore, our framework emphasizes the importance of communication between end-users and anomaly detection developers for optimal outcomes with respect to both detection performance and end-user application. To this end, our framework has high transferability to other types of high frequency time-series data and anomaly detection applications
Forecasting: theory and practice
Forecasting has always been at the forefront of decision making and planning.
The uncertainty that surrounds the future is both exciting and challenging,
with individuals and organisations seeking to minimise risks and maximise
utilities. The large number of forecasting applications calls for a diverse set
of forecasting methods to tackle real-life challenges. This article provides a
non-systematic review of the theory and the practice of forecasting. We provide
an overview of a wide range of theoretical, state-of-the-art models, methods,
principles, and approaches to prepare, produce, organise, and evaluate
forecasts. We then demonstrate how such theoretical concepts are applied in a
variety of real-life contexts.
We do not claim that this review is an exhaustive list of methods and
applications. However, we wish that our encyclopedic presentation will offer a
point of reference for the rich work that has been undertaken over the last
decades, with some key insights for the future of forecasting theory and
practice. Given its encyclopedic nature, the intended mode of reading is
non-linear. We offer cross-references to allow the readers to navigate through
the various topics. We complement the theoretical concepts and applications
covered by large lists of free or open-source software implementations and
publicly-available databases
Forecasting: theory and practice
Forecasting has always been at the forefront of decision making and planning.
The uncertainty that surrounds the future is both exciting and challenging,
with individuals and organisations seeking to minimise risks and maximise
utilities. The large number of forecasting applications calls for a diverse set
of forecasting methods to tackle real-life challenges. This article provides a
non-systematic review of the theory and the practice of forecasting. We provide
an overview of a wide range of theoretical, state-of-the-art models, methods,
principles, and approaches to prepare, produce, organise, and evaluate
forecasts. We then demonstrate how such theoretical concepts are applied in a
variety of real-life contexts.
We do not claim that this review is an exhaustive list of methods and
applications. However, we wish that our encyclopedic presentation will offer a
point of reference for the rich work that has been undertaken over the last
decades, with some key insights for the future of forecasting theory and
practice. Given its encyclopedic nature, the intended mode of reading is
non-linear. We offer cross-references to allow the readers to navigate through
the various topics. We complement the theoretical concepts and applications
covered by large lists of free or open-source software implementations and
publicly-available databases