Search CORE

775 research outputs found

A framework for automated anomaly detection in high frequency water-quality data from in situ sensors

Author: Alsibai Omar
Hyndman Rob J.
Kandanaarachchi Sevvandi
King Olivia C.
Leigh Catherine
McGree James M.
Mengersen Kerrie
Neelamraju Catherine
Peterson Erin E.
Strauss Jennifer
Talagala Priyanga Dilini
Turner Ryan S.
Publication venue
Publication date: 01/01/2019
Field of study

River water-quality monitoring is increasingly conducted using automated in situ sensors, enabling timelier identification of unexpected values. However, anomalies caused by technical issues confound these data, while the volume and velocity of data prevent manual detection. We present a framework for automated anomaly detection in high-frequency water-quality data from in situ sensors, using turbidity, conductivity and river level data. After identifying end-user needs and defining anomalies, we ranked their importance and selected suitable detection methods. High priority anomalies included sudden isolated spikes and level shifts, most of which were classified correctly by regression-based methods such as autoregressive integrated moving average models. However, using other water-quality variables as covariates reduced performance due to complex relationships among variables. Classification of drift and periods of anomalously low or high variability improved when we applied replaced anomalous measurements with forecasts, but this inflated false positive rates. Feature-based methods also performed well on high priority anomalies, but were also less proficient at detecting lower priority anomalies, resulting in high false negative rates. Unlike regression-based methods, all feature-based methods produced low false positive rates, but did not and require training or optimization. Rule-based methods successfully detected impossible values and missing observations. Thus, we recommend using a combination of methods to improve anomaly detection performance, whilst minimizing false detection rates. Furthermore, our framework emphasizes the importance of communication between end-users and analysts for optimal outcomes with respect to both detection performance and end-user needs. Our framework is applicable to other types of high frequency time-series data and anomaly detection applications

arXiv.org e-Print Archive

Queensland University of Technology ePrints Archive

RMIT Research Repository

Entity Aware Modelling: A Survey

Author: Ghosh Rahul
He Erhu
Jia Xiaowei
Khandelwal Ankush
Kumar Vipin
Renganathan Arvind
Sharma Somya
Yang Haoyu
Publication venue
Publication date: 16/02/2023
Field of study

Personalized prediction of responses for individual entities caused by external drivers is vital across many disciplines. Recent machine learning (ML) advances have led to new state-of-the-art response prediction models. Models built at a population level often lead to sub-optimal performance in many personalized prediction settings due to heterogeneity in data across entities (tasks). In personalized prediction, the goal is to incorporate inherent characteristics of different entities to improve prediction performance. In this survey, we focus on the recent developments in the ML community for such entity-aware modeling approaches. ML algorithms often modulate the network using these entity characteristics when they are readily available. However, these entity characteristics are not readily available in many real-world scenarios, and different ML methods have been proposed to infer these characteristics from the data. In this survey, we have organized the current literature on entity-aware modeling based on the availability of these characteristics as well as the amount of training data. We highlight how recent innovations in other disciplines, such as uncertainty quantification, fairness, and knowledge-guided machine learning, can improve entity-aware modeling.Comment: Submitted to IJCAI, Survey Trac

arXiv.org e-Print Archive

A Survey on Causal Discovery Methods for Temporal and Non-Temporal Data

Author: Gani Md Osman
Hasan Uzma
Hossain Emam
Publication venue
Publication date: 27/03/2023
Field of study

Causal Discovery (CD) is the process of identifying the cause-effect relationships among the variables from data. Over the years, several methods have been developed primarily based on the statistical properties of data to uncover the underlying causal mechanism. In this study we introduce the common terminologies in causal discovery, and provide a comprehensive discussion of the approaches designed to identify the causal edges in different settings. We further discuss some of the benchmark datasets available for evaluating the performance of the causal discovery algorithms, available tools to perform causal discovery readily, and the common metrics used to evaluate these methods. Finally, we conclude by presenting the common challenges involved in CD and also, discuss the applications of CD in multiple areas of interest

arXiv.org e-Print Archive

Discovering phase and causal dependencies on manufacturing processes

Author: Giovanni Menegozzo
Publication venue
Publication date: 01/01/2022
Field of study

Discovering phase and causal dependencies on manufacturing processes. Keyword machine learning, causality, Industry 4.

Catalogo dei prodotti della ricerca

A Time Series Analysis: Exploring the Link between Human Activity and Blood Glucose Fluctuation

Author: Sadowski Eric A.
Publication venue: Scholars Commons @ Laurier
Publication date: 01/01/2010
Field of study

In this thesis, time series models are developed to explore the correlates of blood glucose (BG) fluctuation of diabetic patients. In particular, it is investigated whether certain human activities and lifestyle events (e.g. food and medication consumption, physical activity, travel and social interaction) influence BG, and if so, how. A unique dataset is utilized consisting of 40 diabetic patients who participated in a 3-day study involving continuous monitoring of blood glucose (BG) at five minute intervals, combined with measures for sugar; carbohydrate; calorie and insulin intake; physical activity; distance from home; time spent traveling via public transit and private automobile; and time spent with other people, dining and shopping. Using a dynamic regression model fitted with autoregressive integrated moving average (ARIMA) components, the influence of independent predictive variables on BG levels is quantified, while at the same time the impact of unknown factors is defined by an error term. Models were developed for individuals with overall findings demonstrating the potential for continuous monitoring of diabetic (DM) patients who are trying to control their BG. Model results produced significant BG predicting variables that include food consumption, exogenous insulin administration and physical activity

Wilfrid Laurier University

Neural network-based parametric system identification: a review

Author: Dong Aoxiang
Starr Andrew
Zhao Yifan
Publication venue: Taylor and Francis
Publication date: 02/08/2023
Field of study

Parametric system identification, which is the process of uncovering the inherent dynamics of a system based on the model built with the observed inputs and outputs data, has been intensively studied in the past few decades. Recent years have seen a surge in the use of neural networks (NNs) in system identification, owing to their high approximation capability, less reliance on prior knowledge, and the growth of computational power. However, there is a lack of review on neural network modelling in the paradigm of parametric system identification, particularly in the time domain. This article discussed the connection in principle between conventional parametric models and three types of NNs including Feedforward Neural Networks, Recurrent Neural Networks and Encoder-Decoder. Then it reviewed the advantages and limitations of related research in addressing two major challenges of parametric system identification, including the model interpretability and modelling with nonstationary realisations. Finally, new challenges and future trends in neural network-based parametric system identification are presented in this article

Cranfield CERES

Towards Dynamic Structure Changes Detection in Financial Series via Causal Analysis

Author: Brun Armelle
Owusu Patrick
Wang Shengrui
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 07/12/2021
Field of study

International audienceThis is a preliminary paper describing the concepts and principles for a sequential approach towards causal detection in a financial system presented by large-scale data. In particular, we focus on both the regime-switching and causal discovery detection models. This is to address the problem of heterogeneous conditions when analysing nonlinear characteristics from the financial markets. Thus handling the dynamics of multiple regimes in a series and new data to obtain valid answers to causal queries of interest. The availability of large-scale time series data presents new opportunities in knowledge discovery because the insight that can be gained from a causal perspective in a nonlinear system would be tremendous for asset allocation. However, largescale series are prone to biases, including sampling selection. For decades, the main ways to study nonlinear time series analysis has been isolated to statistical analysis, largely restricted to parametric models. We here present an approach for handling a nonlinear system, infused with a causal solution in a temporal mining task

INRIA a CCSD electronic archive server