49 research outputs found

    Filling the G_ap_s: Multivariate Time Series Imputation by Graph Neural Networks

    Get PDF
    Dealing with missing values and incomplete time series is a labor-intensive, tedious, inevitable task when handling data coming from real-world applications. Effective spatio-temporal representations would allow imputation methods to reconstruct missing temporal data by exploiting information coming from sensors at different locations. However, standard methods fall short in capturing the nonlinear time and space dependencies existing within networks of interconnected sensors and do not take full advantage of the available - and often strong - relational information. Notably, most state-of-the-art imputation methods based on deep learning do not explicitly model relational aspects and, in any case, do not exploit processing frameworks able to adequately represent structured spatio-temporal data. Conversely, graph neural networks have recently surged in popularity as both expressive and scalable tools for processing sequential data with relational inductive biases. In this work, we present the first assessment of graph neural networks in the context of multivariate time series imputation. In particular, we introduce a novel graph neural network architecture, named GRIN, which aims at reconstructing missing data in the different channels of a multivariate time series by learning spatio-temporal representations through message passing. Empirical results show that our model outperforms state-of-the-art methods in the imputation task on relevant real-world benchmarks with mean absolute error improvements often higher than 20%.Comment: Accepted at ICLR 202

    Networked Time Series Prediction with Incomplete Data

    Full text link
    A networked time series (NETS) is a family of time series on a given graph, one for each node. It has a wide range of applications from intelligent transportation, environment monitoring to smart grid management. An important task in such applications is to predict the future values of a NETS based on its historical values and the underlying graph. Most existing methods require complete data for training. However, in real-world scenarios, it is not uncommon to have missing data due to sensor malfunction, incomplete sensing coverage, etc. In this paper, we study the problem of NETS prediction with incomplete data. We propose NETS-ImpGAN, a novel deep learning framework that can be trained on incomplete data with missing values in both history and future. Furthermore, we propose Graph Temporal Attention Networks, which incorporate the attention mechanism to capture both inter-time series and temporal correlations. We conduct extensive experiments on four real-world datasets under different missing patterns and missing rates. The experimental results show that NETS-ImpGAN outperforms existing methods, reducing the MAE by up to 25%

    Indoor environment data time-series reconstruction using autoencoder neural networks

    Full text link
    As the number of installed meters in buildings increases, there is a growing number of data time-series that could be used to develop data-driven models to support and optimize building operation. However, building data sets are often characterized by errors and missing values, which are considered, by the recent research, among the main limiting factors on the performance of the proposed models. Motivated by the need to address the problem of missing data in building operation, this work presents a data-driven approach to fill these gaps. In this study, three different autoencoder neural networks are trained to reconstruct missing short-term indoor environment data time-series in a data set collected in an office building in Aachen, Germany. This consisted of a four year-long monitoring campaign in and between the years 2014 and 2017, of 84 different rooms. The models are applicable for different time-series obtained from room automation, such as indoor air temperature, relative humidity and CO2CO_{2} data streams. The results prove that the proposed methods outperform classic numerical approaches and they result in reconstructing the corresponding variables with average RMSEs of 0.42 {\deg}C, 1.30 % and 78.41 ppm, respectively.Comment: Accepted in Building and Environmen

    Data Consistency for Data-Driven Smart Energy Assessment

    Get PDF
    In the smart grid era, the number of data available for different applications has increased considerably. However, data could not perfectly represent the phenomenon or process under analysis, so their usability requires a preliminary validation carried out by experts of the specific domain. The process of data gathering and transmission over the communication channels has to be verified to ensure that data are provided in a useful format, and that no external effect has impacted on the correct data to be received. Consistency of the data coming from different sources (in terms of timings and data resolution) has to be ensured and managed appropriately. Suitable procedures are needed for transforming data into knowledge in an effective way. This contribution addresses the previous aspects by highlighting a number of potential issues and the solutions in place in different power and energy system, including the generation, grid and user sides. Recent references, as well as selected historical references, are listed to support the illustration of the conceptual aspects

    Missing Data Imputation with OLS-based Autoencoder for Intelligent Manufacturing

    Get PDF
    Motivated by the global economy that is greatly shaped by the landscape changes in energy and manufacturing where more and more devices and systems are interconnected, intelligent manufacturing in which data mining is of great importance is studied. In this article, an energy monitoring platform for small- and medium-sized enterprises developed by the point energy team ( www.pointenergy.org ) is first introduced, which monitors and records the energy consumption of manufacturing processes at various levels of granularity. In processing the collected data, the incompleteness in the data due to various factors needs to be addressed first otherwise it may lead to the inaccurate portrayal of the system and poor generalization of the resultant model trained by the data. Hence, a novel orthogonal-least-square-based autoencoder is proposed to generate new samples for the imputation of missing values. This approach is to learn the representative code from the original samples by constructing an improved encoder network in which the hidden neurons are orthogonal with each other. The new samples are then generated through the decoder network. The proposed approach selects the hidden neurons one by one based on the OLS estimation until an adequate network is built. The classical techniques and other generative models are compared to verify the effectiveness of the proposed algorithm. For these methods, the optimal parameters are estimated based on the performance metric of the cross-validation mean square error. In the experiment, two real industrial datasets from a baking process and a polymer extrusion process are adopted and the percentage of missing values varies from 0.02 to 0.25. The experimental results confirm that the proposed method offers stable performance in the presence of different missing ratios, and it outperforms significantly alternative approaches while the missing ratio is greater than 0.05

    A noise robust automatic radiolocation animal tracking system

    Get PDF
    Agriculture is becoming increasingly reliant upon accurate data from sensor arrays, with localization an emerging application in the livestock industry. Ground-based time difference of arrival (TDoA) radio location methods have the advantage of being lightweight and exhibit higher energy efficiency than methods reliant upon Global Navigation Satellite Systems (GNSS). Such methods can employ small primary battery cells, rather than rechargeable cells, and still deliver a multi-year deployment. In this paper, we present a novel deep learning algorithm adapted from a one-dimensional U-Net implementing a convolutional neural network (CNN) model, originally developed for the task of semantic segmentation. The presented model (ResUnet-1d) both converts TDoA sequences directly to positions and reduces positional errors introduced by sources such as multipathing. We have evaluated the model using simulated animal movements in the form of TDoA position sequences in combination with real-world distributions of TDoA error. These animal tracks were simulated at various step intervals to mimic potential TDoA transmission intervals. We compare ResUnet-1d to a Kalman filter to evaluate the performance of our algorithm to a more traditional noise reduction approach. On average, for simulated tracks having added noise with a standard deviation of 50 m, the described approach was able to reduce localization error by between 66.3% and 73.6%. The Kalman filter only achieved a reduction of between 8.0% and 22.5%. For a scenario with larger added noise having a standard deviation of 100 m, the described approach was able to reduce average localization error by between 76.2% and 81.9%. The Kalman filter only achieved a reduction of between 31.0% and 39.1%. Results indicate that this novel 1D CNN U-Net like encoder/decoder for TDoA location error correction outperforms the Kalman filter. It is able to reduce average localization errors to between 16 and 34 m across all simulated experimental treatments while the uncorrected average TDoA error ranged from 55 to 188 m

    Deep learning for the early detection of harmful algal blooms and improving water quality monitoring

    Get PDF
    Climate change will affect how water sources are managed and monitored. The frequency of algal blooms will increase with climate change as it presents favourable conditions for the reproduction of phytoplankton. During monitoring, possible sensory failures in monitoring systems result in partially filled data which may affect critical systems. Therefore, imputation becomes necessary to decrease error and increase data quality. This work investigates two issues in water quality data analysis: improving data quality and anomaly detection. It consists of three main topics: data imputation, early algal bloom detection using in-situ data and early algal bloom detection using multiple modalities.The data imputation problem is addressed by experimenting with various methods with a water quality dataset that includes four locations around the North Sea and the Irish Sea with different characteristics and high miss rates, testing model generalisability. A novel neural network architecture with self-attention is proposed in which imputation is done in a single pass, reducing execution time. The self-attention components increase the interpretability of the imputation process at each stage of the network, providing knowledge to domain experts.After data curation, algal activity is predicted using transformer networks, between 1 to 7 days ahead, and the importance of the input with regard to the output of the prediction model is explained using SHAP, aiming to explain model behaviour to domain experts which is overlooked in previous approaches. The prediction model improves bloom detection performance by 5% on average and the explanation summarizes the complex structure of the model to input-output relationships. Performance improvements on the initial unimodal bloom detection model are made by incorporating multiple modalities into the detection process which were only used for validation purposes previously. The problem of missing data is also tackled by using coordinated representations, replacing low quality in-situ data with satellite data and vice versa, instead of imputation which may result in biased results

    Energy Data Analytics for Smart Meter Data

    Get PDF
    The principal advantage of smart electricity meters is their ability to transfer digitized electricity consumption data to remote processing systems. The data collected by these devices make the realization of many novel use cases possible, providing benefits to electricity providers and customers alike. This book includes 14 research articles that explore and exploit the information content of smart meter data, and provides insights into the realization of new digital solutions and services that support the transition towards a sustainable energy system. This volume has been edited by Andreas Reinhardt, head of the Energy Informatics research group at Technische Universität Clausthal, Germany, and Lucas Pereira, research fellow at Técnico Lisboa, Portugal
    corecore