300 research outputs found
Filling the G_ap_s: Multivariate Time Series Imputation by Graph Neural Networks
Dealing with missing values and incomplete time series is a labor-intensive,
tedious, inevitable task when handling data coming from real-world
applications. Effective spatio-temporal representations would allow imputation
methods to reconstruct missing temporal data by exploiting information coming
from sensors at different locations. However, standard methods fall short in
capturing the nonlinear time and space dependencies existing within networks of
interconnected sensors and do not take full advantage of the available - and
often strong - relational information. Notably, most state-of-the-art
imputation methods based on deep learning do not explicitly model relational
aspects and, in any case, do not exploit processing frameworks able to
adequately represent structured spatio-temporal data. Conversely, graph neural
networks have recently surged in popularity as both expressive and scalable
tools for processing sequential data with relational inductive biases. In this
work, we present the first assessment of graph neural networks in the context
of multivariate time series imputation. In particular, we introduce a novel
graph neural network architecture, named GRIN, which aims at reconstructing
missing data in the different channels of a multivariate time series by
learning spatio-temporal representations through message passing. Empirical
results show that our model outperforms state-of-the-art methods in the
imputation task on relevant real-world benchmarks with mean absolute error
improvements often higher than 20%.Comment: Accepted at ICLR 202
Missing Value Imputation for Multi-attribute Sensor Data Streams via Message Propagation (Extended Version)
Sensor data streams occur widely in various real-time applications in the
context of the Internet of Things (IoT). However, sensor data streams feature
missing values due to factors such as sensor failures, communication errors, or
depleted batteries. Missing values can compromise the quality of real-time
analytics tasks and downstream applications. Existing imputation methods either
make strong assumptions about streams or have low efficiency. In this study, we
aim to accurately and efficiently impute missing values in data streams that
satisfy only general characteristics in order to benefit real-time applications
more widely. First, we propose a message propagation imputation network (MPIN)
that is able to recover the missing values of data instances in a time window.
We give a theoretical analysis of why MPIN is effective. Second, we present a
continuous imputation framework that consists of data update and model update
mechanisms to enable MPIN to perform continuous imputation both effectively and
efficiently. Extensive experiments on multiple real datasets show that MPIN can
outperform the existing data imputers by wide margins and that the continuous
imputation framework is efficient and accurate.Comment: Accepted at VLDB 202
STING: Self-attention based Time-series Imputation Networks using GAN
Time series data are ubiquitous in real-world applications. However, one of
the most common problems is that the time series data could have missing values
by the inherent nature of the data collection process. So imputing missing
values from multivariate (correlated) time series data is imperative to improve
a prediction performance while making an accurate data-driven decision.
Conventional works for imputation simply delete missing values or fill them
based on mean/zero. Although recent works based on deep neural networks have
shown remarkable results, they still have a limitation to capture the complex
generation process of the multivariate time series. In this paper, we propose a
novel imputation method for multivariate time series data, called STING
(Self-attention based Time-series Imputation Networks using GAN). We take
advantage of generative adversarial networks and bidirectional recurrent neural
networks to learn latent representations of the time series. In addition, we
introduce a novel attention mechanism to capture the weighted correlations of
the whole sequence and avoid potential bias brought by unrelated ones.
Experimental results on three real-world datasets demonstrate that STING
outperforms the existing state-of-the-art methods in terms of imputation
accuracy as well as downstream tasks with the imputed values therein.Comment: 10 pages. This paper is an accepted version by ICDM'21. The published
version is https://ieeexplore.ieee.org/abstract/document/967918
Learning to Reconstruct Missing Data from Spatiotemporal Graphs with Sparse Observations
Modeling multivariate time series as temporal signals over a (possibly
dynamic) graph is an effective representational framework that allows for
developing models for time series analysis. In fact, discrete sequences of
graphs can be processed by autoregressive graph neural networks to recursively
learn representations at each discrete point in time and space. Spatiotemporal
graphs are often highly sparse, with time series characterized by multiple,
concurrent, and long sequences of missing data, e.g., due to the unreliable
underlying sensor network. In this context, autoregressive models can be
brittle and exhibit unstable learning dynamics. The objective of this paper is,
then, to tackle the problem of learning effective models to reconstruct, i.e.,
impute, missing data points by conditioning the reconstruction only on the
available observations. In particular, we propose a novel class of
attention-based architectures that, given a set of highly sparse discrete
observations, learn a representation for points in time and space by exploiting
a spatiotemporal propagation architecture aligned with the imputation task.
Representations are trained end-to-end to reconstruct observations w.r.t. the
corresponding sensor and its neighboring nodes. Compared to the state of the
art, our model handles sparse data without propagating prediction errors or
requiring a bidirectional model to encode forward and backward time
dependencies. Empirical results on representative benchmarks show the
effectiveness of the proposed method.Comment: Accepted at NeurIPS 202
- …