6,283 research outputs found
Temporal Tensor Transformation Network for Multivariate Time Series Prediction
Multivariate time series prediction has applications in a wide variety of
domains and is considered to be a very challenging task, especially when the
variables have correlations and exhibit complex temporal patterns, such as
seasonality and trend. Many existing methods suffer from strong statistical
assumptions, numerical issues with high dimensionality, manual feature
engineering efforts, and scalability. In this work, we present a novel deep
learning architecture, known as Temporal Tensor Transformation Network, which
transforms the original multivariate time series into a higher order of tensor
through the proposed Temporal-Slicing Stack Transformation. This yields a new
representation of the original multivariate time series, which enables the
convolution kernel to extract complex and non-linear features as well as
variable interactional signals from a relatively large temporal region.
Experimental results show that Temporal Tensor Transformation Network
outperforms several state-of-the-art methods on window-based predictions across
various tasks. The proposed architecture also demonstrates robust prediction
performance through an extensive sensitivity analysis
Temporal Attention augmented Bilinear Network for Financial Time-Series Data Analysis
Financial time-series forecasting has long been a challenging problem because
of the inherently noisy and stochastic nature of the market. In the
High-Frequency Trading (HFT), forecasting for trading purposes is even a more
challenging task since an automated inference system is required to be both
accurate and fast. In this paper, we propose a neural network layer
architecture that incorporates the idea of bilinear projection as well as an
attention mechanism that enables the layer to detect and focus on crucial
temporal information. The resulting network is highly interpretable, given its
ability to highlight the importance and contribution of each temporal instance,
thus allowing further analysis on the time instances of interest. Our
experiments in a large-scale Limit Order Book (LOB) dataset show that a
two-hidden-layer network utilizing our proposed layer outperforms by a large
margin all existing state-of-the-art results coming from much deeper
architectures while requiring far fewer computations.Comment: 12 pages, 4 figures, 3 table
Machine Learning for Spatiotemporal Sequence Forecasting: A Survey
Spatiotemporal systems are common in the real-world. Forecasting the
multi-step future of these spatiotemporal systems based on the past
observations, or, Spatiotemporal Sequence Forecasting (STSF), is a significant
and challenging problem. Although lots of real-world problems can be viewed as
STSF and many research works have proposed machine learning based methods for
them, no existing work has summarized and compared these methods from a unified
perspective. This survey aims to provide a systematic review of machine
learning for STSF. In this survey, we define the STSF problem and classify it
into three subcategories: Trajectory Forecasting of Moving Point Cloud
(TF-MPC), STSF on Regular Grid (STSF-RG) and STSF on Irregular Grid (STSF-IG).
We then introduce the two major challenges of STSF: 1) how to learn a model for
multi-step forecasting and 2) how to adequately model the spatial and temporal
structures. After that, we review the existing works for solving these
challenges, including the general learning strategies for multi-step
forecasting, the classical machine learning based methods for STSF, and the
deep learning based methods for STSF. We also compare these methods and point
out some potential research directions
Comparison of Deep Neural Networks and Deep Hierarchical Models for Spatio-Temporal Data
Spatio-temporal data are ubiquitous in the agricultural, ecological, and
environmental sciences, and their study is important for understanding and
predicting a wide variety of processes. One of the difficulties with modeling
spatial processes that change in time is the complexity of the dependence
structures that must describe how such a process varies, and the presence of
high-dimensional complex data sets and large prediction domains. It is
particularly challenging to specify parameterizations for nonlinear dynamic
spatio-temporal models (DSTMs) that are simultaneously useful scientifically
and efficient computationally. Statisticians have developed deep hierarchical
models that can accommodate process complexity as well as the uncertainties in
the predictions and inference. However, these models can be expensive and are
typically application specific. On the other hand, the machine learning
community has developed alternative "deep learning" approaches for nonlinear
spatio-temporal modeling. These models are flexible yet are typically not
implemented in a probabilistic framework. The two paradigms have many things in
common and suggest hybrid approaches that can benefit from elements of each
framework. This overview paper presents a brief introduction to the deep
hierarchical DSTM (DH-DSTM) framework, and deep models in machine learning,
culminating with the deep neural DSTM (DN-DSTM). Recent approaches that combine
elements from DH-DSTMs and echo state network DN-DSTMs are presented as
illustrations.Comment: 26 pages, including 6 figures and reference
Transform-Based Multilinear Dynamical System for Tensor Time Series Analysis
We propose a novel multilinear dynamical system (MLDS) in a transform domain,
named -MLDS, to model tensor time series. With transformations
applied to a tensor data, the latent multidimensional correlations among the
frontal slices are built, and thus resulting in the computational independence
in the transform domain. This allows the exact separability of the
multi-dimensional problem into multiple smaller LDS problems. To estimate the
system parameters, we utilize the expectation-maximization (EM) algorithm to
determine the parameters of each LDS. Further, -MLDSs
significantly reduce the model parameters and allows parallel processing. Our
general -MLDS model is implemented based on different transforms:
discrete Fourier transform, discrete cosine transform and discrete wavelet
transform. Due to the nonlinearity of these transformations, -MLDS
is able to capture the nonlinear correlations within the data unlike the MLDS
\cite{rogers2013multilinear} which assumes multi-way linear correlations. Using
four real datasets, the proposed -MLDS is shown to achieve much
higher prediction accuracy than the state-of-the-art MLDS and LDS with an equal
number of parameters under different noise models. In particular, the relative
errors are reduced by . Simultaneously, -MLDS
achieves an exponential improvement in the model's training time than MLDS
Driving with Data: Modeling and Forecasting Vehicle Fleet Maintenance in Detroit
The City of Detroit maintains an active fleet of over 2500 vehicles, spending
an annual average of over \$5 million on new vehicle purchases and over \$7.7
million on maintaining this fleet. Understanding the existence of patterns and
trends in this data could be useful to a variety of stakeholders, particularly
as Detroit emerges from Chapter 9 bankruptcy, but the patterns in such data are
often complex and multivariate and the city lacks dedicated resources for
detailed analysis of this data. This work, a data collaboration between the
Michigan Data Science Team (http://midas.umich.edu/mdst) and the City of
Detroit's Operations and Infrastructure Group, seeks to address this unmet need
by analyzing data from the City of Detroit's entire vehicle fleet from
2010-2017. We utilize tensor decomposition techniques to discover and visualize
unique temporal patterns in vehicle maintenance; apply differential sequence
mining to demonstrate the existence of common and statistically unique
maintenance sequences by vehicle make and model; and, after showing these
time-dependencies in the dataset, demonstrate an application of a predictive
Long Short Term Memory (LSTM) neural network model to predict maintenance
sequences. Our analysis shows both the complexities of municipal vehicle fleet
data and useful techniques for mining and modeling such data.Comment: Presented at the Data For Good Exchange 201
Low-Rank Autoregressive Tensor Completion for Multivariate Time Series Forecasting
Time series prediction has been a long-standing research topic and an
essential application in many domains. Modern time series collected from sensor
networks (e.g., energy consumption and traffic flow) are often large-scale and
incomplete with considerable corruption and missing values, making it difficult
to perform accurate predictions. In this paper, we propose a low-rank
autoregressive tensor completion (LATC) framework to model multivariate time
series data. The key of LATC is to transform the original multivariate time
series matrix (e.g., sensortime point) to a third-order tensor
structure (e.g., sensortime of dayday) by introducing an
additional temporal dimension, which allows us to model the inherent rhythms
and seasonality of time series as global patterns. With the tensor structure,
we can transform the time series prediction and missing data imputation
problems into a universal low-rank tensor completion problem. Besides
minimizing tensor rank, we also integrate a novel autoregressive norm on the
original matrix representation into the objective function. The two components
serve different roles. The low-rank structure allows us to effectively capture
the global consistency and trends across all the three dimensions (i.e.,
similarity among sensors, similarity of different days, and current time v.s.
the same time of historical days). The autoregressive norm can better model the
local temporal trends. Our numerical experiments on three real-world data sets
demonstrate the superiority of the integration of global and local trends in
LATC in both missing data imputation and rolling prediction tasks
Tensor Regression Meets Gaussian Processes
Low-rank tensor regression, a new model class that learns high-order
correlation from data, has recently received considerable attention. At the
same time, Gaussian processes (GP) are well-studied machine learning models for
structure learning. In this paper, we demonstrate interesting connections
between the two, especially for multi-way data analysis. We show that low-rank
tensor regression is essentially learning a multi-linear kernel in Gaussian
processes, and the low-rank assumption translates to the constrained Bayesian
inference problem. We prove the oracle inequality and derive the average case
learning curve for the equivalent GP model. Our finding implies that low-rank
tensor regression, though empirically successful, is highly dependent on the
eigenvalues of covariance functions as well as variable correlations.Comment: 17 page
Continuous-Time Relationship Prediction in Dynamic Heterogeneous Information Networks
Online social networks, World Wide Web, media and technological networks, and
other types of so-called information networks are ubiquitous nowadays. These
information networks are inherently heterogeneous and dynamic. They are
heterogeneous as they consist of multi-typed objects and relations, and they
are dynamic as they are constantly evolving over time. One of the challenging
issues in such heterogeneous and dynamic environments is to forecast those
relationships in the network that will appear in the future. In this paper, we
try to solve the problem of continuous-time relationship prediction in dynamic
and heterogeneous information networks. This implies predicting the time it
takes for a relationship to appear in the future, given its features that have
been extracted by considering both heterogeneity and temporal dynamics of the
underlying network. To this end, we first introduce a feature extraction
framework that combines the power of meta-path-based modeling and recurrent
neural networks to effectively extract features suitable for relationship
prediction regarding heterogeneity and dynamicity of the networks. Next, we
propose a supervised non-parametric approach, called Non-Parametric Generalized
Linear Model (NP-GLM), which infers the hidden underlying probability
distribution of the relationship building time given its features. We then
present a learning algorithm to train NP-GLM and an inference method to answer
time-related queries. Extensive experiments conducted on synthetic data and
three real-world datasets, namely Delicious, MovieLens, and DBLP, demonstrate
the effectiveness of NP-GLM in solving continuous-time relationship prediction
problem vis-a-vis competitive baselinesComment: To appear in ACM Transactions on Knowledge Discovery from Dat
Semi-supervised learning for structured regression on partially observed attributed graphs
Conditional probabilistic graphical models provide a powerful framework for
structured regression in spatio-temporal datasets with complex correlation
patterns. However, in real-life applications a large fraction of observations
is often missing, which can severely limit the representational power of these
models. In this paper we propose a Marginalized Gaussian Conditional Random
Fields (m-GCRF) structured regression model for dealing with missing labels in
partially observed temporal attributed graphs. This method is aimed at learning
with both labeled and unlabeled parts and effectively predicting future values
in a graph. The method is even capable of learning from nodes for which the
response variable is never observed in history, which poses problems for many
state-of-the-art models that can handle missing data. The proposed model is
characterized for various missingness mechanisms on 500 synthetic graphs. The
benefits of the new method are also demonstrated on a challenging application
for predicting precipitation based on partial observations of climate variables
in a temporal graph that spans the entire continental US. We also show that the
method can be useful for optimizing the costs of data collection in climate
applications via active reduction of the number of weather stations to
consider. In experiments on these real-world and synthetic datasets we show
that the proposed model is consistently more accurate than alternative
semi-supervised structured models, as well as models that either use imputation
to deal with missing values or simply ignore them altogether.Comment: Proceedings of the 2015 SIAM International Conference on Data Mining
(SDM 2015) Vancouver, Canada, April 30 - May 02, 201
- …