6,283 research outputs found

    Temporal Tensor Transformation Network for Multivariate Time Series Prediction

    Full text link
    Multivariate time series prediction has applications in a wide variety of domains and is considered to be a very challenging task, especially when the variables have correlations and exhibit complex temporal patterns, such as seasonality and trend. Many existing methods suffer from strong statistical assumptions, numerical issues with high dimensionality, manual feature engineering efforts, and scalability. In this work, we present a novel deep learning architecture, known as Temporal Tensor Transformation Network, which transforms the original multivariate time series into a higher order of tensor through the proposed Temporal-Slicing Stack Transformation. This yields a new representation of the original multivariate time series, which enables the convolution kernel to extract complex and non-linear features as well as variable interactional signals from a relatively large temporal region. Experimental results show that Temporal Tensor Transformation Network outperforms several state-of-the-art methods on window-based predictions across various tasks. The proposed architecture also demonstrates robust prediction performance through an extensive sensitivity analysis

    Temporal Attention augmented Bilinear Network for Financial Time-Series Data Analysis

    Full text link
    Financial time-series forecasting has long been a challenging problem because of the inherently noisy and stochastic nature of the market. In the High-Frequency Trading (HFT), forecasting for trading purposes is even a more challenging task since an automated inference system is required to be both accurate and fast. In this paper, we propose a neural network layer architecture that incorporates the idea of bilinear projection as well as an attention mechanism that enables the layer to detect and focus on crucial temporal information. The resulting network is highly interpretable, given its ability to highlight the importance and contribution of each temporal instance, thus allowing further analysis on the time instances of interest. Our experiments in a large-scale Limit Order Book (LOB) dataset show that a two-hidden-layer network utilizing our proposed layer outperforms by a large margin all existing state-of-the-art results coming from much deeper architectures while requiring far fewer computations.Comment: 12 pages, 4 figures, 3 table

    Machine Learning for Spatiotemporal Sequence Forecasting: A Survey

    Full text link
    Spatiotemporal systems are common in the real-world. Forecasting the multi-step future of these spatiotemporal systems based on the past observations, or, Spatiotemporal Sequence Forecasting (STSF), is a significant and challenging problem. Although lots of real-world problems can be viewed as STSF and many research works have proposed machine learning based methods for them, no existing work has summarized and compared these methods from a unified perspective. This survey aims to provide a systematic review of machine learning for STSF. In this survey, we define the STSF problem and classify it into three subcategories: Trajectory Forecasting of Moving Point Cloud (TF-MPC), STSF on Regular Grid (STSF-RG) and STSF on Irregular Grid (STSF-IG). We then introduce the two major challenges of STSF: 1) how to learn a model for multi-step forecasting and 2) how to adequately model the spatial and temporal structures. After that, we review the existing works for solving these challenges, including the general learning strategies for multi-step forecasting, the classical machine learning based methods for STSF, and the deep learning based methods for STSF. We also compare these methods and point out some potential research directions

    Comparison of Deep Neural Networks and Deep Hierarchical Models for Spatio-Temporal Data

    Full text link
    Spatio-temporal data are ubiquitous in the agricultural, ecological, and environmental sciences, and their study is important for understanding and predicting a wide variety of processes. One of the difficulties with modeling spatial processes that change in time is the complexity of the dependence structures that must describe how such a process varies, and the presence of high-dimensional complex data sets and large prediction domains. It is particularly challenging to specify parameterizations for nonlinear dynamic spatio-temporal models (DSTMs) that are simultaneously useful scientifically and efficient computationally. Statisticians have developed deep hierarchical models that can accommodate process complexity as well as the uncertainties in the predictions and inference. However, these models can be expensive and are typically application specific. On the other hand, the machine learning community has developed alternative "deep learning" approaches for nonlinear spatio-temporal modeling. These models are flexible yet are typically not implemented in a probabilistic framework. The two paradigms have many things in common and suggest hybrid approaches that can benefit from elements of each framework. This overview paper presents a brief introduction to the deep hierarchical DSTM (DH-DSTM) framework, and deep models in machine learning, culminating with the deep neural DSTM (DN-DSTM). Recent approaches that combine elements from DH-DSTMs and echo state network DN-DSTMs are presented as illustrations.Comment: 26 pages, including 6 figures and reference

    Transform-Based Multilinear Dynamical System for Tensor Time Series Analysis

    Full text link
    We propose a novel multilinear dynamical system (MLDS) in a transform domain, named L\mathcal{L}-MLDS, to model tensor time series. With transformations applied to a tensor data, the latent multidimensional correlations among the frontal slices are built, and thus resulting in the computational independence in the transform domain. This allows the exact separability of the multi-dimensional problem into multiple smaller LDS problems. To estimate the system parameters, we utilize the expectation-maximization (EM) algorithm to determine the parameters of each LDS. Further, L\mathcal{L}-MLDSs significantly reduce the model parameters and allows parallel processing. Our general L\mathcal{L}-MLDS model is implemented based on different transforms: discrete Fourier transform, discrete cosine transform and discrete wavelet transform. Due to the nonlinearity of these transformations, L\mathcal{L}-MLDS is able to capture the nonlinear correlations within the data unlike the MLDS \cite{rogers2013multilinear} which assumes multi-way linear correlations. Using four real datasets, the proposed L\mathcal{L}-MLDS is shown to achieve much higher prediction accuracy than the state-of-the-art MLDS and LDS with an equal number of parameters under different noise models. In particular, the relative errors are reduced by 50%∼99%50\% \sim 99\%. Simultaneously, L\mathcal{L}-MLDS achieves an exponential improvement in the model's training time than MLDS

    Driving with Data: Modeling and Forecasting Vehicle Fleet Maintenance in Detroit

    Full text link
    The City of Detroit maintains an active fleet of over 2500 vehicles, spending an annual average of over \$5 million on new vehicle purchases and over \$7.7 million on maintaining this fleet. Understanding the existence of patterns and trends in this data could be useful to a variety of stakeholders, particularly as Detroit emerges from Chapter 9 bankruptcy, but the patterns in such data are often complex and multivariate and the city lacks dedicated resources for detailed analysis of this data. This work, a data collaboration between the Michigan Data Science Team (http://midas.umich.edu/mdst) and the City of Detroit's Operations and Infrastructure Group, seeks to address this unmet need by analyzing data from the City of Detroit's entire vehicle fleet from 2010-2017. We utilize tensor decomposition techniques to discover and visualize unique temporal patterns in vehicle maintenance; apply differential sequence mining to demonstrate the existence of common and statistically unique maintenance sequences by vehicle make and model; and, after showing these time-dependencies in the dataset, demonstrate an application of a predictive Long Short Term Memory (LSTM) neural network model to predict maintenance sequences. Our analysis shows both the complexities of municipal vehicle fleet data and useful techniques for mining and modeling such data.Comment: Presented at the Data For Good Exchange 201

    Low-Rank Autoregressive Tensor Completion for Multivariate Time Series Forecasting

    Full text link
    Time series prediction has been a long-standing research topic and an essential application in many domains. Modern time series collected from sensor networks (e.g., energy consumption and traffic flow) are often large-scale and incomplete with considerable corruption and missing values, making it difficult to perform accurate predictions. In this paper, we propose a low-rank autoregressive tensor completion (LATC) framework to model multivariate time series data. The key of LATC is to transform the original multivariate time series matrix (e.g., sensor×\timestime point) to a third-order tensor structure (e.g., sensor×\timestime of day×\timesday) by introducing an additional temporal dimension, which allows us to model the inherent rhythms and seasonality of time series as global patterns. With the tensor structure, we can transform the time series prediction and missing data imputation problems into a universal low-rank tensor completion problem. Besides minimizing tensor rank, we also integrate a novel autoregressive norm on the original matrix representation into the objective function. The two components serve different roles. The low-rank structure allows us to effectively capture the global consistency and trends across all the three dimensions (i.e., similarity among sensors, similarity of different days, and current time v.s. the same time of historical days). The autoregressive norm can better model the local temporal trends. Our numerical experiments on three real-world data sets demonstrate the superiority of the integration of global and local trends in LATC in both missing data imputation and rolling prediction tasks

    Tensor Regression Meets Gaussian Processes

    Full text link
    Low-rank tensor regression, a new model class that learns high-order correlation from data, has recently received considerable attention. At the same time, Gaussian processes (GP) are well-studied machine learning models for structure learning. In this paper, we demonstrate interesting connections between the two, especially for multi-way data analysis. We show that low-rank tensor regression is essentially learning a multi-linear kernel in Gaussian processes, and the low-rank assumption translates to the constrained Bayesian inference problem. We prove the oracle inequality and derive the average case learning curve for the equivalent GP model. Our finding implies that low-rank tensor regression, though empirically successful, is highly dependent on the eigenvalues of covariance functions as well as variable correlations.Comment: 17 page

    Continuous-Time Relationship Prediction in Dynamic Heterogeneous Information Networks

    Full text link
    Online social networks, World Wide Web, media and technological networks, and other types of so-called information networks are ubiquitous nowadays. These information networks are inherently heterogeneous and dynamic. They are heterogeneous as they consist of multi-typed objects and relations, and they are dynamic as they are constantly evolving over time. One of the challenging issues in such heterogeneous and dynamic environments is to forecast those relationships in the network that will appear in the future. In this paper, we try to solve the problem of continuous-time relationship prediction in dynamic and heterogeneous information networks. This implies predicting the time it takes for a relationship to appear in the future, given its features that have been extracted by considering both heterogeneity and temporal dynamics of the underlying network. To this end, we first introduce a feature extraction framework that combines the power of meta-path-based modeling and recurrent neural networks to effectively extract features suitable for relationship prediction regarding heterogeneity and dynamicity of the networks. Next, we propose a supervised non-parametric approach, called Non-Parametric Generalized Linear Model (NP-GLM), which infers the hidden underlying probability distribution of the relationship building time given its features. We then present a learning algorithm to train NP-GLM and an inference method to answer time-related queries. Extensive experiments conducted on synthetic data and three real-world datasets, namely Delicious, MovieLens, and DBLP, demonstrate the effectiveness of NP-GLM in solving continuous-time relationship prediction problem vis-a-vis competitive baselinesComment: To appear in ACM Transactions on Knowledge Discovery from Dat

    Semi-supervised learning for structured regression on partially observed attributed graphs

    Full text link
    Conditional probabilistic graphical models provide a powerful framework for structured regression in spatio-temporal datasets with complex correlation patterns. However, in real-life applications a large fraction of observations is often missing, which can severely limit the representational power of these models. In this paper we propose a Marginalized Gaussian Conditional Random Fields (m-GCRF) structured regression model for dealing with missing labels in partially observed temporal attributed graphs. This method is aimed at learning with both labeled and unlabeled parts and effectively predicting future values in a graph. The method is even capable of learning from nodes for which the response variable is never observed in history, which poses problems for many state-of-the-art models that can handle missing data. The proposed model is characterized for various missingness mechanisms on 500 synthetic graphs. The benefits of the new method are also demonstrated on a challenging application for predicting precipitation based on partial observations of climate variables in a temporal graph that spans the entire continental US. We also show that the method can be useful for optimizing the costs of data collection in climate applications via active reduction of the number of weather stations to consider. In experiments on these real-world and synthetic datasets we show that the proposed model is consistently more accurate than alternative semi-supervised structured models, as well as models that either use imputation to deal with missing values or simply ignore them altogether.Comment: Proceedings of the 2015 SIAM International Conference on Data Mining (SDM 2015) Vancouver, Canada, April 30 - May 02, 201
    • …
    corecore