7,040 research outputs found
Theory and Algorithms for Forecasting Time Series
We present data-dependent learning bounds for the general scenario of
non-stationary non-mixing stochastic processes. Our learning guarantees are
expressed in terms of a data-dependent measure of sequential complexity and a
discrepancy measure that can be estimated from data under some mild
assumptions. We also also provide novel analysis of stable time series
forecasting algorithm using this new notion of discrepancy that we introduce.
We use our learning bounds to devise new algorithms for non-stationary time
series forecasting for which we report some preliminary experimental results.Comment: An extended abstract has appeared in (Kuznetsov and Mohri, 2015
Foundations of Sequence-to-Sequence Modeling for Time Series
The availability of large amounts of time series data, paired with the
performance of deep-learning algorithms on a broad class of problems, has
recently led to significant interest in the use of sequence-to-sequence models
for time series forecasting. We provide the first theoretical analysis of this
time series forecasting framework. We include a comparison of
sequence-to-sequence modeling to classical time series models, and as such our
theory can serve as a quantitative guide for practitioners choosing between
different modeling methodologies.Comment: To appear at AISTATS 201
Rademacher complexity of stationary sequences
We show how to control the generalization error of time series models wherein
past values of the outcome are used to predict future values. The results are
based on a generalization of standard i.i.d. concentration inequalities to
dependent data without the mixing assumptions common in the time series
setting. Our proof and the result are simpler than previous analyses with
dependent data or stochastic adversaries which use sequential Rademacher
complexities rather than the expected Rademacher complexity for i.i.d.
processes. We also derive empirical Rademacher results without mixing
assumptions resulting in fully calculable upper bounds.Comment: 15 pages, 1 figur
Discrepancy-Based Algorithms for Non-Stationary Rested Bandits
We study the multi-armed bandit problem where the rewards are realizations of
general non-stationary stochastic processes, a setting that generalizes many
existing lines of work and analyses. In particular, we present a theoretical
analysis and derive regret guarantees for rested bandits in which the reward
distribution of each arm changes only when we pull that arm. Remarkably, our
regret bounds are logarithmic in the number of rounds under several natural
conditions. We introduce a new algorithm based on classical UCB ideas combined
with the notion of weighted discrepancy, a useful tool for measuring the
non-stationarity of a stochastic process. We show that the notion of
discrepancy can be used to design very general algorithms and a unified
framework for the analysis of multi-armed rested bandit problems with
non-stationary rewards. In particular, we show that we can recover the regret
guarantees of many specific instances of bandit problems with non-stationary
rewards that have been studied in the literature. We also provide experiments
demonstrating that our algorithms can enjoy a significant improvement in
practice compared to standard benchmarks.Comment: Unfinished wor
MACRO: A Meta-Algorithm for Conditional Risk Minimization
We study conditional risk minimization (CRM), i.e. the problem of learning a
hypothesis of minimal risk for prediction at the next step of sequentially
arriving dependent data. Despite it being a fundamental problem, successful
learning in the CRM sense has so far only been demonstrated using theoretical
algorithms that cannot be used for real problems as they would require storing
all incoming data. In this work, we introduce MACRO, a meta-algorithm for CRM
that does not suffer from this shortcoming, but nevertheless offers learning
guarantees. Instead of storing all data it maintains and iteratively updates a
set of learning subroutines. With suitable approximations, MACRO applied to
real data, yielding improved prediction performance compared to traditional
non-conditional learning
Universal Algorithm for Online Trading Based on the Method of Calibration
We present a universal algorithm for online trading in Stock Market which
performs asymptotically at least as good as any stationary trading strategy
that computes the investment at each step using a fixed function of the side
information that belongs to a given RKHS (Reproducing Kernel Hilbert Space).
Using a universal kernel, we extend this result for any continuous stationary
strategy. In this learning process, a trader rationally chooses his gambles
using predictions made by a randomized well-calibrated algorithm. Our strategy
is based on Dawid's notion of calibration with more general checking rules and
on some modification of Kakade and Foster's randomized rounding algorithm for
computing the well-calibrated forecasts. We combine the method of randomized
calibration with Vovk's method of defensive forecasting in RKHS. Unlike the
statistical theory, no stochastic assumptions are made about the stock prices.
Our empirical results on historical markets provide strong evidence that this
type of technical trading can "beat the market" if transaction costs are
ignored.Comment: 32 pages. arXiv admin note: substantial text overlap with
arXiv:1105.427
Nonparametric Online Learning Using Lipschitz Regularized Deep Neural Networks
Deep neural networks are considered to be state of the art models in many
offline machine learning tasks. However, their performance and generalization
abilities in online learning tasks are much less understood. Therefore, we
focus on online learning and tackle the challenging problem where the
underlying process is stationary and ergodic and thus removing the i.i.d.
assumption and allowing observations to depend on each other arbitrarily. We
prove the generalization abilities of Lipschitz regularized deep neural
networks and show that by using those networks, a convergence to the best
possible prediction strategy is guaranteed
Nonparametric risk bounds for time-series forecasting
We derive generalization error bounds for traditional time-series forecasting
models. Our results hold for many standard forecasting tools including
autoregressive models, moving average models, and, more generally, linear
state-space models. These non-asymptotic bounds need only weak assumptions on
the data-generating process, yet allow forecasters to select among competing
models and to guarantee, with high probability, that their chosen model will
perform well. We motivate our techniques with and apply them to standard
economic and financial forecasting tools---a GARCH model for predicting equity
volatility and a dynamic stochastic general equilibrium model (DSGE), the
standard tool in macroeconomic forecasting. We demonstrate in particular how
our techniques can aid forecasters and policy makers in choosing models which
behave well under uncertainty and mis-specification.Comment: 34 pages, 3 figure
Kernel Change-point Detection with Auxiliary Deep Generative Models
Detecting the emergence of abrupt property changes in time series is a
challenging problem. Kernel two-sample test has been studied for this task
which makes fewer assumptions on the distributions than traditional parametric
approaches. However, selecting kernels is non-trivial in practice. Although
kernel selection for two-sample test has been studied, the insufficient samples
in change point detection problem hinder the success of those developed kernel
selection algorithms. In this paper, we propose KL-CPD, a novel kernel learning
framework for time series CPD that optimizes a lower bound of test power via an
auxiliary generative model. With deep kernel parameterization, KL-CPD endows
kernel two-sample test with the data-driven kernel to detect different types of
change-points in real-world applications. The proposed approach significantly
outperformed other state-of-the-art methods in our comparative evaluation of
benchmark datasets and simulation studies.Comment: To appear in ICLR 201
Diffusion Convolutional Recurrent Neural Network: Data-Driven Traffic Forecasting
Spatiotemporal forecasting has various applications in neuroscience, climate
and transportation domain. Traffic forecasting is one canonical example of such
learning task. The task is challenging due to (1) complex spatial dependency on
road networks, (2) non-linear temporal dynamics with changing road conditions
and (3) inherent difficulty of long-term forecasting. To address these
challenges, we propose to model the traffic flow as a diffusion process on a
directed graph and introduce Diffusion Convolutional Recurrent Neural Network
(DCRNN), a deep learning framework for traffic forecasting that incorporates
both spatial and temporal dependency in the traffic flow. Specifically, DCRNN
captures the spatial dependency using bidirectional random walks on the graph,
and the temporal dependency using the encoder-decoder architecture with
scheduled sampling. We evaluate the framework on two real-world large scale
road network traffic datasets and observe consistent improvement of 12% - 15%
over state-of-the-art baselines.Comment: Published as a conference paper at ICLR 201
- …