22,730 research outputs found
Invariant Causal Prediction for Sequential Data
We investigate the problem of inferring the causal predictors of a response
from a set of explanatory variables . Classical
ordinary least squares regression includes all predictors that reduce the
variance of . Using only the causal predictors instead leads to models that
have the advantage of remaining invariant under interventions, loosely speaking
they lead to invariance across different "environments" or "heterogeneity
patterns". More precisely, the conditional distribution of given its causal
predictors remains invariant for all observations. Recent work exploits such a
stability to infer causal relations from data with different but known
environments. We show that even without having knowledge of the environments or
heterogeneity pattern, inferring causal relations is possible for time-ordered
(or any other type of sequentially ordered) data. In particular, this allows
detecting instantaneous causal relations in multivariate linear time series
which is usually not the case for Granger causality. Besides novel methodology,
we provide statistical confidence bounds and asymptotic detection results for
inferring causal predictors, and present an application to monetary policy in
macroeconomics.Comment: 55 page
Discovering unbounded episodes in sequential data
One basic goal in the analysis of time-series data is
to find frequent interesting episodes, i.e, collections
of events occurring frequently together in the input sequence.
Most widely-known work decide the interestingness of an episode from a
fixed user-specified window width or interval, that bounds the
subsequent sequential association rules.
We present in this paper, a more intuitive definition that
allows, in turn, interesting episodes to grow during the mining without any
user-specified help. A convenient algorithm to
efficiently discover the proposed unbounded episodes is also implemented.
Experimental results confirm that our approach results useful
and advantageous.Postprint (published version
Assessing the Distribution Consistency of Sequential Data
Given n observations, we study the consistency of a batch of k new
observations, in terms of their distribution function. We propose a
non-parametric, non-likelihood test based on Edgeworth expansion of the
distribution function. The keypoint is to approximate the distribution of the
n+k observations by the distribution of n-k among the n observations. Edgeworth
expansion gives the correcting term and the rate of convergence. We also study
the discrete distribution case, for which Cram\`er's condition of smoothness is
not satisfied. The rate of convergence for the various cases are compared.Comment: 20 pages, 0 figure
Stochastic Collapsed Variational Inference for Sequential Data
Stochastic variational inference for collapsed models has recently been
successfully applied to large scale topic modelling. In this paper, we propose
a stochastic collapsed variational inference algorithm in the sequential data
setting. Our algorithm is applicable to both finite hidden Markov models and
hierarchical Dirichlet process hidden Markov models, and to any datasets
generated by emission distributions in the exponential family. Our experiment
results on two discrete datasets show that our inference is both more efficient
and more accurate than its uncollapsed version, stochastic variational
inference.Comment: NIPS Workshop on Advances in Approximate Bayesian Inference, 201
- …