21,784 research outputs found

    Invariant Causal Prediction for Sequential Data

    Full text link
    We investigate the problem of inferring the causal predictors of a response YY from a set of dd explanatory variables (X1,…,Xd)(X^1,\dots,X^d). Classical ordinary least squares regression includes all predictors that reduce the variance of YY. Using only the causal predictors instead leads to models that have the advantage of remaining invariant under interventions, loosely speaking they lead to invariance across different "environments" or "heterogeneity patterns". More precisely, the conditional distribution of YY given its causal predictors remains invariant for all observations. Recent work exploits such a stability to infer causal relations from data with different but known environments. We show that even without having knowledge of the environments or heterogeneity pattern, inferring causal relations is possible for time-ordered (or any other type of sequentially ordered) data. In particular, this allows detecting instantaneous causal relations in multivariate linear time series which is usually not the case for Granger causality. Besides novel methodology, we provide statistical confidence bounds and asymptotic detection results for inferring causal predictors, and present an application to monetary policy in macroeconomics.Comment: 55 page

    Discovering unbounded episodes in sequential data

    Get PDF
    One basic goal in the analysis of time-series data is to find frequent interesting episodes, i.e, collections of events occurring frequently together in the input sequence. Most widely-known work decide the interestingness of an episode from a fixed user-specified window width or interval, that bounds the subsequent sequential association rules. We present in this paper, a more intuitive definition that allows, in turn, interesting episodes to grow during the mining without any user-specified help. A convenient algorithm to efficiently discover the proposed unbounded episodes is also implemented. Experimental results confirm that our approach results useful and advantageous.Postprint (published version

    Assessing the Distribution Consistency of Sequential Data

    Get PDF
    Given n observations, we study the consistency of a batch of k new observations, in terms of their distribution function. We propose a non-parametric, non-likelihood test based on Edgeworth expansion of the distribution function. The keypoint is to approximate the distribution of the n+k observations by the distribution of n-k among the n observations. Edgeworth expansion gives the correcting term and the rate of convergence. We also study the discrete distribution case, for which Cram\`er's condition of smoothness is not satisfied. The rate of convergence for the various cases are compared.Comment: 20 pages, 0 figure

    Stochastic Collapsed Variational Inference for Sequential Data

    Full text link
    Stochastic variational inference for collapsed models has recently been successfully applied to large scale topic modelling. In this paper, we propose a stochastic collapsed variational inference algorithm in the sequential data setting. Our algorithm is applicable to both finite hidden Markov models and hierarchical Dirichlet process hidden Markov models, and to any datasets generated by emission distributions in the exponential family. Our experiment results on two discrete datasets show that our inference is both more efficient and more accurate than its uncollapsed version, stochastic variational inference.Comment: NIPS Workshop on Advances in Approximate Bayesian Inference, 201
    • …
    corecore