1,871 research outputs found

    Conditional independence testing based on a nearest-neighbor estimator of conditional mutual information

    Get PDF
    Conditional independence testing is a fundamental problem underlying causal discovery and a particularly challenging task in the presence of nonlinear and high-dimensional dependencies. Here a fully non-parametric test for continuous data based on conditional mutual information combined with a local permutation scheme is presented. Through a nearest neighbor approach, the test efficiently adapts also to non-smooth distributions due to strongly nonlinear dependencies. Numerical experiments demonstrate that the test reliably simulates the null distribution even for small sample sizes and with high-dimensional conditioning sets. The test is better calibrated than kernel-based tests utilizing an analytical approximation of the null distribution, especially for non-smooth densities, and reaches the same or higher power levels. Combining the local permutation scheme with the kernel tests leads to better calibration, but suffers in power. For smaller sample sizes and lower dimensions, the test is faster than random fourier feature-based kernel tests if the permutation scheme is (embarrassingly) parallelized, but the runtime increases more sharply with sample size and dimensionality. Thus, more theoretical research to analytically approximate the null distribution and speed up the estimation for larger sample sizes is desirable.Comment: 17 pages, 12 figures, 1 tabl

    Optimal model-free prediction from multivariate time series

    Get PDF
    Forecasting a time series from multivariate predictors constitutes a challenging problem, especially using model-free approaches. Most techniques, such as nearest-neighbor prediction, quickly suffer from the curse of dimensionality and overfitting for more than a few predictors which has limited their application mostly to the univariate case. Therefore, selection strategies are needed that harness the available information as efficiently as possible. Since often the right combination of predictors matters, ideally all subsets of possible predictors should be tested for their predictive power, but the exponentially growing number of combinations makes such an approach computationally prohibitive. Here a prediction scheme that overcomes this strong limitation is introduced utilizing a causal pre-selection step which drastically reduces the number of possible predictors to the most predictive set of causal drivers making a globally optimal search scheme tractable. The information-theoretic optimality is derived and practical selection criteria are discussed. As demonstrated for multivariate nonlinear stochastic delay processes, the optimal scheme can even be less computationally expensive than commonly used sub-optimal schemes like forward selection. The method suggests a general framework to apply the optimal model-free approach to select variables and subsequently fit a model to further improve a prediction or learn statistical dependencies. The performance of this framework is illustrated on a climatological index of El Ni\~no Southern Oscillation.Comment: 14 pages, 9 figure

    Causal conditioning and instantaneous coupling in causality graphs

    Full text link
    The paper investigates the link between Granger causality graphs recently formalized by Eichler and directed information theory developed by Massey and Kramer. We particularly insist on the implication of two notions of causality that may occur in physical systems. It is well accepted that dynamical causality is assessed by the conditional transfer entropy, a measure appearing naturally as a part of directed information. Surprisingly the notion of instantaneous causality is often overlooked, even if it was clearly understood in early works. In the bivariate case, instantaneous coupling is measured adequately by the instantaneous information exchange, a measure that supplements the transfer entropy in the decomposition of directed information. In this paper, the focus is put on the multivariate case and conditional graph modeling issues. In this framework, we show that the decomposition of directed information into the sum of transfer entropy and information exchange does not hold anymore. Nevertheless, the discussion allows to put forward the two measures as pillars for the inference of causality graphs. We illustrate this on two synthetic examples which allow us to discuss not only the theoretical concepts, but also the practical estimation issues.Comment: submitte

    Optimal model-free prediction from multivariate time series

    Get PDF
    © 2015 American Physical Society.Forecasting a time series from multivariate predictors constitutes a challenging problem, especially using model-free approaches. Most techniques, such as nearest-neighbor prediction, quickly suffer from the curse of dimensionality and overfitting for more than a few predictors which has limited their application mostly to the univariate case. Therefore, selection strategies are needed that harness the available information as efficiently as possible. Since often the right combination of predictors matters, ideally all subsets of possible predictors should be tested for their predictive power, but the exponentially growing number of combinations makes such an approach computationally prohibitive. Here a prediction scheme that overcomes this strong limitation is introduced utilizing a causal preselection step which drastically reduces the number of possible predictors to the most predictive set of causal drivers making a globally optimal search scheme tractable. The information-theoretic optimality is derived and practical selection criteria are discussed. As demonstrated for multivariate nonlinear stochastic delay processes, the optimal scheme can even be less computationally expensive than commonly used suboptimal schemes like forward selection. The method suggests a general framework to apply the optimal model-free approach to select variables and subsequently fit a model to further improve a prediction or learn statistical dependencies. The performance of this framework is illustrated on a climatological index of El Niño Southern Oscillation
    corecore