2 research outputs found
Analysis of Conditional Randomisation and Permutation schemes with application to conditional independence testing
We study properties of two resampling scenarios: Conditional Randomisation
and Conditional Permutation scheme which are relevant for conditional
independence testing of discrete random variables and given random
variable . Namely, we investigate asymptotic behaviour of estimates of a
vector of probabilities in such settings, establish their asymptotic normality
and ordering between asymptotic covariance matrices. The results are used to
derive asymptotic distributions of empirical Conditional Mutual Information in
these set-ups. Somewhat unexpectedly, the distributions coincide for the two
scenarios, despite differences in asymptotic distribution of estimates of
probabilities. We also prove validity of permutation p-values for Conditional
Permutation scheme. The above results justify consideration of conditional
independence tests based on re-sampled p-values and on asymptotic chi square
distribution with adjusted number of degrees of freedom. We show in numerical
experiments that when the ratio of the sample size to the number of possible
values of the triple exceeds 0.5, the test based on the asymptotic distribution
with the adjustment made on limited number of permutations is a viable
alternative to the exact test for both Conditional Permutation and Conditional
Randomisation scenarios. Moreover, there is no significant difference between
performance of exact tests for Conditional Permutation and Randomisation
scheme, the latter requiring knowledge of conditional distribution of given
, and the same conclusion is true for both adaptive tests.Comment: 28 page
Analysis of Information-Based Nonparametric Variable Selection Criteria
We consider a nonparametric Generative Tree Model and discuss a problem of selecting active predictors for the response in such scenario. We investigated two popular information-based selection criteria: Conditional Infomax Feature Extraction (CIFE) and Joint Mutual information (JMI), which are both derived as approximations of Conditional Mutual Information (CMI) criterion. We show that both criteria CIFE and JMI may exhibit different behavior from CMI, resulting in different orders in which predictors are chosen in variable selection process. Explicit formulae for CMI and its two approximations in the generative tree model are obtained. As a byproduct, we establish expressions for an entropy of a multivariate gaussian mixture and its mutual information with mixing distribution