Canonical correlation analysis (CCA) has become a key tool for population
neuroimaging, allowing investigation of associations between many imaging and
non-imaging measurements. As other variables are often a source of variability
not of direct interest, previous work has used CCA on residuals from a model
that removes these effects, then proceeded directly to permutation inference.
We show that such a simple permutation test leads to inflated error rates. The
reason is that residualisation introduces dependencies among the observations
that violate the exchangeability assumption. Even in the absence of nuisance
variables, however, a simple permutation test for CCA also leads to excess
error rates for all canonical correlations other than the first. The reason is
that a simple permutation scheme does not ignore the variability already
explained by previous canonical variables. Here we propose solutions for both
problems: in the case of nuisance variables, we show that transforming the
residuals to a lower dimensional basis where exchangeability holds results in a
valid permutation test; for more general cases, with or without nuisance
variables, we propose estimating the canonical correlations in a stepwise
manner, removing at each iteration the variance already explained, while
dealing with different number of variables in both sides. We also discuss how
to address the multiplicity of tests, proposing an admissible test that is not
conservative, and provide a complete algorithm for permutation inference for
CCA.Comment: 49 pages, 2 figures, 10 tables, 3 algorithms, 119 reference