2,146 research outputs found
Increasing power for voxel-wise genome-wide association studies : the random field theory, least square kernel machines and fast permutation procedures
Imaging traits are thought to have more direct links to genetic variation than diagnostic measures based on cognitive or clinical assessments and provide a powerful substrate to examine the influence of genetics on human brains. Although imaging genetics has attracted growing attention and interest, most brain-wide genome-wide association studies focus on voxel-wise single-locus approaches, without taking advantage of the spatial information in images or combining the effect of multiple genetic variants. In this paper we present a fast implementation of voxel- and cluster-wise inferences based on the random field theory to fully use the spatial information in images. The approach is combined with a multi-locus model based on least square kernel machines to associate the joint effect of several single nucleotide polymorphisms (SNP) with imaging traits. A fast permutation procedure is also proposed which significantly reduces the number of permutations needed relative to the standard empirical method and provides accurate small p-value estimates based on parametric tail approximation. We explored the relation between 448,294 single nucleotide polymorphisms and 18,043 genes in 31,662 voxels of the entire brain across 740 elderly subjects from the Alzheimer's Disease Neuroimaging Initiative (ADNI). Structural MRI scans were analyzed using tensor-based morphometry (TBM) to compute 3D maps of regional brain volume differences compared to an average template image based on healthy elderly subjects. We find method to be more sensitive compared with voxel-wise single-locus approaches. A number of genes were identified as having significant associations with volumetric changes. The most associated gene was GRIN2B, which encodes the N-methyl-d-aspartate (NMDA) glutamate receptor NR2B subunit and affects both the parietal and temporal lobes in human brains. Its role in Alzheimer's disease has been widely acknowledged and studied, suggesting the validity of the approach. The various advantages over existing approaches indicate a great potential offered by this novel framework to detect genetic influences on human brains
Multiscale Dictionary Learning for Estimating Conditional Distributions
Nonparametric estimation of the conditional distribution of a response given
high-dimensional features is a challenging problem. It is important to allow
not only the mean but also the variance and shape of the response density to
change flexibly with features, which are massive-dimensional. We propose a
multiscale dictionary learning model, which expresses the conditional response
density as a convex combination of dictionary densities, with the densities
used and their weights dependent on the path through a tree decomposition of
the feature space. A fast graph partitioning algorithm is applied to obtain the
tree decomposition, with Bayesian methods then used to adaptively prune and
average over different sub-trees in a soft probabilistic manner. The algorithm
scales efficiently to approximately one million features. State of the art
predictive performance is demonstrated for toy examples and two neuroscience
applications including up to a million features
Interpretable statistics for complex modelling: quantile and topological learning
As the complexity of our data increased exponentially in the last decades, so has our
need for interpretable features. This thesis revolves around two paradigms to approach
this quest for insights.
In the first part we focus on parametric models, where the problem of interpretability
can be seen as a āparametrization selectionā. We introduce a quantile-centric
parametrization and we show the advantages of our proposal in the context of regression,
where it allows to bridge the gap between classical generalized linear (mixed)
models and increasingly popular quantile methods.
The second part of the thesis, concerned with topological learning, tackles the
problem from a non-parametric perspective. As topology can be thought of as a way
of characterizing data in terms of their connectivity structure, it allows to represent
complex and possibly high dimensional through few features, such as the number of
connected components, loops and voids. We illustrate how the emerging branch of
statistics devoted to recovering topological structures in the data, Topological Data
Analysis, can be exploited both for exploratory and inferential purposes with a special
emphasis on kernels that preserve the topological information in the data.
Finally, we show with an application how these two approaches can borrow strength
from one another in the identification and description of brain activity through fMRI
data from the ABIDE project
New Exact and Numerical Solutions of the (Convection-)Diffusion Kernels on SE(3)
We consider hypo-elliptic diffusion and convection-diffusion on , the quotient of the Lie group of rigid body motions SE(3) in
which group elements are equivalent if they are equal up to a rotation around
the reference axis. We show that we can derive expressions for the convolution
kernels in terms of eigenfunctions of the PDE, by extending the approach for
the SE(2) case. This goes via application of the Fourier transform of the PDE
in the spatial variables, yielding a second order differential operator. We
show that the eigenfunctions of this operator can be expressed as (generalized)
spheroidal wave functions. The same exact formulas are derived via the Fourier
transform on SE(3). We solve both the evolution itself, as well as the
time-integrated process that corresponds to the resolvent operator.
Furthermore, we have extended a standard numerical procedure from SE(2) to
SE(3) for the computation of the solution kernels that is directly related to
the exact solutions. Finally, we provide a novel analytic approximation of the
kernels that we briefly compare to the exact kernels.Comment: Revised and restructure
Regularized brain reading with shrinkage and smoothing
Functional neuroimaging measures how the brain responds to complex stimuli.
However, sample sizes are modest, noise is substantial, and stimuli are high
dimensional. Hence, direct estimates are inherently imprecise and call for
regularization. We compare a suite of approaches which regularize via
shrinkage: ridge regression, the elastic net (a generalization of ridge
regression and the lasso), and a hierarchical Bayesian model based on small
area estimation (SAE). We contrast regularization with spatial smoothing and
combinations of smoothing and shrinkage. All methods are tested on functional
magnetic resonance imaging (fMRI) data from multiple subjects participating in
two different experiments related to reading, for both predicting neural
response to stimuli and decoding stimuli from responses. Interestingly, when
the regularization parameters are chosen by cross-validation independently for
every voxel, low/high regularization is chosen in voxels where the
classification accuracy is high/low, indicating that the regularization
intensity is a good tool for identification of relevant voxels for the
cognitive task. Surprisingly, all the regularization methods work about equally
well, suggesting that beating basic smoothing and shrinkage will take not only
clever methods, but also careful modeling.Comment: Published at http://dx.doi.org/10.1214/15-AOAS837 in the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
- ā¦