131 research outputs found
Common and Individual Structure of Brain Networks
This article focuses on the problem of studying shared- and
individual-specific structure in replicated networks or graph-valued data. In
particular, the observed data consist of graphs, , with
each graph consisting of a collection of edges between nodes. In brain
connectomics, the graph for an individual corresponds to a set of
interconnections among brain regions. Such data can be organized as a binary adjacency matrix for each , with ones indicating an edge
between a pair of nodes and zeros indicating no edge. When nodes have a shared
meaning across replicates , it becomes of substantial interest to
study similarities and differences in the adjacency matrices. To address this
problem, we propose a method to estimate a common structure and low-dimensional
individual-specific deviations from replicated networks. The proposed Multiple
GRAph Factorization (M-GRAF) model relies on a logistic regression mapping
combined with a hierarchical eigenvalue decomposition. We develop an efficient
algorithm for estimation and study basic properties of our approach. Simulation
studies show excellent operating characteristics and we apply the method to
human brain connectomics data.Comment: 29 pages, 12 figure
Phase-Amplitude Separation and Modeling of Spherical Trajectories
This paper studies the problem of separating phase-amplitude components in
sample paths of a spherical process (longitudinal data on a unit two-sphere).
Such separation is essential for efficient modeling and statistical analysis of
spherical longitudinal data in a manner that is invariant to any phase
variability. The key idea is to represent each path or trajectory with a pair
of variables, a starting point and a Transported Square-Root Velocity Curve
(TSRVC). A TSRVC is a curve in the tangent (vector) space at the starting point
and has some important invariance properties under the L2 norm. The space of
all such curves forms a vector bundle and the L2 norm, along with the standard
Riemannian metric on S2, provides a natural metric on this vector bundle. This
invariant representation allows for separating phase and amplitude components
in given data, using a template-based idea. Furthermore, the metric property is
useful in deriving computational procedures for clustering, mean computation,
principal component analysis (PCA), and modeling. This comprehensive framework
is demonstrated using two datasets: a set of bird-migration trajectories and a
set of hurricane paths in the Atlantic ocean
Bayesian Clustering of Shapes of Curves
Unsupervised clustering of curves according to their shapes is an important
problem with broad scientific applications. The existing model-based clustering
techniques either rely on simple probability models (e.g., Gaussian) that are
not generally valid for shape analysis or assume the number of clusters. We
develop an efficient Bayesian method to cluster curve data using an elastic
shape metric that is based on joint registration and comparison of shapes of
curves. The elastic-inner product matrix obtained from the data is modeled
using a Wishart distribution whose parameters are assigned carefully chosen
prior distributions to allow for automatic inference on the number of clusters.
Posterior is sampled through an efficient Markov chain Monte Carlo procedure
based on the Chinese restaurant process to infer (1) the posterior distribution
on the number of clusters, and (2) clustering configuration of shapes. This
method is demonstrated on a variety of synthetic data and real data examples on
protein structure analysis, cell shape analysis in microscopy images, and
clustering of shaped from MPEG7 database
Robust Comparison of Kernel Densities on Spherical Domains
While spherical data arises in many contexts, including in directional
statistics, the current tools for density estimation and population comparison
on spheres are quite limited. Popular approaches for comparing populations (on
Euclidean domains) mostly involvea two-step procedure: (1) estimate probability
density functions (pdfs) from their respective samples, most commonly using the
kernel density estimator, and, (2) compare pdfs using a metric such as the L2
norm. However, both the estimated pdfs and their differences depend heavily on
the chosen kernels, bandwidths, and sample sizes. Here we develop a framework
for comparing spherical populations that is robust to these choices.
Essentially, we characterize pdfs on spherical domains by quantifying their
smoothness. Our framework uses a spectral representation, with densities
represented by their coefficients with respect to the eigenfunctions of the
Laplacian operator on a sphere. The change in smoothness, akin to using
different kernel bandwidths, is controlled by exponential decays in coefficient
values. Then we derive a proper distance for comparing pdf coefficients while
equalizing smoothness levels, negating influences of sample size and bandwidth.
This signifies a fair and meaningful comparisons of populations, despite vastly
different sample sizes, and leads to a robust and improved performance. We
demonstrate this framework using examples of variables on S1 and S2, and
evaluate its performance using a number of simulations and real data
experiments
Discovering Common Change-Point Patterns in Functional Connectivity Across Subjects
This paper studies change-points in human brain functional connectivity (FC)
and seeks patterns that are common across multiple subjects under identical
external stimulus. FC relates to the similarity of fMRI responses across
different brain regions when the brain is simply resting or performing a task.
While the dynamic nature of FC is well accepted, this paper develops a formal
statistical test for finding {\it change-points} in times series associated
with FC. It represents short-term connectivity by a symmetric positive-definite
matrix, and uses a Riemannian metric on this space to develop a graphical
method for detecting change-points in a time series of such matrices. It also
provides a graphical representation of estimated FC for stationary subintervals
in between the detected change-points. Furthermore, it uses a temporal
alignment of the test statistic, viewed as a real-valued function over time, to
remove inter-subject variability and to discover common change-point patterns
across subjects. This method is illustrated using data from Human Connectome
Project (HCP) database for multiple subjects and tasks
Low-Rank Representation over the Manifold of Curves
In machine learning it is common to interpret each data point as a vector in
Euclidean space. However the data may actually be functional i.e.\ each data
point is a function of some variable such as time and the function is
discretely sampled. The naive treatment of functional data as traditional
multivariate data can lead to poor performance since the algorithms are
ignoring the correlation in the curvature of each function. In this paper we
propose a method to analyse subspace structure of the functional data by using
the state of the art Low-Rank Representation (LRR). Experimental evaluation on
synthetic and real data reveals that this method massively outperforms
conventional LRR in tasks concerning functional data
Bayesian Hierarchical Modeling on Covariance Valued Data
Analysis of structural and functional connectivity (FC) of human brains is of
pivotal importance for diagnosis of cognitive ability. The Human Connectome
Project (HCP) provides an excellent source of neural data across different
regions of interest (ROIs) of the living human brain. Individual specific data
were available from an existing analysis (Dai et al., 2017) in the form of time
varying covariance matrices representing the brain activity as the subjects
perform a specific task. As a preliminary objective of studying the
heterogeneity of brain connectomics across the population, we develop a
probabilistic model for a sample of covariance matrices using a scaled Wishart
distribution. We stress here that our data units are available in the form of
covariance matrices, and we use the Wishart distribution to create our
likelihood function rather than its more common usage as a prior on covariance
matrices. Based on empirical explorations suggesting the data matrices to have
low effective rank, we further model the center of the Wishart distribution
using an orthogonal factor model type decomposition. We encourage shrinkage
towards a low rank structure through a novel shrinkage prior and discuss
strategies to sample from the posterior distribution using a combination of
Gibbs and slice sampling. We extend our modeling framework to a dynamic setting
to detect change points. The efficacy of the approach is explored in various
simulation settings and exemplified on several case studies including our
motivating HCP data. We extend our modeling framework to a dynamic setting to
detect change points.Comment: Some key references are missing in the old version which are
corrected in this versio
Learning Signal Subgraphs from Longitudinal Brain Networks with Symmetric Bilinear Logistic Regression
Modern neuroimaging technologies, combined with state-of-the-art data
processing pipelines, have made it possible to collect longitudinal
observations of an individual's brain connectome at different ages. It is of
substantial scientific interest to study how brain connectivity varies over
time in relation to human cognitive traits. In brain connectomics, the
structural brain network for an individual corresponds to a set of
interconnections among brain regions. We propose a symmetric bilinear logistic
regression to learn a set of small subgraphs relevant to a binary outcome from
longitudinal brain networks as well as estimating the time effects of the
subgraphs. We enforce the extracted signal subgraphs to have clique structure
which has appealing interpretations as they can be related to neurological
circuits. The time effect of each signal subgraph reflects how its predictive
effect on the outcome varies over time, which may improve our understanding of
interactions between the aging of brain structure and neurological disorders.
Application of this method on longitudinal brain connectomics and cognitive
capacity data shows interesting discovery of relevant interconnections among a
small set of brain regions in frontal and temporal lobes with better predictive
performance than competitors.Comment: 34 pages, 15 figure
Tensor network factorizations: Relationships between brain structural connectomes and traits
Advanced brain imaging techniques make it possible to measure individuals'
structural connectomes in large cohort studies non-invasively. The structural
connectome is initially shaped by genetics and subsequently refined by the
environment. It is extremely interesting to study relationships between
structural connectomes and environment factors or human traits, such as
substance use and cognition. Due to limitations in structural connectome
recovery, previous studies largely focus on functional connectomes. Questions
remain about how well structural connectomes can explain variance in different
human traits. Using a state-of-the-art structural connectome processing
pipeline and a novel dimensionality reduction technique applied to data from
the Human Connectome Project (HCP), we show strong relationships between
structural connectomes and various human traits. Our dimensionality reduction
approach uses a tensor characterization of the connectome and relies on a
generalization of principal components analysis. We analyze over 1100 scans for
1076 subjects from the HCP and the Sherbrooke test-retest data set, as well as
human traits that measure domains including cognition, substance use,
motor, sensory and emotion. We find that structural connectomes are associated
with many traits. Specifically, fluid intelligence, language comprehension, and
motor skills are associated with increased cortical-cortical brain structural
connectivity, while the use of alcohol, tobacco, and marijuana are associated
with decreased cortical-cortical connectivity
Video-Based Action Recognition Using Rate-Invariant Analysis of Covariance Trajectories
Statistical classification of actions in videos is mostly performed by
extracting relevant features, particularly covariance features, from image
frames and studying time series associated with temporal evolutions of these
features. A natural mathematical representation of activity videos is in form
of parameterized trajectories on the covariance manifold, i.e. the set of
symmetric, positive-definite matrices (SPDMs). The variable execution-rates of
actions implies variable parameterizations of the resulting trajectories, and
complicates their classification. Since action classes are invariant to
execution rates, one requires rate-invariant metrics for comparing
trajectories. A recent paper represented trajectories using their transported
square-root vector fields (TSRVFs), defined by parallel translating
scaled-velocity vectors of trajectories to a reference tangent space on the
manifold. To avoid arbitrariness of selecting the reference and to reduce
distortion introduced during this mapping, we develop a purely intrinsic
approach where SPDM trajectories are represented by redefining their TSRVFs at
the starting points of the trajectories, and analyzed as elements of a vector
bundle on the manifold. Using a natural Riemannain metric on vector bundles of
SPDMs, we compute geodesic paths and geodesic distances between trajectories in
the quotient space of this vector bundle, with respect to the
re-parameterization group. This makes the resulting comparison of trajectories
invariant to their re-parameterization. We demonstrate this framework on two
applications involving video classification: visual speech recognition or
lip-reading and hand-gesture recognition. In both cases we achieve results
either comparable to or better than the current literature
- …