6,793 research outputs found
A simple and objective method for reproducible resting state network (RSN) detection in fMRI
Spatial Independent Component Analysis (ICA) decomposes the time by space
functional MRI (fMRI) matrix into a set of 1-D basis time courses and their
associated 3-D spatial maps that are optimized for mutual independence. When
applied to resting state fMRI (rsfMRI), ICA produces several spatial
independent components (ICs) that seem to have biological relevance - the
so-called resting state networks (RSNs). The ICA problem is well posed when the
true data generating process follows a linear mixture of ICs model in terms of
the identifiability of the mixing matrix. However, the contrast function used
for promoting mutual independence in ICA is dependent on the finite amount of
observed data and is potentially non-convex with multiple local minima. Hence,
each run of ICA could produce potentially different IC estimates even for the
same data. One technique to deal with this run-to-run variability of ICA was
proposed by Yang et al. (2008) in their algorithm RAICAR which allows for the
selection of only those ICs that have a high run-to-run reproducibility. We
propose an enhancement to the original RAICAR algorithm that enables us to
assign reproducibility p-values to each IC and allows for an objective
assessment of both within subject and across subjects reproducibility. We call
the resulting algorithm RAICAR-N (N stands for null hypothesis test), and we
have applied it to publicly available human rsfMRI data (http://www.nitrc.org).
Our reproducibility analyses indicated that many of the published RSNs in
rsfMRI literature are highly reproducible. However, we found several other RSNs
that are highly reproducible but not frequently listed in the literature.Comment: 54 pages, 13 figure
Model-based deep autoencoders for clustering single-cell RNA sequencing data with side information
Clustering analysis has been conducted extensively in single-cell RNA sequencing (scRNA-seq) studies. scRNA-seq can profile tens of thousands of genes\u27 activities within a single cell. Thousands or tens of thousands of cells can be captured simultaneously in a typical scRNA-seq experiment. Biologists would like to cluster these cells for exploring and elucidating cell types or subtypes. Numerous methods have been designed for clustering scRNA-seq data. Yet, single-cell technologies develop so fast in the past few years that those existing methods do not catch up with these rapid changes and fail to fully fulfil their potential. For instance, besides profiling transcription expression levels of genes, recent single-cell technologies can capture other auxiliary information at the single-cell level, such as protein expression (multi-omics scRNA-seq) and cells\u27 spatial location information (spatial-resolved scRNA-seq). Most existing clustering methods for scRNA-seq are performed in an unsupervised manner and fail to exploit available side information for optimizing clustering performance.
This dissertation focuses on developing novel computational methods for clustering scRNA-seq data. The basic models are built on a deep autoencoder (AE) framework, which is coupled with a ZINB (zero-inflated negative binomial) loss to characterize the zero-inflated and over-dispersed scRNA-seq count data. To integrate multi-omics scRNA-seq data, a multimodal autoencoder (MAE) is employed. It applies one encoder for the multimodal inputs and two decoders for reconstructing each omics of data. This model is named scMDC (Single-Cell Multi-omics Deep Clustering). Besides, it is expected that cells in spatial proximity tend to be of the same cell types. To exploit cellular spatial information available for spatial-resolved scRNA-seq (sp-scRNA-seq) data, a novel model, DSSC (Deep Spatial-constrained Single-cell Clustering), is developed. DSSC integrates the spatial information of cells into the clustering process by two steps: 1) the spatial information is encoded by using a graphical neural network model; 2) cell-to-cell constraints are built based on the spatially expression pattern of the marker genes and added in the model to guide the clustering process. DSSC is the first model which can utilize the information from both the spatial coordinates and the marker genes to guide the cell/spot clustering. For both scMDC and DSSC, a clustering loss is optimized on the bottleneck layer of autoencoder along with the learning of feature representation. Extensive experiments on both simulated and real datasets demonstrate that scMDC and DSSC boost clustering performance significantly while costing no extra time and space during the training process. These models hold great promise as valuable tools for harnessing the full potential of state-of-the-art single-cell data
Cosmic Constraint to DGP Brane Model: Geometrical and Dynamical Perspectives
In this paper, the Dvali-Gabadadze-Porrati (DGP) brane model is confronted by
current cosmic observational data sets from geometrical and dynamical
perspectives. On the geometrical side, the recent released Union2 of type
Ia supernovae (SN Ia), the baryon acoustic oscillation (BAO) from Sloan Digital
Sky Survey and the Two Degree Galaxy Redshift Survey (transverse and radial to
line-of-sight data points), the cosmic microwave background (CMB) measurement
given by the seven-year Wilkinson Microwave Anisotropy Probe observations
(shift parameters , and redshift at the last scatter surface
), ages of high redshifts galaxies, i.e. the lookback time (LT) and the
high redshift Gamma Ray Bursts (GRBs) are used. On the dynamical side, data
points about the growth function (GF) of matter linear perturbations are used.
Using the same data sets combination, we also constrain the flat CDM
model as a comparison. The results show that current geometrical and dynamical
observational data sets much favor flat CDM model and the departure
from it is above () for spatially flat DGP model
with(without) SN systematic errors. The consistence of growth function data
points is checked in terms of relative departure of redshift-distance relation.Comment: 14 pages, 5 figures, 2 tables, accepted for publication in Physical
Review
Complex Independent Component Analysis of Frequency-Domain Electroencephalographic Data
Independent component analysis (ICA) has proven useful for modeling brain and
electroencephalographic (EEG) data. Here, we present a new, generalized method
to better capture the dynamics of brain signals than previous ICA algorithms.
We regard EEG sources as eliciting spatio-temporal activity patterns,
corresponding to, e.g., trajectories of activation propagating across cortex.
This leads to a model of convolutive signal superposition, in contrast with the
commonly used instantaneous mixing model. In the frequency-domain, convolutive
mixing is equivalent to multiplicative mixing of complex signal sources within
distinct spectral bands. We decompose the recorded spectral-domain signals into
independent components by a complex infomax ICA algorithm. First results from a
visual attention EEG experiment exhibit (1) sources of spatio-temporal dynamics
in the data, (2) links to subject behavior, (3) sources with a limited spectral
extent, and (4) a higher degree of independence compared to sources derived by
standard ICA.Comment: 21 pages, 11 figures. Added final journal reference, fixed minor
typo
- …