12 research outputs found

    Estimating multivariate similarity between neuroimaging datasets with sparse canonical correlation analysis:an application to perfusion imaging

    Get PDF
    An increasing number of neuroimaging studies are based on either combining more than one data modality (inter-modal) or combining more than one measurement from the same modality (intra-modal). To date, most intra-modal studies using multivariate statistics have focused on differences between datasets, for instance relying on classifiers to differentiate between effects in the data. However, to fully characterize these effects, multivariate methods able to measure similarities between datasets are needed. One classical technique for estimating the relationship between two datasets is canonical correlation analysis (CCA). However, in the context of high-dimensional data the application of CCA is extremely challenging. A recent extension of CCA, sparse CCA (SCCA), overcomes this limitation, by regularizing the model parameters while yielding a sparse solution. In this work, we modify SCCA with the aim of facilitating its application to high-dimensional neuroimaging data and finding meaningful multivariate image-to-image correspondences in intra-modal studies. In particular, we show how the optimal subset of variables can be estimated independently and we look at the information encoded in more than one set of SCCA transformations. We illustrate our framework using Arterial Spin Labeling data to investigate multivariate similarities between the effects of two antipsychotic drugs on cerebral blood flow

    The application of omics techniques to understand the role of the gut microbiota in inflammatory bowel disease

    Get PDF
    The aetiopathogenesis of inflammatory bowel diseases (IBD) involves the complex interaction between a patient’s genetic predisposition, environment, gut microbiota and immune system. Currently, however, it is not known if the distinctive perturbations of the gut microbiota that appear to accompany both Crohn’s disease and ulcerative colitis are the cause of, or the result of, the intestinal inflammation that characterizes IBD. With the utilization of novel systems biology technologies, we can now begin to understand not only details about compositional changes in the gut microbiota in IBD, but increasingly also the alterations in microbiota function that accompany these. Technologies such as metagenomics, metataxomics, metatranscriptomics, metaproteomics and metabonomics are therefore allowing us a deeper understanding of the role of the microbiota in IBD. Furthermore, the integration of these systems biology technologies through advancing computational and statistical techniques are beginning to understand the microbiome interactions that both contribute to health and diseased states in IBD. This review aims to explore how such systems biology technologies are advancing our understanding of the gut microbiota, and their potential role in delineating the aetiology, development and clinical care of IBD

    Pathway-Based Multi-Omics Data Integration for Breast Cancer Diagnosis and Prognosis.

    Get PDF
    Ph.D. Thesis. University of Hawaiʻi at Mānoa 2017

    Integrated analysis of miRNA/mRNA expression and gene methylation using sparse canonical correlation analysis.

    Get PDF
    MicroRNAs (miRNAs) are a large number of small endogenous non-coding RNA molecules (18-25 nucleotides in length) which regulate expression of genes post-transcriptionally. While a variety of algorithms exist for determining the targets of miRNAs, they are generally based on sequence information and frequently produce lists consisting of thousands of genes. Canonical correlation analysis (CCA) is a multivariate statistical method that can be used to find linear relationships between two data sets, and here we apply CCA to find the linear combination of differentially expressed miRNAs and their corresponding target genes having maximal negative correlation. Due to the high dimensionality, sparse CCA is used to constrain the problem and obtain a solution. A novel gene set enrichment analysis statistic is proposed based on the sparse CCA results for estimating the significance of predefined gene sets. The methods are illustrated with both a simulation study and real miRNA-mRNA expression data. DNA methylation is a process of adding a methyl group to DNA by a group of enzymes collectively known as DNA methyltransferases which is an epigenetic modification critical to normal genome regulation and development. In order to understand the role of DNA methylation in gene differentiation, we analyze genome-scale DNA methylation patterns and gene expression data using sparse CCA to find linear combinations between the two data sets which have maximal negative correlation. In a similar spirit to the miRNA-mRNA study, we create a GSEA statistic with weight vectors from the sparse CCA method and assess the significance of predefined gene sets. The method is exemplified with real gene expression / DNA methylation data regarding the development of the embryonic murine palate

    A Likelihood Based Framework for Data Integration with Application to eQTL Mapping

    Get PDF
    We develop a new way of thinking about and integrating gene expression data (continuous) and genomic information data (binary) by jointly compressing the two data sets and embedding their signals in low dimensional feature spaces with an information sharing mechanism, which connects the continuous data to the binary data, under the penalized log-likelihood framework. In particular, the continuous data are modeled by a Gaussian likelihood and the binary data are modeled by a Bernoulli likelihood which is formed by transforming the feature space of the genomic information with a logit link. The smoothly clipped absolute deviation (SCAD) penalty, is added on the basis vectors of the low dimensional feature spaces for both data sets, which is based on the assumption that only a small set of genetic variants are associated with a small fraction of gene expression and the fact that those basis vectors can be interpreted as weights assigned on the genetic variants and gene expression similar to the way the loading vectors of principal component analysis (PCA) or canonical correlation analysis (CCA) are interpreted. Algorithmically, a Majorization-Minimization (MM) algorithm with local linear approximation (LLA) to SCAD penalty is developed to effectively and efficiently solve the optimization problem involved, which produces closed-form updating rules. The effectiveness of our method is demonstrated by simulations in various setups with comparisons to some popular competing methods and an application to eQTL mapping with real data

    NSAID Enteropathy: Novel Aspects of Pathophysiology, Diagnosis, and Treatment

    Get PDF
    Although non-steroidal anti-inflammatory drugs (NSAIDs) are among the most frequently used classes of medications in the world, they are well-known to induce an enteropathy that is associated with high morbidity and mortality in upwards of 70% of users. The diagnosis of NSAID enteropathy is difficult. Furthermore, the underlying mechanisms by which NSAIDs induce enteropathy remain ill-defined although microbiota-host interactions appear to play an important role. Importantly, in addition to difficulty in diagnosing this disease, there are also no effective treatment strategies. Therefore, the purpose of this research was to determine if the microbiota-derived metabolite indole, could attenuate severity of NSAID enteropathy. A second goal was to determine if the transcriptome of exfoliated intestinal epithelial cells (IECs) found in the stool could be reflective of NSAID enteropathy, thereby allowing a non-invasive approach to studying how the mucosal transcriptome is altered by NSAIDs and potentially discriminating between healthy and diseased animals. We utilized a mouse model of NSAID enteropathy, whereby mice were assigned to 1 of 4 groups: 1) NSAID; 2) indole; 3) NSAID + indole; and, 4) untreated controls. Disease severity was determined by a number of assays including: fecal calprotectin, microscopic pathology, neutrophil infiltration, and RNA-seq of the ileal mucosa. Diversity and composition of the fecal microbiota was determined by 16S rRNA sequencing. Non-invasive examination of the mucosal transcriptome was determined by isolation and sequencing of polyA+ RNA from the stool followed by novel computational approaches to assess the inter-relatedness of exfoliated and tissue-level transcriptomes. Results from these assays revealed that indole did in fact attenuate disease severity and this improvement appeared to be related to composition of the microbiota. In addition, approximately 96% of all genes that were mapped from the exfoliated cell RNA were also present in the tissue-level RNA and the pathways represented by these genes and their directional changes were similar in both the small intestinal mucosa and exfoliated IEC transcriptome. These findings demonstrate that the exfoliated cell transcriptome correlates to the tissue-level transcriptome and can be used to gain longitudinal information related to NSAID-induced alterations of the mucosal transcriptome and to discriminate between diseased and healthy animals
    corecore