2,411 research outputs found

    A Novel Joint Gene Set Analysis Framework Improves Identification of Enriched Pathways in Cross Disease Transcriptomic Analysis

    Get PDF
    Motivation: Gene set enrichment analysis is a widely accepted expression analysis tool which aims at detecting coordinated expression change within a pre-defined gene sets rather than individual genes. The benefit of gene set analysis over individual differentially expressed (DE) gene analysis includes more reproducible and interpretable results and detecting small but consistent change among gene set which could not be detected by DE gene analysis. There have been many successful gene set analysis applications in human diseases. However, when the sample size of a disease study is small and no other public data sets of the same disease are available, it will lead to lack of power to detect pathways of importance to the disease.Results: We have developed a novel joint gene set analysis statistical framework which aims at improving the power of identifying enriched gene sets through integrating multiple similar disease data sets. Through comprehensive simulation studies, we demonstrated that our proposed frameworks obtained much better AUC scores than single data set analysis and another meta-analysis method in identification of enriched pathways. When applied to two real data sets, the proposed framework could retain the enriched gene sets identified by single data set analysis and exclusively obtained up to 200% more disease-related gene sets demonstrating the improved identification power through information shared between similar diseases. We expect that the proposed framework would enable researchers to better explore public data sets when the sample size of their study is limited

    Topological Analysis of Metabolic Networks Integrating Co-Segregating Transcriptomes and Metabolomes in Type 2 Diabetic Rat Congenic Series

    Get PDF
    Background: The genetic regulation of metabolic phenotypes (i.e., metabotypes) in type 2 diabetes mellitus is caused by complex organ-specific cellular mechanisms contributing to impaired insulin secretion and insulin resistance. Methods: We used systematic metabotyping by 1H NMR spectroscopy and genome-wide gene expression in white adipose tissue to map molecular phenotypes to genomic blocks associated with obesity and insulin secretion in a series of rat congenic strains derived from spontaneously diabetic Goto-Kakizaki (GK) and normoglycemic Brown-Norway (BN) rats. We implemented a network biology strategy approach to visualise shortest paths between metabolites and genes significantly associated with each genomic block. Results: Despite strong genomic similarities (95-99%) among congenics, each strain exhibited specific patterns of gene expression and metabotypes, reflecting metabolic consequences of series of linked genetic polymorphisms in the congenic intervals. We subsequently used the congenic panel to map quantitative trait loci underlying specific metabotypes (mQTL) and genome-wide expression traits (eQTL). Variation in key metabolites like glucose, succinate, lactate or 3-hydroxybutyrate, and second messenger precursors like inositol was associated with several independent genomic intervals, indicating functional redundancy in these regions. To navigate through the complexity of these association networks we mapped candidate genes and metabolites onto metabolic pathways and implemented a shortest path strategy to highlight potential mechanistic links between metabolites and transcripts at colocalized mQTLs and eQTLs. Minimizing shortest path length drove prioritization of biological validations by gene silencing. Conclusions: These results underline the importance of network-based integration of multilevel systems genetics datasets to improve understanding of the genetic architecture of metabotype and transcriptomic regulations and to characterize novel functional roles for genes determining tissue-specific metabolism

    Molecular epidemiology study on genetically regulated gene expression in the colonic mucosa and its role in disease susceptibility

    Full text link
    [spa] La expresión genética es un proceso celular clave, que además está relacionado con la susceptibilidad genética a enfermedades y rasgos complejos. La mayoría de genes se someten a splicing alternativo (AS). Las variantes genéticas que regulan la expresión genética y el AS se llaman ¿quantitative trait loci¿ (e/sQTLs). Técnicas estadísticas permiten predecir in silico la expresión genética en un tejido concreto a partir de datos genéticos. Esta aproximación se lleva a cabo en los estudios de asociación de transcriptoma completo (TWAS). Esta Tesis se compone de tres objetivos principales y presenta tres artículos. 1) Generar perfiles de expresión genética de la mucosa colónica de individuos sanos, así como sus diferencias a lo largo del colon y sus e/sQTLs asociados; 2) Desarrollar una aplicación web que permita explorar los datos de expresión genética en el colon; 3) Llevar a cabo un TWAS para proponer genes de susceptibilidad a enfermedad inflamatoria intestinal (EII). Como resumen de los resultados, 1) se generaron catálogos de e/sQTLs a partir de nuevos datos de expresión genética en colon de 445 individuos, y se encontraron más de 4,000 genes que varían sus niveles de expresión a lo largo del colon; 2) se desarrolló el "Colon Transcriptome Explorer", disponible públicamente en https://barcuvaseq.org/cotrex/; 3) se propusieron más de doscientos genes de susceptibilidad genética a EII. En conclusión, nuestros estudios proporcionan nuevos datos y evidencias sobre los genes involucrados en mecanismos de susceptibilidad a enfermedades relacionadas con el colon, y servirán de guía a otros investigadores para proponer nuevas hipótesis en este campo

    Small RNA signatures of the anterior cruciate ligament from patients with knee joint osteoarthritis

    Get PDF
    ABSTRACT The anterior cruciate ligaments are susceptible to degeneration, resulting in pain, reduced mobility and development of the degenerative joint disease osteoarthritis. There is currently a paucity of knowledge on how anterior cruciate ligament degeneration and disease can lead to osteoarthritis. Small non-coding RNAs (sncRNAs), such as microRNAs, and small nucleolar RNA, are important regulators of gene expression. We aimed to identify sncRNA profiles of human anterior cruciate ligaments to provide novel insights into their roles in osteoarthritis. RNA was extracted from the anterior cruciate ligaments of non-osteoarthritic knee joints (control) and end-stage osteoarthritis knee joints, used for small RNA sequencing and significantly differentially expressed sncRNAs defined. Bioinformatic analysis was undertaken on the differentially expressed miRNAs and their putative target mRNAs to investigate pathways and biological processes affected. Our analysis identified 184 sncRNA that were differentially expressed between control ACLs derived from osteoarthritic joints with a false discovery adjusted p value<0.05; 68 small nucleolar RNAs, 26 small nuclear RNAs and 90 microRNAs. We identified both novel and previously identified (miR-206, –101, –365 and –29b and –29c) osteoarthritis-related microRNAs and other sncRNAs (including SNORD74, SNORD114, SNORD72) differentially expressed in ligaments derived from osteoarthritic joints. Significant cellular functions deduced by the differentially small nuclear RNAs and 90 microRNAs. We identified expressed miRNAs included differentiation of muscle (P<0.001), inflammation (P<1.42E-10), proliferation of chondrocytes (P<0.03), fibrosis (P<0.001) and cell viability (P<0.03). Putative mRNAs were associated with the canonical pathways ‘Hepatic Fibrosis Signalling’ (P<3.7E-32), and ‘Osteoarthritis’ (P<2.2E-23). Biological processes included apoptosis (P<1.7E-85), fibrosis (P<1.2E-79), inflammation (P<3.4E-88), necrosis (P<7.2E-88) and angiogenesis (P<5.7E-101). SncRNAs are important regulators of anterior cruciate disease during osteoarthritis and may be used as therapeutic targets to prevent and manage anterior cruciate ligament disease and the resultant osteoarthritis

    Computational Methods for the Analysis of Genomic Data and Biological Processes

    Get PDF
    In recent decades, new technologies have made remarkable progress in helping to understand biological systems. Rapid advances in genomic profiling techniques such as microarrays or high-performance sequencing have brought new opportunities and challenges in the fields of computational biology and bioinformatics. Such genetic sequencing techniques allow large amounts of data to be produced, whose analysis and cross-integration could provide a complete view of organisms. As a result, it is necessary to develop new techniques and algorithms that carry out an analysis of these data with reliability and efficiency. This Special Issue collected the latest advances in the field of computational methods for the analysis of gene expression data, and, in particular, the modeling of biological processes. Here we present eleven works selected to be published in this Special Issue due to their interest, quality, and originality

    Network-based analysis of gene expression data

    Get PDF
    The methods of molecular biology for the quantitative measurement of gene expression have undergone a rapid development in the past two decades. High-throughput assays with the microarray and RNA-seq technology now enable whole-genome studies in which several thousands of genes can be measured at a time. However, this has also imposed serious challenges on data storage and analysis, which are subject of the young, but rapidly developing field of computational biology. To explain observations made on such a large scale requires suitable and accordingly scaled models of gene regulation. Detailed models, as available for single genes, need to be extended and assembled in larger networks of regulatory interactions between genes and gene products. Incorporation of such networks into methods for data analysis is crucial to identify molecular mechanisms that are drivers of the observed expression. As methods for this purpose emerge in parallel to each other and without knowing the standard of truth, results need to be critically checked in a competitive setup and in the context of the available rich literature corpus. This work is centered on and contributes to the following subjects, each of which represents important and distinct research topics in the field of computational biology: (i) construction of realistic gene regulatory network models; (ii) detection of subnetworks that are significantly altered in the data under investigation; and (iii) systematic biological interpretation of detected subnetworks. For the construction of regulatory networks, I review existing methods with a focus on curation and inference approaches. I first describe how literature curation can be used to construct a regulatory network for a specific process, using the well-studied diauxic shift in yeast as an example. In particular, I address the question how a detailed understanding, as available for the regulation of single genes, can be scaled-up to the level of larger systems. I subsequently inspect methods for large-scale network inference showing that they are significantly skewed towards master regulators. A recalibration strategy is introduced and applied, yielding an improved genome-wide regulatory network for yeast. To detect significantly altered subnetworks, I introduce GGEA as a method for network-based enrichment analysis. The key idea is to score regulatory interactions within functional gene sets for consistency with the observed expression. Compared to other recently published methods, GGEA yields results that consistently and coherently align expression changes with known regulation types and that are thus easier to explain. I also suggest and discuss several significant enhancements to the original method that are improving its applicability, outcome and runtime. For the systematic detection and interpretation of subnetworks, I have developed the EnrichmentBrowser software package. It implements several state-of-the-art methods besides GGEA, and allows to combine and explore results across methods. As part of the Bioconductor repository, the package provides a unified access to the different methods and, thus, greatly simplifies the usage for biologists. Extensions to this framework, that support automating of biological interpretation routines, are also presented. In conclusion, this work contributes substantially to the research field of network-based analysis of gene expression data with respect to regulatory network construction, subnetwork detection, and their biological interpretation. This also includes recent developments as well as areas of ongoing research, which are discussed in the context of current and future questions arising from the new generation of genomic data
    • …
    corecore