8 research outputs found

    Interpreting microarray experiments via co-expressed gene groups analysis

    Get PDF
    International audienceMicroarray technology produces vast amounts of data by measuring simultaneously the expression levels of thousands of genes under hundreds of biological conditions. Nowadays, one of the principal challenges in bioinformatics is the interpretation of huge data using different sources of information. We propose a novel data analysis method named CGGA (Co-expressed Gene Groups Analysis) that automatically finds groups of genes that are functionally enriched, i.e. have the same functional annotations, and are co- expressed. CGGA automatically integrates the information of microarrays, i.e. gene expression profiles, with the functional annotations of the genes obtained by the genome-wide information sources such as Gene Ontology (GO)1. By applying CGGA to well-known microarray experiments, we have identified the principal functionally enriched and co-expressed gene groups, and we have shown that this approach enhances and accelerates the interpretation of DNA microarray experiments

    Analyse des groupes de gènes co-exprimés : un outil automatique pour l'interprétation des expériences de biopuces (version étendue)

    Get PDF
    National audienceLa technologie des biopuces permet de mesurer les niveaux d'expression de milliers de gènes dans différentes conditions biologiques générant ainsi des masses de données à analyser. De nos jours, l'interprétation de ces volumineux jeux de donnés à la lumière des différentes sources d'informations est l'un des principaux défis dans la bio-informatique. Nous avons développé une nouvelle méthode appelée AGGC (Analyse des Groupes de Gènes Co-exprimés) qui permet de constituer de manière automatique des groupes de gènes à la fois fonctionnellement riches, i.e. qui partagent les mêmes annotations fonctionnelles, et co-exprimés. AGGC intègre l'information issue des biopuces, i.e. les profils d'expression des gènes, avec les annotations fonctionnelles des gènes obtenues à partir des sources d'informations génomiques comme Gene Ontology. Les expérimentations menées avec cette méthode ont permis de mettre en évidence les principaux groupes de gènes fonctionnellement riches et co-exprimés dans des expériences de biopuces. Programme et informations annexes : http://keia.i3s.unice.fr/?Implementations:CGGA

    Co-expressed Gene Groups Analysis (CGGA): An Automatic Tool for the Interpretation of Microarray Experiments

    Get PDF
    International audienceMicroarray technology produces vast amounts of data by measuring simultaneously the expression levels of thousands of genes under hundreds of biological conditions. Nowadays, one of the principal challenges in bioinformatics is the interpretation of this large amount of data using different sources of information. We have developed a novel data analysis method named CGGA (Co-expressed Gene Groups Analysis) that automatically finds groups of genes that are functionally enriched, i.e. have the same functional annotations, and are co-expressed. CGGA automatically integrates the information of microarrays, i.e. gene expression profiles, with the functional annotations of the genes obtained by the genome-wide information sources such as Gene Ontology. By applying CGGA to well-known microarray experiments, we have identified the principal functionally enriched and co-expressed gene groups, and we have shown that this approach enhances and accelerates the interpretation of DNA microarray experiments. CGGA program is available at http://www.i3s.unice.fr/~rmartine/CGG

    MultiSeq: unifying sequence and structure data for evolutionary analysis

    Get PDF
    BACKGROUND: Since the publication of the first draft of the human genome in 2000, bioinformatic data have been accumulating at an overwhelming pace. Currently, more than 3 million sequences and 35 thousand structures of proteins and nucleic acids are available in public databases. Finding correlations in and between these data to answer critical research questions is extremely challenging. This problem needs to be approached from several directions: information science to organize and search the data; information visualization to assist in recognizing correlations; mathematics to formulate statistical inferences; and biology to analyze chemical and physical properties in terms of sequence and structure changes. RESULTS: Here we present MultiSeq, a unified bioinformatics analysis environment that allows one to organize, display, align and analyze both sequence and structure data for proteins and nucleic acids. While special emphasis is placed on analyzing the data within the framework of evolutionary biology, the environment is also flexible enough to accommodate other usage patterns. The evolutionary approach is supported by the use of predefined metadata, adherence to standard ontological mappings, and the ability for the user to adjust these classifications using an electronic notebook. MultiSeq contains a new algorithm to generate complete evolutionary profiles that represent the topology of the molecular phylogenetic tree of a homologous group of distantly related proteins. The method, based on the multidimensional QR factorization of multiple sequence and structure alignments, removes redundancy from the alignments and orders the protein sequences by increasing linear dependence, resulting in the identification of a minimal basis set of sequences that spans the evolutionary space of the homologous group of proteins. CONCLUSION: MultiSeq is a major extension of the Multiple Alignment tool that is provided as part of VMD, a structural visualization program for analyzing molecular dynamics simulations. Both are freely distributed by the NIH Resource for Macromolecular Modeling and Bioinformatics and MultiSeq is included with VMD starting with version 1.8.5. The MultiSeq website has details on how to download and use the software

    Co-expressed Gene Groups Analysis (CGGA): An Automatic Tool for the Interpretation of Microarray Experiments

    Get PDF
    International audienceMicroarray technology produces vast amounts of data by measuring simultaneously the expression levels of thousands of genes under hundreds of biological conditions. Nowadays, one of the principal challenges in bioinformatics is the interpretation of this large amount of data using different sources of information. We have developed a novel data analysis method named CGGA (Co-expressed Gene Groups Analysis) that automatically finds groups of genes that are functionally enriched, i.e. have the same functional annotations, and are co-expressed. CGGA automatically integrates the information of microarrays, i.e. gene expression profiles, with the functional annotations of the genes obtained by the genome-wide information sources such as Gene Ontology. By applying CGGA to well-known microarray experiments, we have identified the principal functionally enriched and co-expressed gene groups, and we have shown that this approach enhances and accelerates the interpretation of DNA microarray experiments. CGGA program is available at http://www.i3s.unice.fr/~rmartine/CGG

    Calling International Rescue: knowledge lost in literature and data landslide!

    Get PDF
    We live in interesting times. Portents of impending catastrophe pervade the literature, calling us to action in the face of unmanageable volumes of scientific data. But it isn't so much data generation per se, but the systematic burial of the knowledge embodied in those data that poses the problem: there is so much information available that we simply no longer know what we know, and finding what we want is hard – too hard. The knowledge we seek is often fragmentary and disconnected, spread thinly across thousands of databases and millions of articles in thousands of journals. The intellectual energy required to search this array of data-archives, and the time and money this wastes, has led several researchers to challenge the methods by which we traditionally commit newly acquired facts and knowledge to the scientific record. We present some of these initiatives here – a whirlwind tour of recent projects to transform scholarly publishing paradigms, culminating in Utopia and the Semantic Biochemical Journal experiment. With their promises to provide new ways of interacting with the literature, and new and more powerful tools to access and extract the knowledge sequestered within it, we ask what advances they make and what obstacles to progress still exist? We explore these questions, and, as you read on, we invite you to engage in an experiment with us, a real-time test of a new technology to rescue data from the dormant pages of published documents. We ask you, please, to read the instructions carefully. The time has come: you may turn over your papers

    Genetic and phenotypic heterogeneity in autosomal recessive retinal disease

    Get PDF
    Molecular genetics has transformed our understanding of disease and is gradually changing the way medicine is practiced. Genetic mapping provides a powerful approach to discover genes and biological processes underlying human disorders. Recent advances in DNA microarray and sequencing technology have significantly increased the power of genetic mapping studies and have ushered in a new era for biomedicine. In this thesis, linkage analysis (including homozygosity mapping), exome sequencing and candidate gene sequencing have been utilised to genetically dissect autosomal recessive retinal disease. Subsequently, clinical findings from patients found to be similar in terms of molecular pathology have been pooled. DNA and basic phenotypic data from over 500 unrelated individuals were available for the project. Disease-causing variants in three genes that have not been previously associated with human recessive disorders are reported: (a) biallelic mutations in TRPM1 abrogate ON bipolar cell function and cause complete congenital stationary night blindness; (b) biallelic mutations in KCNJ13, a gene encoding an inwardly rectifying potassium channel subunit cause Leber congenital amaurosis; (c) biallelic mutations in PLA2G5, a gene encoding group V phospholipase A2, cause benign fleck retina. The consequences of mutations in these and other disease-related genes (RDH5, GRM6, KCNV2, OAT and SAG) on retinal structure (spectral domain optical coherence tomography, fundus autofluorescence imaging) and visual function (electrophysiology, perimetry testing) have been studied; features that may have mechanistic relevance have been identified. Additionally, DNA sequence variation of a highly polymorphic gene (C2ORF71), recently associated with photoreceptor degeneration, has been studied and quantified in patient and control samples. Basic bioinformatics tools to analyse genomic data have been developed (bash, perl, python and R programming languages). Overall, results presented in this thesis contribute to an understanding of Mendelian retinal disease that is not only observational but also mechanistic
    corecore