76 research outputs found

    MorphDB : prioritizing genes for specialized metabolism pathways and gene ontology categories in plants

    Get PDF
    Recent times have seen an enormous growth of "omics" data, of which high-throughput gene expression data are arguably the most important from a functional perspective. Despite huge improvements in computational techniques for the functional classification of gene sequences, common similarity-based methods often fall short of providing full and reliable functional information. Recently, the combination of comparative genomics with approaches in functional genomics has received considerable interest for gene function analysis, leveraging both gene expression based guilt-by-association methods and annotation efforts in closely related model organisms. Besides the identification of missing genes in pathways, these methods also typically enable the discovery of biological regulators (i.e., transcription factors or signaling genes). A previously built guilt-by-association method is MORPH, which was proven to be an efficient algorithm that performs particularly well in identifying and prioritizing missing genes in plant metabolic pathways. Here, we present MorphDB, a resource where MORPH-based candidate genes for large-scale functional annotations (Gene Ontology, MapMan bins) are integrated across multiple plant species. Besides a gene centric query utility, we present a comparative network approach that enables researchers to efficiently browse MORPH predictions across functional gene sets and species, facilitating efficient gene discovery and candidate gene prioritization. MorphDB is available at http://bioinformatics.psb.ugent.be/webtools/morphdb/morphDB/index/. We also provide a toolkit, named "MORPH bulk" (https://github.com/arzwa/morph-bulk), for running MORPH in bulk mode on novel data sets, enabling researchers to apply MORPH to their own species of interest

    Magnetostratigraphy and Rock Magnetism Study of Hole U1524A from IODP Expedition 374

    Get PDF
    The Tenth Symposium on Polar Science/Ordinary sessions: [OG] Polar Geosciences, Wed. 4 Dec. / 3F Seminar room, National Institute of Polar Researc

    PLAZA 4.0 : an integrative resource for functional, evolutionary and comparative plant genomics

    Get PDF
    PLAZA (https://bioinformatics.psb.ugent.be/plaza) is a plant-oriented online resource for comparative, evolutionary and functional genomics. The PLAZA platform consists of multiple independent instances focusing on different plant clades, while also providing access to a consistent set of reference species. Each PLAZA instance contains structural and functional gene annotations, gene family data and phylogenetic trees and detailed gene colinearity information. A user-friendly web interface makes the necessary tools and visualizations accessible, specific for each data type. Here we present PLAZA 4.0, the latest iteration of the PLAZA framework. This version consists of two new instances (Dicots 4.0 and Monocots 4.0) providing a large increase in newly available species, and offers access to updated and newly implemented tools and visualizations, helping users with the ever-increasing demands for complex and in-depth analyzes. The total number of species across both instances nearly doubles from 37 species in PLAZA 3.0 to 71 species in PLAZA 4.0, with a much broader coverage of crop species (e.g. wheat, palm oil) and species of evolutionary interest (e.g. spruce, Marchantia). The new PLAZA instances can also be accessed by a programming interface through a RESTful web service, thus allowing bioinformaticians to optimally leverage the power of the PLAZA platform

    Geochemical evidence of Milankovitch cycles in Atlantic Ocean ferromanganese crusts

    Get PDF
    Hydrogenetic ferromanganese crusts are considered a faithful record of the isotopic composition of seawater influenced by weathering processes of continental masses. Given their ubiquitous presence in all oceans of the planet at depths of 400–7000 meters, they form one of the most well-distributed and accessible records of water-mass mixing and climate. However, their slow accumulation rate and poor age constraints have to date limited their use to explore 100 ka paleoclimatic phenomena. Here it is shown how the Pb isotope signature and major element content of a Fe-Mn crust from the north-east Atlantic responded to changes in the intensity and geographic extent of monsoonal rainfall over West Africa, as controlled by climatic precession during the Paleocene. The studied high-spatial resolution (4 ÎŒm) laser-ablation multi-collector inductively coupled plasma mass spectrometer (LA-MC-ICP-MS) Pb isotope data is a nearly 2 order of magnitude improvement in spatial and temporal resolution compared to micro-drill subsamples. The record demonstrates cyclicity of the 206Pb/204Pb and 208, 207Pb/206Pb ratios at the scale of single Fe-Mn oxide laminae, in conjunction with variations in the Fe/Mn ratio, Al, Si and Ti content. Time-frequency analysis and astronomical tuning of the Pb isotope data demonstrates the imprint of climatic precession (∌20 ka) modulated by eccentricity (∌100 and 405 ka), yielding growth rates of 1.5–3.5 mm/Ma consistent with previous chemostratigraphic age models. In this context, boreal summer at the perihelion causes stronger insolation over West Africa, resulting in more intense and geographically extended monsoonal rainfalls compared to aphelion boreal summer conditions. This, in turn, influences the balance between the weathering endmembers feeding the north-east Atlantic basin. These results provide a new approach for calibrating Fe-Mn crust records to astronomical solutions, and allow their isotopic and chemical archive to be exploited with an improved temporal resolution of 1000–5000 years

    CoExpNetViz: comparative co-expression networks construction and visualization tool

    Get PDF
    Motivation: Comparative transcriptomics is a common approach in functional gene discovery efforts. It allows for finding conserved co-expression patterns between orthologous genes in closely related plant species, suggesting that these genes potentially share similar function and regulation. Several efficient co-expression-based tools have been commonly used in plant research but most of these pipelines are limited to data from model systems, which greatly limit their utility. Moreover, in addition, none of the existing pipelines allow plant researchers to make use of their own unpublished gene expression data for performing a comparative co-expression analysis and generate multi-species co-expression networks. Results: We introduce CoExpNetViz, a computational tool that uses a set of query or "bait" genes as an input (chosen by the user) and a minimum of one pre-processed gene expression dataset. The CoExpNetViz algorithm proceeds in three main steps; (i) for every bait gene submitted, co-expression values are calculated using mutual information and Pearson correlation coefficients, (ii) non bait (or target) genes are grouped based on cross-species orthology, and (iii) output files are generated and results can be visualized as network graphs in Cytoscape. Availability: The CoExpNetViz tool is freely available both as a PHP web server (link: http://bioinformatics.psb.ugent.be/webtools/coexpr/) (implemented in C++) and as a Cytoscape plugin (implemented in Java). Both versions of the CoExpNetViz tool support LINUX and Windows platforms

    CoExpNetViz : comparative co-expression networks construction and visualization tool

    Get PDF
    MOTIVATION : Comparative transcriptomics is a common approach in functional gene discovery efforts. It allows for finding conserved co-expression patterns between orthologous genes in closely related plant species, suggesting that these genes potentially share similar function and regulation. Several efficient co-expression-based tools have been commonly used in plant research but most of these pipelines are limited to data from model systems, which greatly limit their utility. Moreover, in addition, none of the existing pipelines allow plant researchers to make use of their own unpublished gene expression data for performing a comparative co-expression analysis and generate multi-species co-expression networks. RESULTS : We introduce CoExpNetViz, a computational tool that uses a set of query or “bait” genes as an input (chosen by the user) and a minimum of one pre-processed gene expression dataset. The CoExpNetViz algorithm proceeds in three main steps; (i) for every bait gene submitted, co-expression values are calculated using mutual information and Pearson correlation coefficients, (ii) non-bait (or target) genes are grouped based on cross-species orthology, and (iii) output files are generated and results can be visualized as network graphs in Cytoscape. AVAILABILITY : The CoExpNetViz tool is freely available both as a PHP web server (link: http://bioinformatics.psb.ugent.be/webtools/coexpr/) (implemented in C++) and as a Cytoscape plugin (implemented in Java). Both versions of the CoExpNetViz tool support LINUX and Windows platformsSupplementary File 1. CoExpNetViz user and development manuals.The work in the AA lab was supported by the European Research Council grant SAMIT (no. 204575). We thank the Tom and Sondra Rykof Family Foundation for supporting the AA lab activity. AA is the incumbent of the Peter J. Cohn Professorial Chair. KV and YP acknowledge the Multidisciplinary Research Partnership “Bioinformatics: from nucleotides to networks” Project (no 01MR0310W) of Ghent University. YVdP also acknowledges support from the European Union Seventh Framework Programme (FP7/2007-2013) under European Research Council Advanced Grant Agreement 322739 “DOUBLE-UP.”http://www.frontiersin.orgam2016Genetic

    Cloning of Dimethylglycine Dehydrogenase and a New Human Inborn Error of Metabolism, Dimethylglycine Dehydrogenase Deficiency

    Get PDF
    Dimethylglycine dehydrogenase (DMGDH) (E.C. number 1.5.99.2) is a mitochondrial matrix enzyme involved in the metabolism of choline, converting dimethylglycine to sarcosine. Sarcosine is then transformed to glycine by sarcosine dehydrogenase (E.C. number 1.5.99.1). Both enzymes use flavin adenine dinucleotide and folate in their reaction mechanisms. We have identified a 38-year-old man who has a lifelong condition of fishlike body odor and chronic muscle fatigue, accompanied by elevated levels of the muscle form of creatine kinase in serum. Biochemical analysis of the patient’s serum and urine, using 1H-nuclear magnetic resonance NMR spectroscopy, revealed that his levels of dimethylglycine were much higher than control values. The cDNA and the genomic DNA for human DMGDH (hDMGDH) were then cloned, and a homozygous A→G substitution (326 A→G) was identified in both the cDNA and genomic DNA of the patient. This mutation changes a His to an Arg (H109R). Expression analysis of the mutant cDNA indicates that this mutation inactivates the enzyme. We therefore confirm that the patient described here represents the first reported case of a new inborn error of metabolism, DMGDH deficiency

    Validating module network learning algorithms using simulated data

    Get PDF
    In recent years, several authors have used probabilistic graphical models to learn expression modules and their regulatory programs from gene expression data. Here, we demonstrate the use of the synthetic data generator SynTReN for the purpose of testing and comparing module network learning algorithms. We introduce a software package for learning module networks, called LeMoNe, which incorporates a novel strategy for learning regulatory programs. Novelties include the use of a bottom-up Bayesian hierarchical clustering to construct the regulatory programs, and the use of a conditional entropy measure to assign regulators to the regulation program nodes. Using SynTReN data, we test the performance of LeMoNe in a completely controlled situation and assess the effect of the methodological changes we made with respect to an existing software package, namely Genomica. Additionally, we assess the effect of various parameters, such as the size of the data set and the amount of noise, on the inference performance. Overall, application of Genomica and LeMoNe to simulated data sets gave comparable results. However, LeMoNe offers some advantages, one of them being that the learning process is considerably faster for larger data sets. Additionally, we show that the location of the regulators in the LeMoNe regulation programs and their conditional entropy may be used to prioritize regulators for functional validation, and that the combination of the bottom-up clustering strategy with the conditional entropy-based assignment of regulators improves the handling of missing or hidden regulators.Comment: 13 pages, 6 figures + 2 pages, 2 figures supplementary informatio

    Parallelizing Training of Deep Generative Models on Massive Scientific Datasets

    Full text link
    Training deep neural networks on large scientific data is a challenging task that requires enormous compute power, especially if no pre-trained models exist to initialize the process. We present a novel tournament method to train traditional as well as generative adversarial networks built on LBANN, a scalable deep learning framework optimized for HPC systems. LBANN combines multiple levels of parallelism and exploits some of the worlds largest supercomputers. We demonstrate our framework by creating a complex predictive model based on multi-variate data from high-energy-density physics containing hundreds of millions of images and hundreds of millions of scalar values derived from tens of millions of simulations of inertial confinement fusion. Our approach combines an HPC workflow and extends LBANN with optimized data ingestion and the new tournament-style training algorithm to produce a scalable neural network architecture using a CORAL-class supercomputer. Experimental results show that 64 trainers (1024 GPUs) achieve a speedup of 70.2 over a single trainer (16 GPUs) baseline, and an effective 109% parallel efficiency
    • 

    corecore