    A guide through the computational analysis of isotope-labeled mass spectrometry-based quantitative proteomics data: an application study

    Albaum S, Hahne H, Otto A, et al. A guide through the computational analysis of isotope-labeled mass spectrometry-based quantitative proteomics data: an application study. Proteome Science. 2011;9(1): 30.Background: Mass spectrometry-based proteomics has reached a stage where it is possible to comprehensively analyze the whole proteome of a cell in one experiment. Here, the employment of stable isotopes has become a standard technique to yield relative abundance values of proteins. In recent times, more and more experiments are conducted that depict not only a static image of the up- or down-regulated proteins at a distinct time point but instead compare developmental stages of an organism or varying experimental conditions. Results: Although the scientific questions behind these experiments are of course manifold, there are, nevertheless, two questions that commonly arise: 1) which proteins are differentially regulated regarding the selected experimental conditions, and 2) are there groups of proteins that show similar abundance ratios, indicating that they have a similar turnover? We give advice on how these two questions can be answered and comprehensively compare a variety of commonly applied computational methods and their outcomes. Conclusions: This work provides guidance through the jungle of computational methods to analyze mass spectrometry-based isotope-labeled datasets and recommends an effective and easy-to-use evaluation strategy. We demonstrate our approach with three recently published datasets on Bacillus subtilis [1,2] and Corynebacterium glutamicum [3]. Special focus is placed on the application and validation of cluster analysis methods. All applied methods were implemented within the rich internet application QuPE [4]. Results can be found at http://qupe.cebitec.uni-bielefeld.de webcite

    RNA-protein correlation of liver toxicity markers in HepaRG cells

    The liver is a main target organ for the toxicity of many different compounds. While in general, in vivo testing is still routinely used for assessing the hepatotoxic potential of test chemicals, the use of in vitro models offers advantages with regard to throughput, consumption of resources, and animal welfare aspects. Using the human hepatoma cell line HepaRG, we performed a comparative evaluation of a panel of hepatotoxicity marker mRNAs and proteins after exposure of the cells to 30 different pesticidal active compounds comprising herbizides, fungicides, insecticides, and others. The panel of hepatotoxicity markers included nuclear receptor target genes, key players of fatty acid and bile acid metabolism-related pathways, as well as recently identified biomarkers of drug-induced liver injury. Moreover, marker genes and proteins were identified, for example, S100P, ANXA10, CYP1A1, and CYP7A1. These markers respond with high sensitivity to stimulation with chemically diverse test compounds already at non-cytotoxic concentrations. The potency of the test compounds, determined as an overall parameter of their ability to deregulate marker expression in vitro, was very similar between the mRNA and protein levels. Thus, this study does not only characterize the response of human liver cells to 30 different pesticides but also demonstrates that hepatotoxicity testing in human HepaRG cells yields well comparable results at the mRNA and protein levels. Furthermore, robust hepatotoxicity marker genes and proteins were identified in HepaRG cells

    Visualizing post genomics data-sets on customized pathway maps by ProMeTra – aeration-dependent gene expression and metabolism of Corynebacterium glutamicum as an example

    Neuweger H, Persicke M, Albaum S, et al. Visualizing post genomics data-sets on customized pathway maps by ProMeTra – aeration-dependent gene expression and metabolism of Corynebacterium glutamicum as an example. BMC Systems Biology. 2009;3(1): 82.Background: The rapid progress of post-genomic analyses, such as transcriptomics, proteomics, and metabolomics has resulted in the generation of large amounts of quantitative data covering and connecting the complete cascade from genotype to phenotype for individual organisms. Various benefits can be achieved when these ''Omics'' data are integrated, such as the identification of unknown gene functions or the elucidation of regulatory networks of whole organisms. In order to be able to obtain deeper insights in the generated datasets, it is of utmost importance to present the data to the researcher in an intuitive, integrated, and knowledge-based environment. Therefore, various visualization paradigms have been established during the last years. The visualization of ''Omics'' data using metabolic pathway maps is intuitive and has been applied in various software tools. It has become obvious that the application of web-based and user driven software tools has great potential and benefits from the use of open and standardized formats for the description of pathways. Results: In order to combine datasets from heterogeneous ''Omics'' sources, we present the web-based ProMeTra system that visualizes and combines datasets from transcriptomics, proteomics, and metabolomics on user defined metabolic pathway maps. Therefore, structured exchange of data with our ''Omics'' applications Emma 2, Qupe and MeltDB is employed. Enriched SVG images or animations are generated and can be obtained via the user friendly web interface. To demonstrate the functionality of ProMeTra, we use quantitative data obtained during a fermentation experiment of the L-lysine producing strain Corynebacterium glutamicum DM1730. During fermentation, oxygen supply was switched off in order to perturb the system and observe its reaction. At six different time points, transcript abundances, intracellular metabolite pools, as well as extracellular glucose, lactate, and L-lysine levels were determined. Conclusion: The interpretation and visualization of the results of this complex experiment was facilitated by the ProMeTra software. Both transcriptome and metabolome data were visualized on a metabolic pathway map. Visual inspection of the combined data confirmed existing knowledge but also delivered novel correlations that are of potential biotechnological importance

    CoryneCenter – An online resource for the integrated analysis of corynebacterial genome and transcriptome data

    Neuweger H, Baumbach J, Albaum S, et al. CoryneCenter: an online resource for the integrated analysis of corynebacterial genome and transcriptome data. BMC Systems Biology. 2007;1(1): 55.Background: The introduction of high-throughput genome sequencing and post-genome analysis technologies, e.g. DNA microarray approaches, has created the potential to unravel and scrutinize complex gene-regulatory networks on a large scale. The discovery of transcriptional regulatory interactions has become a major topic in modern functional genomics. Results: To facilitate the analysis of gene-regulatory networks, we have developed CoryneCenter, a web-based resource for the systematic integration and analysis of genome, transcriptome, and gene regulatory information for prokaryotes, especially corynebacteria. For this purpose, we extended and combined the following systems into a common platform: (1) GenDB, an open source genome annotation system, (2) EMMA, a MAGE compliant application for high-throughput transcriptome data storage and analysis, and (3) CoryneRegNet, an ontology-based data warehouse designed to facilitate the reconstruction and analysis of gene regulatory interactions. We demonstrate the potential of CoryneCenter by means of an application example. Using microarray hybridization data, we compare the gene expression of Corynebacterium glutamicum under acetate and glucose feeding conditions: Known regulatory networks are confirmed, but moreover CoryneCenter points out additional regulatory interactions. Conclusion: CoryneCenter provides more than the sum of its parts. Its novel analysis and visualization features significantly simplify the process of obtaining new biological insights into complex regulatory systems. Although the platform currently focusses on corynebacteria, the integrated tools are by no means restricted to these species, and the presented approach offers a general strategy for the analysis and verification of gene regulatory networks. CoryneCenter provides freely accessible projects with the underlying genome annotation, gene expression, and gene regulation data. The system is publicly available at http://www.CoryneCenter.d

    BACCardI - a tool for the validation of genomic assemblies, assisting genome finishing and intergenome comparison

    Bartels D, Kespohl S, Albaum S, et al. BACCardI - a tool for the validation of genomic assemblies, assisting genome finishing and intergenome comparison. Bioinformatics. 2005;21(7):853-859.Summary: We provide the graphical tool BACCardI for the construction of virtual clone maps from standard assembler output files or BLAST based sequence comparisons. This new tool has been applied to numerous genome projects to solve various problems including (a) validation of whole genome shotgun assemblies, (b) support for contig ordering in the finishing phase of a genome project, and (c) intergenome comparison between related strains when only one of the strains has been sequenced and a large insert library is available for the other. The BACCardI software can seamlessly interact with various sequence assembly packages. Motivation: Genomic assemblies generated from sequence information need to be validated by independent methods such as physical maps. The time-consuming task of building physical maps can be circumvented by virtual clone maps derived from read pair information of large insert libraries

    ALLocator: An Interactive Web Platform for the Analysis of Metabolomic LC-ESI-MS Datasets, Enabling Semi-Automated, User-Revised Compound Annotation and Mass Isotopomer Ratio Analysis

    Kessler N, Walter F, Persicke M, et al. ALLocator: An Interactive Web Platform for the Analysis of Metabolomic LC-ESI-MS Datasets, Enabling Semi-Automated, User-Revised Compound Annotation and Mass Isotopomer Ratio Analysis. PLoS ONE. 2014;9(11): e113909.Adduct formation, fragmentation events and matrix effects impose special challenges to the identification and quantitation of metabolites in LC-ESI-MS datasets. An important step in compound identification is the deconvolution of mass signals. During this processing step, peaks representing adducts, fragments, and isotopologues of the same analyte are allocated to a distinct group, in order to separate peaks from coeluting compounds. From these peak groups, neutral masses and pseudo spectra are derived and used for metabolite identification via mass decomposition and database matching. Quantitation of metabolites is hampered by matrix effects and nonlinear responses in LC-ESI-MS measurements. A common approach to correct for these effects is the addition of a U-13C-labeled internal standard and the calculation of mass isotopomer ratios for each metabolite. Here we present a new web-platform for the analysis of LC-ESI-MS experiments. ALLocator covers the workflow from raw data processing to metabolite identification and mass isotopomer ratio analysis. The integrated processing pipeline for spectra deconvolution “ALLocatorSD” generates pseudo spectra and automatically identifies peaks emerging from the U-13C-labeled internal standard. Information from the latter improves mass decomposition and annotation of neutral losses. ALLocator provides an interactive and dynamic interface to explore and enhance the results in depth. Pseudo spectra of identified metabolites can be stored in user- and method-specific reference lists that can be applied on succeeding datasets. The potential of the software is exemplified in an experiment, in which abundance fold-changes of metabolites of the l-arginine biosynthesis in C. glutamicum type strain ATCC 13032 and l-arginine producing strain ATCC 21831 are compared. Furthermore, the capability for detection and annotation of uncommon large neutral losses is shown by the identification of (γ-)glutamyl dipeptides in the same strains. ALLocator is available online at: https://allocator.cebitec.uni-bielefeld.​de. A login is required, but freely available

    Learning to classify organic and conventional wheat - a machine-learning driven approach using the MeltDB 2.0 metabolomics analysis platform

    Kessler N, Bonte A, Albaum S, et al. Learning to classify organic and conventional wheat - a machine-learning driven approach using the MeltDB 2.0 metabolomics analysis platform. Frontiers in Bioinformatics and Computational Biology. 2015;3: 35.We present results of our machine learning approach to the problem of classifying GC-MS data originating from wheat grains of different farming systems. The aim is to investigate the potential of learning algorithms to classify GC-MS data to be either from conventionally grown or from organically grown samples and considering different cultivars. The motivation of our work is rather obvious on the background of nowadays increased demand for organic food in post-industrialized societies and the necessity to prove organic food authenticity. The background of our data set is given by up to eleven wheat cultivars that have been cultivated in both farming systems, organic and conventional, throughout three years. More than 300 GC-MS measurements were recorded and subsequently processed and analyzed in the MeltDB 2.0 metabolomics analysis platform, being briefly outlined in this paper. We further describe how unsupervised (t-SNE, PCA) and supervised (RF, SVM) methods can be applied for sample visualization and classification. Our results clearly show that years have most and wheat cultivars have second-most influence on the metabolic composition of a sample. We can also show, that for a given year and cultivar, organic and conventional cultivation can be distinguished by machine-learning algorithms

    Development of a Rhizoctonia solani AG1-IB Specific Gene Model Enables Comparative Genome Analyses between Phytopathogenic R-solani AG1-IA, AG1-IB, AG3 and AG8 Isolates

    Wibberg D, Rupp O, Blom J, et al. Development of a Rhizoctonia solani AG1-IB Specific Gene Model Enables Comparative Genome Analyses between Phytopathogenic R-solani AG1-IA, AG1-IB, AG3 and AG8 Isolates. Plos One. 2015;10(12): e0144769.Rhizoctonia solani, a soil-born plant pathogenic basidiomycetous fungus, affects various economically important agricultural and horticultural crops. The draft genome sequence for the R. solani AG1-IB isolate 7/3/14 as well as a corresponding transcriptome dataset (Expressed Sequence Tags-ESTs) were established previously. Development of a specific R. solani AG1-IB gene model based on GMAP transcript mapping within the eukaryotic gene prediction platform AUGUSTUS allowed detection of new genes and provided insights into the gene structure of this fungus. In total, 12,616 genes were recognized in the genome of the AG1-IB isolate. Analysis of predicted genes by means of different bioinformatics tools revealed new genes whose products potentially are involved in degradation of plant cell wall components, melanin formation and synthesis of secondary metabolites. Comparative genome analyses between members of different R. solani anastomosis groups, namely AG1-IA, AG3 and AG8 and the newly annotated R. solani AG1-IB genome were performed within the comparative genomics platform EDGAR. It appeared that only 21 to 28% of all genes encoded in the draft genomes of the different strains were identified as core genes. Based on Average Nucleotide Identity (ANI) and Average Amino-acid Identity (AAI) analyses, considerable sequence differences between isolates representing different anastomosis groups were identified. However, R. solani isolates form a distinct cluster in relation to other fungi of the phylum Basidiomycota. The isolate representing AG1-IB encodes significant more genes featuring predictable functions in secondary metabolite production compared to other completely sequenced R. solani strains. The newly established R. solani AG1-IB 7/3/14 gene layout now provides a reliable basis for post-genomics studies

    Complete Genome Sequence of the Barley Pathogen Xanthomonas translucens pv. translucens DSM 18974 T (ATCC 19319 T)

    Jaenicke S, Bunk B, Wibberg D, et al. Complete Genome Sequence of the Barley Pathogen Xanthomonas translucens pv. translucens DSM 18974 T (ATCC 19319 T). Genome Announcements. 2016;4(6): e01334-16.We report here the complete 4.7-Mb genome sequence of Xanthomonas translucens pv. translucens DSM 18974T, which causes black chaff disease on barley (Hordeum vulgare). Genome data of this X. translucens type strain will improve our understanding of this bacterial species