10 research outputs found

    Visualizing regulatory interactions in metabolic networks

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Direct visualization of data sets in the context of biochemical network drawings is one of the most appealing approaches in the field of data evaluation within systems biology. One important type of information that is very helpful in interpreting and understanding metabolic networks has been overlooked so far. Here we focus on the representation of this type of information given by the strength of regulatory interactions between metabolite pools and reaction steps.</p> <p>Results</p> <p>The visualization of such interactions in a given metabolic network is based on a novel concept defining the regulatory strength (RS) of effectors regulating certain reaction steps. It is applicable to any mechanistic reaction kinetic formula. The RS values are measures for the strength of an up- or down-regulation of a reaction step compared with the completely non-inhibited or non-activated state, respectively. One numerical RS value is associated to any effector edge contained in the network. The RS is approximately interpretable on a percentage scale where 100% means the maximal possible inhibition or activation, respectively, and 0% means the absence of a regulatory interaction. If many effectors influence a certain reaction step, the respective percentages indicate the proportion in which the different effectors contribute to the total regulation of the reaction step. The benefits of the proposed method are demonstrated with a complex example system of a dynamic <it>E. coli </it>network.</p> <p>Conclusion</p> <p>The presented visualization approach is suitable for an intuitive interpretation of simulation data of metabolic networks under dynamic as well as steady-state conditions. Huge amounts of simulation data can be analyzed in a quick and comprehensive way. An extended time-resolved graphical network presentation provides a series of information about regulatory interaction within the biological system under investigation.</p

    A Mighty Small Heart: The Cardiac Proteome of Adult Drosophila melanogaster

    Get PDF
    Drosophila melanogaster is emerging as a powerful model system for the study of cardiac disease. Establishing peptide and protein maps of the Drosophila heart is central to implementation of protein network studies that will allow us to assess the hallmarks of Drosophila heart pathogenesis and gauge the degree of conservation with human disease mechanisms on a systems level. Using a gel-LC-MS/MS approach, we identified 1228 protein clusters from 145 dissected adult fly hearts. Contractile, cytostructural and mitochondrial proteins were most abundant consistent with electron micrographs of the Drosophila cardiac tube. Functional/Ontological enrichment analysis further showed that proteins involved in glycolysis, Ca2+-binding, redox, and G-protein signaling, among other processes, are also over-represented. Comparison with a mouse heart proteome revealed conservation at the level of molecular function, biological processes and cellular components. The subsisting peptidome encompassed 5169 distinct heart-associated peptides, of which 1293 (25%) had not been identified in a recent Drosophila peptide compendium. PeptideClassifier analysis was further used to map peptides to specific gene-models. 1872 peptides provide valuable information about protein isoform groups whereas a further 3112 uniquely identify specific protein isoforms and may be used as a heart-associated peptide resource for quantitative proteomic approaches based on multiple-reaction monitoring. In summary, identification of excitation-contraction protein landmarks, orthologues of proteins associated with cardiovascular defects, and conservation of protein ontologies, provides testimony to the heart-like character of the Drosophila cardiac tube and to the utility of proteomics as a complement to the power of genetics in this growing model of human heart disease

    Information Visualization Techniques for Metabolic Engineering

    Get PDF
    The main purpose of metabolic engineering is the modification of biological systems towards specific goals using genetic manipulations. For this purpose, models are built that describe the stationary and dynamic behaviour of biochemical reaction networks inside a biological cell. Based on these models, simulations are carried out with the intention to understand the cell's behaviour. The modeling process leads to the generation of large amounts of data, both during the modeling itself and after the simulation of the created models. The manual interpretation is almost impossible; consequently, appropriate techniques for supporting the analysis and visualization of these data are needed. The purpose of this thesis is to investigate visualization and data mining techniques to support the metabolic modeling process. The work presented in this thesis is divided into several tracks: -Visualization of metabolic networks and the associated simulation data. Novel visualization techniques will be presented, which allow the visual exploration of metabolic network dynamics, beyond static snapshots of the simulated data plots. Node-link representations of the metabolic network are animated using the time series of metabolite concentrations and reaction rates. In this way, bottlenecks and active parts of metabolic networks can be distinguished. Additionally, 3D visualization techniques for metabolic networks are explored for cross-free drawing of the networks in 3D visualization space. Steerable drawing of metabolic networks is also investigated. In contrast to other approaches for drawing metabolic networks, user guided drawing of the networks allows the creation of high quality drawings by including user feedback in the drawing process. -Comparison of XML/SBML files. SBML (Systems Biology Markup Language) has become ubiquitous in metabolic modeling, serving the storage and exchange of models in XML format. Generally, the modeling process is an iterative task where the next generation model is a further development of the current model, resulting in a family of models stored in SBML format. The SBML format, however, includes a great deal of information, from the structure of the biochemical network to parameters of the model or measured data. Consequently, the CustX-Diff algorithm for a customizable comparison of XML files will be introduced. By customizing the comparison process through the specification of XPath expressions, an adaptable change detection process is enabled. Thus, the comparison process can be focused on specific parts of a XML/SBML document, e.g. on the structure of a metabolic network. -Visual exploration of time-varying sensitivity matrices. Sensitivity analysis is a special method used in simulation to analyze the sensitivity of a model with respect to its parameters. The results of sensitivity analysis of a metabolic network are large time-varying matrices, which need to be properly visualized. However, the visualization of time-varying high-dimensional data is a challenging problem. For this purpose, an extensible framework is proposed, consisting of existing and novel visualization methods, which allow the visual exploration of time-varying sensitivity matrices. Tabular visualization techniques, such as the reorderable matrix, are developed further, and algorithms for their reordering are discussed. Existing and novel techniques for exploring proximity data, both in matrix form and projected using multi-dimensional scaling (MDS), are also discussed. Information visualization paradigms such as focus+context based distortion and overview+details are proposed to enhance such techniques. -Cluster ensembles for analyzing time-varying sensitivity matrices. A novel relationship-based cluster ensemble, which relies on the accumulation of the evolving pairwise similarities of objects (i.e. parameters) will be proposed, as a robust and efficient method for clustering time-varying high-dimensional data. The time-dependent similarities, obtained from the fuzzy partitions created during the fuzzy clustering process, are aggregated, and the final clustering result is derived from this aggregation

    Information Visualization Techniques for Metabolic Engineering

    No full text
    The main purpose of metabolic engineering is the modification of biological systems towards specific goals using genetic manipulations. For this purpose, models are built that describe the stationary and dynamic behaviour of biochemical reaction networks inside a biological cell. Based on these models, simulations are carried out with the intention to understand the cell's behaviour. The modeling process leads to the generation of large amounts of data, both during the modeling itself and after the simulation of the created models. The manual interpretation is almost impossible; consequently, appropriate techniques for supporting the analysis and visualization of these data are needed. The purpose of this thesis is to investigate visualization and data mining techniques to support the metabolic modeling process. The work presented in this thesis is divided into several tracks: -Visualization of metabolic networks and the associated simulation data. Novel visualization techniques will be presented, which allow the visual exploration of metabolic network dynamics, beyond static snapshots of the simulated data plots. Node-link representations of the metabolic network are animated using the time series of metabolite concentrations and reaction rates. In this way, bottlenecks and active parts of metabolic networks can be distinguished. Additionally, 3D visualization techniques for metabolic networks are explored for cross-free drawing of the networks in 3D visualization space. Steerable drawing of metabolic networks is also investigated. In contrast to other approaches for drawing metabolic networks, user guided drawing of the networks allows the creation of high quality drawings by including user feedback in the drawing process. -Comparison of XML/SBML files. SBML (Systems Biology Markup Language) has become ubiquitous in metabolic modeling, serving the storage and exchange of models in XML format. Generally, the modeling process is an iterative task where the next generation model is a further development of the current model, resulting in a family of models stored in SBML format. The SBML format, however, includes a great deal of information, from the structure of the biochemical network to parameters of the model or measured data. Consequently, the CustX-Diff algorithm for a customizable comparison of XML files will be introduced. By customizing the comparison process through the specification of XPath expressions, an adaptable change detection process is enabled. Thus, the comparison process can be focused on specific parts of a XML/SBML document, e.g. on the structure of a metabolic network. -Visual exploration of time-varying sensitivity matrices. Sensitivity analysis is a special method used in simulation to analyze the sensitivity of a model with respect to its parameters. The results of sensitivity analysis of a metabolic network are large time-varying matrices, which need to be properly visualized. However, the visualization of time-varying high-dimensional data is a challenging problem. For this purpose, an extensible framework is proposed, consisting of existing and novel visualization methods, which allow the visual exploration of time-varying sensitivity matrices. Tabular visualization techniques, such as the reorderable matrix, are developed further, and algorithms for their reordering are discussed. Existing and novel techniques for exploring proximity data, both in matrix form and projected using multi-dimensional scaling (MDS), are also discussed. Information visualization paradigms such as focus+context based distortion and overview+details are proposed to enhance such techniques. -Cluster ensembles for analyzing time-varying sensitivity matrices. A novel relationship-based cluster ensemble, which relies on the accumulation of the evolving pairwise similarities of objects (i.e. parameters) will be proposed, as a robust and efficient method for clustering time-varying high-dimensional data. The time-dependent similarities, obtained from the fuzzy partitions created during the fuzzy clustering process, are aggregated, and the final clustering result is derived from this aggregation

    dem Fachbereich Mathematik und Informatik der

    No full text
    The accomplishment of this dissertation would not have been possible without the contribution of many individuals, to whom I want to express my appreciation and gratitude. First of all, I want to gratefully acknowledge my supervisor, Prof. Dr. Bernd Freisleben, for his persistent support and encouragement. His academic guidance and thorough insights helped me to find my own way in the world of research. I wish to express my gratitude to Prof. Dr. Wolfgang Wiechert for the fruitful technical discussions we had and for the insights he provided to me in the field of metabolic engineering and systems biology. My acknowledgement also goes to Prof. Dr. Bernhard Seeger and Prof. Dr. Eyke Hüllermeier for serving in my dissertation committee. The research work presented in this thesis was financially supported by the Deutsche Forschungsgemeinschaft (SPP 1063, Teilprojekt FR 791/8-1) and by Deutscher Akademischer Austauschdienst (DAAD, Stability Pact fo

    Deterministic protein inference for shotgun proteomics data provides new insights into Arabidopsis pollen development and function

    No full text
    Pollen, the male gametophyte of flowering plants, represents an ideal biological system to study developmental processes, such as cell polarity, tip growth, and morphogenesis. Upon hydration, the metabolically quiescent pollen rapidly switches to an active state, exhibiting extremely fast growth. This rapid switch requires relevant proteins to be stored in the mature pollen, where they have to retain functionality in a desiccated environment. Using a shotgun proteomics approach, we unambiguously identified ∼3500 proteins in Arabidopsis pollen, including 537 proteins that were not identified in genetic or transcriptomic studies. To generate this comprehensive reference data set, which extends the previously reported pollen proteome by a factor of 13, we developed a novel deterministic peptide classification scheme for protein inference. This generally applicable approach considers the gene model–protein sequence–protein accession relationships. It allowed us to classify and eliminate ambiguities inherently associated with any shotgun proteomics data set, to report a conservative list of protein identifications, and to seamlessly integrate data from previous transcriptomics studies. Manual validation of proteins unambiguously identified by a single, information-rich peptide enabled us to significantly reduce the false discovery rate, while keeping valuable identifications of shorter and lower abundant proteins. Bioinformatic analyses revealed a higher stability of pollen proteins compared to those of other tissues and implied a protein family of previously unknown function in vesicle trafficking. Interestingly, the pollen proteome is most similar to that of seeds, indicating physiological similarities between these developmentally distinct tissues

    Improved prediction of peptide detectability for targeted proteomics using a rank-based algorithm and organism-specific data.

    Full text link
    UNLABELLED: The in silico prediction of the best-observable "proteotypic" peptides in mass spectrometry-based workflows is a challenging problem. Being able to accurately predict such peptides would enable the informed selection of proteotypic peptides for targeted quantification of previously observed and non-observed proteins for any organism, with a significant impact for clinical proteomics and systems biology studies. Current prediction algorithms rely on physicochemical parameters in combination with positive and negative training sets to identify those peptide properties that most profoundly affect their general detectability. Here we present PeptideRank, an approach that uses learning to rank algorithm for peptide detectability prediction from shotgun proteomics data, and that eliminates the need to select a negative dataset for the training step. A large number of different peptide properties are used to train ranking models in order to predict a ranking of the best-observable peptides within a protein. Empirical evaluation with rank accuracy metrics showed that PeptideRank complements existing prediction algorithms. Our results indicate that the best performance is achieved when it is trained on organism-specific shotgun proteomics data, and that PeptideRank is most accurate for short to medium-sized and abundant proteins, without any loss in prediction accuracy for the important class of membrane proteins. BIOLOGICAL SIGNIFICANCE: Targeted proteomics approaches have been gaining a lot of momentum and hold immense potential for systems biology studies and clinical proteomics. However, since only very few complete proteomes have been reported to date, for a considerable fraction of a proteome there is no experimental proteomics evidence that would allow to guide the selection of the best-suited proteotypic peptides (PTPs), i.e. peptides that are specific to a given proteoform and that are repeatedly observed in a mass spectrometer. We describe a novel, rank-based approach for the prediction of the best-suited PTPs for targeted proteomics applications. By building on methods developed in the field of information retrieval (e.g. web search engines like Google's PageRank), we circumvent the delicate step of selecting positive and negative training sets and at the same time also more closely reflect the experimentalist´s need for selecting e.g. the 5 most promising peptides for targeting a protein of interest. This approach allows to predict PTPs for not yet observed proteins or for organisms without prior experimental proteomics data such as many non-model organisms
    corecore