41 research outputs found

    Computational methods for small molecules

    Get PDF
    Metabolism is the system of chemical reactions sustaining life in the cells of living organisms. It is responsible for cellular processes that break down nutrients for energy and produce building blocks for necessary molecules. The study of metabolism is vital to many disciplines in medicine and pharmacy. Chemical reactions operate on small molecules called metabolites, which form the core of metabolism. In this thesis we propose efficient computational methods for small molecules in metabolic applications. In this thesis we discuss four distinctive studies covering two major themes: the atom-level description of biochemical reactions, and analysis of tandem mass spectrometric measurements of metabolites. In the first part we study atom-level descriptions of organic reactions. We begin by proposing an optimal algorithm for determining the atom-to-atom correspondences between the reactant and product metabolites of organic reactions. In addition, we introduce a graph edit distance based cost as the mathematical formalism to determine optimality of atom mappings. We continue by proposing a compact single-graph representation of reactions using the atom mappings. We investigate the utility of the new representation in a reaction function classification task, where a descriptive category of the reaction's function is predicted. To facilitate the prediction, we introduce the first feasible path-based graph kernel, which describes the reactions as path sequences to high classification accuracy. In the second part we turn our focus on analysing tandem mass spectrometric measurements of metabolites. In a tandem mass spectrometer, an input molecule structure is fragmented into substructures or fragments, whose masses are observed. We begin by studying the fragment identification problem. A combinatorial algorithm is presented to enumerate candidate substructures based on the given masses. We also demonstrate the usefulness of utilising approximated bond energies as a cost function to rank the candidate structures according to their chemical feasibility. We propose fragmentation tree models to describe the dependencies between fragments for higher identification accuracy. We continue by studying a closely related problem where an unknown metabolite is elucidated based on its tandem mass spectrometric fragment signals. This metabolite identification task is an important problem in metabolomics, underpinning the subsequent modelling and analysis efforts. We propose an automatic machine learning framework to predict a set of structural properties of the unknown metabolite. The properties are turned into candidate structures by a novel statistical model. We introduce the first mass spectral kernels and explore three feature classes to facilitate the prediction. The kernels introduce support for high-accuracy mass spectrometric measurements for enhanced predictive accuracy.Tässä väitöskirjassa esitetään tehokkaita laskennallisia menetelmiä pienille molekyyleille aineenvaihduntasovelluksissa. Aineenvaihdunta on kemiallisten reaktioiden järjestelmä, joka ylläpitää elämää solutasolla. Aineenvaihduntaprosessit hajottavat ravinteita energiaksi ja rakennusaineiksi soluille tarpeellisten molekyylien valmistamiseen. Kemiallisten reaktioiden muokkaamia pieniä molekyylejä kutsutaan metaboliiteiksi. Tämä väitöskirja sisältää neljä itsenäistä tutkimusta, jotka jakautuvat teemallisesti biokemiallisten reaktioiden atomitason kuvaamiseen ja metaboliittien massaspektrometriamittausten analysointiin. Väitöskirjan ensimmäisessä osassa käsitellään biokemiallisten reaktioiden atomitason kuvauksia. Väitöskirjassa esitellään optimaalinen algoritmi reaktioiden lähtö- ja tuoteaineiden välisten atomikuvausten määrittämiseen. Optimaalisuus määrittyy verkkojen editointietäisyyteen perustuvalla kustannusfunktiolla. Optimaalinen atomikuvaus mahdollistaa reaktion kuvaamisen yksikäsitteisesti yhdellä verkolla. Uutta reaktiokuvausta hyödynnetään reaktion funktion ennustustehtävässä, jossa pyritään määrittämään reaktiota sanallisesti kuvaava kategoria automaattisesti. Väitöskirjassa esitetään polku-perustainen verkkokerneli, joka kuvaa reaktiot atomien polkusekvensseinä verrattuna aiempiin kulkusekvensseihin saavuttaen paremman ennustustarkkuuden. Väitöskirjan toisessa osassa analysoidaan metaboliittien tandem-massaspektrometriamittauksia. Tandem-massaspektrometri hajottaa analysoitavan syötemolekyylin fragmenteiksi ja mittaa niiden massa-varaus suhteet. Väitöskirjassa esitetään perusteellinen kombinatorinen algoritmi fragmenttien tunnistamiseen. Menetelmän kustannusfunktio perustuu fragmenttien sidosenergioiden vertailuun. Lopuksi väitöskirjassa esitetään fragmentaatiopuut, joiden avulla voidaan mallintaa fragmenttien välisiä suhteita ja saavuttaa parempi tunnistustarkkuus. Fragmenttien tunnistuksen ohella voidaan tunnistaa myös analysoitavia metaboliitteja. Ongelma on merkittävä ja edellytys aineenvaihdunnun analyyseille. Väitöskirjassa esitetään koneoppimismenetelmä, joka ennustaa tuntemattoman metaboliitin rakennetta kuvaavia piirteitä ja muodostaa niiden perusteella rakenne-ennusteita tilastollisesti. Menetelmä esittelee ensimmäiset erityisesti massaspektrometriadataan soveltuvat kernel-funktiot ja saavuttaa hyvän ennustustarkkuuden

    Improved Cross-Linking Mass Spectrometry Algorithms for Probing Protein Structures and Interactions

    Get PDF
    Proteins are the most active molecules in living bodies. They catalyze chemical reactions, provide structural support for cells and allow organisms to move. Their function is intrinsically linked to their folded structure. Resolving the structures of proteins and protein complexes is crucial for our understanding of basic biological processes and diseases. Cross-Linking Mass Spectrometry (XL-MS) is a method to gain structural insights into protein complexes. The field of XL-MS data analysis software is not yet as established as many other methods in proteomics. XL-MS analysis software has significant room for improvement in terms of sensitivity, efficiency and standardization of file formats and workflows to facilitate interoperability and reproducibility. In this thesis we present a new XL-MS search engine, OpenPepXL. We develop an algorithm that scores all candidate cross-linked peptide pairs and is efficient enough to be used on a standard desktop PC for most applications. OpenPepXL supports the standardized XL-MS identification file format defined as a part of the MzIdentML 1.2 specifications that were developed in collaboration with the Proteomics Standards Initiative. We benchmark OpenPepXL against other state-of-the-art XL-MS identification tools on multiple datasets that allow cross-link validation through structures or other means. We show that our exhaustive approach, although not the quickest one, is superior in sensitivity to other tools. We suggest this is due to some tools improving their processing time by discarding too many candidates in early steps of the data analysis. We apply XL-MS analysis with OpenPepXL to multiple protein complexes related to meiosis and the type III secretion system. The first project involved several proteins with unknown structures, some of which are expected to be at least partially intrinsically disordered and therefore difficult to investigate using most traditional structural research methods. Unfortunately, we could not find cross-links between the interaction sites we were interested in the most, but we were able to identify many others in these complexes and gained some structural insights. In the second project we used the photo-cross-linking amino acid pBpa to test very specific hypotheses about interactions within the type III secretion system. We were not able to gain any new structural information yet. However, we could confirm that this is a viable approach. It is possible to identify cross-links between a pBpa residue incorporated into a protein sequence and a residue it cross-links to on a residue level resolution

    Plant Proteomic Research

    Get PDF
    Plants, being sessile in nature, are constantly exposed to environmental challenges resulting in substantial yield loss. To cope with harsh environments, plants have developed a wide range of adaptation strategies involving morpho-anatomical, physiological, and biochemical traits. In recent years, there has been phenomenal progress in the understanding of plant responses to environmental cues at the protein level. This progress has been fueled by the advancement in mass spectrometry techniques, complemented with genome-sequence data and modern bioinformatics analysis with improved sample preparation and fractionation strategies. As proteins ultimately regulate cellular functions, it is perhaps of greater importance to understand the changes that occur at the protein-abundance level, rather than the modulation of mRNA expression. This Special Issue on "Plant Proteomic Research" brings together a selection of insightful papers that address some of these issues related to applications of proteomic techniques in elucidating master regulator proteins and the pathways associated with plant development and stress responses. This Issue includes four reviews and 13 original articles primarily on environmental proteomic studies

    An interactive online software platform for the analysis of small molecules using hyphenated mass spectrometry: MeltDB and ALLocator

    Get PDF
    Kessler N. An interactive online software platform for the analysis of small molecules using hyphenated mass spectrometry: MeltDB and ALLocator. Bielefeld: Universität Bielefeld; 2018

    Qualitative and quantitative screening of side-chain profiles of cereal grain arabinoxylans

    Get PDF
    Arabinoxylans are the major hemicellulosic component of grasses\u27 cell walls. Side-chain profile differences between arabinoxylans alter their functionality in foods and biomaterials. Therefore, both knowledge of the side-chain elements present and quantification of these elements are important. In this work, new arabinoxylan structural elements were identified, and chromatographic and spectroscopic screening methods for both previously-described and novel side-chain components were developed

    The Origin and Early Evolution of Life

    Get PDF
    What is life? How, where, and when did life arise? These questions have remained most fascinating over the last hundred years. Systems chemistry is the way to go to better understand this problem and to try and answer the unsolved question regarding the origin of Life. Self-organization, thanks to the role of lipid boundaries, made possible the rise of protocells. The role of these boundaries is to separate and co-locate micro-environments, and make them spatially distinct; to protect and keep them at defined concentrations; and to enable a multitude of often competing and interfering biochemical reactions to occur simultaneously. The aim of this Special Issue is to summarize the latest discoveries in the field of the prebiotic chemistry of biomolecules, self-organization, protocells and the origin of life. In recent years, thousands of excellent reviews and articles have appeared in the literature and some breakthroughs have already been achieved. However, a great deal of work remains to be carried out. Beyond the borders of the traditional domains of scientific activity, the multidisciplinary character of the present Special Issue leaves space for anyone to creatively contribute to any aspect of these and related relevant topics. We hope that the presented works will be stimulating for a new generation of scientists that are taking their first steps in this fascinating field
    corecore