1,683 research outputs found
Evolution of Metabolic Networks: A Computational Framework
Background: The metabolic architectures of extant organisms share many key pathways such as the citric acid
cycle, glycolysis, or the biosynthesis of most amino acids. Several competing hypotheses for the evolutionary
mechanisms that shape metabolic networks have been discussed in the literature, each of which finds support
from comparative analysis of extant genomes. Alternatively, the principles of metabolic evolution can be studied
by direct computer simulation. This requires, however, an explicit implementation of all pertinent components: a
universe of chemical reaction upon which the metabolism is built, an explicit representation of the enzymes that
implement the metabolism, of a genetic system that encodes these enzymes, and of a fitness function that can
be selected for.
Results: We describe here a simulation environment that implements all these components in a simplified ways so
that large-scale evolutionary studies are feasible. We employ an artificial chemistry that views chemical reactions as
graph rewriting operations and utilizes a toy-version of quantum chemistry to derive thermodynamic parameters.
Minimalist organisms with simple string-encoded genomes produce model ribozymes whose catalytic activity is
determined by an ad hoc mapping between their secondary structure and the transition state graphs that they
stabilize. Fitness is computed utilizing the ideas of metabolic flux analysis. We present an implementation of the
complete system and first simulation results.
Conclusions: The simulation system presented here allows coherent investigations into the evolutionary mechanisms of the first steps of metabolic evolution using a self-consistent toy univers
Automatic Assignment of EC Numbers
A wide range of research areas in molecular biology and medical biochemistry require a reliable enzyme classification system, e.g., drug design, metabolic network reconstruction and system biology. When research scientists in the above mentioned areas wish to unambiguously refer to an enzyme and its function, the EC number introduced by the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology (IUBMB) is used. However, each and every one of these applications is critically dependent upon the consistency and reliability of the underlying data for success. We have developed tools for the validation of the EC number classification scheme. In this paper, we present validated data of 3788 enzymatic reactions including 229 sub-subclasses of the EC classification system. Over 80% agreement was found between our assignment and the EC classification. For 61 (i.e., only 2.5%) reactions we found that their assignment was inconsistent with the rules of the nomenclature committee; they have to be transferred to other sub-subclasses. We demonstrate that our validation results can be used to initiate corrections and improvements to the EC number classification scheme
A critical evaluation of automatic atom mapping algorithms and tools
Dissertação de mestardo em BioinformaticsThe identification of the atoms which change their position in chemical reactions is an
important knowledge within the field of Metabolic Engineering (ME). This can lead to new
advances at different levels from the reconstruction of metabolic networks to the classification
of chemical reactions, through the identification of the atomic changes inside a reaction.
The Atom Mapping approach was initially developed in the 1960’s, but recently it has
suffered important advances, being used in diverse biological and biotechnological studies.
The main methodologies used for the atom mapping process are the Maximum Common
Substructure (MCS) and the Linear Optimization methods, which both require computational
know-how and powerful resources to run the underlying tools.
In this work, we assessed a number of previously implemented atom mapping frameworks,
and built a framework able of managing the different data inputs and outputs, as
well as the mapping process provided by each of these third-party tools. We also evaluated
the admissibility of the calculated atom maps from different algorithms, assessing if with
different approaches were capable of returning equivalent atom maps for the same chemical
reaction.A identificação dos átomos que mudam a sua posição durante uma reacção química é um
conhecimento importante no âmbito da investigação no campo da Engenharia Metabólica.
Esta identificação é capaz de nos trazer vantagens a diversos níveis desde a reconstrução
de redes metabólicas até à classificação de reacções químicas através da identificação das
mudanças atómicas dentro de uma reacção.
As técnicas de mapeamento de átomos foram inicialmente desenvolvidas nos anos 1960,
mas têm sofrido importantes avanços recentemente, sendo usada em diversos trabalhos
biológicos e biotecnológicos. As principais metodologias usadas no mapeamento de átomos
usam as abordagens de Máxima Estrutura Comum ou a Optimização Linear, em ambos os
casos requerendo conhecimentos computacionais bem como de importantes recursos para
correr as ferramentas subjacentes.
Neste trabalho, avaliamos diversas plataformas de mapeamento de átomos já implementadas,
e construímos uma plataforma capaz de gerir as diferentes entradas e saídas de
dados, bem como o processo de mapeamento providenciado por cada uma das ferramentas.
Avaliamos, ainda, a admissibilidade dos mapas atómicos calculados e se diferentes
algoritmos, com diferentes abordagens, são capazes de calcular mapas atómicos equivalentes
para a mesma reacção química
Automated reaction mechanism generation : improving accuracy and broadening scope
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Chemical Engineering, 2012.Cataloged from PDF version of thesis.Includes bibliographical references (p. 169-186).Chemical kinetic modeling plays an important role in the study of reactive chemical systems. Thus, an automated means of constructing chemical kinetic models forms a useful tool in the engineering and science surrounding such systems. This document describes work to further develop one such tool, known as RMG (Reaction Mechanism Generator). Focus is placed on improving the accuracy of parameter estimation in the mechanism generation process and expanding the scope of applicability of the tool. In particular, effort has targeted the generation and use of explicit three-dimensional molecular structures for chemical species considered during reaction mechanism generation. This work has resulted in the generation of a software system integrated with RMG that can automatically generate and use such structures with quantum chemistry or force field codes to obtain more reliable thermochemistry estimates for cyclic structures without human intervention. Ultimately, the result of these updates is improved usefulness and reliability of the software system as a predictive tool. An application of the tool to the high temperature oxidation of JP-10, a jet fuel often used in military applications, is described. Using the newly refined RMG system, a detailed chemical kinetic model was constructed for this system. The resulting model represents a significant improvement upon existing work for JP- 10 oxidation by capturing detailed chemistry for this system. Simulations with this model have been found to produce results for ignition delay and product distribution that compare favorably with experimental results. The successful application of the refined RMG software system to this system demonstrates the practical utility of these updates.by Gregory Russell Magoon.Ph.D
Automated computation of materials properties
Materials informatics offers a promising pathway towards rational materials
design, replacing the current trial-and-error approach and accelerating the
development of new functional materials. Through the use of sophisticated data
analysis techniques, underlying property trends can be identified, facilitating
the formulation of new design rules. Such methods require large sets of
consistently generated, programmatically accessible materials data.
Computational materials design frameworks using standardized parameter sets are
the ideal tools for producing such data. This work reviews the state-of-the-art
in computational materials design, with a focus on these automated
frameworks. Features such as structural prototyping and
automated error correction that enable rapid generation of large datasets are
discussed, and the way in which integrated workflows can simplify the
calculation of complex properties, such as thermal conductivity and mechanical
stability, is demonstrated. The organization of large datasets composed of
calculations, and the tools that render them
programmatically accessible for use in statistical learning applications, are
also described. Finally, recent advances in leveraging existing data to predict
novel functional materials, such as entropy stabilized ceramics, bulk metallic
glasses, thermoelectrics, superalloys, and magnets, are surveyed.Comment: 25 pages, 7 figures, chapter in a boo
Automatic learning for the classification of chemical reactions and in statistical thermodynamics
This Thesis describes the application of automatic learning methods for a) the classification of organic and metabolic reactions, and b) the mapping of Potential Energy Surfaces(PES). The classification of reactions was approached with two distinct methodologies: a representation of chemical reactions based on NMR data, and a representation of chemical reactions from the reaction equation based on the physico-chemical and topological features of chemical bonds.
NMR-based classification of photochemical and enzymatic reactions. Photochemical
and metabolic reactions were classified by Kohonen Self-Organizing Maps (Kohonen
SOMs) and Random Forests (RFs) taking as input the difference between the 1H
NMR spectra of the products and the reactants. The development of such a representation can be applied in automatic analysis of changes in the 1H NMR spectrum of a mixture and their interpretation in terms of the chemical reactions taking place. Examples of possible applications are the monitoring of reaction processes, evaluation of the stability of chemicals, or even the interpretation of metabonomic data.
A Kohonen SOM trained with a data set of metabolic reactions catalysed by transferases
was able to correctly classify 75% of an independent test set in terms of the EC
number subclass. Random Forests improved the correct predictions to 79%. With photochemical reactions classified into 7 groups, an independent test set was classified with 86-93% accuracy. The data set of photochemical reactions was also used to simulate mixtures with two reactions occurring simultaneously. Kohonen SOMs and Feed-Forward Neural Networks (FFNNs) were trained to classify the reactions occurring in a mixture based on the 1H NMR spectra of the products and reactants. Kohonen SOMs allowed the correct assignment of 53-63% of the mixtures (in a test set). Counter-Propagation Neural Networks (CPNNs) gave origin to similar results. The use of supervised learning techniques allowed an improvement in the results. They were improved to 77% of correct assignments when an ensemble of ten FFNNs were used and to 80% when Random Forests were used.
This study was performed with NMR data simulated from the molecular structure by
the SPINUS program. In the design of one test set, simulated data was combined with
experimental data. The results support the proposal of linking databases of chemical
reactions to experimental or simulated NMR data for automatic classification of reactions and mixtures of reactions.
Genome-scale classification of enzymatic reactions from their reaction equation.
The MOLMAP descriptor relies on a Kohonen SOM that defines types of bonds on the basis of their physico-chemical and topological properties. The MOLMAP descriptor of a molecule represents the types of bonds available in that molecule. The MOLMAP
descriptor of a reaction is defined as the difference between the MOLMAPs of the products and the reactants, and numerically encodes the pattern of bonds that are broken,
changed, and made during a chemical reaction.
The automatic perception of chemical similarities between metabolic reactions is required for a variety of applications ranging from the computer validation of classification systems, genome-scale reconstruction (or comparison) of metabolic pathways, to the classification of enzymatic mechanisms. Catalytic functions of proteins are generally described by the EC numbers that are simultaneously employed as identifiers of reactions, enzymes, and enzyme genes, thus linking metabolic and genomic information. Different methods
should be available to automatically compare metabolic reactions and for the automatic
assignment of EC numbers to reactions still not officially classified.
In this study, the genome-scale data set of enzymatic reactions available in the KEGG
database was encoded by the MOLMAP descriptors, and was submitted to Kohonen
SOMs to compare the resulting map with the official EC number classification, to explore
the possibility of predicting EC numbers from the reaction equation, and to assess the
internal consistency of the EC classification at the class level.
A general agreement with the EC classification was observed, i.e. a relationship between the similarity of MOLMAPs and the similarity of EC numbers. At the same time, MOLMAPs were able to discriminate between EC sub-subclasses. EC numbers could be assigned at the class, subclass, and sub-subclass levels with accuracies up to 92%, 80%, and 70% for independent test sets. The correspondence between chemical similarity of metabolic reactions and their MOLMAP descriptors was applied to the identification of a number of reactions mapped into the same neuron but belonging to different EC classes, which demonstrated the ability of the MOLMAP/SOM approach to verify the internal consistency of classifications in databases of metabolic reactions.
RFs were also used to assign the four levels of the EC hierarchy from the reaction
equation. EC numbers were correctly assigned in 95%, 90%, 85% and 86% of the cases
(for independent test sets) at the class, subclass, sub-subclass and full EC number level,respectively. Experiments for the classification of reactions from the main reactants and products were performed with RFs - EC numbers were assigned at the class, subclass and sub-subclass level with accuracies of 78%, 74% and 63%, respectively.
In the course of the experiments with metabolic reactions we suggested that the
MOLMAP / SOM concept could be extended to the representation of other levels of
metabolic information such as metabolic pathways. Following the MOLMAP idea, the pattern of neurons activated by the reactions of a metabolic pathway is a representation of the reactions involved in that pathway - a descriptor of the metabolic pathway. This reasoning enabled the comparison of different pathways, the automatic classification of pathways, and a classification of organisms based on their biochemical machinery. The three levels of classification (from bonds to metabolic pathways) allowed to map and perceive chemical similarities between metabolic pathways even for pathways of different
types of metabolism and pathways that do not share similarities in terms of EC numbers.
Mapping of PES by neural networks (NNs). In a first series of experiments, ensembles of Feed-Forward NNs (EnsFFNNs) and Associative Neural Networks (ASNNs) were trained to reproduce PES represented by the Lennard-Jones (LJ) analytical potential
function. The accuracy of the method was assessed by comparing the results of molecular dynamics simulations (thermal, structural, and dynamic properties) obtained from the NNs-PES and from the LJ function.
The results indicated that for LJ-type potentials, NNs can be trained to generate
accurate PES to be used in molecular simulations. EnsFFNNs and ASNNs gave better
results than single FFNNs. A remarkable ability of the NNs models to interpolate between distant curves and accurately reproduce potentials to be used in molecular simulations is shown.
The purpose of the first study was to systematically analyse the accuracy of different NNs. Our main motivation, however, is reflected in the next study: the mapping
of multidimensional PES by NNs to simulate, by Molecular Dynamics or Monte Carlo,
the adsorption and self-assembly of solvated organic molecules on noble-metal electrodes.
Indeed, for such complex and heterogeneous systems the development of suitable analytical functions that fit quantum mechanical interaction energies is a non-trivial or even impossible task.
The data consisted of energy values, from Density Functional Theory (DFT) calculations,
at different distances, for several molecular orientations and three electrode
adsorption sites. The results indicate that NNs require a data set large enough to cover
well the diversity of possible interaction sites, distances, and orientations. NNs trained with such data sets can perform equally well or even better than analytical functions.
Therefore, they can be used in molecular simulations, particularly for the ethanol/Au
(111) interface which is the case studied in the present Thesis. Once properly trained,
the networks are able to produce, as output, any required number of energy points for
accurate interpolations
Computational methods for small molecules
Metabolism is the system of chemical reactions sustaining life in the cells of living organisms. It is responsible for cellular processes that break down nutrients for energy and produce building blocks for necessary molecules. The study of metabolism is vital to many disciplines in medicine and pharmacy. Chemical reactions operate on small molecules called metabolites, which form the core of metabolism. In this thesis we propose efficient computational methods for small molecules in metabolic applications. In this thesis we discuss four distinctive studies covering two major themes: the atom-level description of biochemical reactions, and analysis of tandem mass spectrometric measurements of metabolites.
In the first part we study atom-level descriptions of organic reactions. We begin by proposing an optimal algorithm for determining the atom-to-atom correspondences between the reactant and product metabolites of organic reactions. In addition, we introduce a graph edit distance based cost as the mathematical formalism to determine optimality of atom mappings. We continue by proposing a compact single-graph representation of reactions using the atom mappings. We investigate the utility of the new representation in a reaction function classification task, where a descriptive category of the reaction's function is predicted. To facilitate the prediction, we introduce the first feasible path-based graph kernel, which describes the reactions as path sequences to high classification accuracy.
In the second part we turn our focus on analysing tandem mass spectrometric measurements of metabolites. In a tandem mass spectrometer, an input molecule structure is fragmented into substructures or fragments, whose masses are observed. We begin by studying the fragment identification problem. A combinatorial algorithm is presented to enumerate candidate substructures based on the given masses. We also demonstrate the usefulness of utilising approximated bond energies as a cost function to rank the candidate structures according to their chemical feasibility. We propose fragmentation tree models to describe the dependencies between fragments for higher identification accuracy.
We continue by studying a closely related problem where an unknown metabolite is elucidated based on its tandem mass spectrometric fragment signals. This metabolite identification task is an important problem in metabolomics, underpinning the subsequent modelling and analysis efforts. We propose an automatic machine learning framework to predict a set of structural properties of the unknown metabolite. The properties are turned into candidate structures by a novel statistical model. We introduce the first mass spectral kernels and explore three feature classes to facilitate the prediction. The kernels introduce support for high-accuracy mass spectrometric measurements for enhanced predictive accuracy.Tässä väitöskirjassa esitetään tehokkaita laskennallisia menetelmiä pienille molekyyleille aineenvaihduntasovelluksissa. Aineenvaihdunta on kemiallisten reaktioiden järjestelmä, joka ylläpitää elämää solutasolla. Aineenvaihduntaprosessit hajottavat ravinteita energiaksi ja rakennusaineiksi soluille tarpeellisten molekyylien valmistamiseen. Kemiallisten reaktioiden muokkaamia pieniä molekyylejä kutsutaan metaboliiteiksi. Tämä väitöskirja sisältää neljä itsenäistä tutkimusta, jotka jakautuvat teemallisesti biokemiallisten reaktioiden atomitason kuvaamiseen ja metaboliittien massaspektrometriamittausten analysointiin.
Väitöskirjan ensimmäisessä osassa käsitellään biokemiallisten reaktioiden atomitason kuvauksia. Väitöskirjassa esitellään optimaalinen algoritmi reaktioiden lähtö- ja tuoteaineiden välisten atomikuvausten määrittämiseen. Optimaalisuus määrittyy verkkojen editointietäisyyteen perustuvalla kustannusfunktiolla. Optimaalinen atomikuvaus mahdollistaa reaktion kuvaamisen yksikäsitteisesti yhdellä verkolla. Uutta reaktiokuvausta hyödynnetään reaktion funktion ennustustehtävässä, jossa pyritään määrittämään reaktiota sanallisesti kuvaava kategoria automaattisesti. Väitöskirjassa esitetään polku-perustainen verkkokerneli, joka kuvaa reaktiot atomien polkusekvensseinä verrattuna aiempiin kulkusekvensseihin saavuttaen paremman ennustustarkkuuden.
Väitöskirjan toisessa osassa analysoidaan metaboliittien tandem-massaspektrometriamittauksia. Tandem-massaspektrometri hajottaa analysoitavan syötemolekyylin fragmenteiksi ja mittaa niiden massa-varaus suhteet. Väitöskirjassa esitetään perusteellinen kombinatorinen algoritmi fragmenttien tunnistamiseen. Menetelmän kustannusfunktio perustuu fragmenttien sidosenergioiden vertailuun. Lopuksi väitöskirjassa esitetään fragmentaatiopuut, joiden avulla voidaan mallintaa fragmenttien välisiä suhteita ja saavuttaa parempi tunnistustarkkuus.
Fragmenttien tunnistuksen ohella voidaan tunnistaa myös analysoitavia metaboliitteja. Ongelma on merkittävä ja edellytys aineenvaihdunnun analyyseille. Väitöskirjassa esitetään koneoppimismenetelmä, joka ennustaa tuntemattoman metaboliitin rakennetta kuvaavia piirteitä ja muodostaa niiden perusteella rakenne-ennusteita tilastollisesti. Menetelmä esittelee ensimmäiset erityisesti massaspektrometriadataan soveltuvat kernel-funktiot ja saavuttaa hyvän ennustustarkkuuden
Research and Technology
Langley Research Center is engaged in the basic an applied research necessary for the advancement of aeronautics and space flight, generating advanced concepts for the accomplishment of related national goals, and provding research advice, technological support, and assistance to other NASA installations, other government agencies, and industry. Highlights of major accomplishments and applications are presented
Nature’s Optics and Our Understanding of Light
Optical phenomena visible to everyone abundantly illustrate important ideas in science and mathematics. The phenomena considered include rainbows, sparkling reflections on water, green flashes, earthlight on the moon, glories, daylight, crystals, and the squint moon. The concepts include refraction, wave interference, numerical experiments, asymptotics, Regge poles, polarisation singularities, conical intersections, and visual illusions
- …