Search CORE

11,668 research outputs found

A critical evaluation of automatic atom mapping algorithms and tools

Author: C Steinbeck
D Fooshee
JS Rokem
M Heinonen
M Latendresse
NC Duarte
R Caspi
R Li
SA Rahman
T Blum
T Hogiri
Y Yamanishi
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 21/06/2017
Field of study

The identification of the atoms which change their position in chemical reactions is an important knowledge within the field of Metabolic Engineering. This can lead to new advances at different levels from the reconstruction of metabolic networks to the classification of chemical reactions, through the identification of the atomic changes inside a reaction. The Atom Mapping approach was initially developed in the 1960s, but recently suffered important advances, being used in diverse biological and biotechnological studies. The main methodologies used for atom mapping are the Maximum Common Substructure and the Linear Optimization methods, which both require computational know-how and powerful resources to run the underlying tools. In this work, we assessed a number of previously implemented atom mapping frameworks, and built a framework able of managing the different data inputs and outputs, as well as the mapping process provided by each of these third-party tools. We evaluated the admissibility of the calculated atom maps from different algorithms, also assessing if with different approaches we were capable of returning equivalent atom maps for the same chemical reaction.ERDF -European Regional Development Fund(UID/BIO/04469/2013)info:eu-repo/semantics/publishedVersio

Universidade do Minho: RepositoriUM

Crossref

A critical evaluation of automatic atom mapping algorithms and tools

Author: Osório Nuno Manuel Carneiro
Publication venue
Publication date: 01/01/2016
Field of study

Dissertação de mestardo em BioinformaticsThe identification of the atoms which change their position in chemical reactions is an important knowledge within the field of Metabolic Engineering (ME). This can lead to new advances at different levels from the reconstruction of metabolic networks to the classification of chemical reactions, through the identification of the atomic changes inside a reaction. The Atom Mapping approach was initially developed in the 1960’s, but recently it has suffered important advances, being used in diverse biological and biotechnological studies. The main methodologies used for the atom mapping process are the Maximum Common Substructure (MCS) and the Linear Optimization methods, which both require computational know-how and powerful resources to run the underlying tools. In this work, we assessed a number of previously implemented atom mapping frameworks, and built a framework able of managing the different data inputs and outputs, as well as the mapping process provided by each of these third-party tools. We also evaluated the admissibility of the calculated atom maps from different algorithms, assessing if with different approaches were capable of returning equivalent atom maps for the same chemical reaction.A identificação dos átomos que mudam a sua posição durante uma reacção química é um conhecimento importante no âmbito da investigação no campo da Engenharia Metabólica. Esta identificação é capaz de nos trazer vantagens a diversos níveis desde a reconstrução de redes metabólicas até à classificação de reacções químicas através da identificação das mudanças atómicas dentro de uma reacção. As técnicas de mapeamento de átomos foram inicialmente desenvolvidas nos anos 1960, mas têm sofrido importantes avanços recentemente, sendo usada em diversos trabalhos biológicos e biotecnológicos. As principais metodologias usadas no mapeamento de átomos usam as abordagens de Máxima Estrutura Comum ou a Optimização Linear, em ambos os casos requerendo conhecimentos computacionais bem como de importantes recursos para correr as ferramentas subjacentes. Neste trabalho, avaliamos diversas plataformas de mapeamento de átomos já implementadas, e construímos uma plataforma capaz de gerir as diferentes entradas e saídas de dados, bem como o processo de mapeamento providenciado por cada uma das ferramentas. Avaliamos, ainda, a admissibilidade dos mapas atómicos calculados e se diferentes algoritmos, com diferentes abordagens, são capazes de calcular mapas atómicos equivalentes para a mesma reacção química

Universidade do Minho: RepositoriUM

Retrosynthetic reaction prediction using neural sequence-to-sequence models

Author: Gomes Joseph
Ho Stephen
Kawthekar Prasad
Liu Bowen
Nguyen Quang Luu
Pande Vijay
Ramsundar Bharath
Shi Jade
Sloane Jack
Wender Paul
Publication venue
Publication date: 06/06/2017
Field of study

We describe a fully data driven model that learns to perform a retrosynthetic reaction prediction task, which is treated as a sequence-to-sequence mapping problem. The end-to-end trained model has an encoder-decoder architecture that consists of two recurrent neural networks, which has previously shown great success in solving other sequence-to-sequence prediction tasks such as machine translation. The model is trained on 50,000 experimental reaction examples from the United States patent literature, which span 10 broad reaction types that are commonly used by medicinal chemists. We find that our model performs comparably with a rule-based expert system baseline model, and also overcomes certain limitations associated with rule-based expert systems and with any machine learning approach that contains a rule-based expert system component. Our model provides an important first step towards solving the challenging problem of computational retrosynthetic analysis

arXiv.org e-Print Archive

Directory of Open Access Journals

BlogForever D2.6: Data Extraction Methodology

Author: Banos V.
Davis R.
Gkotsis G.
Pincent E.
Stepanyan K.
Publication venue
Publication date: 25/10/2013
Field of study

This report outlines an inquiry into the area of web data extraction, conducted within the context of blog preservation. The report reviews theoretical advances and practical developments for implementing data extraction. The inquiry is extended through an experiment that demonstrates the effectiveness and feasibility of implementing some of the suggested approaches. More specifically, the report discusses an approach based on unsupervised machine learning that employs the RSS feeds and HTML representations of blogs. It outlines the possibilities of extracting semantics available in blogs and demonstrates the benefits of exploiting available standards such as microformats and microdata. The report proceeds to propose a methodology for extracting and processing blog data to further inform the design and development of the BlogForever platform

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

An MDL framework for sparse coding and dictionary learning

Author: Ramírez Ignacio
Sapiro Guillermo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 11/10/2011
Field of study

The power of sparse signal modeling with learned over-complete dictionaries has been demonstrated in a variety of applications and fields, from signal processing to statistical inference and machine learning. However, the statistical properties of these models, such as under-fitting or over-fitting given sets of data, are still not well characterized in the literature. As a result, the success of sparse modeling depends on hand-tuning critical parameters for each data and application. This work aims at addressing this by providing a practical and objective characterization of sparse models by means of the Minimum Description Length (MDL) principle -- a well established information-theoretic approach to model selection in statistical inference. The resulting framework derives a family of efficient sparse coding and dictionary learning algorithms which, by virtue of the MDL principle, are completely parameter free. Furthermore, such framework allows to incorporate additional prior information to existing models, such as Markovian dependencies, or to define completely new problem formulations, including in the matrix analysis area, in a natural way. These virtues will be demonstrated with parameter-free algorithms for the classic image denoising and classification problems, and for low-rank matrix recovery in video applications

arXiv.org e-Print Archive

Crossref

Towards automatic Markov reliability modeling of computer architectures

Author: Liceaga C. A.
Siewiorek D. P.
Publication venue
Publication date
Field of study

The analysis and evaluation of reliability measures using time-varying Markov models is required for Processor-Memory-Switch (PMS) structures that have competing processes such as standby redundancy and repair, or renewal processes such as transient or intermittent faults. The task of generating these models is tedious and prone to human error due to the large number of states and transitions involved in any reasonable system. Therefore model formulation is a major analysis bottleneck, and model verification is a major validation problem. The general unfamiliarity of computer architects with Markov modeling techniques further increases the necessity of automating the model formulation. This paper presents an overview of the Automated Reliability Modeling (ARM) program, under development at NASA Langley Research Center. ARM will accept as input a description of the PMS interconnection graph, the behavior of the PMS components, the fault-tolerant strategies, and the operational requirements. The output of ARM will be the reliability of availability Markov model formulated for direct use by evaluation programs. The advantages of such an approach are (a) utility to a large class of users, not necessarily expert in reliability analysis, and (b) a lower probability of human error in the computation

NASA Technical Reports Server

Automated computation of materials properties

Author: Agapito L.A.
Ashcroft N.W.
Carrete J.
Daams J.L.C.
Frisch M.J.
Hahn T.
Hart G.L.W.
Leibfried G.
Maradudin A.A.
Nye J.F.
Scheffler M.
Schroers J.
Villars P.
Publication venue
Publication date: 15/05/2018
Field of study

Materials informatics offers a promising pathway towards rational materials design, replacing the current trial-and-error approach and accelerating the development of new functional materials. Through the use of sophisticated data analysis techniques, underlying property trends can be identified, facilitating the formulation of new design rules. Such methods require large sets of consistently generated, programmatically accessible materials data. Computational materials design frameworks using standardized parameter sets are the ideal tools for producing such data. This work reviews the state-of-the-art in computational materials design, with a focus on these automated

\textit{ab-initio}

frameworks. Features such as structural prototyping and automated error correction that enable rapid generation of large datasets are discussed, and the way in which integrated workflows can simplify the calculation of complex properties, such as thermal conductivity and mechanical stability, is demonstrated. The organization of large datasets composed of

\textit{ab-initio}

calculations, and the tools that render them programmatically accessible for use in statistical learning applications, are also described. Finally, recent advances in leveraging existing data to predict novel functional materials, such as entropy stabilized ceramics, bulk metallic glasses, thermoelectrics, superalloys, and magnets, are surveyed.Comment: 25 pages, 7 figures, chapter in a boo

arXiv.org e-Print Archive

Crossref