31 research outputs found

    Inductive queries for a drug designing robot scientist

    Get PDF
    It is increasingly clear that machine learning algorithms need to be integrated in an iterative scientific discovery loop, in which data is queried repeatedly by means of inductive queries and where the computer provides guidance to the experiments that are being performed. In this chapter, we summarise several key challenges in achieving this integration of machine learning and data mining algorithms in methods for the discovery of Quantitative Structure Activity Relationships (QSARs). We introduce the concept of a robot scientist, in which all steps of the discovery process are automated; we discuss the representation of molecular data such that knowledge discovery tools can analyse it, and we discuss the adaptation of machine learning and data mining algorithms to guide QSAR experiments

    Enumerating molecules.

    Full text link

    Women in Science 2016

    Get PDF
    Women in Science 2016 summarizes research done by Smith College’s Summer Research Fellowship (SURF) Program participants. Ever since its 1967 start, SURF has been a cornerstone of Smith’s science education. In 2016, 150 students participated in SURF (144 hosted on campus and nearby eld sites), supervised by 56 faculty mentor-advisors drawn from the Clark Science Center and connected to its eighteen science, mathematics, and engineering departments and programs and associated centers and units. At summer’s end, SURF participants were asked to summarize their research experiences for this publication.https://scholarworks.smith.edu/clark_womeninscience/1005/thumbnail.jp

    The Polytope Formalism: isomerism and associated unimolecular isomerisation

    Get PDF
    This thesis concerns the ontology of isomerism, this encompassing the conceptual frameworks and relationships that comprise the subject matter; the necessary formal definitions, nomenclature, and representations that have impacts reaching into unexpected areas such as drug registration and patent specifications; the requisite controlled and precise vocabulary that facilitates nuanced communication; and the digital/computational formalisms that underpin the chemistry software and database tools that empower chemists to perform much of their work. Using conceptual tools taken from Combinatorics, and Graph Theory, means are presented to provide a unified description of isomerism and associated unimolecular isomerisation spanning both constitutional isomerism and stereoisomerism called the Polytope Formalism. This includes unification of the varying approaches historically taken to describe and understand stereoisomerism in organic and inorganic compounds. Work for this Thesis began with the synthesis, isolation, and characterisation of compounds not adequately describable using existing IUPAC recommendations. Generalisation of the polytopal-rearrangements model of stereoisomerisation used for inorganic chemistry led to the prescriptions that could deal with the synthesised compounds, revealing an unrecognised fundamental form of isomerism called akamptisomerism. Following on, this Thesis describes how in attempting to place akamptisomerism within the context of existing stereoisomerism reveals significant systematic deficiencies in the IUPAC recommendations. These shortcomings have limited the conceptualisation of broad classes of compounds and hindered development of molecules for medicinal and technological applications. It is shown how the Polytope Formalism can be applied to the description of constitutional isomerism in a practical manner. Finally, a radically different medicinal chemistry design strategy with broad application, based upon the principles, is describe

    Nenad Trinajstić – Pioneer of Chemical Graph Theory

    Get PDF
    We present a brief overview of many contributions of Nenad Trinajstić to Chemical Graph Theory, an important and fast developing branch of Theoretical Chemistry. In addition, we outline briefly the various activities of Trinajstić within the chemical community of Croatia. As can be seen, his scientific work has been very productive and has not abated despite the hostilities towards the Chemical Graph Theory in certain chemical circles over the past 30 years. On the contrary, Trinajstić continued, widened the areas of his research interest, which started with investigating the close relationship between Graph Theory and HMO, and demonstrated the importance of Chemical Graph theory for chemistry. In more than one way he has proven the opponents of Chemical Graph Theory wrong, though some continue to fail to recognize the importance of Graph Theory in Chemistry

    Estimation method for the thermochemical properties of polycyclic aromatic molecules

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Chemical Engineering, 2005.Includes bibliographical references.Polycyclic aromatic molecules, including polycyclic aromatic hydrocarbons (PAHs) have attracted considerable attention in the past few decades. They are formed during the incomplete combustion of hydrocarbon fuels and are precursors of soot. Some PAHs are known carcinogens, and control of their emissions is an important issue. These molecules are found in many materials, including coal, fuel oils, lubricants, and carbon black. They are also implicated in the formation of fullerenes, one of the most. chemically versatile class of molecules known. Clearly, models that provide predictive capability for their formation and growth are highly desirable. Thlermochemical properties of the species in the model are often the most important parameter, particularly for high temperature processes such as the formation of PAH and other aromatic molecules. Thermodynamic consistency requires that reverse rate constants be calculated from the forward rate constants and from the equilibrium constants. The later are obtained from the thermochemical properties of reactants and products. The predictive ability of current kinetic models is significantly limited by the scarcity of available thermochemical data.(cont.) In this work we present the development of a Bond-Centered Group Additivity method for the estimation of the thermochemical properties of polycyclic aromatic molecules, including PAHs, molecules with the furan substructure, molecules with triple bonds, substituted PAHs, and radicals. This method is based on thermochemical values of about two hundred polycyclic aromatic molecules and radicals obtained from quantum chemical calculations at the B3LYP/6-31G(d) level. A consistent set of homodesmic reactions has been developed to accurately calculate the heat of formation from the absolute energy. The entropies calculated from the B3LYP/6-31G(d) vibrational frequencies are shown to be at least as reliable as the few available experimental values. This new Bond-Centered Group Additivity method predicts the thermochemistry of C₆₀ and C₇₀ fullerenes, as well as smaller aromatic molecules, with accuracy comparable to both experiments and the best quantum calculations. This Bond-Centered Group Additivity method is shown to extrapolate reasonably to infinite graphene sheets.(cont.) The Bond-Centered Group Additivity method has been implemented into a computer code within the automatic Reaction Mechanism Generation software (RMG) developed in our group. The database has been organized as a tree structure, making its maintenance and possible extension very straightforward. This computer code allows the fast and easy use of this estimation method by non-expert users. Moreover, since it is incorporated into RMG, it will allow users to generate reaction mechanisms that include aromatic molecules whose thermochemical properties are calculated using the Bond-Centered Group Additivity method. Exploratory equilibrium studies were performed (l. Equilibrium concentrations of individual species depend strongly on the thermochemistry of the individual species, emphasizing the importance of consistent thermochemistry for all the species involved in the calculations. Equilibrium calculations can provide many interesting insights into the relationship between PAH and fullerenes in combustion.by Joanna Yu.Ph.D

    Development and Improvement of Tools and Algorithms for the Problem of Atom Type Perception and for the Assessment of Protein-Ligand-Complex Geometries

    Get PDF
    In context of the present work, a scoring function for protein-ligand complexes has been developed, not aimed at affinity prediction, but rather a good recognition rate of near native geometries. The developed program DSX makes use of the same formalism as the knowledge-based scoring function DrugScore, hence using the knowledge from crystallographic databases and atom-type specific distance-dependent distribution functions. It is based on newly defined atom-types. Additionally, the program is augmented by two novel potentials which evaluate the torsion angles and (de-)solvation effects. Validation of DSX is based on a literature-known, comprehensive data-set that allows for comparison with other popular scoring functions. DSX is intended for the recognition of near-native binding modes. In this important task, DSX outperforms the competitors, but is also among the best scoring functions regarding the ranking of different compounds. Another essential step in the development of DSX was the automatical assignment of the new atom types. A powerful programming framework was implemented to fulfill this task. Validation was done on a literature-known data-set and showed superior efficiency and quality compared to similar programs where this data was available. The front-end fconv was developed to share this functionality with the scientific community. Multiple features useful in computational drug-design workflows are also included and fconv was made freely available as Open Source Project. Based on the developed potentials for DSX, a number of further applications was created and impemented: The program HotspotsX calculates favorable interaction fields in protein binding pockets that can be used as a starting point for pharmacophoric models and that indicate possible directions for the optimization of lead structures. The program DSFP calculates scores based on fingerprints for given binding geometries. These fingerprints are compared with reference fingerprints that are derived from DSX interactions in known crystal structures of the particular target. Finally, the program DSX_wat was developed to predict stable water networks within a binding pocket. DSX interaction fields are used to calculate the putative water positions
    corecore