31 research outputs found

    Machine Learning in Discrete Molecular Spaces

    Get PDF
    The past decade has seen an explosion of machine learning in chemistry. Whether it is in property prediction, synthesis, molecular design, or any other subdivision, machine learning seems poised to become an integral, if not a dominant, component of future research efforts. This extraordinary capacity rests on the interac- tion between machine learning models and the underlying chemical data landscape commonly referred to as chemical space. Chemical space has multiple incarnations, but is generally considered the space of all possible molecules. In this sense, it is one example of a molecular set: an arbitrary collection of molecules. This thesis is devoted to precisely these objects, and particularly how they interact with machine learning models. This work is predicated on the idea that by better understanding the relationship between molecular sets and the models trained on them we can improve models, achieve greater interpretability, and further break down the walls between data-driven and human-centric chemistry. The hope is that this enables the full predictive power of machine learning to be leveraged while continuing to build our understanding of chemistry. The first three chapters of this thesis introduce and reviews the necessary machine learning theory, particularly the tools that have been specially designed for chemical problems. This is followed by an extensive literature review in which the contributions of machine learning to multiple facets of chemistry over the last two decades are explored. Chapters 4-7 explore the research conducted throughout this PhD. Here we explore how we can meaningfully describe the properties of an arbitrary set of molecules through information theory; how we can determine the most informative data points in a set of molecules; how graph signal processing can be used to understand the relationship between the chosen molecular representation, the property, and the machine learning model; and finally how this approach can be brought to bear on protein space. Each of these sub-projects briefly explores the necessary mathematical theory before leveraging it to provide approaches that resolve the posed problems. We conclude with a summary of the contributions of this work and outline fruitful avenues for further exploration

    Nonequilibrium Green Functions Simulations on the Next Level: Theoretical Advances and Applications to Finite Lattice Systems

    Get PDF
    This thesis is devoted to the description of correlated finite lattice systems under nonequilibrium conditions. In this context, the lack of small parameters in the corresponding standard many-body equations makes it difficult to construct suitable approximations for theoretical tools, which renders the computation of relevant observables numerically costly and impractical. At the same time, rigorous predictions for the ultrafast dynamics in correlated lattices are highly valuable for the understanding of many state-of-the-art experiments. The nonequilibrium Green functions (NEGF) technique is particularly well-suited to meet the challenging demands that come with the description of the nontrivial interplay between quantum correlations and nonequilibrium effects in excited lattice systems. However, in order to apply the approach on a practically relevant scale, several methodological improvements come to be indispensable. The present thesis contains these theoretical advances of the NEGF method, alongside with—thus accessible—applications to ultracold atoms in optical lattices and excited finite graphene nanostructures

    Theoretical-experimental study on protein-ligand interactions based on thermodynamics methods, molecular docking and perturbation models

    Get PDF
    The current doctoral thesis focuses on understanding the thermodynamic events of protein-ligand interactions which have been of paramount importance from traditional Medicinal Chemistry to Nanobiotechnology. Particular attention has been made on the application of state-of-the-art methodologies to address thermodynamic studies of the protein-ligand interactions by integrating structure-based molecular docking techniques, classical fractal approaches to solve protein-ligand complementarity problems, perturbation models to study allosteric signal propagation, predictive nano-quantitative structure-toxicity relationship models coupled with powerful experimental validation techniques. The contributions provided by this work could open an unlimited horizon to the fields of Drug-Discovery, Materials Sciences, Molecular Diagnosis, and Environmental Health Sciences

    Medical Informatics and Data Analysis

    Get PDF
    During recent years, the use of advanced data analysis methods has increased in clinical and epidemiological research. This book emphasizes the practical aspects of new data analysis methods, and provides insight into new challenges in biostatistics, epidemiology, health sciences, dentistry, and clinical medicine. This book provides a readable text, giving advice on the reporting of new data analytical methods and data presentation. The book consists of 13 articles. Each article is self-contained and may be read independently according to the needs of the reader. The book is essential reading for postgraduate students as well as researchers from medicine and other sciences where statistical data analysis plays a central role

    Kinetic model construction using chemoinformatics

    Get PDF
    Kinetic models of chemical processes not only provide an alternative to costly experiments; they also have the potential to accelerate the pace of innovation in developing new chemical processes or in improving existing ones. Kinetic models are most powerful when they reflect the underlying chemistry by incorporating elementary pathways between individual molecules. The downside of this high level of detail is that the complexity and size of the models also steadily increase, such that the models eventually become too difficult to be manually constructed. Instead, computers are programmed to automate the construction of these models, and make use of graph theory to translate chemical entities such as molecules and reactions into computer-understandable representations. This work studies the use of automated methods to construct kinetic models. More particularly, the need to account for the three-dimensional arrangement of atoms in molecules and reactions of kinetic models is investigated and illustrated by two case studies. First of all, the thermal rearrangement of two monoterpenoids, cis- and trans-2-pinanol, is studied. A kinetic model that accounts for the differences in reactivity and selectivity of both pinanol diastereomers is proposed. Secondly, a kinetic model for the pyrolysis of the fuel “JP-10” is constructed and highlights the use of state-of-the-art techniques for the automated estimation of thermochemistry of polycyclic molecules. A new code is developed for the automated construction of kinetic models and takes advantage of the advances made in the field of chemo-informatics to tackle fundamental issues of previous approaches. Novel algorithms are developed for three important aspects of automated construction of kinetic models: the estimation of symmetry of molecules and reactions, the incorporation of stereochemistry in kinetic models, and the estimation of thermochemical and kinetic data using scalable structure-property methods. Finally, the application of the code is illustrated by the automated construction of a kinetic model for alkylsulfide pyrolysis

    Principal Component Analysis

    Get PDF
    This book is aimed at raising awareness of researchers, scientists and engineers on the benefits of Principal Component Analysis (PCA) in data analysis. In this book, the reader will find the applications of PCA in fields such as taxonomy, biology, pharmacy,finance, agriculture, ecology, health and architecture

    Book of abstracts of the 10th International Chemical and Biological Engineering Conference: CHEMPOR 2008

    Get PDF
    This book contains the extended abstracts presented at the 10th International Chemical and Biological Engineering Conference - CHEMPOR 2008, held in Braga, Portugal, over 3 days, from the 4th to the 6th of September, 2008. Previous editions took place in Lisboa (1975, 1889, 1998), Braga (1978), Póvoa de Varzim (1981), Coimbra (1985, 2005), Porto (1993), and Aveiro (2001). The conference was jointly organized by the University of Minho, “Ordem dos Engenheiros”, and the IBB - Institute for Biotechnology and Bioengineering with the usual support of the “Sociedade Portuguesa de Química” and, by the first time, of the “Sociedade Portuguesa de Biotecnologia”. Thirty years elapsed since CHEMPOR was held at the University of Minho, organized by T.R. Bott, D. Allen, A. Bridgwater, J.J.B. Romero, L.J.S. Soares and J.D.R.S. Pinheiro. We are fortunate to have Profs. Bott, Soares and Pinheiro in the Honor Committee of this 10th edition, under the high Patronage of his Excellency the President of the Portuguese Republic, Prof. Aníbal Cavaco Silva. The opening ceremony will confer Prof. Bott with a “Long Term Achievement” award acknowledging the important contribution Prof. Bott brought along more than 30 years to the development of the Chemical Engineering science, to the launch of CHEMPOR series and specially to the University of Minho. Prof. Bott’s inaugural lecture will address the importance of effective energy management in processing operations, particularly in the effectiveness of heat recovery and the associated reduction in greenhouse gas emission from combustion processes. The CHEMPOR series traditionally brings together both young and established researchers and end users to discuss recent developments in different areas of Chemical Engineering. The scope of this edition is broadening out by including the Biological Engineering research. One of the major core areas of the conference program is life quality, due to the importance that Chemical and Biological Engineering plays in this area. “Integration of Life Sciences & Engineering” and “Sustainable Process-Product Development through Green Chemistry” are two of the leading themes with papers addressing such important issues. This is complemented with additional leading themes including “Advancing the Chemical and Biological Engineering Fundamentals”, “Multi-Scale and/or Multi-Disciplinary Approach to Process-Product Innovation”, “Systematic Methods and Tools for Managing the Complexity”, and “Educating Chemical and Biological Engineers for Coming Challenges” which define the extended abstracts arrangements along this book. A total of 516 extended abstracts are included in the book, consisting of 7 invited lecturers, 15 keynote, 105 short oral presentations given in 5 parallel sessions, along with 6 slots for viewing 389 poster presentations. Full papers are jointly included in the companion Proceedings in CD-ROM. All papers have been reviewed and we are grateful to the members of scientific and organizing committees for their evaluations. It was an intensive task since 610 submitted abstracts from 45 countries were received. It has been an honor for us to contribute to setting up CHEMPOR 2008 during almost two years. We wish to thank the authors who have contributed to yield a high scientific standard to the program. We are thankful to the sponsors who have contributed decisively to this event. We also extend our gratefulness to all those who, through their dedicated efforts, have assisted us in this task. On behalf of the Scientific and Organizing Committees we wish you that together with an interesting reading, the scientific program and the social moments organized will be memorable for all.Fundação para a Ciência e a Tecnologia (FCT

    Proceedings of the 10th International Chemical and Biological Engineering Conference - CHEMPOR 2008

    Get PDF
    This volume contains full papers presented at the 10th International Chemical and Biological Engineering Conference - CHEMPOR 2008, held in Braga, Portugal, between September 4th and 6th, 2008.FC
    corecore