21 research outputs found

    Rethinking drug design in the artificial intelligence era

    Get PDF
    Artificial intelligence (AI) tools are increasingly being applied in drug discovery. While some protagonists point to vast opportunities potentially offered by such tools, others remain sceptical, waiting for a clear impact to be shown in drug discovery projects. The reality is probably somewhere in-between these extremes, yet it is clear that AI is providing new challenges not only for the scientists involved but also for the biopharma industry and its established processes for discovering and developing new medicines. This article presents the views of a diverse group of international experts on the 'grand challenges' in small-molecule drug discovery with AI and the approaches to address them

    Structural Pattern Recognition for Chemical-Compound Virtual Screening

    Get PDF
    Les molècules es configuren de manera natural com a xarxes, de manera que són ideals per estudiar utilitzant les seves representacions gràfiques, on els nodes representen àtoms i les vores representen els enllaços químics. Una alternativa per a aquesta representació directa és el gràfic reduït ampliat, que resumeix les estructures químiques mitjançant descripcions de nodes de tipus farmacòfor per codificar les propietats moleculars rellevants. Un cop tenim una manera adequada de representar les molècules com a gràfics, hem de triar l’eina adequada per comparar-les i analitzar-les. La distància d'edició de gràfics s'utilitza per resoldre la concordança de gràfics tolerant als errors; aquesta metodologia calcula la distància entre dos gràfics determinant el nombre mínim de modificacions necessàries per transformar un gràfic en l’altre. Aquestes modificacions (conegudes com a operacions d’edició) tenen associat un cost d’edició (també conegut com a cost de transformació), que s’ha de determinar en funció del problema. Aquest estudi investiga l’eficàcia d’una comparació molecular basada només en gràfics que utilitza gràfics reduïts ampliats i distància d’edició de gràfics com a eina per a aplicacions de cribratge virtual basades en lligands. Aquestes aplicacions estimen la bioactivitat d'una substància química que utilitza la bioactivitat de compostos similars. Una part essencial d’aquest estudi es centra en l’ús d’aprenentatge automàtic i tècniques de processament del llenguatge natural per optimitzar els costos de transformació utilitzats en les comparacions moleculars amb la distància d’edició de gràfics.Las moléculas tienen la forma natural de redes, lo que las hace ideales para estudiar mediante el empleo de sus representaciones gráficas, donde los nodos representan los átomos y los bordes representan los enlaces químicos. Una alternativa para esta representación sencilla es el gráfico reducido extendido, que resume las estructuras químicas utilizando descripciones de nodos de tipo farmacóforo para codificar las propiedades moleculares relevantes. Una vez que tenemos una forma adecuada de representar moléculas como gráficos, debemos elegir la herramienta adecuada para compararlas y analizarlas. La distancia de edición de gráficos se utiliza para resolver la coincidencia de gráficos tolerante a errores; esta metodología estima una distancia entre dos gráficos determinando el número mínimo de modificaciones necesarias para transformar un gráfico en el otro. Estas modificaciones (conocidas como operaciones de edición) tienen un costo de edición (también conocido como costo de transformación) asociado, que debe determinarse en función del problema. Este estudio investiga la efectividad de una comparación molecular basada solo en gráficos que emplea gráficos reducidos extendidos y distancia de edición de gráficos como una herramienta para aplicaciones de detección virtual basadas en ligandos. Estas aplicaciones estiman la bioactividad de una sustancia química empleando la bioactividad de compuestos similares. Una parte esencial de este estudio se centra en el uso de técnicas de procesamiento de lenguaje natural y aprendizaje automático para optimizar los costos de transformación utilizados en las comparaciones moleculares con la distancia de edición de gráficos.Molecules are naturally shaped as networks, making them ideal for studying by employing their graph representations, where nodes represent atoms and edges represent the chemical bonds. An alternative for this straightforward representation is the extended reduced graph, which summarizes the chemical structures using pharmacophore-type node descriptions to encode the relevant molecular properties. Once we have a suitable way to represent molecules as graphs, we need to choose the right tool to compare and analyze them. Graph edit distance is used to solve the error-tolerant graph matching; this methodology estimates a distance between two graphs by determining the minimum number of modifications required to transform one graph into the other. These modifications (known as edit operations) have an edit cost (also known as transformation cost) associated, which must be determined depending on the problem. This study investigates the effectiveness of a graph-only driven molecular comparison employing extended reduced graphs and graph edit distance as a tool for ligand-based virtual screening applications. Those applications estimate the bioactivity of a chemical employing the bioactivity of similar compounds. An essential part of this study focuses on using machine learning and natural language processing techniques to optimize the transformation costs used in the molecular comparisons with the graph edit distance. Overall, this work shows a framework that combines graph reduction and comparison with optimization tools and natural language processing to identify bioactivity similarities in a structurally diverse group of molecules. We confirm the efficiency of this framework with several chemoinformatic tests applied to regression and classification problems over different publicly available datasets

    A multiscale methodology for the preliminary screening of alternative process designs from a sustainability viewpoint adopting molecular and process simulation along with data envelopment analysis

    Get PDF
    La ricerca scientifica nell\u2019ambito dell\u2019ingegneria chimica si \ue8 focalizzata sia sul perfezionamento delle teorie e delle tecniche utilizzate attualmente, che sullo sviluppo di nuovi strumenti atti a risolvere le problematiche ancora insolute relative alle produzioni di beni e servizi tipici delle industrie chimiche, biochimiche e farmaceutiche. In questo panorama, gli approcci multiscala si sono rivelati molto utili grazie alla loro peculiarit\ue0 di coniugare aspetti che spaziano dalla quanto-meccanica tipica della nanoscala, alla meccanica classica dei materiali massivi, comprendendo prospettive molto ampie e adattando ogni teoria alle diverse applicazioni. Inoltre, il riconoscimento dei concetti legati alla sostenibilit\ue0 come principi cardine per ottenere uno sviluppo sostenibile ha generato un prolifico incremento della diffusione di metodologie per considerare aspetti sociali e ambientali, a fianco delle tradizionali stime economiche, nel quadro pi\uf9 ampio delle valutazioni degli impianti chimici. Di conseguenza, questa tesi tratta dello sviluppo di una metodologia multiscala per la stima preliminare di diverse configurazioni impiantistiche, promuovendo l\u2019adozione di strumenti computazionali differenti e comprendendo valutazioni di carattere economico, sociale e ambientale. Il fine ultimo che tale metodologia si prefigge risiede nella soddisfazione della necessit\ue0 tipica di qualsiasi impianto di produzione, ovvero nella definizione di una metodologia di valutazione di vari parametri e configurazioni impiantistiche, utilizzando un\u2019ottica sostenibile e fornendo risultati velocemente. Al lettore verranno fornite le adeguate informazioni sull\u2019argomento in maniera progressiva attraverso i capitoli di questa tesi. Nel Chapter I saranno descritti il concetto di sostenibilit\ue0 e di sviluppo sostenibile. Seguir\ue0 una trattazione riguardante la loro applicazione nella societ\ue0 odierna da diverse prospettive: a partire da quella pi\uf9 generalista delle istituzioni, fino a quella pi\uf9 particolare dell\u2019industria, per concludere con una parte specifica sull\u2019industria chimica, corredata di esempi di metodologie applicate a processi chimici. Il Chapter II descriver\ue0 i passaggi necessari ad ottenere la valutazione della sostenibilit\ue0 delle alternative impiantistiche. Dal reperimento delle informazioni necessarie, all\u2019implementazione dei modelli nei simulatori di processo, seguito dal calcolo degli indici rappresentativi dei pilastri della sostenibilit\ue0, i cui valori vengono successivamente valutati tramite un algoritmo matematico (DEA) per identificare la configurazione impiantistica ottimale. Infine \ue8 necessario analizzare le alternative inefficienti di modo da comprendere su quali variabili si debba intervenire per migliorarne le prestazioni attraverso una retrofit analisi. Il Chapter III affronter\ue0 l\u2019utilizzo di diverse tecniche di simulazione molecolare per la stima del coefficiente di ripartizione ottanolo-acqua (Kow), che \ue8 un propriet\ue0 fondamentale per il calcolo di alcuni indici utilizzati. Il lettore trover\ue0 alcuni casi di studio descritti nel Chapter IV. Il primo appartiene al ramo della farmaceutica e si occupa della produzione del pioglitazone cloridrato attraverso l\u2019utilizzo di diverse vie di sintesi appartenenti a numerosi brevetti. La seconda applicazione della metodologia riguarda l\u2019industria biochimica e ottimizza le condizioni operative di un reattore utilizzato per la produzione di biodiesel a partire da olio vegetale. L\u2019ultimo caso di studio esplora il mondo dei materiali nanostrutturati, valutando diversi parametri di reazione utilizzati per condurre la sintesi di CdSe quantum dot. L\u2019ultimo Chapter V conterr\ue0 le valutazioni conclusive e le prospettive future.Research activity in chemical engineering is focused on the refinement of theories and techniques employed for the development of new tools aiming at solving issues directly related to the generation of goods and services supplied by chemical, biochemical and pharmaceutical industries. In this context, multiscale approaches revealed to be very useful, since they embrace theories from quantum mechanics at the nanoscale to classical mechanics at the macroscale, contemplating wide perspectives and enabling the adaptation of each theory to an abundance of disparate applications. Furthermore, the acknowledgment of sustainability among the cornerstones of future development led to a copious diffusion of sustainability evaluation methodologies, aiming to account for economic, social and environmental concerns among chemical processes assessments. Therefore, this thesis deals with the development of a multiscale framework for the preliminary screening of chemical process designs, promoting the adoption of various computational tools along with sustainability considerations. The purpose of this methodology resides in the fulfillment of an emblematic need for any production site, i.e. evaluating a production process considering possible modifications from different perspectives in order to identify as fast as possible the most efficient design including economic, social and environmental concerns. The reader will be guided through this topic following the chapters of this dissertation. In Chapter I, the concept of sustainability and sustainable development will be presented, followed by some applications starting from the wider panorama of institutions to the industry perspective, concluding with some relevant examples from chemical process engineering. Chapter II will describe each step to be performed in order to gain the sustainability evaluation of the process alternatives. From retrieving the promising process designs, to implementing each flowsheet in a process simulator, then calculating several indicators based on the sustainability pillars, which is followed by employing a mathematical tool (DEA) in order to select the most efficient designs and finally investigating how to enhance the sub-optimal alternatives through a retrofit analysis. Chapter III will deal with the application of different molecular simulation techniques in order to estimate the octanol-water partition coefficient (Kow), which is an essential parameter for the calculation of several sustainability indicators. Then the reader will encounter the three case studies shown in details in Chapter IV. The first one belongs to the pharmaceutical field and deals with the production of pioglitazone hydrochloride considering different synthesis routes from various patents. The second application regards the biochemical industry, optimizing the operating conditions of a reactor employed for the production of biodiesel from vegetable oil. The last one explores the synthesis of nanomaterials, evaluating several reaction parameters involved in the laboratory production of CdSe quantum dots from a sustainability viewpoint. Some concluding remarks and future perspectives will be included in the final Chapter V

    Aqueous hydrocarbon systems: Experimental measurements and quantitative structure-property relationship modeling

    Get PDF
    Scope and Method of Study: The experimental objectives of this work were to (a) evaluate existing mutual hydrocarbon-water liquid-liquid equilibrium (LLE) data, and (b) develop an experimental apparatus capable of measuring accurately the hydrocarbon-water (LLE) mutual solubilities. The hydrocarbon-water systems studied included benzene-water, toluene-water, and 3-methylpentane water. The modeling efforts in this study focused on developing quantitative structure-property relationship (QSPR) models for the prediction of infinite-dilution activity coefficient values (gamma infinity i) of hydrocarbon-water systems. Specifically, case studies were constructed to investigate the efficacy of (a) QSPR models using multiple linear regression analyses and non-linear neural networks; and (b) theory-based QSPR model, where the Bader-Gasem activity coefficient model derived from a modified Peng-Robinson equation of state (EOS) is used to model the phase behavior, and QSPR neural networks are used to generalize the EOS binary interaction parameters. The database used in the modeling efforts consisted of 1400 infinite-dilution activity coefficients at temperatures ranging from 283 K to 373 K.Findings and Conclusions: A continuous flow apparatus was utilized to measure the LLE mutual solubilities at temperatures ranging from ambient to 500 K, which is near the three-phase critical end point of the benzene-water and toluene-water systems. The well-documented benzene-water system was used to validate the reliability of the sampling and analytical techniques employed. Generally, adequate agreement was observed for the benzene-water, toluene-water, and 3-methylpentane-water systems with literature data. An error propagation analysis for the three systems indicated maximum expected uncertainties of 4% and 8% in the water phase and organic phase solubility measurements, respectively. In general, the use of non-linear QSPR models developed in this work were satisfactory and compared favorably to the majority of predictive models found in literature; however, these model did not account for temperature dependence. The Bader-Gasem activity coefficient model fitted with QSPR generalized binary interactions was capable of providing accurate predictions for the infinite-dilution activity coefficients of hydrocarbons in water. Careful validation of the model predictions over the full temperature range of the data considered yielded absolute average deviations of 3.4% in ln gamma infinity i and 15% in gamma infinity i, which is about twice the estimated experimental uncertainty. This study provides valuable LLE mutual solubility data and further demonstrates the effectiveness of theory-framed QSPR modeling of thermophysical properties

    Étude de la structure et des propriétés SAR/QSAR de quelques molécules à visée thérapeutique

    Get PDF
    Recently, a series of carbazole derivatives containing chalcone analogues (CDCAs) were synthetized as potent anticancer agents and apoptosis inducers. These compounds target the inhibition of topoisomerase II and present cytotoxic activities. After comparison to experiment, we validated the use of B3LYP, a density functional theory-based approach, to describe the structure and molecular properties of the carbazole subunit and CDCAs compounds of interest. Then, we derived relationships between the chemical descriptors and activity of these carbazole derivatives using multi parameter optimization and quantitative structure activity relationships (QSAR) approaches. For the QSAR studies, we used multiple linear regression and artificial neural network statistical modelling. Our predicted activities are in good agreement with the experimental ones. We found that the most important parameter influencing the activity of the considered compounds is the octanol water partition coefficient, highlighting the importance of flexibility as a key molecular parameter to favor cell membrane crossing and enhance the action of these CDCAs against topoisomerase II. Our results provide useful guidelines for designing new oral active CDCAs medicaments for cytotoxic inhibition

    Computer aided drug design: Drug target directed in silico approaches

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Machine Learning in Discrete Molecular Spaces

    Get PDF
    The past decade has seen an explosion of machine learning in chemistry. Whether it is in property prediction, synthesis, molecular design, or any other subdivision, machine learning seems poised to become an integral, if not a dominant, component of future research efforts. This extraordinary capacity rests on the interac- tion between machine learning models and the underlying chemical data landscape commonly referred to as chemical space. Chemical space has multiple incarnations, but is generally considered the space of all possible molecules. In this sense, it is one example of a molecular set: an arbitrary collection of molecules. This thesis is devoted to precisely these objects, and particularly how they interact with machine learning models. This work is predicated on the idea that by better understanding the relationship between molecular sets and the models trained on them we can improve models, achieve greater interpretability, and further break down the walls between data-driven and human-centric chemistry. The hope is that this enables the full predictive power of machine learning to be leveraged while continuing to build our understanding of chemistry. The first three chapters of this thesis introduce and reviews the necessary machine learning theory, particularly the tools that have been specially designed for chemical problems. This is followed by an extensive literature review in which the contributions of machine learning to multiple facets of chemistry over the last two decades are explored. Chapters 4-7 explore the research conducted throughout this PhD. Here we explore how we can meaningfully describe the properties of an arbitrary set of molecules through information theory; how we can determine the most informative data points in a set of molecules; how graph signal processing can be used to understand the relationship between the chosen molecular representation, the property, and the machine learning model; and finally how this approach can be brought to bear on protein space. Each of these sub-projects briefly explores the necessary mathematical theory before leveraging it to provide approaches that resolve the posed problems. We conclude with a summary of the contributions of this work and outline fruitful avenues for further exploration

    Développement de méthodes et d’outils chémoinformatiques pour l’analyse et la comparaison de chimiothèques

    Get PDF
    Some news areas in biology ,chemistry and computing interface, have emerged in order to respond the numerous problematics linked to the drug research. This is what this thesis is all about, as an interface gathered under the banner of chimocomputing. Though, new on a human scale, these domains are nevertheless, already an integral part of the drugs and medicines research. As the Biocomputing, his fundamental pillar remains storage, representation, management and the exploitation through computing of chemistry data. Chimocomputing is now mostly used in the upstream phases of drug research. Combining methods from various fields ( chime, computing, maths, apprenticeship, statistics, etc…) allows the implantation of computing tools adapted to the specific problematics and data of chime such as chemical database storage, understructure research, data visualisation or physoco-chimecals and biologics properties prediction.In that multidisciplinary frame, the work done in this thesis pointed out two important aspects, both related to chimocomputing : (1) The new methods development allowing to ease the visualization, analysis and interpretation of data related to set of the molecules, currently known as chimocomputing and (2) the computing tools development enabling the implantation of these methods.De nouveaux domaines ont vu le jour, à l’interface entre biologie, chimie et informatique, afin de répondre aux multiples problématiques liées à la recherche de médicaments. Cette thèse se situe à l’interface de plusieurs de ces domaines, regroupés sous la bannière de la chémo-informatique. Récent à l’échelle humaine, ce domaine fait néanmoins déjà partie intégrante de la recherche pharmaceutique. De manière analogue à la bioinformatique, son pilier fondateur reste le stockage, la représentation, la gestion et l’exploitation par ordinateur de données provenant de la chimie. La chémoinformatique est aujourd’hui utilisée principalement dans les phases amont de la recherche de médicaments. En combinant des méthodes issues de différents domaines (chimie, informatique, mathématique, apprentissage, statistiques, etc.), elle permet la mise en oeuvre d’outils informatiques adaptés aux problématiques et données spécifiques de la chimie, tels que le stockage de l’information chimique en base de données, la recherche par sous-structure, la visualisation de données, ou encore la prédiction de propriétés physico-chimiques et biologiques.Dans ce cadre pluri-disciplinaire, le travail présenté dans cette thèse porte sur deux aspects importants liés à la chémoinformatique : (1) le développement de nouvelles méthodes permettant de faciliter la visualisation, l’analyse et l’interprétation des données liées aux ensembles de molécules, plus communément appelés chimiothèques, et (2) le développement d’outils informatiques permettant de mettre en oeuvre ces méthodes

    Estudio conformacional de compuestos organofosforados y sus mecanismos de acción tóxica

    Get PDF
    Estudios espectroscópicos de pequeños organofosforados (OPs) dan cuenta de la existencia de terceros confórmeros a temperatura ambiente. Estos constituyen los primeros resultados experimentales que evidencian un comportamiento multiconformacional ya predicho mediante resultados teóricos. En el análisis de este fenómeno se asocia el efecto sobre la flexibilidad molecular, que decrece en el sentido del aumento del peso del calcógeno unido por doble enlace al átomo de fósforo. A partir de estos resultados, y por extensión al comportamiento biológico de los OPs, se han encontrado relaciones fundamentales entre las diferencias de libertad conformacional y los mecanismos de acción tóxica asociados a OP pequeños determinados por la inhibición de la acetilcolinesterasa, de manera que el comportamiento conformacional puede ser suficiente para explicar ciertos aspectos de la interacción de estos organofosforados con el sitio activo de esta enzima. Estos resultados se obtienen aplicando metodologías QSAR (Quantitative Structure Activity Relationships) junto con el desarrollo de distintos Descriptores Conformacionales.Spectroscopic studies of small organophosphorus (OPs) are made that give account of the existence of third conformers at room temperature. These constitute the first experimental results that demonstrate a multiconformacional behavior already predicted by means of theoretical results. In the analysis of this phenomenon the effect is associated on the molecular flexibility, that decreases in the sense of the increase of the weight of the calchogen atom bonded to the phosphorus atom. From these results, and by extension to the biological behavior of the OPs, have been fundamental relations between the differences of conformational freedom and the associated mechanisms of toxic effect to small OP determined by the acethylcholinesterase inhibition, so that the conformacional behavior can be sufficient to explain certain aspects of the interaction of these OPs on the active site of this enzyme. These results are obtained along with applying to QSAR (Quantitative Structure Activity Relationships) methodologies and development of Conformational Descriptors.Facultad de Ciencias Exacta
    corecore