22 research outputs found

    Chemoinformatics Research at the University of Sheffield: A History and Citation Analysis

    Get PDF
    This paper reviews the work of the Chemoinformatics Research Group in the Department of Information Studies at the University of Sheffield, focusing particularly on the work carried out in the period 1985-2002. Four major research areas are discussed, these involving the development of methods for: substructure searching in databases of three-dimensional structures, including both rigid and flexible molecules; the representation and searching of the Markush structures that occur in chemical patents; similarity searching in databases of both two-dimensional and three-dimensional structures; and compound selection and the design of combinatorial libraries. An analysis of citations to 321 publications from the Group shows that it attracted a total of 3725 residual citations during the period 1980-2002. These citations appeared in 411 different journals, and involved 910 different citing organizations from 54 different countries, thus demonstrating the widespread impact of the Group's work

    Intelligent data acquisition for drug design through combinatorial library design

    Get PDF
    A problem that occurs in machine learning methods for drug discovery is aneed for standardized data. Methods and interest exist for producing new databut due to material and budget constraints it is desirable that each iteration ofproducing data is as efficient as possible. In this thesis, we present two papersmethods detailing different problems for selecting data to produce. We invest-igate Active Learning for models that use the margin in model decisiveness tomeasure the model uncertainty to guide data acquisition. We demonstrate thatthe models perform better with Active Learning than with random acquisitionof data independent of machine learning model and starting knowledge. Wealso study the multi-objective optimization problem of combinatorial librarydesign. Here we present a framework that could process the output of gener-ative models for molecular design and give an optimized library design. Theresults show that the framework successfully optimizes a library based onmolecule availability, for which the framework also attempts to identify usingretrosynthesis prediction. We conclude that the next step in intelligent dataacquisition is to combine the two methods and create a library design modelthat use the information of previous libraries to guide subsequent designs

    Algorithm-supported, mass and sequence diversity-oriented random peptide library design.

    Get PDF
    Random peptide libraries that cover large search spaces are often used for the discovery of new binders, even when the target is unknown. To ensure an accurate population representation, there is a tendency to use large libraries. However, parameters such as the synthesis scale, the number of library members, the sequence deconvolution and peptide structure elucidation, are challenging when increasing the library size. To tackle these challenges, we propose an algorithm-supported approach to peptide library design based on molecular mass and amino acid diversity. The aim is to simplify the tedious permutation identification in complex mixtures, when mass spectrometry is used, by avoiding mass redundancy. For this purpose, we applied multi (two- and three-)-objective genetic algorithms to discriminate between library members based on defined parameters. The optimizations led to diverse random libraries by maximizing the number of amino acid permutations and minimizing the mass and/or sequence overlapping. The algorithm-suggested designs offer to the user a choice of appropriate compromise solutions depending on the experimental needs. This implies that diversity rather than library size is the key element when designing peptide libraries for the discovery of potential novel biologically active peptides

    The effects of combinatorial chemistry and technologies on drug discovery and biotechnology : A mini review

    Get PDF
    The review will focus on the aspects of combinatorial chemistry and technologies that are more relevant in the modern pharmaceutical process. An historical, critical introduction is followed by three chapters, dealing with the use of combinatorial chemistry/high throughput synthesis in medicinal chemistry; the rational design of combinatorial libraries using computer-assisted combinatorial drug design; and the use of combinatorial technologies in biotechnology. The impact of "combinatorial thinking" in drug discovery in general, and in the examples reported in details, is critically discussed. Finally, an expert opinion on current and future trends in combinatorial chemistry and combinatorial technologies is provided

    Statistical strategies for avoiding false discoveries in metabolomics and related experiments

    Full text link

    COMBINATORIAL LIBRARY DESIGN OF MUTATION-RESISTANT HIV PROTEASE INHIBITORS.

    Get PDF
    The emergence of HIV strains that are resistant to current HIV protease inhibitors in the past few years has become a major concern in AIDS treatment. The goal of this project is to design a combinatorial library of potential lead compounds that can bind to both the wild-type and mutant proteases and that can resist further mutations. A recent crystallographic study of complexes of HIV protease with its substrates has provided structural insights into the differential recognition of the substrates and inhibitors. It has been proposed that clinical resistance is a consequence of inhibitors failure to stay within the consensus substrate volume. In this work, we devised a quantitative indicator of the degree to which a candidate ligand falls outside the consensus substrate volume, and determined its correlation with the inhibitor's sensitivity to clinically relevant resistant mutations. The validation of this hypothesis has encouraged us to use this strategy in our design of a combinatorial library of inhibitors. The compounds in a typical combinatorial library are built around a common structural scaffold possessing multiple connection points where substituents can be added by reliable synthetic steps. As the number of compounds encompassed by such a combinatorial scheme frequently exceeds what can actually be synthesized and tested, virtual screening methods are sought to shortlist the compounds. Even though these methods require only seconds to minutes of CPU time per compound, exhaustive screening of an entire virtual combinatorial library is computationally demanding. We therefore implemented a simple algorithm of combining substituents that have been optimized independently for the substituent sites. This method was compared with Genetic Algorithm, a global optimization method and was found equally efficient. This simple method was hence chosen for the design process. A combinatorial library based on these ideas and methods has been synthesized and tested. It includes four compounds with nanomolar inhibition constants. Two of them were shown to have retained affinity against a panel of treatment-resistant mutations

    Optimiertes Design kombinatorischer Verbindungsbibliotheken durch Genetische Algorithmen und deren Bewertung anhand wissensbasierter Protein-Ligand Bindungsprofile

    Get PDF
    In dieser Arbeit sind die zwei neuen Computer-Methoden DrugScore Fingerprint (DrugScoreFP) und GARLig in ihrer Theorie und Funktionsweise vorgestellt und validiert worden. DrugScoreFP ist ein neuartiger Ansatz zur Bewertung von computergenerierten Bindemodi potentieller Liganden fĂŒr eine bestimmte Zielstruktur. Das Programm basiert auf der etablierten Bewertungsfunktion DrugScoreCSD und unterscheidet sich darin, dass anhand bereits bekannter Kristallstrukturen fĂŒr den zu untersuchenden Rezeptor ein Referenzvektor generiert wird, der zu jedem Bindetaschenatom Potentialwerte fĂŒr alle möglichen Interaktionen enthĂ€lt. FĂŒr jeden neuen, computergenerierten Bindungsmodus eines Liganden lĂ€sst sich ein entsprechender Vektor generieren. Dessen Distanz zum Referenzvektor ist ein Maß dafĂŒr, wie Ă€hnlich generierte Bindungsmodi zu bereits bekannten sind. Eine experimentelle Validierung der durch DrugScoreFP als Ă€hnlich vorhergesagten Liganden ergab fĂŒr die in unserem Arbeitskreis untersuchten Proteinstrukturen Trypsin, Thermolysin und tRNA-Guanin Transglykosylase (TGT) sechs Inhibitoren fragmentĂ€rer GrĂ¶ĂŸe und eine Thermolysin Kristallstruktur in Komplex mit einem der gefundenen Fragmente. Das in dieser Arbeit entwickelte Programm GARLig ist eine auf einem Genetischen Algorithmus basierende Methode, um chemische Seitenkettenmodifikationen niedermolekularer Verbindungen hinsichtlich eines untersuchten Rezeptors effizient durchzufĂŒhren. Zielsetzung ist hier die Zusammenstellung einer Verbindungsbibliothek, welche eine benutzerdefiniert große Untermenge aller möglichen chemischen Modifikationen Ligand-Ă€hnlicher GrundgerĂŒste darstellt. Als zentrales QualitĂ€tskriterium einzelner Vertreter der Verbindungsbibliothek dienen durch Docking erzeugte Ligand-Geometrien und deren Bewertungen durch Protein-Ligand-Bewertungsfunktionen. In mehreren Validierungsszenarien an den Proteinen Trypsin, Thrombin, Faktor Xa, Plasmin und Cathepsin D konnte gezeigt werden, dass eine effiziente Zusammenstellung Rezeptor-spezifischer Substrat- oder Ligand-Bibliotheken lediglich eine Durchsuchung von weniger als 8% der vorgegebenen SuchrĂ€ume erfordert und GARLig dennoch im Stande ist, bekannte Inhibitoren in der Zielbibliothek anzureichern
    corecore