11 research outputs found

    IMPROVING MOLECULAR FINGERPRINT SIMILARITY VIA ENHANCED FOLDING

    Get PDF
    Drug discovery depends on scientists finding similarity in molecular fingerprints to the drug target. A new way to improve the accuracy of molecular fingerprint folding is presented. The goal is to alleviate a growing challenge due to excessively long fingerprints. This improved method generates a new shorter fingerprint that is more accurate than the basic folded fingerprint. Information gathered during preprocessing is used to determine an optimal attribute order. The most commonly used blocks of bits can then be organized and used to generate a new improved fingerprint for more optimal folding. We thenapply the widely usedTanimoto similarity search algorithm to benchmark our results. We show an improvement in the final results using this method to generate an improved fingerprint when compared against other traditional folding methods

    DPRESS: Localizing estimates of predictive uncertainty

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The need to have a quantitative estimate of the uncertainty of prediction for QSAR models is steadily increasing, in part because such predictions are being widely distributed as tabulated values disconnected from the models used to generate them. Classical statistical theory assumes that the error in the population being modeled is independent and identically distributed (IID), but this is often not actually the case. Such inhomogeneous error (heteroskedasticity) can be addressed by providing an individualized estimate of predictive uncertainty for each particular new object <it>u</it>: the standard error of prediction <it>s</it><sub>u </sub>can be estimated as the non-cross-validated error <it>s</it><sub>t* </sub>for the closest object <it>t</it>* in the training set adjusted for its separation <it>d </it>from <it>u </it>in the descriptor space relative to the size of the training set.</p> <p><display-formula><graphic file="1758-2946-1-11-i1.gif"/></display-formula></p> <p>The predictive uncertainty factor <it>γ</it><sub>t* </sub>is obtained by distributing the internal predictive error sum of squares across objects in the training set based on the distances between them, hence the acronym: <it>D</it>istributed <it>PR</it>edictive <it>E</it>rror <it>S</it>um of <it>S</it>quares (DPRESS). Note that <it>s</it><sub>t* </sub>and <it>γ</it><sub>t*</sub>are characteristic of each training set compound contributing to the model of interest.</p> <p>Results</p> <p>The method was applied to partial least-squares models built using 2D (molecular hologram) or 3D (molecular field) descriptors applied to mid-sized training sets (<it>N </it>= 75) drawn from a large (<it>N </it>= 304), well-characterized pool of cyclooxygenase inhibitors. The observed variation in predictive error for the external 229 compound test sets was compared with the uncertainty estimates from DPRESS. Good qualitative and quantitative agreement was seen between the distributions of predictive error observed and those predicted using DPRESS. Inclusion of the distance-dependent term was essential to getting good agreement between the estimated uncertainties and the observed distributions of predictive error. The uncertainty estimates derived by DPRESS were conservative even when the training set was biased, but not excessively so.</p> <p>Conclusion</p> <p>DPRESS is a straightforward and powerful way to reliably estimate individual predictive uncertainties for compounds outside the training set based on their distance to the training set and the internal predictive uncertainty associated with its nearest neighbor in that set. It represents a sample-based, <it>a posteriori </it>approach to defining applicability domains in terms of localized uncertainty.</p

    Cavity-based negative images in molecular docking

    Get PDF
    In drug development, computer-based methods are constantly evolving as a result of increasing computing power and cumulative costs of generating new pharmaceuticals. With virtual screening (VS), it is possible to screen even hundreds of millions of compounds and select the best molecule candidates for in vitro testing instead of investing time and resources in analysing all molecules systematically in laboratories. However, there is a constant need to generate more reliable and effective software for VS. For example, molecular docking, one of the most central methods in structure-based VS, can be a very successful approach for certain targets while failing completely with others. However, it is not necessarily the docking sampling but the scoring of the docking poses that is the bottleneck. In this thesis, a novel rescoring method, negative image-based rescoring (R-NiB), is introduced, which generates a negative image of the ligand binding cavity and compares the shape and electrostatic similarity between the generated model and the docked molecule pose. The performance of the method is tested comprehensively using several different protein targets, benchmarking sets and docking software. Additionally, it is compared to other rescoring methods. R-NiB is shown to be a fast and effective method to rescore the docking poses producing notable improvement in active molecule recognition. Furthermore, the NIB model optimization method based on a greedy algorithm is introduced that uses a set of known active and inactive molecules as a training set. This approach, brute force negative image-based optimization (BR-NiB), is shown to work remarkably well producing impressive in silico results even with very limited active molecule training sets. Importantly, the results suggest that the in silico hit rates of the optimized models in docking rescoring are on a level needed in real-world VS and drug discovery projects.Tietokoneiden laskentatehojen ja lääketutkimuksen tuotekehityskulujen kasvaessa tietokonepohjaiset menetelmät kehittyvät jatkuvasti lääkekehityksessä. Virtuaaliseulonnalla voidaan seuloa jopa satoja miljoonia molekyylejä ja valita vain parhaat molekyyliehdokkaat laboratoriotestaukseen sen sijaan, että tuhlattaisiin aikaa ja resursseja analysoimalla järjestelmällisesti kaikki molekyylit laboratoriossa. Tästä huolimatta on koko ajan jatkuva tarve kehittää luotettavampia ja tehokkaampia menetelmiä virtuaaliseulontaan. Esimerkiksi telakointi, yksi keskeisimmistä työkaluista rakennepohjaisessa lääkeainekehityksessä, saattaa toimia erinomaisesti yhdellä kohteella ja epäonnistua täysin toisella. Ongelma ei välttämättä ole telakoitujen molekyylien luonnissa vaan niiden pisteytyksessä. Tässä väitöskirjassa tähän ongelmaan esitellään ratkaisuksi uudenlainen pisteytysmenetelmä R-NiB, jossa verrataan ligandinsitomisalueen negatiivikuvan muodon ja sähköstaattisen potentiaalin samankaltaisuutta telakoituihin molekyyleihin. Menetelmän suorituskykyä testataan usealla eri molekyylisarjalla, lääkeainekohteella, telakointiohjelmalla ja vertaamalla tuloksia muihin pisteytysmenetelmiin. R-NiB:n näytetään olevan nopea ja tehokas menetelmä telakointiasentojen pisteytykseen tuottaen huomattavan parannuksen aktiivisten molekyylien tunnistukseen. Tämän lisäksi esitellään ns. ahneeseen algoritmiin perustuva negatiivikuvan optimointimenetelmä, joka käyttää sarjaa tunnettuja aktiivisia ja inaktiivisia molekyylejä harjoitusjoukkona. Tämän BR-NiB-menetelmän näytetään toimivan ainakin tietokonemallinnuksessa todella hyvin tuottaen vaikuttavia tuloksia jopa silloin, kun harjoitusjoukko koostuu vain muutamista aktiivisista molekyyleistä. Mikä tärkeintä, in silico -tulokset viittaavat optimointimenetelmän osumaprosentin telakoinnin uudelleenpisteytyksessä olevan riittävän korkea myös oikeisiin virtuaaliseulontaprojekteihin

    Evaluation of Potential Inhibitors of Escherichia coli RecA to Attenuate the Rate of Antibiotic Resistance Development and to Sensitize Escherichia coli to Current Antibiotics

    Get PDF
    Antibacterials are invaluable to treating infectious diseases. However, bacteria have a profound ability to alter their susceptibility to antibiotics, rendering themselves resistant to one or more current available antibiotics [4,7,26,27]. Treatment of bacterial infections may be accomplished by attenuating bacterial resistance mechanisms and sensitizing bacteria to current antibiotics. RecA, a recombinase enzyme involved in DNA repair, horizontal gene transfer and the induction of SOS mutagenesis[19-25], seems to be a promising target whose inhibition would reduce bacteria's susceptibility to antibiotics. In this study, we evaluated whether cell-permeable RecA inhibitors could prevent the transfer of genetic material [27,27,37] from heat-killed antibiotic-resistant E. coli to live, susceptible E. coli. One inhibitor identified from a previous screen (A1) attenuated the rate at which E. coli developed resistance to chloramphenicol in both the presence and absence of heat-killed chloramphenicol-resistant cells. Collaborative studies were also undertaken to identify prospective next-generation RecA inhibitors among virtual compound libraries.Master of Scienc

    Estudio comparativo de la regulación transcripcional en procesos de biodegradación

    Full text link
    Tesis doctoral inédita. Universidad Autónoma de Madrid, Facultad de Ciencias, Departamento de Biología Molecular. Fecha de lectura: 16-02-200

    Evaluation of Similarity Measures for Ligand-Based Virtual Screening

    Get PDF

    Clustering for 2D chemical structures

    Get PDF
    The clustering of chemical structures is important and widely used in several areas of chemoinformatics. A little-discussed aspect of clustering is standardization, it ensures all descriptors in a chemical representation make a comparable contribution to the measurement of similarity. The initial study compares the effectiveness of seven different standardization procedures that have been suggested previously, the results were also compared with unstandardized datasets. It was found that no one standardization method offered consistently the best performance. Comparative studies of clustering effectiveness are helpful in providing suitability and guidelines of different methods. In order to examine the suitability of different clustering methods for the application in chemoinformatics, especially those had not previously been applied to chemoinformatics, the second piece of study carries out an effectiveness comparison of nine clustering methods. However, the result revealed that it is unlikely that a single clustering method can provide consistently the best partition under all circumstances. Consensus clustering is a technique to combine multiple input partitions of the same set of objects to achieve a single clustering that is expected to provide a more robust and more generally effective representation of the partitions that are submitted. The third piece of study reports the use of seven different consensus clustering methods which had not previously been used on sets of chemical compounds represented by 2D fingerprints. Their effectiveness was compared with some traditional clustering methods discussed in the second study. It was observed that no consistently best consensus clustering method was found

    The Application of Spectral Clustering in Drug Discovery

    Get PDF
    The application of clustering algorithms to chemical datasets is well established and has been reviewed extensively. Recently, a number of ‘modern’ clustering algorithms have been reported in other fields. One example is spectral clustering, which has yielded promising results in areas such as protein library analysis. The term spectral clustering is used to describe any clustering algorithm that utilises the eigenpairs of a matrix as the basis for partitioning a dataset. This thesis describes the development and optimisation of a non-overlapping spectral clustering method that is based upon a study by Brewer. The initial version of the spectral clustering algorithm was closely related to Brewer’s method and used a full matrix diagonalisation procedure to identify the eigenpairs of an input matrix. This spectral clustering method was compared to the k-means and Ward’s algorithms, producing encouraging results, for example, when coupled with extended connectivity fingerprints, this method outperformed the other clustering algorithms according to the QCI measure. Although the spectral clustering algorithm showed promising results, its operational costs restricted its application to small datasets. Hence, the method was optimised in successive studies. Firstly, the effect of matrix sparsity on the spectral clustering was examined and showed that spectral clustering with sparse input matrices can lead to an improvement in the results. Despite this improvement, the costs of spectral clustering remained prohibitive, so the full matrix diagonalisation procedure was replaced with the Lanczos algorithm that has lower associated costs, as suggested by Brewer. This method led to a significant decrease in the computational costs when identifying a small number of clusters, however a number of issues remained; leading to the adoption of a SVD-based eigendecomposition method. The SVD-based algorithm was shown to be highly efficient, accurate and scalable through a number of studies

    Biophysical studies of protein-ligand interactions and the discovery of FKBP12 inhibitors

    Get PDF
    The principal aim of this study was to discover, through virtual screening, new nonimmunosuppressive inhibitors for the human immunophilin FKBP12, a target of the immunosuppressant drugs rapamycin and FK506. The enzyme acts as peptidyl-prolyl isomerase catalysing protein folding in the cell. Structurally similar isomerase domains are important for molecular recognition in multi-domain chaperone proteins. FKBP inhibitors have been shown to have protective effects against nerve damage and are therefore interesting targets for the treatment of neurodegenerative diseases. Virtual screening has been used to discover novel inhibitors for protein drug targets. Recent advances in computational power and the availability of large virtual libraries, such as the EDULISS database at Edinburgh University, have enhanced the appeal of this approach. X-ray structures of known protein-ligand complexes were examined to obtain an understanding of the key non-covalent interactions in the FKBP12 binding pocket. Virtual screening hits were selected using macromolecular docking and programs that employed a ligand-based approach. The bulk of the virtual screening in this study used Edinburgh University’s in-house program LIDAEUS. In the course of this study nearly three hundred compounds were screened in the laboratory using biophysical and biochemical binding assays. Thirty four compounds were found to have an affinity for FKBP12 of less than one hundred micromolar. To test virtual hits, it was necessary to select the most appropriate medium-throughput biophysical assay. The aim was to employ methods with sufficient sensitivity to detect compounds with affinity in the order of one hundred micromolar, coupled with the capacity to screen hundreds of compounds in a week. This study used a wide variety of biophysical techniques, these including: electrospray ionisation mass spectrometry, surface plasmon resonance and isothermal titration calorimetry. There was a particular emphasis on the quality of data from electrospray ionisation mass spectrometry. A correlation was found between the cone voltages that gave 50 % dissociation of the complex with the enthalpic contribution to the free energy of binding. From the careful examination of the differences in charge-state distributions between a pure protein and a protein-ligand mixture, it was possible to determine if a protein-ligand complex had been present in solution prior to dissociation during the electrospray process. This observation provides the basis for an assay that could be of general utility in detecting very weak inhibitors
    corecore