224 research outputs found

    Tautomerism in large databases

    Get PDF
    We have used the Chemical Structure DataBase (CSDB) of the NCI CADD Group, an aggregated collection of over 150 small-molecule databases totaling 103.5 million structure records, to conduct tautomerism analyses on one of the largest currently existing sets of real (i.e. not computer-generated) compounds. This analysis was carried out using calculable chemical structure identifiers developed by the NCI CADD Group, based on hash codes available in the chemoinformatics toolkit CACTVS and a newly developed scoring scheme to define a canonical tautomer for any encountered structure. CACTVS’s tautomerism definition, a set of 21 transform rules expressed in SMIRKS line notation, was used, which takes a comprehensive stance as to the possible types of tautomeric interconversion included. Tautomerism was found to be possible for more than 2/3 of the unique structures in the CSDB. A total of 680 million tautomers were calculated from, and including, the original structure records. Tautomerism overlap within the same individual database (i.e. at least one other entry was present that was really only a different tautomeric representation of the same compound) was found at an average rate of 0.3% of the original structure records, with values as high as nearly 2% for some of the databases in CSDB. Projected onto the set of unique structures (by FICuS identifier), this still occurred in about 1.5% of the cases. Tautomeric overlap across all constituent databases in CSDB was found for nearly 10% of the records in the collection

    A high-throughput synthetic platform enables the discovery of proteomimetic cell penetrating peptides and bioportides

    Get PDF
    Collectively, cell penetrating peptide (CPP) vectors and intrinsically active bioportides possess tremendous potential for drug delivery applications and the discrete modulation of intracellular targets including the sites of protein–protein interactions (PPIs). Such sequences are usually relatively short (< 25 AA), polycationic in nature and able to access the various intracellular compartments of eukaryotic cells without detrimental influences upon cellular biology. The high-throughput platform for bioportide discovery described herein exploits the discovery that many human proteins are an abundant source of potential CPP sequences which are reliably predicted using QSAR algorithms or other methods. Subsequently, microwave-enhanced solid phase peptides synthesis provides a high-throughput source of novel proteomimetic CPPs for screening purposes. By focussing upon cationic helical domains, often located within the molecular interfaces that facilitate PPIs, bioportides which act by a dominant-negative mechanism at such sites can be reliably identified within small number libraries of CPPs. Protocols that employ fluorescent peptides, routinely prepared by N-terminal acylation with carboxytetramethylrhodamine, further enable both the quantification of cellular uptake kinetics and the identification of specific site(s) of intracellular accretion. Chemical modifications of linear peptides, including strategies to promote and stabilise helicity, are compatible with the synthesis of second-generation bioportides with improved drug-like properties to further exploit the inherent selectivity of biologics

    Combinatorial Clustering of Residue Position Subsets Predicts Inhibitor Affinity across the Human Kinome

    Get PDF
    The protein kinases are a large family of enzymes that play fundamental roles in propagating signals within the cell. Because of the high degree of binding site similarity shared among protein kinases, designing drug compounds with high specificity among the kinases has proven difficult. However, computational approaches to comparing the 3-dimensional geometry and physicochemical properties of key binding site residue positions have been shown to be informative of inhibitor selectivity. The Combinatorial Clustering Of Residue Position Subsets (CCORPS) method, introduced here, provides a semi-supervised learning approach for identifying structural features that are correlated with a given set of annotation labels. Here, CCORPS is applied to the problem of identifying structural features of the kinase ATP binding site that are informative of inhibitor binding. CCORPS is demonstrated to make perfect or near-perfect predictions for the binding affinity profile of 8 of the 38 kinase inhibitors studied, while only having overall poor predictive ability for 1 of the 38 compounds. Additionally, CCORPS is shown to identify shared structural features across phylogenetically diverse groups of kinases that are correlated with binding affinity for particular inhibitors; such instances of structural similarity among phylogenetically diverse kinases are also shown to not be rare among kinases. Finally, these function-specific structural features may serve as potential starting points for the development of highly specific kinase inhibitors

    Structure-guided selection of specificity determining positions in the human kinome

    Get PDF
    Background: The human kinome contains many important drug targets. It is well-known that inhibitors of protein kinases bind with very different selectivity profiles. This is also the case for inhibitors of many other protein families. The increased availability of protein 3D structures has provided much information on the structural variation within a given protein family. However, the relationship between structural variations and binding specificity is complex and incompletely understood. We have developed a structural bioinformatics approach which provides an analysis of key determinants of binding selectivity as a tool to enhance the rational design of drugs with a specific selectivity profile. Results: We propose a greedy algorithm that computes a subset of residue positions in a multiple sequence alignment such that structural and chemical variation in those positions helps explain known binding affinities. By providing this information, the main purpose of the algorithm is to provide experimentalists with possible insights into how the selectivity profile of certain inhibitors is achieved, which is useful for lead optimization. In addition, the algorithm can also be used to predict binding affinities for structures whose affinity for a given inhibitor is unknown. The algorithm’s performance is demonstrated using an extensive dataset for the human kinome. Conclusion: We show that the binding affinity of 38 different kinase inhibitors can be explained with consistently high precision and accuracy using the variation of at most six residue positions in the kinome binding site. We show for several inhibitors that we are able to identify residues that are known to be functionally important

    "Psychic Degenerate": Why G. Was Interned

    Get PDF
    Abstract This chapter explains how homosexuality was pathologised: to do this, it traces the origins of the "effeminate male" stereotype, explaining how the socio-cultural concept of degeneration was extended to include "sexual inversion". Through the doctors' words, G.'s biography starts to take shape and it becomes clear how it matched the "degenerate" and "effeminate pederast" stereotypical description

    Let’s not forget tautomers

    Get PDF
    A compound exhibits tautomerism if it can be represented by two structures that are related by an intramolecular movement of hydrogen from one atom to another. The different tautomers of a molecule usually have different molecular fingerprints, hydrophobicities and pKa’s as well as different 3D shape and electrostatic properties; additionally, proteins frequently preferentially bind a tautomer that is present in low abundance in water. As a result, the proper treatment of molecules that can tautomerize, ~25% of a database, is a challenge for every aspect of computer-aided molecular design. Library design that focuses on molecular similarity or diversity might inadvertently include similar molecules that happen to be encoded as different tautomers. Physical property measurements might not establish the properties of individual tautomers with the result that algorithms based on these measurements may be less accurate for molecules that can tautomerize—this problem influences the accuracy of filtering for library design and also traditional QSAR. Any 2D or 3D QSAR analysis must involve the decision of if or how to adjust the observed Ki or IC50 for the tautomerization equilibria. QSARs and recursive partitioning methods also involve the decision as to which tautomer(s) to use to calculate the molecular descriptors. Docking virtual screening must involve the decision as to which tautomers to include in the docking and how to account for tautomerization in the scoring. All of these decisions are more difficult because there is no extensive database of measured tautomeric ratios in both water and non-aqueous solvents and there is no consensus as to the best computational method to calculate tautomeric ratios in different environments

    Identifying Compound-Target Associations by Combining Bioactivity Profile Similarity Search and Public Databases Mining

    Get PDF
    Molecular target identification is of central importance to drug discovery. Here, we developed a computational approach, named bioactivity profile similarity search (BASS), for associating targets to small molecules by using the known target annotations of related compounds from public databases. To evaluate BASS, a bioactivity profile database was constructed using 4296 compounds that were commonly tested in the US National Cancer Institute 60 human tumor cell line anticancer drug screen (NCI-60). Each compound was used as a query to search against the entire bioactivity profile database, and reference compounds with similar bioactivity profiles above a threshold of 0.75 were considered as neighbor compounds of the query. Potential targets were subsequently linked to the identified neighbor compounds by using the known targets o

    A physicochemical descriptor-based scoring scheme for effective and rapid filtering of kinase-like chemical space

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The current chemical space of known small molecules is estimated to exceed 10<sup>60 </sup>structures. Though the largest physical compound repositories contain only a few tens of millions of unique compounds, virtual screening of databases of this size is still difficult. In recent years, the application of physicochemical descriptor-based profiling, such as Lipinski's rule-of-five for drug-likeness and Oprea's criteria of lead-likeness, as early stage filters in drug discovery has gained widespread acceptance. In the current study, we outline a kinase-likeness scoring function based on known kinase inhibitors.</p> <p>Results</p> <p>The method employs a collection of 22,615 known kinase inhibitors from the ChEMBL database. A kinase-likeness score is computed using statistical analysis of nine key physicochemical descriptors for these inhibitors. Based on this score, the kinase-likeness of four publicly and commercially available databases, i.e., National Cancer Institute database (NCI), the Natural Products database (NPD), the National Institute of Health's Molecular Libraries Small Molecule Repository (MLSMR), and the World Drug Index (WDI) database, is analyzed. Three of these databases, i.e., NCI, NPD, and MLSMR are frequently used in the virtual screening of kinase inhibitors, while the fourth WDI database is for comparison since it covers a wide range of known chemical space. Based on the kinase-likeness score, a kinase-focused library is also developed and tested against three different kinase targets selected from three different branches of the human kinome tree.</p> <p>Conclusions</p> <p>Our proposed methodology is one of the first that explores how the narrow chemical space of kinase inhibitors and its relevant physicochemical information can be utilized to build kinase-focused libraries and prioritize pre-existing compound databases for screening. We have shown that focused libraries generated by filtering compounds using the kinase-likeness score have, on average, better docking scores than an equivalent number of randomly selected compounds. Beyond library design, our findings also impact the broader efforts to identify kinase inhibitors by screening pre-existing compound libraries. Currently, the NCI library is the most commonly used database for screening kinase inhibitors. Our research suggests that other libraries, such as MLSMR, are more kinase-like and should be given priority in kinase screenings.</p

    Advances in structure elucidation of small molecules using mass spectrometry

    Get PDF
    The structural elucidation of small molecules using mass spectrometry plays an important role in modern life sciences and bioanalytical approaches. This review covers different soft and hard ionization techniques and figures of merit for modern mass spectrometers, such as mass resolving power, mass accuracy, isotopic abundance accuracy, accurate mass multiple-stage MS(n) capability, as well as hybrid mass spectrometric and orthogonal chromatographic approaches. The latter part discusses mass spectral data handling strategies, which includes background and noise subtraction, adduct formation and detection, charge state determination, accurate mass measurements, elemental composition determinations, and complex data-dependent setups with ion maps and ion trees. The importance of mass spectral library search algorithms for tandem mass spectra and multiple-stage MS(n) mass spectra as well as mass spectral tree libraries that combine multiple-stage mass spectra are outlined. The successive chapter discusses mass spectral fragmentation pathways, biotransformation reactions and drug metabolism studies, the mass spectral simulation and generation of in silico mass spectra, expert systems for mass spectral interpretation, and the use of computational chemistry to explain gas-phase phenomena. A single chapter discusses data handling for hyphenated approaches including mass spectral deconvolution for clean mass spectra, cheminformatics approaches and structure retention relationships, and retention index predictions for gas and liquid chromatography. The last section reviews the current state of electronic data sharing of mass spectra and discusses the importance of software development for the advancement of structure elucidation of small molecules
    corecore