3,710 research outputs found

    Bayesian matching of unlabeled marked point sets using random fields, with an application to molecular alignment

    Full text link
    Statistical methodology is proposed for comparing unlabeled marked point sets, with an application to aligning steroid molecules in chemoinformatics. Methods from statistical shape analysis are combined with techniques for predicting random fields in spatial statistics in order to define a suitable measure of similarity between two marked point sets. Bayesian modeling of the predicted field overlap between pairs of point sets is proposed, and posterior inference of the alignment is carried out using Markov chain Monte Carlo simulation. By representing the fields in reproducing kernel Hilbert spaces, the degree of overlap can be computed without expensive numerical integration. Superimposing entire fields rather than the configuration matrices of point coordinates thereby avoids the problem that there is usually no clear one-to-one correspondence between the points. In addition, mask parameters are introduced in the model, so that partial matching of the marked point sets can be carried out. We also propose an adaptation of the generalized Procrustes analysis algorithm for the simultaneous alignment of multiple point sets. The methodology is illustrated with a simulation study and then applied to a data set of 31 steroid molecules, where the relationship between shape and binding activity to the corticosteroid binding globulin receptor is explored.Comment: Published in at http://dx.doi.org/10.1214/11-AOAS486 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Automatic generation of alignments for 3D QSAR analyses

    Get PDF
    Many 3D QSAR methods require the alignment of the molecules in a dataset, which can require a fair amount of manual effort in deciding upon a rational basis for the superposition. This paper describes the use of FBSS, a pro-ram for field-based similarity searching in chemical databases, for generating such alignments automatically. The CoMFA and CoMSIA experiments with several literature datasets show that the QSAR models resulting from the FBSS alignments are broadly comparable in predictive performance with the models resulting from manual alignments

    Predicting the accuracy of protein-ligand docking on homology models

    Get PDF
    Ligand-protein docking is increasingly used in Drug Discovery. The initial limitations imposed by a reduced availability of target protein structures have been overcome by the use of theoretical models, especially those derived by homology modeling techniques. While this greatly extended the use of docking simulations, it also introduced the need for general and robust criteria to estimate the reliability of docking results given the model quality. To this end, a large-scale experiment was performed on a diverse set including experimental structures and homology models for a group of representative ligand-protein complexes. A wide spectrum of model quality was sampled using templates at different evolutionary distances and different strategies for target-template alignment and modeling. The obtained models were scored by a selection of the most used model quality indices. The binding geometries were generated using AutoDock, one of the most common docking programs. An important result of this study is that indeed quantitative and robust correlations exist between the accuracy of docking results and the model quality, especially in the binding site. Moreover, state-of-the-art indices for model quality assessment are already an effective tool for an a priori prediction of the accuracy of docking experiments in the context of groups of proteins with conserved structural characteristics.Contract/grant sponsor: National Institutes of Health; contract/grant numbers: ES00768

    Predicting Flavonoid UGT Regioselectivity with Graphical Residue Models and Machine Learning.

    Get PDF
    Machine learning is applied to a challenging and biologically significant protein classification problem: the prediction of flavonoid UGT acceptor regioselectivity from primary protein sequence. Novel indices characterizing graphical models of protein residues are introduced. The indices are compared with existing amino acid indices and found to cluster residues appropriately. A variety of models employing the indices are then investigated by examining their performance when analyzed using nearest neighbor, support vector machine, and Bayesian neural network classifiers. Improvements over nearest neighbor classifications relying on standard alignment similarity scores are reported

    Recent Trends in In-silico Drug Discovery

    Get PDF
    A Drug designing is a process in which new leads (potential drugs) are discovered which have therapeutic benefits in diseased condition. With development of various computational tools and availability of databases (having information about 3D structure of various molecules) discovery of drugs became comparatively, a faster process. The two major drug development methods are structure based drug designing and ligand based drug designing. Structure based methods try to make predictions based on three dimensional structure of the target molecules. The major approach of structure based drug designing is Molecular docking, a method based on several sampling algorithms and scoring functions. Docking can be performed in several ways depending upon whether ligand and receptors are rigid or flexible. Hotspot grafting, is another method of drug designing. It is preferred when the structure of a native binding protein and target protein complex is available and the hotspots on the interface are known. In absence of information of three Dimensional structure of target molecule, Ligand based methods are used. Two common methods used in ligand based drug designing are Pharmacophore modelling and QSAR. Pharmacophore modelling explains only essential features of an active ligand whereas QSAR model determines effect of certain property on activity of ligand. Fragment based drug designing is a de novo approach of building new lead compounds using fragments within the active site of the protein. All the candidate leads obtained by various drug designing method need to satisfy ADMET properties for its development as a drug. In-silico ADMET prediction tools have made ADMET profiling an easier and faster process. In this review, various softwares available for drug designing and ADMET property predictions have also been listed

    Structure- and Ligand-Based Design of Novel Antimicrobial Agents

    Get PDF
    The use of computer based techniques in the design of novel therapeutic agents is a rapidly emerging field. Although the drug-design techniques utilized by Computational Medicinal Chemists vary greatly, they can roughly be classified into structure-based and ligand-based approaches. Structure-based methods utilize a solved structure of the design target, protein or DNA, usually obtained by X-ray or NMR methods to design or improve compounds with activity against the target. Ligand-based methods use active compounds with known affinity for a target that may yet be unresolved. These methods include Pharmacophore-based searching for novel active compounds or Quantitative Structure-Activity Relationship (QSAR) studies. The research presented here utilized both structure and ligand-based methods against two bacterial targets: Bacillus anthracis and Mycobacterium tuberculosis. The first part of this thesis details our efforts to design novel inhibitors of the enzyme dihydropteroate synthase from B. anthracis using crystal structures with known inhibitors bound. The second part describes a QSAR study that was performed using a series of novel nitrofuranyl compounds with known, whole-cell, inhibitory activity against M. tuberculosis. Dihydropteroate synthase (DHPS) catalyzes the addition of p-amino benzoic acid (pABA) to dihydropterin pyrophosphate (DHPP) to form pteroic acid as a key step in bacterial folate biosynthesis. It is the traditional target of the sulfonamide class of antibiotics. Unfortunately, bacterial resistance and adverse effects have limited the clinical utility of the sulfonamide antibiotics. Although six bacterial crystal structures are available, the flexible loop regions that enclose pABA during binding and contain key sulfonamide resistance sites have yet to be visualized in their functional conformation. To gain a new understanding of the structural basis of sulfonamide resistance, the molecular mechanism of DHPS action, and to generate a screening structure for high-throughput virtual screening, molecular dynamics simulations were applied to model the conformations of the unresolved loops in the active site. Several series of molecular dynamics simulations were designed and performed utilizing enzyme substrates and inhibitors, a transition state analog, and a pterin-sulfamethoxazole adduct. The positions of key mutation sites conserved across several bacterial species were closely monitored during these analyses. These residues were shown to interact closely with the sulfonamide binding site. The simulations helped us gain new understanding of the positions of the flexible loops during inhibitor binding that has allowed the development of a DHPS structural model that could be used for high-through put virtual screening (HTVS). Additionally, insights gained on the location and possible function of key mutation sites on the flexible loops will facilitate the design of new, potent inhibitors of DHPS that can bypass resistance mutations that render sulfonamides inactive. Prior to performing high-throughput virtual screening, the docking and scoring functions to be used were validated using established techniques against the B. anthracis DHPS target. In this validation study, five commonly used docking programs, FlexX, Surflex, Glide, GOLD, and DOCK, as well as nine scoring functions, were evaluated for their utility in virtual screening against the novel pterin binding site. Their performance in ligand docking and virtual screening against this target was examined by their ability to reproduce a known inhibitor conformation and to correctly detect known active compounds seeded into three separate decoy sets. Enrichment was demonstrated by calculated enrichment factors at 1% and Receiver Operating Characteristic (ROC) curves. The effectiveness of post-docking relaxation prior to rescoring and consensus scoring were also evaluated. Of the docking and scoring functions evaluated, Surflex with SurflexScore and Glide with GlideScore performed best overall for virtual screening against the DHPS target. The next phase of the DHPS structure-based drug design project involved high-throughput virtual screening against the DHPS structural model previously developed and docking methodology validated against this target. Two general virtual screening methods were employed. First, large, virtual libraries were pre-filtered by 3D pharmacophore and modified Rule-of-Three fragment constraints. Nearly 5 million compounds from the ZINC databases were screened generating 3,104 unique, fragment-like hits that were subsequently docked and ranked by score. Second, fragment docking without pharmacophore filtering was performed on almost 285,000 fragment-like compounds obtained from databases of commercial vendors. Hits from both virtual screens with high predicted affinity for the pterin binding pocket, as determined by docking score, were selected for in vitro testing. Activity and structure-activity relationship of the active fragment compounds have been developed. Several compounds with micromolar activity were identified and taken to crystallographic trials. Finally, in our ligand-based research into M. tuberculosis active agents, a series of nitrofuranylamide and related aromatic compounds displaying potent activity was investigated utilizing 3-Dimensional Quantitative Structure-Activity Relationship (3D-QSAR) techniques. Comparative Molecular Field Analysis (CoMFA) and Comparative Molecular Similarity Indices Analysis (CoMSIA) methods were used to produce 3D-QSAR models that correlated the Minimum Inhibitory Concentration (MIC) values against M. tuberculosis with the molecular structures of the active compounds. A training set of 95 active compounds was used to develop the models, which were then evaluated by a series of internal and external cross-validation techniques. A test set of 15 compounds was used for the external validation. Different alignment and ionization rules were investigated as well as the effect of global molecular descriptors including lipophilicity (cLogP, LogD), Polar Surface Area (PSA), and steric bulk (CMR), on model predictivity. Models with greater than 70% predictive ability, as determined by external validation and high internal validity (cross validated r2 \u3e .5) were developed. Incorporation of lipophilicity descriptors into the models had negligible effects on model predictivity. The models developed will be used to predict the activity of proposed new structures and advance the development of next generation nitrofuranyl and related nitroaromatic anti-tuberculosis agents

    A QSTR-based expert system to predict sweetness of molecules

    Get PDF
    This work describes a novel approach based on advanced molecular similarity to predict the sweetness of chemicals. The proposed Quantitative Structure-Taste Relationship (QSTR) model is an expert system developed keeping in mind the five principles defined by the Organization for Economic Co-operation and Development (OECD) for the validation of (Q)SARs. The 649 sweet and non-sweet molecules were described by both conformation-independent extended-connectivity fingerprints (ECFPs) and molecular descriptors. In particular, the molecular similarity in the ECFPs space showed a clear association with molecular taste and it was exploited for model development. Molecules laying in the subspaces where the taste assignation was more difficult were modeled trough a consensus between linear and local approaches (Partial Least Squares-Discriminant Analysis and N-nearest-neighbor classifier). The expert system, which was thoroughly validated through a Monte Carlo procedure and an external set, gave satisfactory results in comparison with the state-of-the-art models. Moreover, the QSTR model can be leveraged into a greater understanding of the relationship between molecular structure and sweetness, and into the design of novel sweeteners.Instituto de Investigaciones FisicoquĂ­micas TeĂłricas y AplicadasFacultad de Ciencias Exacta

    High-Throughput Atomistic Modeling of Biomolecular Structure and Association

    Get PDF
    The reliability of many protein models arising from structure prediction methods is unclear. Here we present a method for absolute quality control of theoretical protein models, which can significantly contribute to their acceptance in the life-science research. We apply these methods to gain insight into the family of hydrophobins and modify them for increased cell adhesion to allow for the coating of implants. The novel proteins were shown to bind cells, while impeding bacterial adhesion

    Origin and higher-level diversification of acariform mites – evidence from nuclear ribosomal genes, extensive taxon sampling, and secondary structure alignment

    Get PDF
    Abstract Background Acariformes is the most species-rich and morphologically diverse radiation of chelicerate arthropods, known from the oldest terrestrial ecosystems. It is also a key lineage in understanding the evolution of this group, with the most vexing question whether mites, or Acari (Parasitiformes and Acariformes) is monophyletic. Previous molecular studies recovered Acari either as monophyletic or non-monophyletic, albeit with a limited taxon sampling. Similarly, relationships between basal acariform groups (include little-known, deep-soil 'endeostigmatan' mites) and major lineages of Acariformes (Sarcoptiformes, Prostigmata) are virtually unknown. We infer phylogeny of chelicerate arthropods, using a large and representative dataset, comprising all main in- and outgroups (228 taxa). Basal diversity of Acariformes is particularly well sampled. With this dataset, we conduct a series of phylogenetically explicit tests of chelicerate and acariform relationships and present a phylogenetic framework for internal relationships of acariform mites. Results Our molecular data strongly support a diphyletic Acari, with Acariformes as the sister group to Solifugae (PP =1.0; BP = 100), the so called Poecilophysidea. Among Acariformes, some representatives of the basal group Endeostigmata (mainly deep-soil mites) were recovered as sister-groups to the remaining Acariformes (i. e., Trombidiformes + and most of Sarcoptiformes). Desmonomatan oribatid mites (soil and litter mites) were recovered as the monophyletic sister group of Astigmata (e. g., stored product mites, house dust mites, mange mites, feather and fur mites). Trombidiformes (Sphaerolichida + Prostigmata) is strongly supported (PP =1.0; BP = 98–100). Labidostommatina was inferred as the basal lineage of Prostigmata. Eleutherengona (e. g., spider mites) and Parasitengona (e. g., chiggers, fresh water mites) were recovered as monophyletic. By contrast, Eupodina (e. g., snout mites and relatives) was not. Marine mites (Halacaridae) were traditionally regarded as the sister-group to Bdelloidea (Eupodina), but our analyses show their close relationships to Parasitengona. Conclusions Non-trivial relationships recovered by our analyses with high support (i.e., basal arrangement of endeostigmatid lineages, the position of marine mites, polyphyly of Eupodina) had been  proposed by previous underappreciated morphological studies. Thus, we update currently the accepted taxonomic classification to reflect these results: the superfamily Halacaroidea Murray, 1877 is moved from the infraorder Eupodina Krantz, 1978 to Anystina van der Hammen, 1972; and the subfamily Erythracarinae Oudemans, 1936 (formerly in Anystidae Oudemans, 1902) is elevated to family rank, Erythracaridae stat. ressur., leaving Anystidae only with the nominal subfamily. Our study also shows that a clade comprising early derivative Endeostigmata (Alycidae, Nanorchestidae, Nematalycidae, and maybe Alicorhagiidae) should be treated as a taxon with the same rank as Sarcoptiformes and Trombidiformes, and the scope of the superfamily Bdelloidea should  be changed. Before turning those findings into nomenclatural changes, however, we consider that our study calls for (i) finding shared apomorphies of the early derivative Endeostigmata clade and the clade including the remaining Acariformes; (ii) a well-supported hypothesis  for Alicorhagiidae placement; (iii) sampling the families Proterorhagiidae, Proteonematalycidae and Grandjeanicidae not yet included in molecular analyses; (iv) undertake a denser sampling of clades traditionally placed in Eupodina, Anystina (Trombidiformes) and Palaeosomata (Sarcoptiformes), since consensus networks and Internode certainty (IC) and IC All (ICA) indices indicate high levels of conflict in these tree regions. Our study shows that regions of ambiguous alignment may provide useful phylogenetic signal when secondary structure information is used to guide the alignment procedure and provides an R implementation to the Bayesian Relative Rates test.http://deepblue.lib.umich.edu/bitstream/2027.42/113097/1/12862_2015_Article_458.pd
    • …
    corecore