45 research outputs found

    Selective prediction of interaction sites in protein structures with THEMATICS

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Methods are now available for the prediction of interaction sites in protein 3D structures. While many of these methods report high success rates for site prediction, often these predictions are not very selective and have low precision. Precision in site prediction is addressed using Theoretical Microscopic Titration Curves (THEMATICS), a simple computational method for the identification of active sites in enzymes. Recall and precision are measured and compared with other methods for the prediction of catalytic sites.</p> <p>Results</p> <p>Using a test set of 169 enzymes from the original Catalytic Residue Dataset (CatRes) it is shown that THEMATICS can deliver precise, localised site predictions. Furthermore, adjustment of the cut-off criteria can improve the recall rates for catalytic residues with only a small sacrifice in precision. Recall rates for CatRes/CSA annotated catalytic residues are 41.1%, 50.4%, and 54.2% for Z score cut-off values of 1.00, 0.99, and 0.98, respectively. The corresponding precision rates are 19.4%, 17.9%, and 16.4%. The success rate for catalytic sites is higher, with correct or partially correct predictions for 77.5%, 85.8%, and 88.2% of the enzymes in the test set, corresponding to the same respective Z score cut-offs, if only the CatRes annotations are used as the reference set. Incorporation of additional literature annotations into the reference set gives total success rates of 89.9%, 92.9%, and 94.1%, again for corresponding cut-off values of 1.00, 0.99, and 0.98. False positive rates for a 75-protein test set are 1.95%, 2.60%, and 3.12% for Z score cut-offs of 1.00, 0.99, and 0.98, respectively.</p> <p>Conclusion</p> <p>With a preferred cut-off value of 0.99, THEMATICS achieves a high success rate of interaction site prediction, about 86% correct or partially correct using CatRes/CSA annotations only and about 93% with an expanded reference set. Success rates for catalytic residue prediction are similar to those of other structure-based methods, but with substantially better precision and lower false positive rates. THEMATICS performs well across the spectrum of E.C. classes. The method requires only the structure of the query protein as input. THEMATICS predictions may be obtained via the web from structures in PDB format at: <url>http://pfweb.chem.neu.edu/thematics/submit.html</url></p

    Identification of Functional Subclasses in the DJ-1 Superfamily Proteins

    Get PDF
    Genomics has posed the challenge of determination of protein function from sequence and/or 3-D structure. Functional assignment from sequence relationships can be misleading, and structural similarity does not necessarily imply functional similarity. Proteins in the DJ-1 family, many of which are of unknown function, are examples of proteins with both sequence and fold similarity that span multiple functional classes. THEMATICS (theoretical microscopic titration curves), an electrostatics-based computational approach to functional site prediction, is used to sort proteins in the DJ-1 family into different functional classes. Active site residues are predicted for the eight distinct DJ-1 proteins with available 3-D structures. Placement of the predicted residues onto a structural alignment for six of these proteins reveals three distinct types of active sites. Each type overlaps only partially with the others, with only one residue in common across all six sets of predicted residues. Human DJ-1 and YajL from Escherichia coli have very similar predicted active sites and belong to the same probable functional group. Protease I, a known cysteine protease from Pyrococcus horikoshii, and PfpI/YhbO from E. coli, a hypothetical protein of unknown function, belong to a separate class. THEMATICS predicts a set of residues that is typical of a cysteine protease for Protease I; the prediction for PfpI/YhbO bears some similarity. YDR533Cp from Saccharomyces cerevisiae, of unknown function, and the known chaperone Hsp31 from E. coli constitute a third group with nearly identical predicted active sites. While the first four proteins have predicted active sites at dimer interfaces, YDR533Cp and Hsp31 both have predicted sites contained within each subunit. Although YDR533Cp and Hsp31 form different dimers with different orientations between the subunits, the predicted active sites are superimposable within the monomer structures. Thus, the three predicted functional classes form four different types of quaternary structures. The computational prediction of the functional sites for protein structures of unknown function provides valuable clues for functional classification

    Reintegrating Biology through the Nexus of Energy, Information, and Matter

    Get PDF
    Information, energy, and matter are fundamental properties of all levels of biological organization, and life emerges from the continuous flux of matter, energy, and information. This perspective piece defines and explains each of the three pillars of this nexus. We propose that a quantitative characterization of the complex interconversions between matter, energy, and information that compose this nexus will help us derive biological insights that connect phenomena across different levels of biological organization. We articulate examples from multiple biological scales that highlight how this nexus approach leads to a more complete understanding of the biological system. Metrics of energy, information, and matter can provide a common currency that helps link phenomena across levels of biological organization. The propagation of energy and information through levels of biological organization can result in emergent properties and system-wide changes that impact other hierarchical levels. Deeper consideration of measured imbalances in energy, information, and matter can help researchers identify key factors that influence system function at one scale, highlighting avenues to link phenomena across levels of biological organization and develop predictive models of biological systems

    Best Practices to Diversify Chemistry Faculty

    Get PDF
    Many academic institutions have looked at various ways to make their faculty a more diverse and inclusive group of people that better reflect the demographic swath of their current and future student bodies. This is even more so important in chemistry departments, where there has long been a discussion on the “leaky pipeline” for women and underrepresented groups. The work presented here examines programs and policies at various departments aimed at increasing the diversity of their faculty applicant pool, and compares them against the reception of the general scientific community by way of applicant demographics and the use of a survey instrument designed to ascertain the advertisement language that lends to a more diverse applicant pool. The combination of these results is then used to generate a list of best practices that administrations and academic search committees can use to improve their ability to attract diverse talent

    Partial Order Optimum Likelihood (POOL): Maximum Likelihood Prediction of Protein Active Site Residues Using 3D Structure and Sequence Properties

    Get PDF
    A new monotonicity-constrained maximum likelihood approach, called Partial Order Optimum Likelihood (POOL), is presented and applied to the problem of functional site prediction in protein 3D structures, an important current challenge in genomics. The input consists of electrostatic and geometric properties derived from the 3D structure of the query protein alone. Sequence-based conservation information, where available, may also be incorporated. Electrostatics features from THEMATICS are combined with multidimensional isotonic regression to form maximum likelihood estimates of probabilities that specific residues belong to an active site. This allows likelihood ranking of all ionizable residues in a given protein based on THEMATICS features. The corresponding ROC curves and statistical significance tests demonstrate that this method outperforms prior THEMATICS-based methods, which in turn have been shown previously to outperform other 3D-structure-based methods for identifying active site residues. Then it is shown that the addition of one simple geometric property, the size rank of the cleft in which a given residue is contained, yields improved performance. Extension of the method to include predictions of non-ionizable residues is achieved through the introduction of environment variables. This extension results in even better performance than THEMATICS alone and constitutes to date the best functional site predictor based on 3D structure only, achieving nearly the same level of performance as methods that use both 3D structure and sequence alignment data. Finally, the method also easily incorporates such sequence alignment data, and when this information is included, the resulting method is shown to outperform the best current methods using any combination of sequence alignments and 3D structures. Included is an analysis demonstrating that when THEMATICS features, cleft size rank, and alignment-based conservation scores are used individually or in combination THEMATICS features represent the single most important component of such classifiers

    Evaluating protein cross-linking as a therapeutic strategy to stabilize SOD1 variants in a mouse model of familial ALS

    Get PDF
    Mutations in the gene encoding Cu-Zn superoxide dismutase 1 (SOD1) cause a subset of familial amyotrophic lateral sclerosis (fALS) cases. A shared effect of these mutations is that SOD1, which is normally a stable dimer, dissociates into toxic monomers that seed toxic aggregates. Considerable research effort has been devoted to developing compounds that stabilize the dimer of fALS SOD1 variants, but unfortunately, this has not yet resulted in a treatment. We hypothesized that cyclic thiosulfinate cross-linkers, which selectively target a rare, 2 cysteine-containing motif, can stabilize fALS-causing SOD1 variants in vivo. We created a library of chemically diverse cyclic thiosulfinates and determined structure-cross-linking-activity relationships. A pre-lead compound, “S-XL6,” was selected based upon its cross-linking rate and drug-like properties. Co-crystallographic structure clearly establishes the binding of S-XL6 at Cys 111 bridging the monomers and stabilizing the SOD1 dimer. Biophysical studies reveal that the degree of stabilization afforded by S-XL6 (up to 24°C) is unprecedented for fALS, and to our knowledge, for any protein target of any kinetic stabilizer. Gene silencing and protein degrading therapeutic approaches require careful dose titration to balance the benefit of diminished fALS SOD1 expression with the toxic loss-of-enzymatic function. We show that S-XL6 does not share this liability because it rescues the activity of fALS SOD1 variants. No pharmacological agent has been proven to bind to SOD1 in vivo. Here, using a fALS mouse model, we demonstrate oral bioavailability; rapid engagement of SOD1G93A by S-XL6 that increases SOD1G93A’s in vivo half-life; and that S-XL6 crosses the blood–brain barrier. S-XL6 demonstrated a degree of selectivity by avoiding off-target binding to plasma proteins. Taken together, our results indicate that cyclic thiosulfinate-mediated SOD1 stabilization should receive further attention as a potential therapeutic approach for fALS

    A community effort in SARS-CoV-2 drug discovery.

    Get PDF
    peer reviewedThe COVID-19 pandemic continues to pose a substantial threat to human lives and is likely to do so for years to come. Despite the availability of vaccines, searching for efficient small-molecule drugs that are widely available, including in low- and middle-income countries, is an ongoing challenge. In this work, we report the results of an open science community effort, the "Billion molecules against Covid-19 challenge", to identify small-molecule inhibitors against SARS-CoV-2 or relevant human receptors. Participating teams used a wide variety of computational methods to screen a minimum of 1 billion virtual molecules against 6 protein targets. Overall, 31 teams participated, and they suggested a total of 639,024 molecules, which were subsequently ranked to find 'consensus compounds'. The organizing team coordinated with various contract research organizations (CROs) and collaborating institutions to synthesize and test 878 compounds for biological activity against proteases (Nsp5, Nsp3, TMPRSS2), nucleocapsid N, RdRP (only the Nsp12 domain), and (alpha) spike protein S. Overall, 27 compounds with weak inhibition/binding were experimentally identified by binding-, cleavage-, and/or viral suppression assays and are presented here. Open science approaches such as the one presented here contribute to the knowledge base of future drug discovery efforts in finding better SARS-CoV-2 treatments.R-AGR-3826 - COVID19-14715687-CovScreen (01/06/2020 - 31/01/2021) - GLAAB Enric
    corecore