579 research outputs found

    Accounting for epistatic interactions improves the functional analysis of protein structures

    Get PDF
    Motivation: The constraints under which sequence, structure and function coevolve are not fully understood. Bringing this mutual relationship to light can reveal the molecular basis of binding, catalysis and allostery, thereby identifying function and rationally guiding protein redesign. Underlying these relationships are the epistatic interactions that occur when the consequences of a mutation to a protein are determined by the genetic background in which it occurs. Based on prior data, we hypothesize that epistatic forces operate most strongly between residues nearby in the structure, resulting in smooth evolutionary importance across the structure. Methods and Results: We find that when residue scores of evolutionary importance are distributed smoothly between nearby residues, functional site prediction accuracy improves. Accordingly, we designed a novel measure of evolutionary importance that focuses on the interaction between pairs of structurally neighboring residues. This measure that we term pair-interaction Evolutionary Trace yields greater functional site overlap and better structure-based proteome-wide functional predictions. Conclusions: Our data show that the structural smoothness of evolutionary importance is a fundamental feature of the coevolution of sequence, structure and function. Mutations operate on individual residues, but selective pressure depends in part on the extent to which a mutation perturbs interactions with neighboring residues. In practice, this principle led us to redefine the importance of a residue in terms of the importance of its epistatic interactions with neighbors, yielding better annotation of functional residues, motivating experimental validation of a novel functional site in LexA and refining protein function prediction. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online

    Evolutionary Trace Annotation Server: automated enzyme function prediction in protein structures using 3D templates

    Get PDF
    Summary:The Evolutionary Trace Annotation (ETA) Server predicts enzymatic activity. ETA starts with a structure of unknown function, such as those from structural genomics, and with no prior knowledge of its mechanism uses the phylogenetic Evolutionary Trace (ET) method to extract key functional residues and propose a function-associated 3D motif, called a 3D template. ETA then searches previously annotated structures for geometric template matches that suggest molecular and thus functional mimicry. In order to maximize the predictive value of these matches, ETA next applies distinctive specificity filters—evolutionary similarity, function plurality and match reciprocity. In large scale controls on enzymes, prediction coverage is 43% but the positive predictive value rises to 92%, thus minimizing false annotations. Users may modify any search parameter, including the template. ETA thus expands the ET suite for protein structure annotation, and can contribute to the annotation efforts of metaservers

    Structure and evolutionary trace-assisted screening of a residue swapping the substrate ambiguity and chiral specificity in an esterase

    Get PDF
    11 pags., 6figs., 3 pags.Our understanding of enzymes with high substrate ambiguity remains limited because their large active sites allow substrate docking freedom to an extent that seems incompatible with stereospecificity. One possibility is that some of these enzymes evolved a set of evolutionarily fitted sequence positions that stringently allow switching substrate ambiguity and chiral specificity. To explore this hypothesis, we targeted for mutation a serine ester hydrolase (EH) that exhibits an impressive 71-substrate repertoire but is not stereospecific (e.e. 50%). We used structural actions and the computational evolutionary trace method to explore specificity-swapping sequence positions and hypothesized that position I244 was critical. Driven by evolutionary action analysis, this position was substituted to leucine, which together with isoleucine appears to be the amino acid most commonly present in the closest homologous sequences (max. identity, ca. 67.1%), and to phenylalanine, which appears in distant homologues. While the I244L mutation did not have any functional consequences, the I244F mutation allowed the esterase to maintain a remarkable 53-substrate range while gaining stereospecificity properties (e.e. 99.99%). These data support the possibility that some enzymes evolve sequence positions that control the substrate scope and stereospecificity. Such residues, which can be evolutionarily screened, may serve as starting points for further designing substrate-ambiguous, yet chiral-specific, enzymes that are greatly appreciated in biotechnology and synthetic chemistry.MF acknowledges the grant ‘INMARE’ from the EuropeanUnion’s Horizon 2020 (grant agreement no. 634486), the grantsPCIN-2017-078 (within the Marine Biotechnology ERA-NET) and BIO2017-85522-R from the Ministerio de Economía, Industria y Competitividad, Agencia Estatal de Investigación (AEI), Fondo Eur-opeo de Desarrollo Regional (FEDER) and the European Union (EU),and the grant 2020AEP061 from the Agencia Estatal CSIC. J.S-A.acknowledges grant PID2019-105838RB-C33 from the Ministeriode Ciencia e Innovación, Agencia Estatal de Investigación (AEI),Fondo Europeo de Desarrollo Regional (FEDER) and the EuropeanUnion (EU). P.N.G. acknowledges the support of the Era-Net IB Pro-ject MetaCat funded through UK Biotechnology and BiologicalSciences Research Council (BBSRC), grant No. BB/M029085/1, andthe Centre for Environmental Biotechnology Project, co-fundedby European Regional Development Fund (ERDF) via the WelshGovernment (WEFO); R.B. acknowledges the Supercomputing Wales project, co-funded by ERDF via WEFO. OL and PK were sup-ported by the National Institutes of Health (NIH) grants 5R01AG061105, 5R01GM066099, and 5R01GM079656. C. Coscolínthanks the Ministerio de Economía y Competitividad and FEDER fora PhD fellowship (Grant BES-2015-073829). Staff of the Synchrotron Radiation Source at Alba (Barcelona, Spain) for assistance at the BL13-XALOC beamlin

    Evolutionary action and structural basis of the allosteric switch controlling β(2)AR functional selectivity

    Get PDF
    Functional selectivity of G-protein-coupled receptors is believed to originate from ligand-specific conformations that activate only subsets of signaling effectors. In this study, to identify molecular motifs playing important roles in transducing ligand binding into distinct signaling responses, we combined in silico evolutionary lineage analysis and structure-guided site-directed mutagenesis with large-scale functional signaling characterization and non-negative matrix factorization clustering of signaling profiles. Clustering based on the signaling profiles of 28 variants of the β(2)-adrenergic receptor reveals three clearly distinct phenotypical clusters, showing selective impairments of either the Gi or βarrestin/endocytosis pathways with no effect on Gs activation. Robustness of the results is confirmed using simulation-based error propagation. The structural changes resulting from functionally biasing mutations centered around the DRY, NPxxY, and PIF motifs, selectively linking these micro-switches to unique signaling profiles. Our data identify different receptor regions that are important for the stabilization of distinct conformations underlying functional selectivity

    SigniSite: Identification of residue-level genotype-phenotype correlations in protein multiple sequence alignments

    Get PDF
    Identifying which mutation(s) within a given genotype is responsible for an observable phenotype is important in many aspects of molecular biology. Here, we present SigniSite, an online application for subgroup-free residue-level genotype–phenotype correlation. In contrast to similar methods, SigniSite does not require any pre-definition of subgroups or binary classification. Input is a set of protein sequences where each sequence has an associated real number, quantifying a given phenotype. SigniSite will then identify which amino acid residues are significantly associated with the data set phenotype. As output, SigniSite displays a sequence logo, depicting the strength of the phenotype association of each residue and a heat-map identifying ‘hot’ or ‘cold’ regions. SigniSite was benchmarked against SPEER, a state-of-the-art method for the prediction of specificity determining positions (SDP) using a set of human immunodeficiency virus protease-inhibitor genotype–phenotype data and corresponding resistance mutation scores from the Stanford University HIV Drug Resistance Database, and a data set of protein families with experimentally annotated SDPs. For both data sets, SigniSite was found to outperform SPEER. SigniSite is available at: http://www.cbs.dtu.dk/services/SigniSite/

    Background frequencies for residue variability estimates: BLOSUM revisited

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Shannon entropy applied to columns of multiple sequence alignments as a score of residue conservation has proven one of the most fruitful ideas in bioinformatics. This straightforward and intuitively appealing measure clearly shows the regions of a protein under increased evolutionary pressure, highlighting their functional importance. The inability of the column entropy to differentiate between residue types, however, limits its resolution power.</p> <p>Results</p> <p>In this work we suggest generalizing Shannon's expression to a function with similar mathematical properties, that, at the same time, includes observed propensities of residue types to mutate to each other. To do that, we revisit the original construction of BLOSUM matrices, and re-interpret them as mutation probability matrices. These probabilities are then used as background frequencies in the revised residue conservation measure.</p> <p>Conclusion</p> <p>We show that joint entropy with BLOSUM-proportional probabilities as a reference distribution enables detection of protein functional sites comparable in quality to a time-costly maximum-likelihood evolution simulation method (rate4site), and offers greater resolution than the Shannon entropy alone, in particular in the cases when the available sequences are of narrow evolutionary scope.</p
    corecore