18 research outputs found

    Mutating a Highly Conserved Residue in Diverse Cytochrome P450s Facilitates Diastereoselective Olefin Cyclopropanation

    Get PDF
    Cytochrome P450s and other heme-containing proteins have recently been shown to have promiscuous activity for the cyclopropanation of olefins using diazoacetate reagents. Despite the progress made thus far, engineering selective catalysts for all possible stereoisomers for the cyclopropanation reaction remains a considerable challenge. Previous investigations of a model P450 (P450BM3) revealed that mutation of a conserved active site threonine (Thr268) to alanine transformed the enzyme into a highly active and selective cyclopropanation catalyst. By incorporating this mutation into a diverse panel of P450 scaffolds, we were able to quickly identify enantioselective catalysts for all possible diastereomers in the model reaction of styrene with ethyl diazoacetate. Some alanine variants exhibited selectivities that were markedly different from the wild-type enzyme, with a few possessing moderate to high diastereoselectivity and enantioselectivities up to 97 % for synthetically challenging cis-cyclopropane diastereomers

    DASP3: identification of protein sequences belonging to functionally relevant groups.

    Get PDF
    BackgroundDevelopment of automatable processes for clustering proteins into functionally relevant groups is a critical hurdle as an increasing number of sequences are deposited into databases. Experimental function determination is exceptionally time-consuming and can't keep pace with the identification of protein sequences. A tool, DASP (Deacon Active Site Profiler), was previously developed to identify protein sequences with active site similarity to a query set. Development of two iterative, automatable methods for clustering proteins into functionally relevant groups exposed algorithmic limitations to DASP.ResultsThe accuracy and efficiency of DASP was significantly improved through six algorithmic enhancements implemented in two stages: DASP2 and DASP3. Validation demonstrated DASP3 provides greater score separation between true positives and false positives than earlier versions. In addition, DASP3 shows similar performance to previous versions in clustering protein structures into isofunctional groups (validated against manual curation), but DASP3 gathers and clusters protein sequences into isofunctional groups more efficiently than DASP and DASP2.ConclusionsDASP algorithmic enhancements resulted in improved efficiency and accuracy of identifying proteins that contain active site features similar to those of the query set. These enhancements provide incremental improvement in structure database searches and initial sequence database searches; however, the enhancements show significant improvement in iterative sequence searches, suggesting DASP3 is an appropriate tool for the iterative processes required for clustering proteins into isofunctional groups

    An Atlas of Peroxiredoxins Created Using an Active Site Profile-Based Approach to Functionally Relevant Clustering of Proteins

    No full text
    <div><p>Peroxiredoxins (Prxs or Prdxs) are a large protein superfamily of antioxidant enzymes that rapidly detoxify damaging peroxides and/or affect signal transduction and, thus, have roles in proliferation, differentiation, and apoptosis. Prx superfamily members are widespread across phylogeny and multiple methods have been developed to classify them. Here we present an updated atlas of the Prx superfamily identified using a novel method called MISST (Multi-level Iterative Sequence Searching Technique). MISST is an iterative search process developed to be both agglomerative, to add sequences containing similar functional site features, and divisive, to split groups when functional site features suggest distinct functionally-relevant clusters. Superfamily members need not be identified initially—MISST begins with a minimal representative set of known structures and searches GenBank iteratively. Further, the method’s novelty lies in the manner in which isofunctional groups are selected; rather than use a single or shifting threshold to identify clusters, the groups are deemed isofunctional when they pass a self-identification criterion, such that the group identifies itself and nothing else in a search of GenBank. The method was preliminarily validated on the Prxs, as the Prxs presented challenges of both agglomeration and division. For example, previous sequence analysis clustered the Prx functional families Prx1 and Prx6 into one group. Subsequent expert analysis clearly identified Prx6 as a distinct functionally relevant group. The MISST process distinguishes these two closely related, though functionally distinct, families. Through MISST search iterations, over 38,000 Prx sequences were identified, which the method divided into six isofunctional clusters, consistent with previous expert analysis. The results represent the most complete computational functional analysis of proteins comprising the Prx superfamily. The feasibility of this novel method is demonstrated by the Prx superfamily results, laying the foundation for potential functionally relevant clustering of the universe of protein sequences.</p></div

    PSSM Analysis subdivides Rlx6 into AhpE and PrxQ groups.

    No full text
    <p>(A) The DASP2 score distribution of the Rlx6 Search1 results is shown with bars colored by known functional annotations (see legend). The blue and green boxes represent the two groups identified by PSSM Analysis. (B) The DASP2 score distributions that result from Search2, which uses as input the ASPs composed of the proteins in the blue and green boxes from (A). Search2 results illustrate the separation of the AhpE (orange) and PrxQ (pink) subgroups. An inset shows more detail for scoring bins 1e-25 to 1e-12 for the Rlx6_AhpE Search2 histogram.</p

    Agglomeration of Tpx sequences and loss of PrxQ sequences in Sct2_Tpx during MISST search iterations.

    No full text
    <p>The proteins identified in Sct2_Tpx Search0 (A) and Search1 (B) are displayed as histograms with bars colored to show previously known functional groups. Dotted black lines signify the DASP search score threshold of ≤1e-12 for Search0 and ≤1e-14 for Search1. (C) The number of total proteins identified by Sct2_Tpx at significant DASP2 search scores is shown for searches 0 through 3.</p

    Signature conservation graphs highlight potential specificity determining positions (SDPs) in each of the six Prx subgroups.

    No full text
    <p>Pseudo-signatures (see <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1005284#sec012" target="_blank">Methods</a>) for the significantly scoring proteins (post cross hit analysis) in each MISST group were used to construct signature conservation graphs (signature logos of the active site profiles). Letter height indicates the residue conservation in that position. Colored braces indicate motifs discussed in the text. The clusters on the left show the proteins used to create the signature logos, colored by previously defined subgroup; the number in parenthesis represents the number of proteins in each cluster. The signature logos were created using WebLogo version 2.8.2 with default settings and with the y-axis not shown.</p

    Comparison of AhpE and PrxQ signatures suggests why 54 previously annotated PrxQ proteins are identified in the Rlx6_AhpE MISST group.

    No full text
    <p>Signature conservation graphs were made for all proteins previously annotated as PrxQ in the Rlx6_PrxQ group and all proteins previously annotated as PrxQ or AhpE in the Rlx6_AhpE MISST group. Gray highlights represent the key residues used to initiate TuLIP. Orange highlights represent positions in which Rlx6_AhpE proteins annotated as PrxQ share more similarity with the AhpE subgroup than the PrxQ subgroup. Signature conservation graphs were made using Weblogo version 2.8.2 [<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1005284#pcbi.1005284.ref061" target="_blank">61</a>] with default settings, including small sample correction.</p

    MISST and PSSM Analysis flowcharts describe the process of agglomerative identification of sequences as members of functionally relevant groups.

    No full text
    <p>(A) Flow chart of the MISST process for identifying functionally relevant groups within a protein superfamily. (B) An illustration of the agreement criterion: a scatterplot of all proteins identified by DASP2 searches using two ASPs, Group 4A and Group 4B, that were subdivided in the previous MISST iteration. Red lines indicate the significance threshold used to label proteins as “significant” or “not significant” in each group. Sequences in the yellow quadrants are those identified in both searches at similar (significant or not) DASP2 scores. Those sequences in the cyan quadrants differ in significance. This metric is used to determine if a group that is subdivided by PSSM Analysis produces truly distinct search results. (C) Flow chart of PSSM Analysis for identifying when and how to divide clusters into functionally relevant groups.</p
    corecore