558 research outputs found

    An automated stochastic approach to the identification of the protein specificity determinants and functional subfamilies

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Recent progress in sequencing and 3 D structure determination techniques stimulated development of approaches aimed at more precise annotation of proteins, that is, prediction of exact specificity to a ligand or, more broadly, to a binding partner of any kind.</p> <p>Results</p> <p>We present a method, SDPclust, for identification of protein functional subfamilies coupled with prediction of specificity-determining positions (SDPs). SDPclust predicts specificity in a phylogeny-independent stochastic manner, which allows for the correct identification of the specificity for proteins that are separated on a phylogenetic tree, but still bind the same ligand. SDPclust is implemented as a Web-server <url>http://bioinf.fbb.msu.ru/SDPfoxWeb/</url> and a stand-alone Java application available from the website.</p> <p>Conclusions</p> <p>SDPclust performs a simultaneous identification of specificity determinants and specificity groups in a statistically robust and phylogeny-independent manner.</p

    AlloRep: A Repository of Sequence, Structural and Mutagenesis Data for the LacI/GalR Transcription Regulators

    Get PDF
    Protein families evolve functional variation by accumulating point mutations at functionally important amino acid positions. Homologs in the LacI/GalR family of transcription regulators have evolved to bind diverse DNA sequences and allosteric regulatory molecules. In addition to playing key roles in bacterial metabolism, these proteins have been widely used as a model family for benchmarking structural and functional prediction algorithms. We have collected manually curated sequence alignments for >ᅠ3000 sequences, in vivo phenotypic and biochemical data for >ᅠ5750 LacI/GalR mutational variants, and noncovalent residue contact networks for 65 LacI/GalR homolog structures. Using this rich data resource, we compared the noncovalent residue contact networks of the LacI/GalR subfamilies to design and experimentally validate an allosteric mutant of a synthetic LacI/GalR repressor for use in biotechnology. The AlloRep database (freely available at www.AlloRep.org) is a key resource for future evolutionary studies of LacI/GalR homologs and for benchmarking computational predictions of functional change

    Functional classification of CATH superfamilies: a domain-based approach for protein function annotation

    Get PDF
    Computational approaches that can predict protein functions are essential to bridge the widening function annotation gap especially since <1.0% of all proteins in UniProtKB have been experimentally characterised. We present a domain-based method for protein function classification and prediction of functional sites that exploits functional subclassification of CATH superfamilies. The superfamilies are subclassified into functional families (FunFams) using a hierarchical clustering algorithm supervised by a new classification method, FunFHMMer

    Determinants of protein function revealed by combinatorial entropy optimization

    Get PDF
    A new algorithm is presented allows protein specificity residues to be assigned from multiple sequence alignments alone. This information can be used, amongst other things, to infer protein functions

    Disentangling evolutionary signals: conservation, specificity determining positions and coevolution. Implication for catalytic residue prediction

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>A large panel of methods exists that aim to identify residues with critical impact on protein function based on evolutionary signals, sequence and structure information. However, it is not clear to what extent these different methods overlap, and if any of the methods have higher predictive potential compared to others when it comes to, in particular, the identification of catalytic residues (CR) in proteins. Using a large set of enzymatic protein families and measures based on different evolutionary signals, we sought to break up the different components of the information content within a multiple sequence alignment to investigate their predictive potential and degree of overlap.</p> <p>Results</p> <p>Our results demonstrate that the different methods included in the benchmark in general can be divided into three groups with a limited mutual overlap. One group containing real-value Evolutionary Trace (rvET) methods and conservation, another containing mutual information (MI) methods, and the last containing methods designed explicitly for the identification of specificity determining positions (SDPs): integer-value Evolutionary Trace (ivET), SDPfox, and XDET. In terms of prediction of CR, we find using a proximity score integrating structural information (as the sum of the scores of residues located within a given distance of the residue in question) that only the methods from the first two groups displayed a reliable performance. Next, we investigated to what degree proximity scores for conservation, rvET and cumulative MI (cMI) provide complementary information capable of improving the performance for CR identification. We found that integrating conservation with proximity scores for rvET and cMI achieved the highest performance. The proximity conservation score contained no complementary information when integrated with proximity rvET. Moreover, the signal from rvET provided only a limited gain in predictive performance when integrated with mutual information and conservation proximity scores. Combined, these observations demonstrate that the rvET and cMI scores add complementary information to the prediction system.</p> <p>Conclusions</p> <p>This work contributes to the understanding of the different signals of evolution and also shows that it is possible to improve the detection of catalytic residues by integrating structural and higher order sequence evolutionary information with sequence conservation.</p

    Bayesian classification of residues associated with protein functional divergence: Arf and Arf-like GTPases

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Certain residues within proteins are highly conserved across very distantly related organisms, yet their (presumably critical) structural or mechanistic roles are completely unknown. To obtain clues regarding such residues within Arf and Arf-like (Arf/Arl) GTPases--which function as on/off switches regulating vesicle trafficking, phospholipid metabolism and cytoskeletal remodeling--I apply a new sampling procedure for comparative sequence analysis, termed multiple category Bayesian Partitioning with Pattern Selection (mcBPPS).</p> <p>Results</p> <p>The mcBPPS sampler classified sequences within the entire P-loop GTPase class into multiple categories by identifying those evolutionarily-divergent residues most likely to be responsible for functional specialization. Here I focus on categories of residues that most distinguish various Arf/Arl GTPases from other GTPases. This identified residues whose specific roles have been previously proposed (and in some cases corroborated experimentally and that thus serve as positive controls), as well as several categories of co-conserved residues whose possible roles are first hinted at here. For example, Arf/Arl/Sar GTPases are most distinguished from other GTPases by a conserved aspartate residue within the phosphate binding loop (P-loop) and by co-conserved residues nearby that, together, can form a network of salt-bridge and hydrogen bond interactions centered on the GTPase active site. Residues corresponding to an N-[VI] motif that is conserved within Arf/Arl GTPases may play a role in the interswitch toggle characteristic of the Arf family, whereas other, co-conserved residues may modulate the flexibility of the guanine binding loop. Arl8 GTPases conserve residues that strikingly diverge from those typically found in other Arf/Arl GTPases and that form structural interactions suggestive of a novel interswitch toggle mechanism.</p> <p>Conclusions</p> <p>This analysis suggests specific mutagenesis experiments to explore mechanisms underlying GTP hydrolysis, nucleotide exchange and interswitch toggling within Arf/Arl GTPases. More generally, it illustrates how the mcBPPS sampler can complement traditional evolutionary analyses by providing an objective, quantitative and statistically rigorous way to explore protein functional-divergence in molecular detail. Because the sampler classifies the input sequences at the same time, it can be used to generate subgroup profiles, in which functionally-divergent categories of residues are annotated automatically.</p> <p>Reviewers</p> <p>This article was reviewed by Frank Eisenhaber, L Aravind and Daniel Gaston (nominated by Eric Bapteste). For the full reviews, go to the Reviewers' comments section.</p

    Functional classification of protein domain superfamilies for protein function annotation

    Get PDF
    Proteins are made up of domains that are generally considered to be independent evolutionary and structural units having distinct functional properties. It is now well established that analysis of domains in proteins provides an effective approach to understand protein function using a `domain grammar'. Towards this end, evolutionarily-related protein domains have been classified into homologous superfamilies in CATH and SCOP databases. An ideal functional sub-classification of the domain superfamilies into `functional families' can not only help in function annotation of uncharacterised sequences but also provide a useful framework for understanding the diversity and evolution of function at the domain level. This work describes the development of a new protocol (FunFHMMer) for identifying functional families in CATH superfamilies that makes use of sequence patterns only and hence, is unaffected by the incompleteness of function annotations, annotation biases or misannotations existing in the databases. The resulting family classification was validated using known functional information and was found to generate more functionally coherent families than other domain-based protein resources. A protein function prediction pipeline was developed exploiting the functional annotations provided by the domain families which was validated by a database rollback benchmark set of proteins and an independent assessment by CAFA 2. The functional classification was found to capture the functional diversity of superfamilies well in terms of sequence, structure and the protein-context. This aided studies on evolution of protein domain function both at the superfamily level and in specific proteins of interest. The conserved positions in the functional family alignments were found to be enriched in catalytic site residues and ligand-binding site residues which led to the development of a functional site prediction tool. Lastly, the function prediction tools were assessed for annotation of moonlighting functions of proteins and a classification of moonlighting proteins was proposed based on their structure-function relationships

    Phosphorylation of CRN2 by CK2 regulates F-actin and Arp2/3 interaction and inhibits cell migration

    Get PDF
    CRN2 (synonyms: coronin 1C, coronin 3) functions in the re-organization of the actin network and is implicated in cellular processes like protrusion formation, secretion, migration and invasion. We demonstrate that CRN2 is a binding partner and substrate of protein kinase CK2, which phosphorylates CRN2 at S463 in its C-terminal coiled coil domain. Phosphomimetic S463D CRN2 loses the wild-type CRN2 ability to inhibit actin polymerization, to bundle F-actin, and to bind to the Arp2/3 complex. As a consequence, S463D mutant CRN2 changes the morphology of the F-actin network in the front of lamellipodia. Our data imply that CK2-dependent phosphorylation of CRN2 is involved in the modulation of the local morphology of complex actin structures and thereby inhibits cell migration

    Multi-Harmony: detecting functional specificity from sequence alignment

    Get PDF
    Many protein families contain sub-families with functional specialization, such as binding different ligands or being involved in different protein–protein interactions. A small number of amino acids generally determine functional specificity. The identification of these residues can aid the understanding of protein function and help finding targets for experimental analysis. Here, we present multi-Harmony, an interactive web sever for detecting sub-type-specific sites in proteins starting from a multiple sequence alignment. Combining our Sequence Harmony (SH) and multi-Relief (mR) methods in one web server allows simultaneous analysis and comparison of specificity residues; furthermore, both methods have been significantly improved and extended. SH has been extended to cope with more than two sub-groups. mR has been changed from a sampling implementation to a deterministic one, making it more consistent and user friendly. For both methods Z-scores are reported. The multi-Harmony web server produces a dynamic output page, which includes interactive connections to the Jalview and Jmol applets, thereby allowing interactive analysis of the results. Multi-Harmony is available at http://www.ibi.vu.nl/ programs/shmrwww

    Light Microscopy Combined with Computational Image Analysis Uncovers Virus-Specific Infection Phenotypes and Host Cell State Variability

    Get PDF
    Abstract: The study of virus infection phenotypes and variability plays a critical role in understanding viral pathogenesis and host response. Virus-host interactions can be investigated by light and various label-free microscopy methods, which provide a powerful tool for the spatiotemporal analysis of patterns at the cellular and subcellular levels in live or fixed cells. Analysis of microscopy images is increasingly complemented by sophisticated statistical methods and leverages artificial intelligence (AI) to address the tasks of image denoising, segmentation, classification, and tracking. Work in this thesis demonstrates that combining microscopy with AI techniques enables models that accurately detect and quantify viral infection due to the virus-induced cytopathic effect (CPE). Furthermore, it shows that statistical analysis of microscopy image data can disentangle stochastic and deterministic factors that contribute to viral infection variability, such as the cellular state. In summary, the integration of microscopy and computational image analysis offers a powerful and flexible approach for studying virus infection phenotypes and variability, ultimately contributing to a more advanced understanding of infection processes and creating possibilities for the development of more effective antiviral strategies
    corecore