48,211 research outputs found

    PiRaNhA: A server for the computational prediction of RNA-binding residues in protein sequences

    Get PDF
    The PiRaNhA web server is a publicly available online resource that automatically predicts the location of RNA-binding residues (RBRs) in protein sequences. The goal of functional annotation of sequences in the field of RNA binding is to provide predictions of high accuracy that require only small numbers of targeted mutations for verification. The PiRaNhA server uses a support vector machine (SVM), with position-specific scoring matrices, residue interface propensity, predicted residue accessibility and residue hydrophobicity as features. The server allows the submission of up to 10 protein sequences, and the predictions for each sequence are provided on a web page and via email. The prediction results are provided in sequence format with predicted RBRs highlighted, in text format with the SVM threshold score indicated and as a graph which enables users to quickly identify those residues above any specific SVM threshold. The graph effectively enables the increase or decrease of the false positive rate. When tested on a non-redundant data set of 42 protein sequences not used in training, the PiRaNhA server achieved an accuracy of 85%, specificity of 90% and a Matthews correlation coefficient of 0.41 and outperformed other publicly available servers. The PiRaNhA prediction server is freely available at http://www.bioinformatics.sussex.ac.uk/PIRANHA. © The Author(s) 2010. Published by Oxford University Press

    Kernel-based machine learning protocol for predicting DNA-binding proteins

    Get PDF
    DNA-binding proteins (DNA-BPs) play a pivotal role in various intra- and extra-cellular activities ranging from DNA replication to gene expression control. Attempts have been made to identify DNA-BPs based on their sequence and structural information with moderate accuracy. Here we develop a machine learning protocol for the prediction of DNA-BPs where the classifier is Support Vector Machines (SVMs). Information used for classification is derived from characteristics that include surface and overall composition, overall charge and positive potential patches on the protein surface. In total 121 DNA-BPs and 238 non-binding proteins are used to build and evaluate the protocol. In self-consistency, accuracy value of 100% has been achieved. For cross-validation (CV) optimization over entire dataset, we report an accuracy of 90%. Using leave 1-pair holdout evaluation, the accuracy of 86.3% has been achieved. When we restrict the dataset to less than 20% sequence identity amongst the proteins, the holdout accuracy is achieved at 85.8%. Furthermore, seven DNA-BPs with unbounded structures are all correctly predicted. The current performances are better than results published previously. The higher accuracy value achieved here originates from two factors: the ability of the SVM to handle features that demonstrate a wide range of discriminatory power and, a different definition of the positive patch. Since our protocol does not lean on sequence or structural homology, it can be used to identify or predict proteins with DNA-binding function(s) regardless of their homology to the known ones

    Annotating Protein Functional Residues by Coupling High-Throughput Fitness Profile and Homologous-Structure Analysis.

    Get PDF
    Identification and annotation of functional residues are fundamental questions in protein sequence analysis. Sequence and structure conservation provides valuable information to tackle these questions. It is, however, limited by the incomplete sampling of sequence space in natural evolution. Moreover, proteins often have multiple functions, with overlapping sequences that present challenges to accurate annotation of the exact functions of individual residues by conservation-based methods. Using the influenza A virus PB1 protein as an example, we developed a method to systematically identify and annotate functional residues. We used saturation mutagenesis and high-throughput sequencing to measure the replication capacity of single nucleotide mutations across the entire PB1 protein. After predicting protein stability upon mutations, we identified functional PB1 residues that are essential for viral replication. To further annotate the functional residues important to the canonical or noncanonical functions of viral RNA-dependent RNA polymerase (vRdRp), we performed a homologous-structure analysis with 16 different vRdRp structures. We achieved high sensitivity in annotating the known canonical polymerase functional residues. Moreover, we identified a cluster of noncanonical functional residues located in the loop region of the PB1 β-ribbon. We further demonstrated that these residues were important for PB1 protein nuclear import through the interaction with Ran-binding protein 5. In summary, we developed a systematic and sensitive method to identify and annotate functional residues that are not restrained by sequence conservation. Importantly, this method is generally applicable to other proteins about which homologous-structure information is available.ImportanceTo fully comprehend the diverse functions of a protein, it is essential to understand the functionality of individual residues. Current methods are highly dependent on evolutionary sequence conservation, which is usually limited by sampling size. Sequence conservation-based methods are further confounded by structural constraints and multifunctionality of proteins. Here we present a method that can systematically identify and annotate functional residues of a given protein. We used a high-throughput functional profiling platform to identify essential residues. Coupling it with homologous-structure comparison, we were able to annotate multiple functions of proteins. We demonstrated the method with the PB1 protein of influenza A virus and identified novel functional residues in addition to its canonical function as an RNA-dependent RNA polymerase. Not limited to virology, this method is generally applicable to other proteins that can be functionally selected and about which homologous-structure information is available

    De novo prediction of PTBP1 binding and splicing targets reveals unexpected features of its RNA recognition and function.

    Get PDF
    The splicing regulator Polypyrimidine Tract Binding Protein (PTBP1) has four RNA binding domains that each binds a short pyrimidine element, allowing recognition of diverse pyrimidine-rich sequences. This variation makes it difficult to evaluate PTBP1 binding to particular sites based on sequence alone and thus to identify target RNAs. Conversely, transcriptome-wide binding assays such as CLIP identify many in vivo targets, but do not provide a quantitative assessment of binding and are informative only for the cells where the analysis is performed. A general method of predicting PTBP1 binding and possible targets in any cell type is needed. We developed computational models that predict the binding and splicing targets of PTBP1. A Hidden Markov Model (HMM), trained on CLIP-seq data, was used to score probable PTBP1 binding sites. Scores from this model are highly correlated (ρ = -0.9) with experimentally determined dissociation constants. Notably, we find that the protein is not strictly pyrimidine specific, as interspersed Guanosine residues are well tolerated within PTBP1 binding sites. This model identifies many previously unrecognized PTBP1 binding sites, and can score PTBP1 binding across the transcriptome in the absence of CLIP data. Using this model to examine the placement of PTBP1 binding sites in controlling splicing, we trained a multinomial logistic model on sets of PTBP1 regulated and unregulated exons. Applying this model to rank exons across the mouse transcriptome identifies known PTBP1 targets and many new exons that were confirmed as PTBP1-repressed by RT-PCR and RNA-seq after PTBP1 depletion. We find that PTBP1 dependent exons are diverse in structure and do not all fit previous descriptions of the placement of PTBP1 binding sites. Our study uncovers new features of RNA recognition and splicing regulation by PTBP1. This approach can be applied to other multi-RRM domain proteins to assess binding site degeneracy and multifactorial splicing regulation

    Molecular modelling and Function Prediction of hABH7, human homologue of _E. coli_ ALKB7

    Get PDF
    Human homologues of ALKB protein have shown the prime role in DNA damaging drugs, used for cancer therapy. Little is known about structure and function of hABH7, one of the members of this superfamily. Therefore, in the present study we intend to predict its structure and function using various bioinformatics tools. Modeling was done with modeller 9v7 to predict the 3D structure of the hABH7 protein. The tertiary structure model of hABH7, ALKBH7.B99990002.pdb was predicted and evaluated. Validation results showed 97.8% residues in favored and additional allowed regions of Ramachandran plots. Ligand binding residues prediction showed four ligand clusters, having 25 ligands in cluster 1. Importantly, presence of a Phe120-Gly121-Gly122 conserved pattern in the functional domain was detected. In the predicted structural model of hABH7, amino acid residues, Arginine at 57, 58, 59 and 60 along with tyrosine at 61 were predicted in RNA binding sites of the model. The predicted and validated model of human homologue hABH7 resulting from this study may unveil the mechanism of DNA damage repair in humans and accelerate the research on designing appropriate inhibitors aiding in chemotherapy and cancer related diseases

    A creature with a hundred waggly tails: intrinsically disordered proteins in the ribosome

    Get PDF
    This article is made available for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.Intrinsic disorder (i.e., lack of a unique 3-D structure) is a common phenomenon, and many biologically active proteins are disordered as a whole, or contain long disordered regions. These intrinsically disordered proteins/regions constitute a significant part of all proteomes, and their functional repertoire is complementary to functions of ordered proteins. In fact, intrinsic disorder represents an important driving force for many specific functions. An illustrative example of such disorder-centric functional class is RNA-binding proteins. In this study, we present the results of comprehensive bioinformatics analyses of the abundance and roles of intrinsic disorder in 3,411 ribosomal proteins from 32 species. We show that many ribosomal proteins are intrinsically disordered or hybrid proteins that contain ordered and disordered domains. Predicted globular domains of many ribosomal proteins contain noticeable regions of intrinsic disorder. We also show that disorder in ribosomal proteins has different characteristics compared to other proteins that interact with RNA and DNA including overall abundance, evolutionary conservation, and involvement in protein–protein interactions. Furthermore, intrinsic disorder is not only abundant in the ribosomal proteins, but we demonstrate that it is absolutely necessary for their various functions

    Understanding diversity of human innate immunity receptors: analysis of surface features of leucine-rich repeat domains in NLRs and TLRs.

    Get PDF
    BackgroundThe human innate immune system uses a system of extracellular Toll-like receptors (TLRs) and intracellular Nod-like receptors (NLRs) to match the appropriate level of immune response to the level of threat from the current environment. Almost all NLRs and TLRs have a domain consisting of multiple leucine-rich repeats (LRRs), which is believed to be involved in ligand binding. LRRs, found also in thousands of other proteins, form a well-defined "horseshoe"-shaped structural scaffold that can be used for a variety of functions, from binding specific ligands to performing a general structural role. The specific functional roles of LRR domains in NLRs and TLRs are thus defined by their detailed surface features. While experimental crystal structures of four human TLRs have been solved, no structure data are available for NLRs.ResultsWe report a quantitative, comparative analysis of the surface features of LRR domains in human NLRs and TLRs, using predicted three-dimensional structures for NLRs. Specifically, we calculated amino acid hydrophobicity, charge, and glycosylation distributions within LRR domain surfaces and assessed their similarity by clustering. Despite differences in structural and genomic organization, comparison of LRR surface features in NLRs and TLRs allowed us to hypothesize about their possible functional similarities. We find agreement between predicted surface similarities and similar functional roles in NLRs and TLRs with known agonists, and suggest possible binding partners for uncharacterized NLRs.ConclusionDespite its low resolution, our approach permits comparison of molecular surface features in the absence of crystal structure data. Our results illustrate diversity of surface features of innate immunity receptors and provide hints for function of NLRs whose specific role in innate immunity is yet unknown

    Structure and function prediction of human homologue hABH5 of _E. coli_ ALKB5 using in silico approach

    Get PDF
    Newly discovered human homologues of ALKB protein have shown the activity of DNA damaging drugs, used for cancer therapy. Little is known about the structure and function of hABH5, one of the members of this superfamily. Therefore, in the present study we intend to predict its structure and function using various bioinformatics tools. Modeling was done with modeler 9v7 to predict the 3D structure of the hABH5 protein. 3-D model of hABH5, ALKBH5.B99990005.pdb was predicted and evaluated. Validation results showed 96.8% residues in favor and an additional allowed region of the Ramachandran plot. Ligand binding residues prediction showed four ligand clusters, having 25 ligands in cluster 1. Importantly, conserved pattern of Pro158-X-Asp160-Xn-His266 in the functional domain was detected. DNA and RNA binding sites were also predicted in the model. The predicted and validated model of human homologue hABH5 resulting from this study may unveil the mechanism of DNA damage repair in humans and accelerate research on designing appropriate inhibitors, aiding in chemotherapy and cancer related diseases
    corecore