10,537 research outputs found

    D2P2: database of disordered protein predictions

    Get PDF
    We present the Database of Disordered Protein Prediction (D2P2), available at http://d2p2.pro (including website source code). A battery of disorder predictors and their variants, VL-XT, VSL2b, PrDOS, PV2, Espritz and IUPred, were run on all protein sequences from 1765 complete proteomes (to be updated as more genomes are completed). Integrated with these results are all of the predicted (mostly structured) SCOP domains using the SUPERFAMILY predictor. These disorder/structure annotations together enable comparison of the disorder predictors with each other and examination of the overlap between disordered predictions and SCOP domains on a large scale. D2P2 will increase our understanding of the interplay between disorder and structure, the genomic distribution of disorder, and its evolutionary history. The parsed data are made available in a unified format for download as flat files or SQL tables either by genome, by predictor, or for the complete set. An interactive website provides a graphical view of each protein annotated with the SCOP domains and disordered regions from all predictors overlaid (or shown as a consensus). There are statistics and tools for browsing and comparing genomes and their disorder within the context of their position on the tree of life. Ā© The Author(s) 2012. Published by Oxford University Press

    The Phyre2 web portal for protein modeling, prediction and analysis

    Get PDF
    Phyre2 is a suite of tools available on the web to predict and analyze protein structure, function and mutations. The focus of Phyre2 is to provide biologists with a simple and intuitive interface to state-of-the-art protein bioinformatics tools. Phyre2 replaces Phyre, the original version of the server for which we previously published a paper in Nature Protocols. In this updated protocol, we describe Phyre2, which uses advanced remote homology detection methods to build 3D models, predict ligand binding sites and analyze the effect of amino acid variants (e.g., nonsynonymous SNPs (nsSNPs)) for a user's protein sequence. Users are guided through results by a simple interface at a level of detail they determine. This protocol will guide users from submitting a protein sequence to interpreting the secondary and tertiary structure of their models, their domain composition and model quality. A range of additional available tools is described to find a protein structure in a genome, to submit large number of sequences at once and to automatically run weekly searches for proteins that are difficult to model. The server is available at http://www.sbg.bio.ic.ac.uk/phyre2. A typical structure prediction will be returned between 30 min and 2 h after submission

    A Combination of Compositional Index and Genetic Algorithm for Predicting Transmembrane Helical Segments

    Get PDF
    Transmembrane helix (TMH) topology prediction is becoming a focal problem in bioinformatics because the structure of TM proteins is difficult to determine using experimental methods. Therefore, methods that can computationally predict the topology of helical membrane proteins are highly desirable. In this paper we introduce TMHindex, a method for detecting TMH segments using only the amino acid sequence information. Each amino acid in a protein sequence is represented by a Compositional Index, which is deduced from a combination of the difference in amino acid occurrences in TMH and non-TMH segments in training protein sequences and the amino acid composition information. Furthermore, a genetic algorithm was employed to find the optimal threshold value for the separation of TMH segments from non-TMH segments. The method successfully predicted 376 out of the 378 TMH segments in a dataset consisting of 70 test protein sequences. The sensitivity and specificity for classifying each amino acid in every protein sequence in the dataset was 0.901 and 0.865, respectively. To assess the generality of TMHindex, we also tested the approach on another standard 73-protein 3D helix dataset. TMHindex correctly predicted 91.8% of proteins based on TM segments. The level of the accuracy achieved using TMHindex in comparison to other recent approaches for predicting the topology of TM proteins is a strong argument in favor of our proposed method. Availability: The datasets, software together with supplementary materials are available at: http://faculty.uaeu.ac.ae/nzaki/TMHindex.htm

    FLEXc: Protein Flexibility Prediction Using Context-Based Statistics, Predicted Structural Features, and Sequence Information

    Get PDF
    The fluctuation of atoms around their average positions in protein structures provides important information regarding protein dynamics. This flexibility of protein structures is associated with various biological processes. Predicting flexibility of residues from protein sequences is significant for analyzing the dynamic properties of proteins which will be helpful in predicting their functions

    Machine Learning and Graph Theory Approaches for Classification and Prediction of Protein Structure

    Get PDF
    Recently, many methods have been proposed for the classification and prediction problems in bioinformatics. One of these problems is the protein structure prediction. Machine learning approaches and new algorithms have been proposed to solve this problem. Among the machine learning approaches, Support Vector Machines (SVM) have attracted a lot of attention due to their high prediction accuracy. Since protein data consists of sequence and structural information, another most widely used approach for modeling this structured data is to use graphs. In computer science, graph theory has been widely studied; however it has only been recently applied to bioinformatics. In this work, we introduced new algorithms based on statistical methods, graph theory concepts and machine learning for the protein structure prediction problem. A new statistical method based on z-scores has been introduced for seed selection in proteins. A new method based on finding common cliques in protein data for feature selection is also introduced, which reduces noise in the data. We also introduced new binary classifiers for the prediction of structural transitions in proteins. These new binary classifiers achieve much higher accuracy results than the current traditional binary classifiers
    • ā€¦
    corecore