Search CORE

461 research outputs found

Alignment of helical membrane protein sequences using AlignMe

Author: Forrest Lucy R.
Khafizov Kamil
Stamm Marcus
Staritzbichler René
Publication venue
Publication date: 01/01/2013
Field of study

Few sequence alignment methods have been designed specifically for integral membrane proteins, even though these important proteins have distinct evolutionary and structural properties that might affect their alignments. Existing approaches typically consider membrane-related information either by using membrane-specific substitution matrices or by assigning distinct penalties for gap creation in transmembrane and non-transmembrane regions. Here, we ask whether favoring matching of predicted transmembrane segments within a standard dynamic programming algorithm can improve the accuracy of pairwise membrane protein sequence alignments. We tested various strategies using a specifically designed program called AlignMe. An updated set of homologous membrane protein structures, called HOMEP2, was used as a reference for optimizing the gap penalties. The best of the membrane-protein optimized approaches were then tested on an independent reference set of membrane protein sequence alignments from the BAliBASE collection. When secondary structure (S) matching was combined with evolutionary information (using a position-specific substitution matrix (P)), in an approach we called AlignMePS, the resultant pairwise alignments were typically among the most accurate over a broad range of sequence similarities when compared to available methods. Matching transmembrane predictions (T), in addition to evolutionary information, and secondary-structure predictions, in an approach called AlignMePST, generally reduces the accuracy of the alignments of closely-related proteins in the BAliBASE set relative to AlignMePS, but may be useful in cases of extremely distantly related proteins for which sequence information is less informative. The open source AlignMe code is available at https://sourceforge.net/projects/alignme/, and at http://www.forrestlab.org, along with an online server and the HOMEP2 data set

Directory of Open Access Journals

PubMed Central

MPG.PuRe

Hochschulschriftenserver - Universität Frankfurt am Main

FigShare

Improving protein order-disorder classification using charge-hydropathy plots

Author: Dunker A. Keith
Hsu Wei-Lun
Huang Fei
Liu Xiaowen
Meng Jingwei
Oldfield Christopher J.
Romero Pedro
Shen Li
Uversky Vladimir N.
Xue Bin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

BACKGROUND: The earliest whole protein order/disorder predictor (Uversky et al., Proteins, 41: 415-427 (2000)), herein called the charge-hydropathy (C-H) plot, was originally developed using the Kyte-Doolittle (1982) hydropathy scale (Kyte & Doolittle., J. Mol. Biol, 157: 105-132(1982)). Here the goal is to determine whether the performance of the C-H plot in separating structured and disordered proteins can be improved by using an alternative hydropathy scale. RESULTS: Using the performance of the CH-plot as the metric, we compared 19 alternative hydropathy scales, with the finding that the Guy (1985) hydropathy scale (Guy, Biophys. J, 47:61-70(1985)) was the best of the tested hydropathy scales for separating large collections structured proteins and intrinsically disordered proteins (IDPs) on the C-H plot. Next, we developed a new scale, named IDP-Hydropathy, which further improves the discrimination between structured proteins and IDPs. Applying the C-H plot to a dataset containing 109 IDPs and 563 non-homologous fully structured proteins, the Kyte-Doolittle (1982) hydropathy scale, the Guy (1985) hydropathy scale, and the IDP-Hydropathy scale gave balanced two-state classification accuracies of 79%, 84%, and 90%, respectively, indicating a very substantial overall improvement is obtained by using different hydropathy scales. A correlation study shows that IDP-Hydropathy is strongly correlated with other hydropathy scales, thus suggesting that IDP-Hydropathy probably has only minor contributions from amino acid properties other than hydropathy. CONCLUSION: We suggest that IDP-Hydropathy would likely be the best scale to use for any type of algorithm developed to predict protein disorder

USFSP Digital Archive

IUPUIScholarWorks

Springer - Publisher Connector

PubMed Central

Scholar Commons - University of South Florida

Recommended from our members

Investigation of transmembrane proteins using a computational approach

Author: Deng Youping
Dunker A Keith
Huang Xudong
Yang Jack Y
Yang Mary Qu
Publication venue: BioMed Central
Publication date: 20/03/2008
Field of study

Background: An important subfamily of membrane proteins are the transmembrane α-helical proteins, in which the membrane-spanning regions are made up of α-helices. Given the obvious biological and medical significance of these proteins, it is of tremendous practical importance to identify the location of transmembrane segments. The difficulty of inferring the secondary or tertiary structure of transmembrane proteins using experimental techniques has led to a surge of interest in applying techniques from machine learning and bioinformatics to infer secondary structure from primary structure in these proteins. We are therefore interested in determining which physicochemical properties are most useful for discriminating transmembrane segments from non-transmembrane segments in transmembrane proteins, and for discriminating intrinsically unstructured segments from intrinsically structured segments in transmembrane proteins, and in using the results of these investigations to develop classifiers to identify transmembrane segments in transmembrane proteins. Results: We determined that the most useful properties for discriminating transmembrane segments from non-transmembrane segments and for discriminating intrinsically unstructured segments from intrinsically structured segments in transmembrane proteins were hydropathy, polarity, and flexibility, and used the results of this analysis to construct classifiers to discriminate transmembrane segments from non-transmembrane segments using four classification techniques: two variants of the Self-Organizing Global Ranking algorithm, a decision tree algorithm, and a support vector machine algorithm. All four techniques exhibited good performance, with out-of-sample accuracies of approximately 75%. Conclusions: Several interesting observations emerged from our study: intrinsically unstructured segments and transmembrane segments tend to have opposite properties; transmembrane proteins appear to be much richer in intrinsically unstructured segments than other proteins; and, in approximately 70% of transmembrane proteins that contain intrinsically unstructured segments, the intrinsically unstructured segments are close to transmembrane segments

Harvard University - DASH

Springer - Publisher Connector

PubMed Central

Machine learning can guide experimental approaches for protein digestibility estimations

Author: Balaguer Maria Angels de Luis
Bhagavathula Anvita
Chandra Ranveer
Malvar Sara
Sharma Swati
Publication venue
Publication date: 01/11/2022
Field of study

Food protein digestibility and bioavailability are critical aspects in addressing human nutritional demands, particularly when seeking sustainable alternatives to animal-based proteins. In this study, we propose a machine learning approach to predict the true ileal digestibility coefficient of food items. The model makes use of a unique curated dataset that combines nutritional information from different foods with FASTA sequences of some of their protein families. We extracted the biochemical properties of the proteins and combined these properties with embeddings from a Transformer-based protein Language Model (pLM). In addition, we used SHAP to identify features that contribute most to the model prediction and provide interpretability. This first AI-based model for predicting food protein digestibility has an accuracy of 90% compared to existing experimental techniques. With this accuracy, our model can eliminate the need for lengthy in-vivo or in-vitro experiments, making the process of creating new foods faster, cheaper, and more ethical.Comment: 50 pages, submitted to Nature Foo

arXiv.org e-Print Archive

Software tools for simultaneous data visualization and T cell epitopes and disorder prediction in proteins

Author: Jandrlić Davorka
Lazić Goran M.
Mitić Nenad S.
Pavlović Mirjana D.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2016
Field of study

We have developed EpDis and MassPred, extendable open source software tools that support bioinformatic research and enable parallel use of different methods for the prediction of T cell epitopes, disorder and disordered binding regions and hydropathy calculation. These tools offer a semi-automated installation of chosen sets of external predictors and an interface allowing for easy application of the prediction methods, which can be applied either to individual proteins or to datasets of a large number of proteins. In addition to access to prediction methods, the tools also provide visualization of the obtained results, calculation of consensus from results of different methods, as well as import of experimental data and their comparison with results obtained with different predictors. The tools also offer a graphical user interface and the possibility to store data and the results obtained using all of the integrated methods in the relational database or flat file for further analysis. The MassPred part enables a massive parallel application of all integrated predictors to the set of proteins. Both tools can be downloaded from http://bioinfo.matf.bg.ac.rs/home/downloads.wafl?cat=Software. Appendix A includes the technical description of the created tools and a list of supported predictors

Machinery - Repository of the Faculty of Mechanical Engineering, University of Belgrade

machinery

Software tools for simultaneous data visualization and T cell epitopes and disorder prediction in proteins

Author: Jandrlić Davorka
Lazić Goran M.
Mitić Nenad S.
Pavlović Mirjana D.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2016
Field of study

Machinery - Repository of the Faculty of Mechanical Engineering, University of Belgrade

Factors Affecting Synonymous Codon Usage Bias in Chloroplast Genome of Oncidium Gower Ramsey

Author: Bierne N.
Bulmer M.
Carlini D.B.
Greenacre M.J.
Hou Z.C.
Kinshuk C.N.
Maria D.
Peng J.
Pär K.
Publication venue: Libertas Academica
Publication date: 01/01/2011
Field of study

Oncidium Gower Ramsey is a fascinating and important ornamental flower in floral industry. In this research, the complete nucleotide sequence of the chloroplast genome in Oncidium Gower Ramsey was studied, then analyzed using Codonw software. Correspondence analysis and method of effective number of codon as Nc-plot were conducted to analyze synonymous codon usage. According to the corresponding analysis, codon bias in the chloroplast genome of Oncidium Gower Ramsey is related to their gene length, mutation bias, gene hydropathy level of each protein, gene function and selection or gene expression only subtly affect codon usage. This study will provide insights into the molecular evolution study and high-level transgene expression

CiteSeerX

Crossref

Directory of Open Access Journals

PubMed Central

A Review on the Role of Amino Acids in Gas Hydrate Inhibition, CO<sub>2</sub> Capture and Sequestration, and Natural Gas Storage

Author: Bavoh Cornelius B.
Lal Bhajan
Mukhtar Hilmi
Osei Harrison
Sabil Khalik Mohamad
Publication venue: 'Elsevier BV'
Publication date: 01/04/2019
Field of study

Heriot Watt Pure

Sensitivity of Water Dynamics to Biologically Significant Surfaces of Monomeric Insulin: Role of Topology and Electrostatic Interactions

Author: Bagchi Kushal
Roy Susmita
Publication venue
Publication date: 21/12/2013
Field of study

In addition to the biologically active monomer of the protein Insulin circulating in human blood, the molecule also exists in dimeric and hexameric forms that are used as storage. The Insulin monomer contains two distinct surfaces, namely the dimer forming surface (DFS) and the hexamer forming surface (HFS) that are specifically designed to facilitate the formation of the dimer and the hexamer, respectively. In order to characterize the structural and dynamical behaviour of interfacial water molecules near these two surfaces (DFS and HFS), we performed atomistic molecular dynamics simulations of Insulin with explicit water. Dynamical characterization reveals that the structural relaxation of the hydrogen bonds formed between the residues of DFS and the interfacial water molecules is faster than those formed between water and that of the HFS. Furthermore, the residence times of water molecules in the protein hydration layer for both the DFS and HFS are found to be significantly higher than those for some of the other proteins studied so far, such as HP-36 and lysozyme. The surface topography and the arrangement of amino acid residues work together to organize the water molecules in the hydration layer in order to provide them with a preferred orientation. HFS having a large polar solvent accessible surface area and a convex extensive nonpolar region, drives the surrounding water molecules to acquire predominantly a clathrate-like structure. In contrast, near the DFS, the surrounding water molecules acquire an inverted orientation owing to the flat curvature of hydrophobic surface and interrupted hydrophilic residual alignment. We have followed escape trajectory of several such quasi-bound water molecules from both the surfaces and constructed free energy surfaces of these water molecules.These free energy surfaces reveal the differences between the two hydration layers.Comment: 34 pages, 10 figure

arXiv.org e-Print Archive

Open Access Repository of IISc Research Publications