15 research outputs found
Predicting Residue-Residue Contacts and Helix-Helix Interactions in Transmembrane Proteins Using an Integrative Feature-Based Random Forest Approach
Integral membrane proteins constitute 25–30% of genomes and play crucial roles in many biological processes. However, less than 1% of membrane protein structures are in the Protein Data Bank. In this context, it is important to develop reliable computational methods for predicting the structures of membrane proteins. Here, we present the first application of random forest (RF) for residue-residue contact prediction in transmembrane proteins, which we term as TMhhcp. Rigorous cross-validation tests indicate that the built RF models provide a more favorable prediction performance compared with two state-of-the-art methods, i.e., TMHcon and MEMPACK. Using a strict leave-one-protein-out jackknifing procedure, they were capable of reaching the top L/5 prediction accuracies of 49.5% and 48.8% for two different residue contact definitions, respectively. The predicted residue contacts were further employed to predict interacting helical pairs and achieved the Matthew's correlation coefficients of 0.430 and 0.424, according to two different residue contact definitions, respectively. To facilitate the academic community, the TMhhcp server has been made freely accessible at http://protein.cau.edu.cn/tmhhcp
A computational analysis of non-genomic plasma membrane progestin binding proteins: Signaling through ion channel-linked cell surface receptors
AbstractA number of plasma membrane progestin receptors linked to non-genomic events have been identified. These include: (1) α1-subunit of the Na+/K+-ATPase (ATP1A1), (2) progestin binding PAQR proteins, (3) membrane progestin receptor alpha (mPRα), (4) progesterone receptor MAPR proteins and (5) the association of nuclear receptor (PRB) with the plasma membrane. This study compares: the pore-lining regions (ion channels), transmembrane (TM) helices, caveolin binding (CB) motifs and leucine-rich repeats (LRRs) of putative progesterone receptors. ATP1A1 contains 10 TM helices (TM-2, 4, 5, 6 and 8 are pores) and 4 CB motifs; whereas PAQR5, PAQR6, PAQR7, PAQRB8 and fish mPRα each contain 8 TM helices (TM-3 is a pore) and 2–4 CB motifs. MAPR proteins contain a single TM helix but lack pore-lining regions and CB motifs. PRB contains one or more TM helices in the steroid binding region, one of which is a pore. ATP1A1, PAQR5/7/8, mPRα, and MAPR-1 contain highly conserved leucine-rich repeats (LRR, common to plant membrane proteins) that are ligand binding sites for ouabain-like steroids associated with LRR kinases. LRR domains are within or overlap TM helices predicted to be ion channels (pore-lining regions), with the variable LRR sequence either at the C-terminus (PAQR and MAPR-1) or within an external loop (ATP1A1). Since ouabain-like steroids are produced by animal cells, our findings suggest that ATP1A1, PAQR5/7/8 and mPRα represent ion channel-linked receptors that respond physiologically to ouabain-like steroids (not progestin) similar to those known to regulate developmental and defense-related processes in plants
Molecular models for the core components of the flagellar type-III secretion complex
We show that by using a combination of computational methods, consistent three-dimensional molecular models can be proposed for the core proteins of the type-III secretion system. We employed a variety of approaches to reconcile disparate, and sometimes inconsistent, data sources into a coherent picture that for most of the proteins indicated a unique solution to the constraints. The range of difficulty spanned from the trivial (FliQ) to the difficult (FlhA and FliP). The uncertainties encountered with FlhA were largely the result of the greater number of helix packing possibilities allowed in a large protein, however, for FliP, there remains an uncertainty in how to reconcile the large displacement predicted between its two main helical hairpins and their ability to sit together happily across the bacterial membrane. As there is still no high resolution structural information on any of these proteins, we hope our predicted models may be of some use in aiding the interpretation of electron microscope images and in rationalising mutation data and experiments
C19orf12 mutation leads to a pallido-pyramidal syndrome.
Pallido-pyramidal syndromes combine dystonia with or without parkinsonism and spasticity as part of a mixed neurodegenerative disorder. Several causative genes have been shown to lead to pallido-pyramidal syndromes, including FBXO7, ATP13A2, PLA2G6, PRKN and SPG11. Among these, ATP13A2 and PLA2G6 are inconsistently associated with brain iron deposition. Using homozygosity mapping and direct sequencing in a multiplex consanguineous Saudi Arabian family with a pallido-pyramidal syndrome, iron deposition and cerebellar atrophy, we identified a homozygous p.G53R mutation in C19orf12. Our findings add to the phenotypic spectrum associated with C19orf12 mutations
The PSIPRED Protein Analysis Workbench: 20 years on
The PSIPRED Workbench is a web server offering a range of predictive methods to the bioscience community for 20 years. Here, we present the work we have completed to update the PSIPRED Protein Analysis Workbench and make it ready for the next 20 years. The main focus of our recent website upgrade work has been the acceleration of analyses in the face of increasing protein sequence database size. We additionally discuss any new software, the new hardware infrastructure, our webservices and web site. Lastly we survey updates to some of the key predictive algorithms available through our website
The PSIPRED Protein Analysis Workbench: 20 years on
The PSIPRED Workbench is a web server offering a range of predictive methods to the bioscience community for 20 years. Here, we present the work we have completed to update the PSIPRED Protein Analysis Workbench and make it ready for the next 20 years. The main focus of our recent website upgrade work has been the acceleration of analyses in the face of increasing protein sequence database size. We additionally discuss any new software, the new hardware infrastructure, our webservices and web site. Lastly we survey updates to some of the key predictive algorithms available through our website
Transmembrane protein structure prediction using machine learning
This thesis describes the development and application of machine learning-based
methods for the prediction of alpha-helical transmembrane protein
structure from sequence alone. It is divided into six chapters.
Chapter 1 provides an introduction to membrane structure and dynamics,
membrane protein classes and families, and membrane protein structure prediction.
Chapter 2 describes a topological study of the transmembrane protein
CLN3 using a consensus of bioinformatic approaches constrained by experimental
data. Mutations in CLN3 can cause juvenile neuronal ceroid
lipofuscinosis, or Batten disease, an inherited neurodegenerative lysosomal
storage disease affecting children, therefore such studies are important
for directing further experimental work into this incurable illness.
Chapter 3 explores the possibility of using biologically meaningful signatures
described as regular expressions to influence the assignment of inside
and outside loop locations during transmembrane topology prediction. Using
this approach, it was possilbe to modify a recent topology prediction method
leading to an improvement of 6% prediction accuracy using a standard data set.
Chapter 4 describes the development of a novel support vector machine-based
topology predictor that integrates both signal peptide and re-entrant helix prediction,
benchmarked with full cross-validation on a novel data set of sequences with
known crystal structures. The method achieves state-of-the-art performance in predicting
topology and discriminating between globular and transmembrane proteins.
We also present the results of applying these tools to a number of complete genomes.
Chapter 5 describes a novel approach to predict lipid exposure, residue
contacts, helix-helix interactions and finally the optimal helical packing arrangement of transmembrane proteins. It is based on two support vector
machine classifiers that predict per residue lipid exposure and residue contacts,
which are used to determine helix-helix interaction with up to 65%
accuracy. The method is also able to discriminate native from decoy helical
packing arrangements with up to 70% accuracy. Finally, a force-directed
algorithm is employed to construct the optimal helical packing arrangement
which demonstrates success for proteins containing up to 13 transmembrane helices.
The final chapter summarises the major contributions of this thesis to biology,
before future perspectives for TM protein structure prediction are discussed
Designing novel construction for cell surface display of protein E on Escherichia coli using non-classical pathway based on Lpp-OmpA
Additional file 2. pNon-OmpA-PE. ab1. Sequencing result of pNon-OmpA-PE
Structural bioinformatics analysis of the SARS-COV-2 proteome evolution to characterize the emerging variants of the virus and to suggest possible therapeutic strategies
SARS-CoV-2 is a new coronavirus responsible for the global COVID-19 pandemic, detected in China in December 2019 and that has spread rapidly across the world. Our unit, with its specific expertise in structural bioinformatics and molecular modelling, has been involved in collaboration with epidemiology and molecular genetics groups to study SARS-CoV-2 proteome and to suggest possible molecular strategies able to inhibit virus infection. All coronaviruses, including SARS-CoV-2, evolve and adapt to the host through accumulation of mutations generated by characteristics of the virus RNA-polymerase. This work can be divided into two parts: the first part is focused onto the
predictions of the potential effects of the mutations on the functions of the SARS-CoV-2 Spike glycoprotein, whereas the second part is focused at suggesting possible therapeutic strategies. In particular, I performed docking analyses to study the possible mode ad sites of interaction of inorganic polyphosphates with ACE2 and SARS-CoV-2 RNA dependent RNA polymerase (RdRp) because the molecular genetics group with whom we collaborate suggested that polyphosphates can enhance ACE2 proteasomal degradation and impair synthesis of viral RNA. In addition, I developed a pipeline to predict the most frequent sites of interaction between Spike glycoprotein and neutralizing
monoclonal antibodies in order to propose therapeutic alternatives more specific and selective