3,443 research outputs found

    Alignment of helical membrane protein sequences using AlignMe

    Get PDF
    Few sequence alignment methods have been designed specifically for integral membrane proteins, even though these important proteins have distinct evolutionary and structural properties that might affect their alignments. Existing approaches typically consider membrane-related information either by using membrane-specific substitution matrices or by assigning distinct penalties for gap creation in transmembrane and non-transmembrane regions. Here, we ask whether favoring matching of predicted transmembrane segments within a standard dynamic programming algorithm can improve the accuracy of pairwise membrane protein sequence alignments. We tested various strategies using a specifically designed program called AlignMe. An updated set of homologous membrane protein structures, called HOMEP2, was used as a reference for optimizing the gap penalties. The best of the membrane-protein optimized approaches were then tested on an independent reference set of membrane protein sequence alignments from the BAliBASE collection. When secondary structure (S) matching was combined with evolutionary information (using a position-specific substitution matrix (P)), in an approach we called AlignMePS, the resultant pairwise alignments were typically among the most accurate over a broad range of sequence similarities when compared to available methods. Matching transmembrane predictions (T), in addition to evolutionary information, and secondary-structure predictions, in an approach called AlignMePST, generally reduces the accuracy of the alignments of closely-related proteins in the BAliBASE set relative to AlignMePS, but may be useful in cases of extremely distantly related proteins for which sequence information is less informative. The open source AlignMe code is available at https://sourceforge.net/projects/alignme​/, and at http://www.forrestlab.org, along with an online server and the HOMEP2 data set

    Folding and insertion thermodynamics of the transmembrane WALP peptide

    Get PDF
    The anchor of most integral membrane proteins consists of one or several helices spanning the lipid bilayer. The WALP peptide, GWW(LA)n_n(L)WWA, is a common model helix to study the fundamentals of protein insertion and folding, as well as helix-helix association in the membrane. Its structural properties have been illuminated in a large number of experimental and simulation studies. In this combined coarse-grained and atomistic simulation study, we probe the thermodynamics of a single WALP peptide, focusing on both the insertion across the water-membrane interface, as well as folding in both water and a membrane. The potential of mean force characterizing the peptide's insertion into the membrane shows qualitatively similar behavior across peptides and three force fields. However, the Martini force field exhibits a pronounced secondary minimum for an adsorbed interfacial state, which may even become the global minimum---in contrast to both atomistic simulations and the alternative PLUM force field. Even though the two coarse-grained models reproduce the free energy of insertion of individual amino acids side chains, they both underestimate its corresponding value for the full peptide (as compared with atomistic simulations), hinting at cooperative physics beyond the residue level. Folding of WALP in the two environments indicates the helix as the most stable structure, though with different relative stabilities and chain-length dependence.Comment: 12 pages, 5 figure

    Novel Strategies for Model-Building of G Protein-Coupled Receptors

    Get PDF
    The G protein-coupled receptors constitute still the most densely populated proteinfamily encompassing numerous disease-relevant drug targets. Consequently, medicinal chemistry is expected to pursue targets from that protein family in that hits need to be generated and subsequently optimized towards viable clinical candidates for a variety of therapeutic areas. For the purpose of rationalizing structure-activity relationships within such optimization programs, structural information derived from the ligand's as well as the macromolecule's perspective is essential. While it is relatively straightforward to define pharmacophore hypotheses based on comparative modelling of structurally and biologically characterized low-molecular weight ligands, a deeper understanding of the molecular recognition event underlying, remains challenging, since the principally available amount of experimentally derived structural data on GPCRs is extremely scarse when compared to, e.g., soluble enzymes. In this context, the protein modelling methodologies introduced, developed, optimized, and applied in this thesis provide structural models that are capable of assisting in the development of structural hypotheses on ligand-receptor complexes. As such they provide a valuable structural framework not only for a more detailed insight into ligand-GPCR interaction, but also for guiding the design process towards next-generation compounds which should display enhanced affinity. The model building procedure developed in this thesis systematically follows a hierarchical approach, sequentially generating a 1D topology, followed by a 2D topology that is finally converted into a 3D topology. The determination of a 1D topology is based on a compartmentalization of the linear amino acid sequence of a GPCR of interest into the extracellular, intracellular, and transmembrane sequence stretches. The entire chapter 3 of this study elaborates on the strengths and weaknesses of applying automated prediction tools for the purpose of identifying the transmembrane sequence domains. Based on an once derived 1D topology, a type of in-plane projection structure for the seven transmembrane helices can be derived with the aide of calculated vectorial property moments, yielding the 2D topology. Thorough bioinformatics studies revealed that only a consensus approach based on a conceptual combination of different methods employing a carefully made selection of parameter sets gave reliable results, emphasizing the danger to fully automate a GPCR modelling procedure. Chapter 4 describes a procedure to further expand the 2D topological findings into 3D space, exemplified on the human CCK-B receptor protein. This particular GPCR was chosen as the receptor of interest, since an enormous experimentally derived and structurally relevant data-set was available. Within the computational refinement procedure of constructed GPCR models, major emphasis was laid on the explicit treatment of a non-isotropic solvent environment during molecular mechanics (i.e. energy minimization and molecular dynamics simulations) calculations. The majority of simulations was therefore carried out in a tri-phasic solvent box accounting for a central lipid environment, flanked by two aqueous compartments, mimicking the extracellular and cytoplasmic space. Chapter 5 introduces the reference compound set, comprising low-molecular weight compounds modulating CCK receptors, that was used for validation purposes of the generated models of the receptor protein. Chapter 6 describes how the generated model of the CCK-B receptor was subjected to intensive docking studies employing compound series introduced in chapter 5. It turned out that by applying the DRAGHOME methodology viable structural hypotheses on putative receptor-ligand complexes could be generated. Based on the methodology pursued in this thesis a detailed model of the receptor binding site could be devised that accounts for known structure-activity relationships as well as for results obtained by site-directed mutagenesis studies in a qualitative manner. The overall study presented in this thesis is primarily aimed to deliver a feasibility study on generating model structures of GPCRs by a conceptual combination of tailor-made bioinformatics techniques with the toolbox of protein modelling, exemplified on the human CCK-B receptor. The generated structures should be envisioned as models only, not necessarily providing a detailed image of reality. However, consistent models, when verified and refined against experimental data, deliver an extremely useful structural contextual platform on which different scientific disciplines such as medicinal chemistry, molecular biology, and biophysics can effectively communicate

    Prediction of transmembrane helix orientation in polytopic membrane proteins

    Get PDF
    BACKGROUND: Membrane proteins compose up to 30% of coding sequences within genomes. However, their structure determination is lagging behind compared with soluble proteins due to the experimental difficulties. Therefore, it is important to develop reliable computational methods to predict structures of membrane proteins. RESULTS: We present a method for prediction of the TM helix orientation, which is an essential step in ab initio modeling of membrane proteins. Our method is based on a canonical model of the heptad repeat originally developed for coiled coils. We identify the helical surface patches that interface with lipid molecules at an accuracy of about 88% from the sequence information alone, using an empirical scoring function LIPS (LIPid-facing Surface), which combines lipophilicity and conservation of residues in the helix. We test and discuss results of prediction of helix-lipid interfaces on 162 transmembrane helices from 18 polytopic membrane proteins and present predicted orientations of TM helices in TRPV1 channel. We also apply our method to two structures of homologous cytochrome b(6)f complexes and find discrepancy in the assignment of TM helices from subunits PetG, PetN and PetL. The results of LIPS calculations and analysis of packing and H-bonding interactions support the helix assignment found in the cytochrome b(6)f structure from green alga but not the assignment of TM helices in the cyanobacterium b(6)f structure. CONCLUSION: LIPS calculations can be used for the prediction of helix orientation in ab initio modeling of polytopic membrane proteins. We also show with the example of two cytochrome b(6)f structures that our method can identify questionable helix assignments in membrane proteins. The LIPS server is available online at

    MemBrain: Improving the Accuracy of Predicting Transmembrane Helices

    Get PDF
    Prediction of transmembrane helices (TMH) in α helical membrane proteins provides valuable information about the protein topology when the high resolution structures are not available. Many predictors have been developed based on either amino acid hydrophobicity scale or pure statistical approaches. While these predictors perform reasonably well in identifying the number of TMHs in a protein, they are generally inaccurate in predicting the ends of TMHs, or TMHs of unusual length. To improve the accuracy of TMH detection, we developed a machine-learning based predictor, MemBrain, which integrates a number of modern bioinformatics approaches including sequence representation by multiple sequence alignment matrix, the optimized evidence-theoretic K-nearest neighbor prediction algorithm, fusion of multiple prediction window sizes, and classification by dynamic threshold. MemBrain demonstrates an overall improvement of about 20% in prediction accuracy, particularly, in predicting the ends of TMHs and TMHs that are shorter than 15 residues. It also has the capability to detect N-terminal signal peptides. The MemBrain predictor is a useful sequence-based analysis tool for functional and structural characterization of helical membrane proteins; it is freely available at http://chou.med.harvard.edu/bioinf/MemBrain/

    Novel Strategies for Model-Building of G Protein-Coupled Receptors

    Get PDF
    The G protein-coupled receptors constitute still the most densely populated proteinfamily encompassing numerous disease-relevant drug targets. Consequently, medicinal chemistry is expected to pursue targets from that protein family in that hits need to be generated and subsequently optimized towards viable clinical candidates for a variety of therapeutic areas. For the purpose of rationalizing structure-activity relationships within such optimization programs, structural information derived from the ligand's as well as the macromolecule's perspective is essential. While it is relatively straightforward to define pharmacophore hypotheses based on comparative modelling of structurally and biologically characterized low-molecular weight ligands, a deeper understanding of the molecular recognition event underlying, remains challenging, since the principally available amount of experimentally derived structural data on GPCRs is extremely scarse when compared to, e.g., soluble enzymes. In this context, the protein modelling methodologies introduced, developed, optimized, and applied in this thesis provide structural models that are capable of assisting in the development of structural hypotheses on ligand-receptor complexes. As such they provide a valuable structural framework not only for a more detailed insight into ligand-GPCR interaction, but also for guiding the design process towards next-generation compounds which should display enhanced affinity. The model building procedure developed in this thesis systematically follows a hierarchical approach, sequentially generating a 1D topology, followed by a 2D topology that is finally converted into a 3D topology. The determination of a 1D topology is based on a compartmentalization of the linear amino acid sequence of a GPCR of interest into the extracellular, intracellular, and transmembrane sequence stretches. The entire chapter 3 of this study elaborates on the strengths and weaknesses of applying automated prediction tools for the purpose of identifying the transmembrane sequence domains. Based on an once derived 1D topology, a type of in-plane projection structure for the seven transmembrane helices can be derived with the aide of calculated vectorial property moments, yielding the 2D topology. Thorough bioinformatics studies revealed that only a consensus approach based on a conceptual combination of different methods employing a carefully made selection of parameter sets gave reliable results, emphasizing the danger to fully automate a GPCR modelling procedure. Chapter 4 describes a procedure to further expand the 2D topological findings into 3D space, exemplified on the human CCK-B receptor protein. This particular GPCR was chosen as the receptor of interest, since an enormous experimentally derived and structurally relevant data-set was available. Within the computational refinement procedure of constructed GPCR models, major emphasis was laid on the explicit treatment of a non-isotropic solvent environment during molecular mechanics (i.e. energy minimization and molecular dynamics simulations) calculations. The majority of simulations was therefore carried out in a tri-phasic solvent box accounting for a central lipid environment, flanked by two aqueous compartments, mimicking the extracellular and cytoplasmic space. Chapter 5 introduces the reference compound set, comprising low-molecular weight compounds modulating CCK receptors, that was used for validation purposes of the generated models of the receptor protein. Chapter 6 describes how the generated model of the CCK-B receptor was subjected to intensive docking studies employing compound series introduced in chapter 5. It turned out that by applying the DRAGHOME methodology viable structural hypotheses on putative receptor-ligand complexes could be generated. Based on the methodology pursued in this thesis a detailed model of the receptor binding site could be devised that accounts for known structure-activity relationships as well as for results obtained by site-directed mutagenesis studies in a qualitative manner. The overall study presented in this thesis is primarily aimed to deliver a feasibility study on generating model structures of GPCRs by a conceptual combination of tailor-made bioinformatics techniques with the toolbox of protein modelling, exemplified on the human CCK-B receptor. The generated structures should be envisioned as models only, not necessarily providing a detailed image of reality. However, consistent models, when verified and refined against experimental data, deliver an extremely useful structural contextual platform on which different scientific disciplines such as medicinal chemistry, molecular biology, and biophysics can effectively communicate

    Knowledge-Based Potential for Positioning Membrane-Associated Structures and Assessing Residue-Specific Energetic Contributions

    Get PDF
    The complex hydrophobic and hydrophilic milieus of membrane-associated proteins pose experimental and theoretical challenges to their understanding. Here we produce a non-redundant database to compute knowledge-based asymmetric cross-membrane potentials from the per-residue distributions of Cβ, Cγ and functional group atoms. We predict transmembrane and peripherally associated regions from genomic sequence and position peptides and protein structures relative to the bilayer (available at http://www.degradolab.org/ez). The pseudo-energy topological landscapes underscore positional stability and functional mechanisms demonstrated here for antimicrobial peptides, transmembrane proteins, and viral fusion proteins. Moreover, experimental effects of point mutations on the relative ratio changes of dual-topology proteins are quantitatively reproduced. The functional group potential and the membrane-exposed residues display the largest energetic changes enabling to detect native-like structures from decoys. Hence, focusing on the uniqueness of membrane-associated proteins and peptides, we quantitatively parameterize their cross-membrane propensity thus facilitating structural refinement, characterization, prediction and design

    Experimental Determination of the Topology of the HIV-1 gp41 C-Terminal Tail

    Get PDF
    The C-terminal tail (CTT) of HIV gp41 has been traditionally viewed as a cytoplasmic domain. Genetic studies demonstrating functional interactions between the CTT and various intracellular partners have implicitly reinforced this view. However, antibody neutralization data and biochemical studies have suggested that the CTT is, or can be, externally localized under certain condition. Additionally, other studies have demonstrated that the CTT is dispensable for in vitro virus replication. After nearly three decades of HIV research, the function and structure of the CTT remain elusive. Our goals, then, were twofold: (i) to determine the overall conservation of the CTT in an attempt to provide an understanding of the functional and structural relevance of the CTT; and, (ii) to provide an experimental topological map of the CTT in an attempt to understand and align observed CTT topology(ies) with the functional necessity of a cytoplasmic CTT. We believe that we made significant contributions to the understanding of CTT topology and its relationship to current published functional studies. The initial studies demonstrated that the CTT sequence is conserved at a level that is intermediate between the highly variable gp120 region and the relatively conserved gp41 ectodomain. Additionally, physicochemical and structural properties of CTT sequences were found to be conserved in spite of the relatively high sequencevariability. These studies demonstrated for the first time that the CTT sequence, while highly variable, contains highly conserved structural and chemical properties that suggest a functional requirement for the CTT. Topology studies of the CTT indicated that the topology of the CTT can be distinct between the surface of Env-expressing cells and viral particles. Additionally, dynamic rearrangement of the CTT was observed as a function of antibody neutralization. These findings prompted a theoretical study of gp41 CTT predicted topology and the proposal of a topological model that we believe is consistent with all published studies regarding the localization of the CTT
    • …
    corecore