172 research outputs found

    Toward biologically realistic computational membrane protein structure prediction and design

    Get PDF
    Membrane proteins function as gates and checkpoints that control the transit of molecules and information across the lipid bilayer. Understanding their structures will provide mechanistic insights in how to keep cells healthy and defend against disease. However, experimental difficulties have slowed the progress of structure determination. Previous work has demonstrated the promise of computational modeling for elucidating membrane protein structures. A remaining challenge is to model proteins coupled with the heterogeneous cell membrane environment. In the first half of this dissertation, I detail the development, testing and integration of a biologically realistic implicit lipid bilayer model in Rosetta. First, I describe the initial iteration of the implicit model that captures the anisotropic structure, shape of water-filled pores, and nanoscale dimensions of membranes with different lipid compositions. Second, I explain my approach to energy function benchmarking and optimization given the challenge of sparse and low-quality experimental data. Third, I outline the second generation that incorporates a new electrostatics and pH model. All of these developments have advanced the accuracy of Rosetta membrane protein structure prediction and design. In the second half of this dissertation, I investigate three challenging biological and engineering applications involving membrane proteins. In the first application, I examine mutation-induced stability changes in the integral membrane zinc metalloprotease ZMPSTE24: a protein with a large voluminous chamber that is not captured by current implicit models. In the second application, I model interactions between the SERCA2a calcium pump and the regulatory transmembrane protein phospholamban: a key membrane protein-protein interaction implicated in the heart’s response to adrenaline. Finally, I explore the challenge of membrane protein design to engineer a self-assembling transmembrane protein pore for nanotechnology applications. These applications highlight the next steps required to improve computational membrane protein modeling tools. Taken together, my work in both methods development and applications has advanced our understanding and ability to model and design membrane protein structures

    Technology development for the over-expression, purification and crystallisation of human membrane proteins

    Get PDF
    Currently, the field of mammalian membrane protein structural biology is in its infancy. Existing technologies and experiences have shown that it is possible to obtain the structures of mammalian membrane proteins if sufficient work and thought has been invested. However, there is still an urgent need to develop new methodologies and approaches to improve all aspects of this important area of biological research. Here, a series of novel technologies for the overproduction, purification and crystallisation of human membrane proteins are described which have been tested with a representative member from each of the G-protein coupled receptor (adenosine 2a receptor (A2aR)) and membrane enzyme (sterol isomerase (SI)) superfamilies. The methylotrophic yeast Pichia pastoris is an excellent host cell for the overproduction of recombinant proteins including membrane proteins of mammalian origin. However, the commercially available expression vectors are far from what is required to maximise the production levels as well as simplify the detergent extraction and purification of human membrane proteins. Here, a series of related expression constructs were made that had different combinations of tags at both ends of the recombinant protein. The final optimised expression vectors had a C3 protease-iLOV-biotin acceptor-His10 (CLBH) tag fused to the C-terminus of the recombinant protein. The -CLBH vectors gave high level production of both test proteins (one Nin – hSI; one Nout – hA2aR) that could be rapidly purified to homogeneity using a generic protocol. The position of the His10 tag did not affect the expression level of the recombinant protein. In contrast, fusion of the biotin acceptor domain to the C-terminus of the recombinant protein increased its expression by a factor of between 2-4. The biotin acceptor domain could also be fully biotinylated in vitro using recombinantly expressed biotin ligase allowing purification/immobilisation of the target protein with streptavidin beads. Removal of the expression/ purification tags from the recombinant proteins with C3 protease occurred more efficiently than when TEV protease was used. An optimised protocol was developed that gave maximal production of our target proteins in fermenter culture at an induction temperature of 22°C. Care was taken to find a methanol feed rate that gave the highest levels of protein production without causing the accumulation of excess methanol in the culture (which is known to be toxic to the yeast). Using this protocol it was possible to make both hSI and hA2aR with a production level >10 mg of recombinant protein per litre of culture. As most MPs are colourless, target protein identification is usually performed by methods such as radioligand binding and/or Western blotting. However, these techniques can be time-consuming, use a lot of protein and do not give any information on the aggregation state of the protein in detergent solution. Previously, it has been shown that the processes of identifying and analysing membrane proteins in detergent solution can be accelerated by attaching green fluorescent protein to the C-terminus of the recombinant MP. Here, the potential of the recently described iLOV fluorescence tag for membrane protein applications was assessed. iLOV was shown to be an useful tool for optimising processes such as yeast clonal selection, protein production in fermenter culture, detergent and construct screening as well as tracking recombinant MPs through the purification process. Of note, the iLOV tag allowed a direct assessment of the stability and dispersity state of both target MPs in a range of detergents by fluorescence size exclusion chromatography (FSEC). Using this approach, it was shown that wild-type hA2aR solubilised using a combination of dodecyl-βDmaltoside (DDM) and cholesteryl-hemisuccinate (CHS) aggregated during purification on a Ni2+ column. Furthermore, it was shown that the hA2aR agonistconformationally-fixed mutant Rag23 is stable in DDM without any CHS present. Moreover, Rag23 was found to be monodisperse in a series of short-chain detergents (decyl-βD-maltoside, nonyl-βD-maltoside (NM) and β-octylglucoside) suggesting that this mutant is well-suited to structural studies. SI was remarkably robust in short chain detergents demonstrating a reasonable level of stability in the short chain detergent NM. The FSEC experiments showed that wild-type SI has considerably higher intrinsic stability than native hA2aR suggesting that membrane enzymes will prove to be more amenable to structural analysis than GPCRs. Rag23 and SI were both purified to homogeneity in a simple four-step procedure: i) Ni2+ purification, ii) cleavage with C3 protease, iii) reverse Ni2+ purification and iv) gel-filtration chromatography. A buffer/salt screen was devised that allowedthose conditions where SI had maximal thermostability in detergent-solution to be identified. SI was found to have greatest stability in sodium phosphate buffer at acidic pH. Using this information, it was possible to purify monodisperse SI in DM suggesting that this protein may make an excellent candidate for structural studies too. Crystallisation trials with SI were performed using the commercially available sparse matrix screen MemSys/MemStart. In addition, a lipidic-sponge phase sparse-matrix crystallisation screen that was developed in collaboration with Prof. Richard Neutze (University of Chalmers, Sweden) was tested using SI. Cholesterol could be incorporated into all of the sponges that make up the screen upto a concentration of 10%. (This is important as the activity of many mammalian membrane proteins is cholesterol-dependent). To date, no diffracting crystals of SI have been obtained with either the conventional or lipidic-sponge phase crystallisation approaches. In short, a series of novel technologies/methodologies have been developed that will act as a platform for future efforts to solve the structures of a wide-range of human membrane proteins

    Exploiting Advanced Methods for Membrane Protein Structure Prediction

    Get PDF
    Recent strides in computational structural biology have opened up an opportunity to understand previously uncharacterised proteins. The under-representation of transmembrane proteins in the Protein Data Bank highlights the need to apply new and advanced bioinformatics methods to shed light on their structure and function. A protein’s structural information is crucial to understand its function and evolution. Currently, there is only experimental structural data for a tiny fraction of proteins. For instance, membrane proteins are encoded by 30% of the protein-coding genes of the human genome, but they only have a 3.5% representation in the Protein Data Bank (PDB). Membrane protein families are particularly poorly understood due to experimental difficulties, such as over-expression, which can result in toxicity to host cells, as well as difficulty in finding a suitable membrane mimetic to reconstitute the protein. Additionally, membrane proteins are much less conserved across species compared to water-soluble proteins, making sequence-based homologue identification a challenge, and in turn rendering homology modelling of these proteins more difficult. Until the structure of poorly characterised protein families can be elucidated experimentally, ab initio protein modelling can be used to predict a fold allowing for structure based function inferences. Such methods have made significant strides recently due to the availability of contact predictions, with these methods addressing larger targets than conventional fragment-assembly-based ab initio methods. This study initially focusses on the structure and function transmembrane proteins specifically in the process of autophagosome construction and demonstrates how covariance prediction data have multiple roles in modern structural bioinformatics: not just by acting as restraints for model making and serving for validation of the final models but by predicting domain boundaries and revealing the presence of cryptic internal repeats not evidenced by sequence analysis. Furthermore, we characterised a contact map feature characteristic of a re-entrant helix which may in future allow detection of this feature in other protein families. The recent innovations in computational structural biology were employed further giving rise to an opportunity to revise our current understanding of the structure and function of clinically important proteins. Through the modelling of the transmembrane Pfam families and subsequent mining of their structural libraries we identified the human Oca2 protein as a protein of interest. Oca2 is located on mature melanosomal membranes and mutations of Oca2 can result in a form of oculocutanous albinism which is the most prevalent and visually identifiable form of albinism. Sequence analysis predicts Oca2 to be a member of the SLC13 transporter family but it has not been classified into any existing SLC families. The modelling of Oca2 with AlphaFold2 and other advanced methods shows that, like SLC13 members, it consists of a scaffold and transport domain and displays a pseudo inverted repeat topology that includes re-entrant loops. This finding contradicts the prevailing consensus view of its topology. In addition to the scaffold and transport domains the presence of a cryptic GOLD domain is revealed that is likely responsible for its trafficking from the endoplasmic reticulum to the Golgi prior to localisation at the melanosomes and possesses known glycosylation sites. Analysis of the putative ligand binding site of the model shows the presence of highly conserved key asparagine residues that suggest Oca2 may be a Na+/dicarboxylate symporter. Known critical pathogenic mutations map to structural features present in the repeat regions that form the transport domain. Exploiting the AlphaFold2 multimeric modelling protocol in combination with conventional homology modelling allowed the building of a plausible homodimer in both an inward- and outward-facing conformation supporting an elevator-type transport mechanism

    Cell-free expression and molecular modeling of the γ-secretase complex and G-protein-coupled receptors

    Get PDF
    Alzheimer’s disease (AD), which was first reported more than a century ago by Alhzeimer, is one of the commonest forms of dementia which affects >30 million people globally (>8 million in Europe). The origin and pathogenesis of AD is poorly understood and there is no cure available for the disease. AD is characterized by the accumulation of senile plaques composed of amyloid beta peptides (Ab 37-43) which is formed by the gamma secretase (GS) complex by cleaving amyloid precursor protein. Therefore GS can be an attractive drug target. Since GS processes several other substrates like Notch, CD44 and Cadherins, nonspecific inhibition of GS has many side effects. Due to the lack of crystal structure of GS, which is attributed to the extreme difficulties in purifying it, molecular modeling can be useful to understand its architecture. So far only low resolution cryoEM structures of the complex has been solved which only provides a rough structure of the complex at low 12-15 A resolution Furthermore the activity of GS in vitro can be achieved by means of cell-free (CF) expression. GS comprises catalytic subunits namely presenilins and supporting elements containing Pen-2, Aph-1 and Nicastrin. The origin of AD is hidden in the regulated intramembrnae proteolysis (RIP) which is involved in various physiological processes and also in leukemia. So far growth factors, cytokines, receptors, viral proteins, cell adhesion proteins, signal peptides and GS has been shown to undergo RIP. During RIP, the target proteins undergo extracellular shredding and intramembrane proteolysis. This thesis is based on molecular modeling, molecular dynamics (MD) simulations, cell-free (CF) expression, mass spectrometry, NMR, crystallization, activity assay etc of the components of GS complex and G-protein coupled receptors (GPCRs). First I validated the NMR structure of PS1 CTF in detergent micelles and lipid bilayers using coarse-grained MD simulations using MARTINI forcefield implemented in Gromacs. CTF was simulated in DPC micelles, DPPC and DLPC lipid bilayer. Starting from random configuration of detergent and lipids, micelle and lipid bilyer were formed respectively in presence of CTF and it was oriented properly to the micelle and bilyer during the simulation. Around DPC molecules formed micelle around CTF in agreement of the experimental results in which 80-85 DPC molecules are required to form micelles. The structure obtained in DPC was similar to that of NMR structure but differed in bilayer simulations showed the possibility of substrate docking in the conserved PAL motif. Simulations of CTF in implicit membrane (IMM1) in CHAMM yielded similar structure to that from coarse grained MD. I performed cell-free expression optimization, crystallization and NMR spectroscopy of Pen-2 in various detergent micelles. Additionally Pen-2 was modeled by a combination of rosetta membrane ab-initio method, HHPred distant homology modeling and incorporating NMR constraints. The models were validated by all atom and coarse grained MD simulations both in detergent micelles and POPC/DPPC lipid bilayers using MARTINI forcefield. GS operon consisting of all four subunits was co-expressed in CF and purified. The presence of of GS subunits after pull-down with Aph-1 was determined by western blotting (Pen-2) and mass spectrometry (Presenilin-1 and Aph-1). I also studied interactions of especially PS1 CTF, APP and NTF by docking and MD. I also made models and interfaces of Pen-2 with PS1 NTF and checked their stability by MD simulations and compared with experimental results. The goal is to model the interfaces between GS subunits using molecular modeling approaches based on available experimental data like cross-linking, mutations and NMR structure of C-terminal fragment of PS1 and transmembrane part of APP. The obtained interfaces of GS subunits may explain its catalysis mechanism which can be exploited for novel lead design. Due to lack of crystal/NMR structure of the GS subunits except the PS1 CTF, it is not possible to predict the effect of mutations in terms of APP cleavage. So I also developed a sequence based approach based on machine learning using support vector machine to predict the effect of PS1 CTF L383 mutations in terms of Aβ40/Aβ42 ratio with 88% accuracy. Mutational data derived from the Molgen database of Presenilin 1 mutations was using for training. GPCRs (also called 7TM receptors) form a large superfamily of membrane proteins, which can be activated by small molecules, lipids, hormones, peptides, light, pain, taste and smell etc. Although 50% of the drugs in market target GPCRs , only few are targeted therapeutically. Such wide range of targets is due to involvement of GPCRs in signaling pathways related to many diseases i.e. dementia (like Alzheimer's disease), metabolic (like diabetes) including endocrinological disorders, immunological including viral infections, cardiovascular, inflammatory, senses disorders, pain and cancer. Cannabinoid and adrenergic receptors belong to the class A (similar to rhodopsin) GPCRs. Docking of agonists and antagonists to CB1 and CB2 cannabinoid receptors revealed the importance of a centrally located rotamer toggle switch, and its possible role in the mechanism of agonist/antagonist recognition. The switch is composed of two residues, F3.36 and W6.48, located on opposite transmembrane helices TM3 and TM6 in the central part of the membranous domain of cannabinoid receptors. The CB1 and CB2 receptor models were constructed based on the adenosine A2A receptor template. The two best scored conformations of each receptor were used for the docking procedure. In all poses (ligand-receptor conformations) characterized by the lowest ligand-receptor intermolecular energy and free energy of binding the ligand type matched the state of the rotamer toggle switch: antagonists maintained an inactive state of the switch, whereas agonists changed it. In case of agonists of β2AR, the (R,R) and (S,S) stereoisomers of fenoterol, the molecular dynamics simulations provided evidence of different binding modes while preserving the same average position of ligands in the binding site. The (S,S) isomer was much more labile in the binding site and only one stable hydrogen bond was created. Such dynamical binding modes may also be valid for ligands of cannabinoid receptors because of the hydrophobic nature of their ligand-receptor interactions. However, only very long molecular dynamics simulations could verify the validity of such binding modes and how they affect the process of activation. Human N-formyl peptide receptors (FPRs) are G protein-coupled receptors (GPCRs) involved in many physiological processes, including host defense against bacterial infection and resolving inflammation. The three human FPRs (FPR1, FPR2 and FPR3) share significant sequence homology and perform their action via coupling to Gi protein. Activation of FPRs induces a variety of responses, which are dependent on the agonist, cell type, receptor subtype, and also species involved. FPRs are expressed mainly by phagocytic leukocytes. Together, these receptors bind a large number of structurally diverse groups of agonistic ligands, including N-formyl and nonformyl peptides of different composition, that chemoattract and activate phagocytes. For example, N-formyl-Met-Leu-Phe (fMLF), an FPR1 agonist, activates human phagocyte inflammatory responses, such as intracellular calcium mobilization, production of cytokines, generation of reactive oxygen species, and chemotaxis. This ligand can efficiently activate the major bactericidal neutrophil functions and it was one of the first characterized bacterial chemotactic peptides. Whereas fMLF is by far the most frequently used chemotactic peptide in studies of neutrophil functions, atomistic descriptions for fMLF-FPR1 binding mode are still scarce mainly because of the absence of a crystal structure of this receptor. Elucidating the binding modes may contribute to designing novel and more efficient non-peptide FPR1 drug candidates. Molecular modeling of FPR1, on the other hand, can provide an efficient way to reveal details of ligand binding and activation of the receptor. However, recent modelings of FPRs were confined only to bovine rhodopsin as a template. To locate specific ligand-receptor interactions based on a more appropriate template than rhodopsin we generated the homology models of FPR1 using the crystal structure of the chemokine receptor CXCR4, which shares over 30% sequence identity with FPR1 and is located in the same γ branch of phylogenetic tree of GPCRs (rhodopsin is located in α branch). Docking and model refinement procedures were pursued afterward. Finally, 40 ns full-atom MD simulations were conducted for the Apo form as well as for complexes of fMLF (agonist) and tBocMLF (antagonist) with FPR1 in the membrane. Based on locations of the N- and C-termini of the ligand the FPR1 extracellular pocket can be divided into two zones, namely, the anchor and activation regions. The formylated M1 residue of fMLF bound to the activation region led to a series of conformational changes of conserved residues. Internal water molecules participating in extended hydrogen bond networks were found to play a crucial role in transmitting the agonist-receptor interactions. A mechanism of initial steps of the activation concurrent with ligand binding is proposed. I accurately predicted the structure and ligand binding pose of dopamine receptor 3 (RMSD to the crystal structure: 2.13 Å) and chemokine receptor 4 (CXCR4, RMSD to the crystal structure 3.21 Å) in GPCR-Dock 2010 competition. The homology model of the dopamine receptor 3 was 8 th best overall in the competition

    The Molecular Basis of Familial Danish Dementia

    Get PDF

    Mass & secondary structure propensity of amino acids explain their mutability and evolutionary replacements

    Get PDF
    Why is an amino acid replacement in a protein accepted during evolution? The answer given by bioinformatics relies on the frequency of change of each amino acid by another one and the propensity of each to remain unchanged. We propose that these replacement rules are recoverable from the secondary structural trends of amino acids. A distance measure between high-resolution Ramachandran distributions reveals that structurally similar residues coincide with those found in substitution matrices such as BLOSUM: Asn Asp, Phe Tyr, Lys Arg, Gln Glu, Ile Val, Met → Leu; with Ala, Cys, His, Gly, Ser, Pro, and Thr, as structurally idiosyncratic residues. We also found a high average correlation (\overline{R} R = 0.85) between thirty amino acid mutability scales and the mutational inertia (I X ), which measures the energetic cost weighted by the number of observations at the most probable amino acid conformation. These results indicate that amino acid substitutions follow two optimally-efficient principles: (a) amino acids interchangeability privileges their secondary structural similarity, and (b) the amino acid mutability depends directly on its biosynthetic energy cost, and inversely with its frequency. These two principles are the underlying rules governing the observed amino acid substitutions. © 2017 The Author(s)

    Markov field models of molecular kinetics

    Get PDF
    Computer simulations such as molecular dynamics (MD) provide a possible means to understand protein dynamics and mechanisms on an atomistic scale. The resulting simulation data can be analyzed with Markov state models (MSMs), yielding a quantitative kinetic model that, e.g., encodes state populations and transition rates. However, the larger an investigated system, the more data is required to estimate a valid kinetic model. In this work, we show that this scaling problem can be escaped when decomposing a system into smaller ones, leveraging weak couplings between local domains. Our approach, termed independent Markov decomposition (IMD), is a first-order approximation neglecting couplings, i.e., it represents a decomposition of the underlying global dynamics into a set of independent local ones. We demonstrate that for truly independent systems, IMD can reduce the sampling by three orders of magnitude. IMD is applied to two biomolecular systems. First, synaptotagmin-1 is analyzed, a rapid calcium switch from the neurotransmitter release machinery. Within its C2A domain, local conformational switches are identified and modeled with independent MSMs, shedding light on the mechanism of its calcium-mediated activation. Second, the catalytic site of the serine protease TMPRSS2 is analyzed with a local drug-binding model. Equilibrium populations of different drug-binding modes are derived for three inhibitors, mirroring experimentally determined drug efficiencies. IMD is subsequently extended to an end-to-end deep learning framework called iVAMPnets, which learns a domain decomposition from simulation data and simultaneously models the kinetics in the local domains. We finally classify IMD and iVAMPnets as Markov field models (MFM), which we define as a class of models that describe dynamics by decomposing systems into local domains. Overall, this thesis introduces a local approach to Markov modeling that enables to quantitatively assess the kinetics of large macromolecular complexes, opening up possibilities to tackle current and future computational molecular biology questions

    Protein Structure

    Get PDF
    Since the dawn of recorded history, and probably even before, men and women have been grasping at the mechanisms by which they themselves exist. Only relatively recently, did this grasp yield anything of substance, and only within the last several decades did the proteins play a pivotal role in this existence. In this expose on the topic of protein structure some of the current issues in this scientific field are discussed. The aim is that a non-expert can gain some appreciation for the intricacies involved, and in the current state of affairs. The expert meanwhile, we hope, can gain a deeper understanding of the topic

    Investigating the protein subcomplexes from a conjugative Type IV Secretion System

    Get PDF
    Type IV secretion system (T4SS) are versatile nanomachines that enable the efficient transport of substrates in bacteria. In general, they are formed from two major membrane embedded subassemblies: an outer membrane core complex (OMCC) and an inner membrane complex (IMC). The conjugative T4SS encoded by the F plasmid is of particular interest due to its clinical relevance as it facilitates the spread of antibiotic resistance amongst bacterial population. Despite its importance, atomic details of the F-T4SS structure and protein-protein interactions were rudimentary which in turn precludes thorough understanding of how conjugation is orchestrated. Therefore, this thesis aimed to improve knowledge on the F-T4SS by studying the structure of the F-OMCC and investigating other proteins the complex may interact with. After optimising the detergent solubilisation of the F-OMCC expressed from the pED208 F-like plasmid, and improving the purification of the complex, a cryo-EM dataset was collected. Using single particle analysis, the structure was solved with an overall resolution of 3.3 Å. The F-OMCC is formed from two concentric rings which have two distinct symmetries. The outer ring adopts 13-fold symmetry whereas the inner ring showed 17-fold symmetry, together they form a 2.1 MDa complex. The atomic models of TraB, TraK and TraV were built into the structure, and they revealed a unique stoichiometric arrangement. Interestingly, TraV and TraK proteins were found to adopt two different conformations within the outer ring. TraV and TraB were found to accommodate the symmetry mismatch by existing in both F-OMCC rings, and also appeared to confer flexibility. This makes the F-OMCC a dynamic complex which is likely to have important implications in the pilus and T4SS activity during conjugation. The interactions between the F-OMCC and other Tra/Trb proteins were also investigated to decipher how the concerted dynamics of the pilus may be connected to the complex. A potential interaction between F-OMCC and the proteins TraH and TraN was observed by pull-down assays. Furthermore, initial work on TraG found that it seems to assemble as a high order oligomer in solution. The results are reminiscent of a hexameric protein which may be functionally important. Together, the findings of this thesis reveal novel insights into the F-T4SS and its subassemblies. The approach used to purify the F-OMCC and study the interactions will act as the basis of future work on the F-T4SS and is directly applicable to the other protein complexes within the conjugative nanomachine.Open Acces
    corecore