2,290 research outputs found

    A biologically-validated HCV E1E2 heterodimer structural model

    Get PDF
    The design of vaccine strategies and the development of drugs targeting the early stages of Hepatitis C virus (HCV) infection are hampered by the lack of structural information about its surface glycoproteins E1 and E2, the two constituents of HCV entry machinery. Despite the recent crystal resolution of limited versions of both proteins in truncated form, a complete picture of the E1E2 complex is still missing. Here we combined deep computational analysis of E1E2 secondary, tertiary and quaternary structure with functional and immunological mutational analysis across E1E2 in order to propose an in silico model for the ectodomain of the E1E2 heterodimer. Our model describes E1-E2 ectodomain dimerization interfaces, provides a structural explanation of E1 and E2 immunogenicity and sheds light on the molecular processes and disulfide bridges isomerization underlying the conformational changes required for fusion. Comprehensive alanine mutational analysis across 553 residues of E1E2 also resulted in identifying the epitope maps of diverse mAbs and the disulfide connectivity underlying E1E2 native conformation. The predicted structure unveils E1 and E2 structures in complex, thus representing a step towards the rational design of immunogens and drugs inhibiting HCV entry

    Searching for New Z-DNA/Z-RNA Binding Proteins Based on Structural Similarity to Experimentally Validated Zα Domain.

    Get PDF
    Z-DNA and Z-RNA are functionally important left-handed structures of nucleic acids, which play a significant role in several molecular and biological processes including DNA replication, gene expression regulation and viral nucleic acid sensing. Most proteins that have been proven to interact with Z-DNA/Z-RNA contain the so-called Zα domain, which is structurally well conserved. To date, only eight proteins with Zα domain have been described within a few organisms (including human, mouse, Danio rerio, Trypanosoma brucei and some viruses). Therefore, this paper aimed to search for new Z-DNA/Z-RNA binding proteins in the complete PDB structures database and from the AlphaFold2 protein models. A structure-based similarity search found 14 proteins with highly similar Zα domain structure in experimentally-defined proteins and 185 proteins with a putative Zα domain using the AlphaFold2 models. Structure-based alignment and molecular docking confirmed high functional conservation of amino acids involved in Z-DNA/Z-RNA, suggesting that Z-DNA/Z-RNA recognition may play an important role in a variety of cellular processes

    From in vitro evolution to protein structure

    Get PDF
    In the nanoscale, the machinery of life is mainly composed by macromolecules and macromolecular complexes that through their shapes create a network of interconnected mechanisms of biological processes. The relationship between shape and function of a biological molecule is the foundation of structural biology, that aims at studying the structure of a protein or a macromolecular complex to unveil the molecular mechanism through which it exerts its function. What about the reverse: is it possible by exploiting the function for which a protein was naturally selected to deduce the protein structure? To this aim we developed a method, called CAMELS (Coupling Analysis by Molecular Evolution Library Sequencing), able to obtain the structural features of a protein from an artificial selection based on that protein function. With CAMELS we tried to reconstruct the TEM-1 beta lactamase fold exclusively by generating and sequencing large libraries of mutational variants. Theoretically with this method it is possible to reconstruct the structure of a protein regardless of the species of origin or the phylogenetical time of emergence when a functional phenotypic selection of a protein is available. CAMELS allows us to obtain protein structures without needing to purify the protein beforehand

    Nat Struct Mol Biol

    Get PDF
    \u3b2-sheet proteins carry out critical functions in biology, and hence are attractive scaffolds for computational protein design. Despite this potential, de novo design of all-\u3b2-sheet proteins from first principles lags far behind the design of all-\u3b1 or mixed-\u3b1\u3b2 domains owing to their non-local nature and the tendency of exposed \u3b2-strand edges to aggregate. Through study of loops connecting unpaired \u3b2-strands (\u3b2-arches), we have identified a series of structural relationships between loop geometry, side chain directionality and \u3b2-strand length that arise from hydrogen bonding and packing constraints on regular \u3b2-sheet structures. We use these rules to de novo design jellyroll structures with double-stranded \u3b2-helices formed by eight antiparallel \u3b2-strands. The nuclear magnetic resonance structure of a hyperthermostable design closely matched the computational model, demonstrating accurate control over the \u3b2-sheet structure and loop geometry. Our results open the door to the design of a broad range of non-local \u3b2-sheet protein structures.Howard Hughes Medical Institute/United StatesR35 GM125034/NIGMS NIH HHS/National Institute of General Medical Sciences/United StatesS10 OD018455/ODCDC CDC HHS/Office of the Director/United States2019-04-29T00:00:00Z30374087PMC62199066525vault:3355

    The Proteomic Code: a molecular recognition code for proteins

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The Proteomic Code is a set of rules by which information in genetic material is transferred into the physico-chemical properties of amino acids. It determines how individual amino acids interact with each other during folding and in specific protein-protein interactions. The Proteomic Code is part of the redundant Genetic Code.</p> <p>Review</p> <p>The 25-year-old history of this concept is reviewed from the first independent suggestions by Biro and Mekler, through the works of Blalock, Root-Bernstein, Siemion, Miller and others, followed by the discovery of a Common Periodic Table of Codons and Nucleic Acids in 2003 and culminating in the recent conceptualization of partial complementary coding of interacting amino acids as well as the theory of the nucleic acid-assisted protein folding.</p> <p>Methods and conclusions</p> <p>A novel cloning method for the design and production of specific, high-affinity-reacting proteins (SHARP) is presented. This method is based on the concept of proteomic codes and is suitable for large-scale, industrial production of specifically interacting peptides.</p

    Frustration in Biomolecules

    Get PDF
    Biomolecules are the prime information processing elements of living matter. Most of these inanimate systems are polymers that compute their structures and dynamics using as input seemingly random character strings of their sequence, following which they coalesce and perform integrated cellular functions. In large computational systems with a finite interaction-codes, the appearance of conflicting goals is inevitable. Simple conflicting forces can lead to quite complex structures and behaviors, leading to the concept of "frustration" in condensed matter. We present here some basic ideas about frustration in biomolecules and how the frustration concept leads to a better appreciation of many aspects of the architecture of biomolecules, and how structure connects to function. These ideas are simultaneously both seductively simple and perilously subtle to grasp completely. The energy landscape theory of protein folding provides a framework for quantifying frustration in large systems and has been implemented at many levels of description. We first review the notion of frustration from the areas of abstract logic and its uses in simple condensed matter systems. We discuss then how the frustration concept applies specifically to heteropolymers, testing folding landscape theory in computer simulations of protein models and in experimentally accessible systems. Studying the aspects of frustration averaged over many proteins provides ways to infer energy functions useful for reliable structure prediction. We discuss how frustration affects folding, how a large part of the biological functions of proteins are related to subtle local frustration effects and how frustration influences the appearance of metastable states, the nature of binding processes, catalysis and allosteric transitions. We hope to illustrate how Frustration is a fundamental concept in relating function to structural biology.Comment: 97 pages, 30 figure

    Biochemical analysis of bacteriophage Orf family recombinases

    Get PDF
    Bacterial pathogen evolution is an important field of study in light of the emergence of new diseases and drug resistant strains. Genomic rearrangements facilitated by temperate bacteriophages provide a major route for bacteria to acquire new pathogenic traits and disseminate them across species barriers. In phage λ, the Red pathway comprising the exo, bet and gam genes, promotes recombination in Escherichia coli at elevated frequency and with only limited sequence homology leading to numerous illegitimate exchanges. λ encodes another recombinase, Orf, which appears to supply similar functions to that of the host RecFOR proteins but has been considerably less well studied. This thesis describes the characterisation of mutations in Orf located within a proposed central DNA binding channel and a RecA interaction module. The mutants were found to impact upon single-stranded DNA binding by Orf and, in some cases, affect homodimerisation. Similar characterisation of Orf homologues from phage φETA from Staphylococcus aureus (ETA20) and E. coli K-12 cryptic prophage DLP12 (Orf151) was undertaken. In addition, another DNA binding protein (NinH) has been subjected to bioinformatic analysis, uncovering a relationship with helix-turn-helix proteins involved in site-specific recombination and gene regulation

    Interaction of Spliceosomal U2 snRNP Protein p14 with Its Branch Site RNA Target

    Full text link
    Newly transcribed precursor messenger RNA (pre-mRNA) molecules contain coding sequences (exons) interspersed with non-coding intervening sequences (introns). These introns must be removed in order to generate a continuous coding sequence prior to translation of the message into protein. The mechanism through which these introns are removed is known as pre-mRNA splicing, a two-step reaction catalyzed be a large macromolecular machine, the spliceosome, located in the nucleus of eukaryotic cells. The spliceosome is a protein-directed ribozyme composed of small nuclear RNAs (snRNA) and hundreds of proteins that assemble in a very dynamic process. One of these snRNAs, the U2 snRNA, is an important component of the human spliceosomal catalytic core that pairs with the intron through the branch site interacting region. These interactions are stabilized by the presence of several protein splicing factors. One splicing factor, p14, is the only protein shown to interact directly with the branch site in the fully assembled spliceosome. In this research we have used electrophoretic mobility shift assays (EMSA) and nuclear magnetic resonance (NMR) under non-denaturing conditions to establish the structural and or functional role of this splicing factor. Our EMSA results show that p14, which contains a RNA recognition motif (RRM), binds duplex RNA representing the branch site helix (yBP) with weak affinity (KD in the range of 200-400 mM). However, p14 also binds single-stranded (ss) RNA and even a non-related double-stranded (ds) DNA; therefore, any binding appears to be nonspecific for sequence or pairing status. The p14 protein also interacted with a fragment representing the SF3b155 protein, a natural binding partner in the spliceosome forming a stable and strong complex. Our NMR studies show that ten cross-peaks of 15N-labeled p14 were perturbed upon interaction with yBP. Calculations of the magnitude of the chemical shift changes upon titration of RNA into the protein solution suggested KD values of ~150 mM. However, perturbations in the presence of ss intron, ssU2 snRNA or dsDNA of a different sequence are similar to those with the branch site duplex, further supporting the finding from EMSA that interaction is non-specific for sequence, pairing status, or even nucleic acid. In the presence of the SF3b155 fragment, most of the p14 cross-peaks were perturbed consistent with extensive protein-protein contact. In this case, addition of the RNA duplex resulted in shifts in only a subset of the cross-peaks in p14 seen in the absence of SF3b155 but with similar affinity. Our NMR data imply that p14 interacts with RNA through very electropositive regions located in its RNP2 motif and a β-loop, with or without the SF3b155 fragment. However, no residues on β3, the RNP1 motif that usually interacts with ssRNA in RRM proteins, showed significant perturbation. Affinity, as determined by NMR titration, for yBP and an RNA duplex without the branch site (yBPΔA) were very similar. However, the overall magnitude of chemical shift perturbations was larger for yBP than for yBPΔA, which we speculate is related to the highly negative surface potential of yBP in the major groove. Taken together, p14 interacts with the branch site RNA and the binding appears to be of an electrostatic nature between the electropositive patch of RNP2 and the negative backbone of the RNA. Thus, we speculate that the role of p14 in human spliceosome is an electrostatic spacer as a cofactor of SF3b155 to screen backbone charges of the branch site RNA during spliceosome assembly to protect branch site from premature chemical activity prior to formation of the spliceosome’s active site

    Development of computational approaches for structural classification, analysis and prediction of molecular recognition regions in proteins

    Get PDF
    The vast and growing volume of 3D protein structural data stored in the PDB contains abundant information about macromolecular complexes, and hence, data about protein interfaces. Non-covalent contacts between amino acids are the basis of protein interactions, and they are responsible for binding afinity and specificity in biological processes. In addition, water networks in protein interfaces can also complement direct interactions contributing significantly to molecular recognition, although their exact role is still not well understood. It is estimated that protein complexes in the PDB are substantially underrepresented due to their crystallization dificulties. Methods for automatic classifification and description of the protein complexes are essential to study protein interfaces, and to propose putative binding regions. Due to this strong need, several protein-protein interaction databases have been developed. However, most of them do not take into account either protein-peptide complexes, solvent information or a proper classification of the binding regions, which are fundamental components to provide an accurate description of protein interfaces. In the firest stage of my thesis, I developed the SCOWLP platform, a database and web application that structurally classifies protein binding regions at family level and defines accurately protein interfaces at atomic detail. The analysis of the results showed that protein-peptide complexes are substantially represented in the PDB, and are the only source of interacting information for several families. By clustering the family binding regions, I could identify 9,334 binding regions and 79,803 protein interfaces in the PDB. Interestingly, I observed that 65% of protein families interact to other molecules through more than one region and in 22% of the cases the same region recognizes different protein families. The database and web application are open to the research community (www.scowlp.org) and can tremendously facilitate high-throughput comparative analysis of protein binding regions, as well as, individual analysis of protein interfaces. SCOWLP and the other databases collect and classify the protein binding regions at family level, where sequence and structure homology exist. Interestingly, it has been observed that many protein families also present structural resemblances within each other, mostly across folds. Likewise, structurally similar interacting motifs (binding regions) have been identified among proteins with different folds and functions. For these reasons, I decided to explore the possibility to infer protein binding regions independently of their fold classification. Thus, I performed the firest systematic analysis of binding region conservation within all protein families that are structurally similar, calculated using non-sequential structural alignment methods. My results indicate there is a substantial molecular recognition information that could be potentially inferred among proteins beyond family level. I obtained a 6 to 8 fold enrichment of binding regions, and identified putative binding regions for 728 protein families that lack binding information. Within the results, I found out protein complexes from different folds that present similar interfaces, confirming the predictive usage of the methodology. The data obtained with my approach may complement the SCOWLP family binding regions suggesting alternative binding regions, and can be used to assist protein-protein docking experiments and facilitate rational ligand design. In the last part of my thesis, I used the interacting information contained in the SCOWLP database to help understand the role that water plays in protein interactions in terms of affinity and specificity. I carried out one of the firest high-throughput analysis of solvent in protein interfaces for a curated dataset of transient and obligate protein complexes. Surprisingly, the results highlight the abundance of water-bridged residues in protein interfaces (40.1% of the interfacial residues) that reinforces the importance of including solvent in protein interaction studies (14.5% extra residues interacting only water- mediated). Interestingly, I also observed that obligate and transient interfaces present a comparable amount of solvent, which contrasts the old thoughts saying that obligate protein complexes are expected to exhibit similarities to protein cores having a dry and hydrophobic interfaces. I characterized novel features of water-bridged residues in terms of secondary structure, temperature factors, residue composition, and pairing preferences that differed from direct residue-residue interactions. The results also showed relevant aspects in the mobility and energetics of water-bridged interfacial residues. Collectively, my doctoral thesis work can be summarized in the following points: 1. I developed SCOWLP, an improved framework that identiffies protein interfaces and classifies protein binding regions at family level. 2. I developed a novel methodology to predict alternative binding regions among structurally similar protein families independently of the fold they belong to. 3. I performed a high-throughput analysis of water-bridged interactions contained in SCOWLP to study the role of solvent in protein interfaces. These three components of my thesis represent novel methods for exploiting existing structural information to gain insights into protein- protein interactions, key mechanisms to understand biological processes
    corecore