4,161 research outputs found

    Template Based Modeling and Structural Refinement of Protein-Protein Interactions.

    Full text link
    Determining protein structures from sequence is a fundamental problem in molecular biology, as protein structure is essential to understanding protein function. In this study, I developed one of the first fully automated pipelines for template based quaternary structure prediction starting from sequence. Two critical steps for template based modeling are identifying the correct homologous structures by threading which generates sequence to structure alignments and refining the initial threading template coordinates closer to the native conformation. I developed SPRING (single-chain-based prediction of interactions and geometries), a monomer threading to dimer template mapping program, which was compared to the dimer co-threading program, COTH, using 1838 non homologous target complex structures. SPRING’s similarity score outperformed COTH in the first place ranking of templates, correctly identifying 798 and 527 interfaces respectively. More importantly the results were found to be complementary and the programs could be combined in a consensus based threading program showing a 5.1% improvement compared to SPRING. Template based modeling requires a structural analog being present in the PDB. A full search of the PDB, using threading and structural alignment, revealed that only 48.7% of the PDB has a suitable template whereas only 39.4% of the PDB has templates that can be identified by threading. In order to circumvent this, I included intramolecular domain-domain interfaces into the PDB library to boost template recognition of protein dimers; the merging of the two classes of interfaces improved recognition of heterodimers by 40% using benchmark settings. Next the template based assembly of protein complexes pipeline, TACOS, was created. The pipeline combines threading templates and domain knowledge from the PDB into a knowledge based energy score. The energy score is integrated into a Monte Carlo sampling simulation that drives the initial template closer to the native topology. The full pipeline was benchmarked using 350 non homologous structures and compared to two state of the art programs for dimeric structure prediction: ZDOCK and MODELLER. On average, TACOS models global and interface structure have a better quality than the models generated by MODELLER and ZDOCK.PHDBioinformaticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/135847/1/bgovi_1.pd

    An environmental control box for serial crystallography enables multi-dimensional experiments

    Get PDF
    We present a new environmental enclosure for fixed-target, serial crystallography enabling full control of both the temperature and humidity. While maintaining the relative humidity to within a percent, this enclosure provides access to X-ray diffraction experiments in a wide temperature range from below 10 °C to above 80 °C. Coupled with the LAMA method, time-resolved serial crystallography experiments can now be carried out at truly physiological temperatures, providing fundamentally new insight into protein function. Using the hyperthermophile enzyme xylose isomerase, we demonstrate changes in the electron density as a function of increasing temperature and time. This method provides the necessary tools to successfully carry out multi-dimensional serial crystallography

    SENSING AND MAPPING OF SURFACE HYDROPHOBICITY OF PROTEINS BY FLUORESCENT PROBES

    Get PDF
    Surface hydrophobic interactions in proteins play a critical role in molecular recognition, influence biological functions, and play a central role in many protein misfolding diseases. As significance of surface hydrophobic interactions in age-related proteinopathies is becoming clear; it has led to an increased demand for better probes and tools to sense and characterize protein surface hydrophobicity. Current commercially available fluorescent probes such as 8-anilino-1-naphthalene sulfonic acid (ANS), 4,4â€Č -dianilino-1,1â€Č-binaphthyl-5,5â€Č-disulfonic acid (Bis-ANS), 6-propionyl-2-(N,N-dimethylamino)naphthalene (PRODAN), tetraphenylethene derivative, and Nile Red can sense proteins average hydrophobicity. However, probe limitations prevents their application for measuring the protein surface hydrophobicity. Some of the major deficiencies of these fluorescent probes are: poor solubility in water, overestimation of fluorescence signal due to contribution from hydrophobic as well as electrostatic interactions, and weak signal when bound to solvent exposed hydrophobic surface of proteins due to quenching. As a consequence of these limitations the above fluorescent dyes do not provide accurate measure of proteins surface hydrophobicity. Therefore, in this study we focused on designing and testing novel fluorescent probes for selectively reporting the surface hydrophobicity of proteins. For the first project, we chose 4,4-difluoro-4-bora-3a,4a-diaza-s-indacene (BODIPY) based fluorescent probes as these are highly fluorescent in both non-polar as well polar media. To increase water solubility we substituted 2-methoxyethylamine group at 3,5-position of the BODIPY core. For increasing hydrophobic sensing we focused our efforts on substitutions at meso position on BODIPY dye. These BODIPY-based surface hydrophobic sensors (HPsensors) showed a much stronger signal compared to ANS, a commonly used hydrophobic probe. The probes showed a 10- to 60-fold increase in signal strength compared to ANS for the BSA protein. For the second project, we modified the commercially available ANS dye with a succinimide-functionalized ethynyl derivative that offers facile reaction with amine residues of proteins at physiological pH. This modification of ANS with a reactive NHS ester favors crosslinking of the dye on proteins surface with lysine or arginine residue present near surface hydrophobic regions. SDS-PAGE results show that the dye is covalently linked to the proteins. To map the hydrophobic surface on proteins, covalently modified proteins will be digested and analyzed using mass spectrometry. Following that, the proteins hydrophobic surface will be visualized using crystallographic structure database for in-silico screening of small molecule libraries. These small molecules will be tailored to fit the exposed hydrophobic surface by rational drug design approach and explored for novel therapeutic avenues

    On vital aid: the why, what and how of validation

    Get PDF
    The need for validation of macromolecular crystal structures is discussed. A general approach to validation is presented, together with examples of its implementation in the special case of macromolecular crystallography

    Structural Studies of Nep1/Emg1 RNA Methyltransferase and Atypical Rio3 Kinase

    Get PDF
    Nucleolar Essential Protein 1 (Nep1) is required for small subunit (SSU) ribosomal RNA (rRNA) maturation and is mutated in Bowen-Conradi Syndrome. Although yeast (Saccharomyces cerevisiae) Nep1 interacts with a consensus sequence found in three regions of the SSU rRNA, the molecular details of the interaction are unknown. Nep1 is a SPOUT RNA methyltransferase, and can catalyze methylation at the N1 position of pseudouridine. Nep1 is also involved in assembly of Rps19, an SSU ribosomal protein, into the SSU. Mutations in Nep1 that result in decreased methyl donor binding do not result in lethality, suggesting that enzymatic activity may not be required for function, and RNA binding may play a more important role. To study these interactions, the crystal structures of the ScNep1 dimer and its complexes with RNA were determined. The results demonstrate that Nep1 recognizes its RNA site via base-specific interactions and stabilizes a stem-loop in the bound RNA. Furthermore, the observed RNA structure contradicts the structures of the Nep1-binding sites within mature rRNA, suggesting that the Nep1 changes rRNA structure upon binding. Finally, a uridine base is bound in the active site of Nep1, positioned for a methyltransfer at the C5 position, supporting Nep1's role as an N1-specific pseudouridine methyltransferase. In addition to the work completed with the Nep1 project, structural characterization of Rio3 Kinase is reported, as well as collaborative work in the structure determination of Ubiquitin and Ubiquitin complexes and Iodotyrosine Deiodinase

    Bioinformatics as a Tool for the Structural and Evolutionary Analysis of Proteins

    Get PDF
    This chapter deals with the topic of bioinformatics, computational, mathematics, and statistics tools applied to biology, essential for the analysis and characterization of biological molecules, in particular proteins, which play an important role in all cellular and evolutionary processes of the organisms. In recent decades, with the next generation sequencing technologies and bioinformatics, it has facilitated the collection and analysis of a large amount of genomic, transcriptomic, proteomic, and metabolomic data from different organisms that have allowed predictions on the regulation of expression, transcription, translation, structure, and mechanisms of action of proteins as well as homology, mutations, and evolutionary processes that generate structural and functional changes over time. Although the information in the databases is greater every day, all bioinformatics tools continue to be constantly modified to improve performance that leads to more accurate predictions regarding protein functionality, which is why bioinformatics research remains a great challenge

    Genome-Wide Analysis of RNA Secondary Structure in Eukaryotes

    Get PDF
    The secondary structure of an RNA molecule plays an integral role in its maturation, regulation, and function. Over the past decades, myriad studies have revealed specific examples of structural elements that direct the expression and function of both protein-coding messenger RNAs (mRNAs) and non-coding RNAs (ncRNAs). In this work, we develop and apply a novel high-throughput, sequencing-based, structure mapping approach to study RNA secondary structure in three eukaryotic organisms. First, we assess global patterns of secondary structure across protein-coding transcripts and identify a conserved mark of strongly reduced base pairing at transcription start and stop sites, which we hypothesize helps with ribosome recruitment and function. We also find empirical evidence for reduced base pairing within microRNA (miRNA) target sites, lending further support to the notion that even mRNAs have additional selective pressures outside of their protein coding sequence. Next, we integrate our structure mapping approaches with transcriptome-wide sequencing of ribosomal RNA-depleted (RNA-seq), small (smRNA-seq), and ribosome-bound (ribo-seq) RNA populations to investigate the impact of RNA secondary structure on gene expression regulation in the model organism Arabidopsis thaliana. We find that secondary structure and mRNA abundance are strongly anti-correlated, which is likely due to the propensity for highly structured transcripts to be degraded and/or processed into smRNAs. Finally, we develop a likelihood model and Bayesian Markov chain Monte Carlo (MCMC) algorithm that utilizes the sequencing data from our structure mapping approaches to generate single-nucleotide resolution predictions of RNA secondary structure. We show that this likelihood framework resolves ambiguities that arise from the sequencing protocol and leads to significantly increased prediction accuracy. In total, our findings provide on a global scale both validation of existing hypotheses regarding RNA biology as well as new insights into the regulatory and functional consequences of RNA secondary structure. Furthermore, the development of a statistical approach to structure prediction from sequencing data offers the promise of true genome-wide determination of RNA secondary structure

    Production and analysis of synthetic Cascade variants

    Get PDF
    CRISPR (clustered regularly interspaced short palindromic repeats)-Cas (CRISPR assoziiert) ist ein adaptives Immunsystem in Archaeen und Bakterien, das fremdes genetisches Material mit Hilfe von Ribonukleoprotein-Komplexen erkennt und zerstört. Diese Komplexe bestehen aus einer CRISPR RNA (crRNA) und Cas Proteinen. CRISPR-Cas Systeme sind in zwei Hauptklassen und mehrere Typen unterteilt, abhĂ€ngig von den beteiligten Cas Proteinen. In Typ I Systemen sucht ein Komplex namens Cascade (CRISPR associated complex for antiviral defence) nach eingedrungener viraler DNA wĂ€hrend einer Folgeinfektion und bindet die zu der eingebauten crRNA komplementĂ€re Sequenz. Anschließend wird die Nuklease/Helikase Cas3 rekrutiert, welche die virale DNA degradiert (Interferenz). Das Typ I System wird in mehrere Subtypen unterteilt, die Unterschiede im Aufbau von Cascade vorweisen. Im Fokus dieser Arbeit steht eine minimale Cascade-Variante aus Shewanella putrefaciens CN-32. Im Vergleich zur gut untersuchten Typ I-E Cascade aus Escherichia coli fehlen in diesem Komplex zwei Untereinheiten, die gewöhnlicher Weise fĂŒr die Zielerkennung benötigt werden. Dennoch ist der Komplex aktiv. Rekombinante I-Fv Cascade wurde bereits aus E. coli aufgereinigt und es war möglich, den Komplex zu modifizieren, indem das RĂŒckgrat entweder verlĂ€ngert oder verkĂŒrzt wurde. Dadurch wurden synthetische Varianten mit verĂ€nderter Protein-Stöchiometrie erzeugt. In der vorliegenden Arbeit wurde I-Fv Cascade weiter mit in vitro Methoden untersucht. So wurde die Bindung von Ziel-DNA beobachtet und die 3D Struktur zeigt, dass strukturelle VerĂ€nderungen im Komplex die fehlenden Untereinheiten ersetzen, möglicherweise um viralen Anti-CRISPR Proteinen zu entgehen. Die Nuklease/Helikase dieses Systems, Cas2/3fv, ist eine Fusion des Cas3 Proteins mit dem Interferenz-unabhĂ€ngigen Protein Cas2. Ein unabhĂ€ngiges Cas3fv ohne Cas2 Untereinheit wurde aufgereinigt und in vitro Assays zeigten, dass dieses Protein sowohl freie ssDNA als auch Cascadegebundene Substrate degradiert. Das komplette Cas2/3fv Protein bildet einen Komplex mit dem Protein Cas1 und zeigt eine reduzierte AktivitĂ€t gegenĂŒber freier ssDNA, möglicherweise als Regulationsmechanismus zur Vermeidung von unspezifischer AktivitĂ€t. Weiterhin wurde ein Prozess namens „RNA wrapping“ etabliert. Synthetische Cascade-Komplexe wurden erzeugt, in denen die grundlegende RNA-Bindung des charakteristischen Cas7fv RĂŒckgratProteins auf eine ausgewĂ€hlte RNA gelenkt wird. Diese spezifische Komplexbildung kann in vivo durch eine Repeat-Sequenz der crRNA stromaufwĂ€rts der Zielsequenz und durch Bindung des Cas5fv Proteins initiiert werden. Die erzeugten Komplexe beinhalten die ersten 100 nt der markierten RNA, die anschließend isoliert werden kann. Innerhalb der Komplexe ist die RNA stabilisiert und geschĂŒtzt vor Degradation durch RNasen. Komplexbildung kann außerdem genutzt werden, um ReportergenTranskripte stillzulegen. ZusĂ€tzlich wurden erste Hinweise geliefert, dass das RĂŒckgrat der synthetischen Komplexe durch Fusion mit weiteren Reporterproteinen modifiziert werden kann.CRISPR (clustered regularly interspaced short palindromic repeats)-Cas (CRISPR associated) is an adaptive immune system of Archaea and Bacteria. It is able to target and destroy foreign genetic material with ribonucleoprotein complexes consisting of CRISPR RNAs (crRNAs) and certain Cas proteins. CRISPR-Cas systems are classified in two major classes and multiple types, according to the involved Cas proteins. In type I systems, a ribonucleoprotein complex called Cascade (CRISPR associated complex for antiviral defence) scans for invading viral DNA during a recurring infection and binds the sequence complementary to the incorporated crRNA. After target recognition, the nuclease/helicase Cas3 is recruited and subsequently destroys the viral DNA in a step termed interfere nce. Multiple subtypes of type I exist that show differences in the Cascade composition. This work focuses on a minimal Cascade variant found in Shewanella putrefaciens CN-32. In comparison to the well-studied type I-E Cascade from Escherichia coli, this complex is missing two proteins usually required for target recognition, yet it is still able to provide immunity. Recombinant I-Fv Cascade was previously purified from E. coli and it was possible to modulate the complex by extending or shortening the backbone, resulting in synthetic variants with altered protein stoichiometry. In the present study, I-Fv Cascade was further analyzed by in vitro methods. Target binding was observed and the 3D structure revealed structural variations that replace the missing subunits, potentially to evade viral anti-CRISPR proteins. The nuclease/helicase of this system, Cas2/3fv, is a fusion of the Cas3 protein with the interference-unrelated protein Cas2. A standalone Cas3fv was purified without the Cas2 domain and in vitro cleavage assays showed that Cas3fv degrades both free ssDNA as well as Cascade-bound substrates. The complete Cas2/3fv protein forms a complex with the protein Cas1 and was shown to reduce cleave of free ssDNA, potentially as a regulatory mechanism against unspecific cleavage. Furthermore, we established a process termed “RNA wrapping”. Synthetic Cascade assemblies can be created by directing the general RNA-binding ability of the characteristic Cas7fv backbone protein on an RNA of choice such as reporter gene transcripts. Specific complex formation can be initiated in vivo by including a repeat sequence from the crRNA upstream a given target sequence and binding of the Cas5fv protein. The created complexes contain the initial 100 nt of the tagged RNA which can be isolated afterwards. While incorporated in complexes, RNA is stabilized and protected from degradation by RNases. Complex formation can be used to silence reporter gene transcripts. Furthermore, we provided initial indications that the backbone of synthetic complexes can be modified by addition of reporter proteins

    Structural Prediction of Protein–Protein Interactions by Docking: Application to Biomedical Problems

    Get PDF
    A huge amount of genetic information is available thanks to the recent advances in sequencing technologies and the larger computational capabilities, but the interpretation of such genetic data at phenotypic level remains elusive. One of the reasons is that proteins are not acting alone, but are specifically interacting with other proteins and biomolecules, forming intricate interaction networks that are essential for the majority of cell processes and pathological conditions. Thus, characterizing such interaction networks is an important step in understanding how information flows from gene to phenotype. Indeed, structural characterization of protein–protein interactions at atomic resolution has many applications in biomedicine, from diagnosis and vaccine design, to drug discovery. However, despite the advances of experimental structural determination, the number of interactions for which there is available structural data is still very small. In this context, a complementary approach is computational modeling of protein interactions by docking, which is usually composed of two major phases: (i) sampling of the possible binding modes between the interacting molecules and (ii) scoring for the identification of the correct orientations. In addition, prediction of interface and hot-spot residues is very useful in order to guide and interpret mutagenesis experiments, as well as to understand functional and mechanistic aspects of the interaction. Computational docking is already being applied to specific biomedical problems within the context of personalized medicine, for instance, helping to interpret pathological mutations involved in protein–protein interactions, or providing modeled structural data for drug discovery targeting protein–protein interactions.Spanish Ministry of Economy grant number BIO2016-79960-R; D.B.B. is supported by a predoctoral fellowship from CONACyT; M.R. is supported by an FPI fellowship from the Severo Ochoa program. We are grateful to the Joint BSC-CRG-IRB Programme in Computational Biology.Peer ReviewedPostprint (author's final draft
    • 

    corecore