9 research outputs found

    Annotation Error in Public Databases: Misannotation of Molecular Function in Enzyme Superfamilies

    Get PDF
    Due to the rapid release of new data from genome sequencing projects, the majority of protein sequences in public databases have not been experimentally characterized; rather, sequences are annotated using computational analysis. The level of misannotation and the types of misannotation in large public databases are currently unknown and have not been analyzed in depth. We have investigated the misannotation levels for molecular function in four public protein sequence databases (UniProtKB/Swiss-Prot, GenBank NR, UniProtKB/TrEMBL, and KEGG) for a model set of 37 enzyme families for which extensive experimental information is available. The manually curated database Swiss-Prot shows the lowest annotation error levels (close to 0% for most families); the two other protein sequence databases (GenBank NR and TrEMBL) and the protein sequences in the KEGG pathways database exhibit similar and surprisingly high levels of misannotation that average 5%–63% across the six superfamilies studied. For 10 of the 37 families examined, the level of misannotation in one or more of these databases is >80%. Examination of the NR database over time shows that misannotation has increased from 1993 to 2005. The types of misannotation that were found fall into several categories, most associated with “overprediction” of molecular function. These results suggest that misannotation in enzyme superfamilies containing multiple families that catalyze different reactions is a larger problem than has been recognized. Strategies are suggested for addressing some of the systematic problems contributing to these high levels of misannotation

    Directed evolution of G-protein-coupled receptors for expression and stability

    Full text link
    G-protein-coupled receptors (GPCRs) represent the largest superfamily of cell surface receptors in the human genome. They mediate the cellular responses to an enormous diversity of endogenous signaling molecules such as hormones and neurotransmitters, as well as environmental signals such as pheromones, smells, tastes and light. Because of the enormous diversity in transmitted signals, GPCR signaling is involved in nearly every physiological process, and deregulated signaling readily leads to pathologic conditions. Currently, about 25% of all drugs in clinical use exert their action by targeting GPCRs. Understanding the molecular mechanisms of ligand binding and signal transduction at the structural level of single atoms holds great promise for the detailed characterization of GPCR function and structure-based drug discovery. However, functional characterization and effective drug design of these receptors is strongly restricted by the limited availability of high- resolution structural information. This limitation is mainly a consequence of the enormous experimental difficulties inherent to the process of atomic-resolution determination of GPCR structure. This thesis describes the development and application of powerful directed evolution methods aimed at improving the biophysical properties of GPCRs to make these receptors amenable to atomic-resolution structural studies. The initial studies have focused on engineering a water-soluble analog of the ÎČ2 adrenergic receptor (ADRB2) by a complete redesign of its hydrophobic phospholipid-contacting surface. Computational methods were used to design a combinatorial gene library from which water-soluble ADRB2 analogs can be selected by modern in-vitro selection technologies and further evolved towards the desired biophysical and functional properties. The more recent studies presented in this thesis describe the development of a powerful directed evolution method for producing well- expressed and stable GPCRs in the inner membrane of the expression host Escherichia coli. The method is based on introducing random amino acid changes to a GPCR sequence and selecting improved receptor variants by fluorescence-activated cell sorting. The performance of the method was first critically assessed on the model protein neurotensin receptor NTR1 and then successfully applied to three additional human GPCRs – some of which were hardly expressed or quickly lost their native fold when solubilized in detergent micelles. For all four receptors, mutations could be evolved that showed strongly increased functional expression levels (up to 30-fold) and greater receptor stability. The improvements are achieved by a combinatorial approach that does not require the input of rational design principles. In summary, by introducing the concepts of directed evolution to the field of integral membrane proteins, this work has produced powerful protein design strategies for GPCRs and a set of well-behaved GPCR proteins that are amenable to crystallographic studies. Existing roadblocks in structural and biophysical studies of these important receptors can now be removed by providing sufficient quantities of correctly folded and stable receptor protein. The ability to obtain atomic- resolution structures may help to elucidate the molecular basis for activation, inactivation, or pathology associated with that receptor, and may also provide valuable templates for drug design. Zusammenfassung G-Protein-gekoppelte Rezeptoren (GPCRs) stellen die grösste Protein- Superfamilie im menschlichen Genom dar. Deren Aufgabe ist es, Ă€ussere Signale, z. B. von Hormonen oder Neurotransmittern aber auch Sinneswahrnehmungen wie Geruch oder Licht, ins Zellinnere weiterzuleiten und in der Zelle eine physiologische Antwort auszulösen. Aufgrund ihrer FĂ€higkeit eine Vielzahl Ă€usserst unterschiedlicher Signale zu verarbeiten, hĂ€ngen die meisten physiologischen Prozesse im menschlichen Körper von der AktivitĂ€t der GPCRs ab. GerĂ€t eine bestimmte GPCR-abhĂ€ngige Signalverarbeitung jedoch ausser Kontrolle, können leicht pathologische Prozesse ausgelöst werden. Dies macht GPCRs zu prominenten Kandidaten in der pharmazeutischen Industrie und erklĂ€rt auch weshalb jedes vierte Medikament auf dem Markt in die AktivitĂ€t GPCR-abhĂ€ngiger Signalverarbeitung eingreift. WĂ€re man nun in der Lage, bestimmte funktionelle Aspekte G-protein gekoppelter Rezeptoren auf der strukturellen Ebene einzelner Atome genauer beschreiben zu können (z.B. die rĂ€umliche Konfiguration einer Bindesstelle fĂŒr Liganden), ergĂ€ben sich daraus nicht nur wertvolle Erkenntnisse fĂŒr die Grundlagenforschung sondern auch neue Möglichkeiten fĂŒr die strukturgestĂŒtzte Medikamentenentwicklung. Detaillierte strukturelle Einblicke sind heute jedoch nur sehr beschrĂ€nkt möglich, weil trotz grosser Anstrengungen nur einige wenige hochaufgelöste Kristallstrukturen von GPCRs gemessen werden konnten. Diese BeschrĂ€nkung besteht vor allem auf Grund der grossen experimentellen Schwierigkeiten bei der Röntgenstrukuranalyse von GPCRs. Die vorliegende Arbeit beschreibt die Entwicklung und Anwendung neuer Methoden auf dem Gebiet des Protein Engineering, die die biophysikalischen Eigenschaften von GPCRs verbessern und damit die Strukturanalyse dieser Proteine ermöglichen sollen. Der erste Teil dieser Doktorarbeit hatte zum Ziel wasserlösliche Analoga des ÎČ2-adrenergen Rezeptors zu konstruieren, indem der problematische wasserabweisende Anteil der ProteinoberflĂ€che stark verringert wurde. Mittels computergestĂŒzter Analysen wurde eine kombinatorische Genbibliothek des Rezeptors entworfen. Aus dieser sollten mit Hilfe von in-vitro Selektionsmethoden funktionale wasserlösliche Analoga des ÎČ2-adrenergen Rezeptors selektiert werden können. Der zentrale Teil dieser Doktorarbiet beinhaltet die neueren Studien zur Entwicklung einer effizienten Selektionsmethode zur Herstellung hoch exprimierender und stabiler GPCRs im Expressionsorganismus Escherichia coli. In dieser Methode wird die AminosĂ€uresequenz eines experimentell schwierig zu handhabenden GPCR zuerst nach dem Zufallsprinzip mutiert. Mit Hilfe eines Durchflusszytometers können dann diejenigen Mutanten isoliert werden, die eine erhöhte funktionale Proteinexpression aufweisen. Die Effizienz der Methode wurde zuerst am Modellprotein Neurotensinrezeptor NTR1 gezeigt und danach erfolgreich auf drei weitere humane GPCRs angewendet. FĂŒr alle vier AusgangsmolekĂŒle konnten neue Varianten evolviert werden, die eine stark verbesserte funktionale Proteinexpression (bis zu 30 mal höher als das AusgangsmolekĂŒl) und eine höhere ProteinstabilitĂ€t aufwiesen. Zusammenfassend resultierten aus dieser Arbeit neue effiziente Strategien fĂŒr das Protein Engineering von G-protein gekoppelten Rezeptoren. Weil die evolvierten Proteine sehr stabil sind und in ausreichender Menge hergestellt werden können, entfallen nun einige der grössten HĂŒrden im Prozess der Röntgenstrukturanalyse von GPCRs. Die Entdeckung neuer Kristallstrukturen von GPCRs wĂŒrde dadurch vereinfacht und könnte dazu beitragen, die molekularen Mechanismen der Rezerptor-Aktivierung, der Rezeptor-Inaktivierung oder pathologischer ZustĂ€nde besser zu verstehen. Auch die Entwicklung verbesserter Medikamente mit Hilfe strukturbasierender Methoden wĂŒrde durch detaillierte Rezeptorstrukturen vorangetrieben

    RNA Thermometers for the PURExpress System

    No full text
    Cell-free synthetic biology approaches enable engineering of biomolecular systems exhibiting complex, cell-like behaviors in the absence of living entities. Often essential to these systems are user-controllable mechanisms to regulate gene expression. Here we describe synthetic RNA thermometers that enable temperature-dependent translation in the PURExpress <i>in vitro</i> protein synthesis system. Previously described cellular thermometers lie wholly in the 5â€Č untranslated region and do not retain their intended function in PURExpress. By contrast, we designed hairpins between the Shine–Dalgarno sequence and complementary sequences within the gene of interest. The resulting thermometers enable high-yield, cell-free protein expression in an inducible temperature range compatible with <i>in vitro</i> translation systems (30–37 °C). Moreover, expression efficiency and switching behavior are tunable <i>via</i> small variations to the coding sequence. Our approach and resulting thermometers provide new tools for exploiting temperature as a rapid, external trigger for <i>in vitro</i> gene regulation

    Reverse Micelles As a Platform for Dynamic Nuclear Polarization in Solution NMR of Proteins

    No full text
    Despite tremendous advances in recent years, solution NMR remains fundamentally restricted due to its inherent insensitivity. Dynamic nuclear polarization (DNP) potentially offers significant improvements in this respect. The basic DNP strategy is to irradiate the EPR transitions of a stable radical and transfer this nonequilibrium polarization to the hydrogen spins of water, which will in turn transfer polarization to the hydrogens of the macromolecule. Unfortunately, these EPR transitions lie in the microwave range of the electromagnetic spectrum where bulk water absorbs strongly, often resulting in catastrophic heating. Furthermore, the residence times of water on the surface of the protein in bulk solution are generally too short for efficient transfer of polarization. Here we take advantage of the properties of solutions of encapsulated proteins dissolved in low viscosity solvents to implement DNP in liquids. Such samples are largely transparent to the microwave frequencies required and thereby avoid significant heating. Nitroxide radicals are introduced into the reverse micelle system in three ways: attached to the protein, embedded in the reverse micelle shell, and free in the aqueous core. Significant enhancements of the water resonance ranging up to ∌−93 at 0.35 T were observed. We also find that the hydration properties of encapsulated proteins allow for efficient polarization transfer from water to the protein. These and other observations suggest that merging reverse micelle encapsulation technology with DNP offers a route to a significant increase in the sensitivity of solution NMR spectroscopy of proteins and other biomolecules.National Institutes of Health (U.S.) (Grant P41 EB002026)National Institutes of Health (U.S.) (Grant P41 EB002804)Netherlands Organization for Scientific Research (Rubicon Fellowship

    Optimized Reverse Micelle Surfactant System for High-Resolution NMR Spectroscopy of Encapsulated Proteins and Nucleic Acids Dissolved in Low Viscosity Fluids

    No full text
    An optimized reverse micelle surfactant system has been developed for solution nuclear magnetic resonance studies of encapsulated proteins and nucleic acids dissolved in low viscosity fluids. Comprising the nonionic 1-decanoyl-<i>rac</i>-glycerol and the zwitterionic lauryldimethylamine-<i>N</i>-oxide (10MAG/LDAO), this mixture is shown to efficiently encapsulate a diverse set of proteins and nucleic acids. Chemical shift analyses of these systems show that high structural fidelity is achieved upon encapsulation. The 10MAG/LDAO surfactant system reduces the molecular reorientation time for encapsulated macromolecules larger than ∌20 kDa leading to improved overall NMR performance. The 10MAG/LDAO system can also be used for solution NMR studies of lipid-modified proteins. New and efficient strategies for optimization of encapsulation conditions are described. 10MAG/LDAO performs well in both the low viscosity pentane and ultralow viscosity liquid ethane and therefore will serve as a general surfactant system for initiating solution NMR studies of proteins and nucleic acids
    corecore