509 research outputs found

    Improving Protein Therapeutics Through Quantitative Molecular Engineering Approaches and A Cell-Based Oral Delivery Platform

    Get PDF
    Proteins, with their ability to perform a variety of highly specific biological functions, have emerged as an important class of therapeutics. However, to fully harness their therapeutic potential, proteins often need to be optimized by molecular engineering; therapeutic efficacy can be improved by modulating protein properties such as binding affinity/specificity, half-life, bioavailability, and immunogenicity. In this work, we first present an introductory example in which a mechanistic mathematical model was used to improve target selection for directed evolution of an aglycosylated Fc domain of an antibody to enhance phagocytosis of tumor cells. Several aspects of directed evolution experimental methods were then optimized. A model-guided ligation strategy was developed to maximize ligation yield in DNA library construction, and this design tool is freely available through a web server. Streamlined protocols for mRNA display and ribosome display, which are powerful in vitro selection methods, were also created to allow robust selection of a variety of therapeutic proteins, including monomeric Fc domains, designed ankyrin repeat proteins, a single-chain insulin analog (SCI-57), and leptin. Anti-ICAM-1 scFv antibody fragments were also optimized for ribosome display by grafting complementarity determining regions onto a stable human framework. In addition to engineering the proteins themselves, effective delivery systems are essential for maximizing the therapeutic benefit of these proteins in a clinical setting. We therefore also developed a novel oral delivery platform based on the food-grade bacterium Lactococcus lactis. SCI-57, leptin, and SCI-57-leptin fusion proteins have been successfully secreted from this host in vitro and preliminary studies in a diabetic mouse model show reduced glucose levels after oral administration of L. lactis secreting SCI-57. We then further improved the secretion potential of this host through directed evolution of a L. lactis signal peptide. In summary, our studies have provided important advances to the field of protein engineering through the development of mechanistic mathematical models, streamlined experimental methodologies, and polypeptides with improved properties. This work has also opened up the possibility of systemic delivery of protein therapeutics using live microorganisms

    Alternative splicing of U2AF1 reveals a shared repression mechanism for duplicated exons

    No full text
    The auxiliary factor of U2 small nuclear ribonucleoprotein (U2AF) facilitates branch point (BP) recognition and formation of lariat introns. The gene for the 35-kD subunit of U2AF gives rise to two protein isoforms (termed U2AF35a and U2AF35b) that are encoded by alternatively spliced exons 3 and Ab, respectively. The splicing recognition sequences of exon 3 are less favorable than exon Ab, yet U2AF35a expression is higher than U2AF35b across tissues. We show that U2AF35b repression is facilitated by weak, closely spaced BPs next to a long polypyrimidine tract of exon Ab. Each BP lacked canonical uridines at position -2 relative to the BP adenines, with efficient U2 base-pairing interactions predicted only for shifted registers reminiscent of programmed ribosomal frameshifting. The BP cluster was compensated by interactions involving unpaired cytosines in an upstream, EvoFold-predicted stem loop (termed ESL) that binds FUBP1/2. Exon Ab inclusion correlated with predicted free energies of mutant ESLs, suggesting that the ESL operates as a conserved rheostat between long inverted repeats upstream of each exon. The isoform-specific U2AF35 expression was U2AF65-dependent, required interactions between the U2AF-homology motif (UHM) and the ?6 helix of U2AF35, and was fine-tuned by exon Ab/3 variants. Finally, we identify tandem homologous exons regulated by U2AF and show that their preferential responses to U2AF65-related proteins and SRSF3 are associated with unpaired pre-mRNA segments upstream of U2AF-repressed 3?ss. These results provide new insights into tissue-specific subfunctionalization of duplicated exons in vertebrate evolution and expand the repertoire of exon repression mechanisms that control alternative splicing

    Protein Ligand Interactions Probed by NMR: A Dissertation

    Get PDF
    Molecular recognition, defined as the specific interactions between two or more molecules, is at the center of many biological processes including catalysis, signal transduction, gene regulation and allostery. Allosteric regulation is the modification of function caused by an intermolecular interaction. Allosteric proteins modify their activity in response to a biological signal that is often transmitted through the interaction with a small effector molecule. Therefore, determination of the origins of intermolecular interactions involved in molecular recognition and allostery are essential for understanding biological processes. Classically, molecular recognition and allosteric regulation have been associated to structural changes of the system. NMR spectroscopic methods have indicated that changes in protein dynamics may also contribute to molecular recognition and allostery. This thesis is an investigation of the contributions of both structure and dynamics in molecular binding phenomena. In chapter I, I describe molecular recognition, allostery and examples of allostery and cooperativity. Then I discuss the contribution of protein dynamics to function with a special focus on allosteric regulation. Lastly I introduce the hemoglobin homodimer, HbI of Scapharca inaequivalvis and the mRNA binding protein TIS11d. Chapter II is the primary focus of this thesis on the contribution of protein dynamics to allostery in the dimeric hemoglobin of scapharca inaequivalvis, HbI. Thereafter I concentrate on the mechanism of adenine recognition of the Tristetraprolin-like (TTP) protein TIS11d; this study is detailed in Chapter III. In Chapter IV I discuss broader impacts and future directions of my research. This thesis presents an example of the use of protein NMR spectroscopy to probe ligand binding. The studies presented in this thesis emphasize the importance of dynamics in understanding protein function. Measurements of protein motions will be an element of future studies to understand protein function in health and disease

    Expanding the repertoire of bacterial (non-)coding RNAs

    Get PDF
    The detection of non-protein-coding RNA (ncRNA) genes in bacteria and their diverse regulatory mode of action moved the experimental and bio-computational analysis of ncRNAs into the focus of attention. Regulatory ncRNA transcripts are not translated to proteins but function directly on the RNA level. These typically small RNAs have been found to be involved in diverse processes such as (post-)transcriptional regulation and modification, translation, protein translocation, protein degradation and sequestration. Bacterial ncRNAs either arise from independent primary transcripts or their mature sequence is generated via processing from a precursor. Besides these autonomous transcripts, RNA regulators (e.g. riboswitches and RNA thermometers) also form chimera with protein-coding sequences. These structured regulatory elements are encoded within the messenger RNA and directly regulate the expression of their “host” gene. The quality and completeness of genome annotation is essential for all subsequent analyses. In contrast to protein-coding genes ncRNAs lack clear statistical signals on the sequence level. Thus, sophisticated tools have been developed to automatically identify ncRNA genes. Unfortunately, these tools are not part of generic genome annotation pipelines and therefore computational searches for known ncRNA genes are the starting point of each study. Moreover, prokaryotic genome annotation lacks essential features of protein-coding genes. Many known ncRNAs regulate translation via base-pairing to the 5’ UTR (untranslated region) of mRNA transcripts. Eukaryotic 5’ UTRs have been routinely annotated by sequencing of ESTs (expressed sequence tags) for more than a decade. Only recently, experimental setups have been developed to systematically identify these elements on a genome-wide scale in prokaryotes. The first part of this thesis, describes three experimental surveys of exploratory field studies to analyze transcript organization in pathogenic bacteria. To identify ncRNAs in Pseudomonas aeruginosa we used a combination of an experimental RNomics approach and ncRNA prediction. Besides already known ncRNAs we identified and validated the expression of six novel RNA genes. Global detection of transcripts by next generation RNA sequencing techniques unraveled an unexpectedly complex transcript organization in many bacteria. These ultra high-throughput methods give us the appealing opportunity to analyze the complete RNA output of any species at once. The development of the differential RNA sequencing (dRNA-seq) approach enabled us to analyze the primary transcriptome of Helicobacter pylori and Xanthomonas campestris. For the first time we generated a comprehensive and precise transcription start site (TSS) map for both species and provide a general framework for the analysis of dRNA-seq data. Focusing on computer-aided analysis we developed new tools to annotate TSS, detect small protein-coding genes and to infer homology of newly detected transcripts. We discovered hundreds of TSS in intergenic regions, upstream of protein-coding genes, within operons and antisense to annotated genes. Analysis of 5’ UTRs (spanning from the TSS to the start codon of the adjacent protein-coding gene) revealed an unexpected size diversity ranging from zero to several hundred nucleotides. We identified and validated the expression of about 60 and about 20 ncRNA candidates in Helicobacter and Xanthomonas, respectively. Among these ncRNA candidates we found several small protein-coding genes that have previously evaded annotation in both species. We showed that the combination of dRNA-seq and computational analysis is a powerful method to examine prokaryotic transcriptomes. Experimental setups are time consuming and often combined with huge costs. Another limitation of experimental approaches is that genes which are expressed in specific developmental stages or stress conditions are likely to be missed. Bioinformatic tools build an alternative to overcome such restraints. General approaches usually depend on comparative genomic data and evolutionary signatures are used to analyze the (non-)coding potential of multiple sequence alignments. In the second part of my thesis we present our major update of the widely used ncRNA gene finder RNAz and introduce RNAcode, an efficient tool to asses local protein-coding potential of genomic regions. RNAz has been successfully used to identify structured RNA elements in all domains of life. However, our own experience and the user feedback not only demonstrated the applicability of the RNAz approach, but also helped us to identify limitations of the current implementation. Using a much larger training set and a new classification model we significantly improved the prediction accuracy of RNAz. During transcriptome analysis we repeatedly identified small protein-coding genes that have not been annotated so far. Only a few of those genes are known to date and standard proteincoding gene finding tools suffer from the lack of training data. To avoid an excess of false positive predictions, gene finding software is usually run with an arbitrary cutoff of 40-50 amino acids and therefore misses the small sized protein-coding genes. We have implemented RNAcode which is optimized for emerging applications not covered by standard protein-coding gene annotation software. In addition to complementing classical protein gene annotation, a major field of application of RNAcode is the functional classification of transcribed regions. RNA sequencing analyses are likely to falsely report transcript fragments (e.g. mRNA degradation products) as non-coding. Hence, an evaluation of the protein-coding potential of these fragments is an essential task. RNAcode reports local regions of high coding potential instead of complete protein-coding genes. A training on known protein-coding sequences is not necessary and RNAcode can therefore be applied to any species. We showed this with our analysis of the Escherichia coli genome where the current annotation could be accurately reproduced. We furthermore identified novel small protein-coding genes with RNAcode in this extensively studied genome. Using transcriptome and proteome data we found compelling evidence that several of the identified candidates are bona fide proteins. In summary, this thesis clearly demonstrates that bioinformatic methods are mandatory to analyze the huge amount of transcriptome data and to identify novel (non-)coding RNA genes. With the major update of RNAz and the implementation of RNAcode we contributed to complete the repertoire of gene finding software which will help to unearth hidden treasures of the RNA World

    Investigating the Contribution of Disordered Domains to the Biological Activity of RNA-binding Proteins

    Get PDF
    Many proteins contain disordered domains under physiological conditions. These disordered regions may be functional, although under pathological conditions they may lead to protein aggregation and degradation, as observed in proteins related to neurodegenerative diseases. In my thesis study, I aimed to understand how the primary sequence of these proteins encodes for the diverse ensemble of conformations rather than a stable folded state. I focused on the role of disordered domains in the activity of RNA-binding proteins involved in post-transcriptional regulation, but may lead to pathogenesis in many diseases. The human TIS11 proteins bind to AU-rich elements in the 30 UTR of mRNAs through a CCCH-type tandem zinc finger (TZF) domain. Mutations in these proteins have been linked to cancer. A member of this protein family, Tristetraprolin (TTP), is partially unfolded in the C-terminal zinc finger in the apo state, but folds upon RNA binding. The homolog protein TIS11d is folded in both free and bound states. Previous studies have shown that the extent of structure of the TZF domain in the apo state does not affect the affinity to target RNA in vitro, however it modulates the activity of the protein in cell. To understand which interactions determine the zinc affinity of the C-terminal zinc fingers of TTP and TIS11d, I investigated the stability of their TZF domains using homology modeling and molecular dynamics (MD) simulations. I found that, in the C-terminal zinc finger of TIS11d, a hydrogen bond is necessary to allow for [pi-[pi] stacking between the side chains of a conserved phenylalanine and the zinc-coordinating histidine. Using mutagenesis and nuclear magnetic resonance (NMR) spectroscopy, I demonstrated that the lack of this hydrogen bond is responsible for the reduced zinc affinity, and thus lack of structure, of the C-terminal zinc finger in TTP. These results suggest that the CCCH-type TZF domain in different proteins have evolved to differentiate their function through a disorder-to-order transition. In Caenorhabditis elegans several RNA-binding proteins contain a TZF domain homologous to the RNA-binding domain of TIS11 proteins, but have different RNA-binding specificity. I characterized the structure and the dynamics of the C. elegans protein MEX-5 using NMR spectroscopy and MD simulations. I found that MEX-5, like its mammalian counterpart TTP, contains a zinc finger that is partially unfolded in the free state but that folds upon RNA-binding. To assess if the disorder-to-order transition upon RNA-binding contributes to MEX-5 function, I designed a variant MEX-5 where both zinc fingers are stably folded in the absence of RNA. I characterized the RNA-binding activity of this variant MEX-5 and I found that the binding affnity and specificity are unchanged compared to the wild type protein. Together with Ryder\u27s lab, we used CRISPR-hr to introduce this variant into the endogenous C. elegans mex-5 locus. Homozygotes animals are sterile, form massive uterine tumors within a few days of reaching adulthood, and often die by bursting. These results show that the unfolded state of MEX-5 is critical to its function in vivo by a mechanism distinct from its RNA-binding activity. To further investigate how the equilibrium between structural order and disorder affects the function of a protein in the cell, I focused on the human protein TDP-43, a major component of the cellular proteinaceous aggregates found in amyotrophic lateral sclerosis and other neurodegenerative diseases. Previous studies have shown, both in vitro and in vivo, that the second RNA recognition motif (RRM2) of TDP-43 domain contains peptide regions that are particularly prone to fibril formation. In addition, RRM2 has been shown to populate, to a small degree, one or more partially folded states under native conditions. To determine if the partially folded states of TDP-43 RRM2 contribute to the formation of aggregates observed in the human diseases, I characterized the structures of these states using MD simulations including enhanced sampling methods and restraints from experimental chemical shifts. I found that in these states the protein exposes to the solvent aggregation-prone regions that are instead buried in the protein core in the native state. These results suggest a role in fibrogenesis for the transient partially folded states of TDP-43 RRM2

    The Mechanism of Cu,Zn Superoxide Dismutase Aggregation in Familial Amyotrophic Lateral Sclerosis

    Get PDF
    Amyotrophic lateral sclerosis (ALS) is a degenerative disease of the motor neurons characterized by the progressive loss of muscle strength and eventual death due to selective killing of motor neurons in the brain stem and spinal cord. ALS consists of both sporadic and familial subtypes that share the same clinical progression of symptoms. Of the 10% of ALS cases considered familial ALS (FALS), 1 in 5 is the result of a mutation in the enzyme Cu,Zn superoxide dismutase (SOD1). Over 100 mutations have been identified, and though they are distributed evenly throughout the homodimeric structure of SOD1, the mutations have the general property of inducing SOD1 aggregation and toxicity in motor neurons and surrounding glial cells. In recent years, a shift has occurred in ALS research and the broader field of protein aggregation diseases toward the hypothesis that soluble oligomers, rather than the end products of aggregation, are the species responsible for the patterns of toxicity observed in these diseases. Previous studies of SOD1 oligomerization have thus far focused on large-scale oligomers and ignored the earliest stages of oligomerization during which the transition from the native state of SOD1 occurs. Knowledge of structural transformations that initiate SOD1 aggregation, as well as the structure of early oligomeric intermediates, is essential for the design of strategies to prevent the aggregation of SOD1 in FALS. The following chapters contain a multifaceted description of the initiation of SOD1 oligomerization including "first-principles" computational approaches for modeling the formation of aberrant SOD1 dimers, in vitro mechanistic studies of SOD1 oligomerization, as well as the characterization of the in vivo modification state of SOD1. By calling attention to the fact that SOD1 is highly post-translationally modified in-vivo and showing that mutations allow SOD1 to access altogether different oligomeric intermediates than wild type, we lay the groundwork for significant advances in understanding the structural basis of SOD1 oligomerization in ALS

    Structural characterization and selective drug targeting of higher-order DNA G-quadruplex systems.

    Get PDF
    There is now substantial evidence that guanine-rich regions of DNA form non-B DNA structures known as G-quadruplexes in cells. G-quadruplexes (G4s) are tetraplex DNA structures that form amid four runs of guanines which are stabilized via Hoogsteen hydrogen bonding to form stacked tetrads. DNA G4s have roles in key genomic functions such as regulating gene expression, replication, and telomere homeostasis. Because of their apparent role in disease, G4s are now viewed as important molecular targets for anticancer therapeutics. To date, the structures of many important G4 systems have been solved by NMR or X-ray crystallographic techniques. Small molecules developed to target these structures have shown promising results in treating cancer in vitro and in vivo, however, these compounds commonly lack the selectivity required for clinical success. There is now evidence that long single-stranded G-rich regions can stack or otherwise interact intramolecularly to form G4-multimers, opening a new avenue for rational drug design. For a variety of reasons, G4 multimers are not amenable to NMR or X-ray crystallography. In the current dissertation, I apply a variety of biophysical techniques in an integrative structural biology (ISB) approach to determine the primary conformation of two disputed higher-order G4 systems: (1) the extended human telomere G-quadruplex and (2) the G4-multimer formed within the human telomerase reverse transcriptase (hTERT) gene core promoter. Using the higher-order human telomere structure in virtual drug discovery approaches I demonstrate that novel small molecule scaffolds can be identified which bind to this sequence in vitro. I subsequently summarize the current state of G-quadruplex focused virtual drug discovery in a review that highlights successes and pitfalls of in silico drug screens. I then present the results of a massive virtual drug discovery campaign targeting the hTERT core promoter G4 multimer and show that discovering selective small molecules that target its loops and grooves is feasible. Lastly, I demonstrate that one of these small molecules is effective in down-regulating hTERT transcription in breast cancer cells. Taken together, I present here a rigorous ISB platform that allows for the characterization of higher-order DNA G-quadruplex structures as unique targets for anticancer therapeutic discovery

    Interactions between the Translation Machinery and a Translational preQ1 Riboswitch.

    Full text link
    Gene expression is highly regulated with a diversity of regulation at the RNA level. In bacteria, regulation of mRNA translation into protein often occurs through RNA sequence features such as the Shine-Dalgarno (SD) sequence and local structural features. Translational riboswitches in bacteria exemplify such cis-acting regulation. This work look at how structural features of a preQ1 riboswitch effect regulation through interactions with the translation machinery. Broader questions about how individual translational machinery components, such as ribosomal protein S1 and the 30S ribosomal subunit, interact with structured RNAs are also addressed. We sought a more detailed mechanistic view of the interplay between the translational preQ1 riboswitch found in the 5′ UTR of an mRNA from T. tengcongensis, its ligand preQ1, and the SD sequence accessibility. To this end, we developed SiM-KARTS, a generalized strategy to interrogate site-specific structural dynamics of RNA molecules based on probe hybridization kinetics. Intriguingly, we found that the riboswitch expression platform alternates between conformations with differing SD accessibility, which are distinguished by “bursts” of probe binding, the pattern of which is modulated by ligand. This challenges the assumption that riboswitches behave in simple ON/OFF fashion and thus has broader implications for how we think about translational riboswitch regulation. The folding and unfolding of RNA structure influences other cellular processes besides translation. Ribosomal protein S1 performs other roles outside of the context of translation, which are related to its RNA binding or unfolding capacity. We used the well-characterized preQ1 riboswitch as a model pseudoknot to study how S1 interacts with defined, stable tertiary structure. S1 is able to bind and at least partially unfold this pseudoknot in a manner that is limited by RNA structural stability. Lastly, we investigated the influence of S1 on translation of preQ1 riboswitch-containing mRNAs and found that the effects of ligand on translation are not potentiated by the loss of S1. There is, however, a dramatic effect on translational coupling, invoking a role for S1 in polycistronic mRNA translation. These results highlight the need for additional techniques, such as assays at the single molecule level, to monitor early 30S-mRNA interactions during translation.PHDChemical BiologyUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/116677/1/palund_1.pd
    • …