6,387 research outputs found

    Genome-wide histone acetylation data improve prediction of mammalian transcription factor binding sites

    Get PDF
    Motivation: Histone acetylation (HAc) is associated with open chromatin, and HAc has been shown to facilitate transcription factor (TF) binding in mammalian cells. In the innate immune system context, epigenetic studies strongly implicate HAc in the transcriptional response of activated macrophages. We hypothesized that using data from large-scale sequencing of a HAc chromatin immunoprecipitation assay (ChIP-Seq) would improve the performance of computational prediction of binding locations of TFs mediating the response to a signaling event, namely, macrophage activation

    Proteome-Wide Prediction of Acetylation Substrates

    Get PDF
    Eukaryotic DNA is found packaged with proteins and RNA, which forms a substance called chromatin. This packaging is dynamic and regulates access to DNA for essential cellular processes such as transcription, replication, and repair. In recent years, studies have shown that regulated changes in the chemical and physical properties of chromatin often lead to dynamic changes in multiple cellular processes by affecting the accessibility of the DNA. These changes can be brought about in part through posttranslational modifications of histone proteins, which are involved in disrupting chromatin contacts or by recruiting effector proteins to chromatin. Acetylation is one of the well-studied post-translational modifications that has been associated with chromatin-associated processes, notably gene regulation. Many studies have contributed to our knowledge of the enzymology underlying acetylation, including efforts to understand the molecular mechanism of substrate recognition by several acetyltransferases, but traditional experiments to determine intrinsic features of substrate and site specificity have proven challenging. In my thesis work, I hypothesize that the primary amino acid sequence surrounding an acetylated lysine plays a critical role in acetylation site selection, and whether there are sequence preferences that enable a lysine acetyltransferase to recognize target lysines. A computational method was devised to examine this hypothesis, and an experimental approach was taken to test my computationally-derived predictions. In Chapter 2, I describe my basic computational methods, using a clustering analysis of protein sequences to predict lysine acetylation based on the sequence characteristics of acetylated lysines within histones. I define a local amino acid sequence composition that represents potential acetylation sites by implementing a clustering analysis of histone and nonhistone sequences. I demonstrate that this sequence composition has predictive power on two independent experimental datasets of acetylation marks. In Chapter 3, I describe the experimental validation approach used to detect acetylation in histone and nonhistone proteins using mass spectrometry. I also report several novel non-histone acetylated substrates in S. cerevisiae. My approach, combined with more traditional experimental methods, may be useful for identifying additional proteins in the acetylome. Finally, in Chapter 4, I describe two bioinformatics approaches; one to predict additional chromatin associated effector proteins, and another to further understand the evolutionary history and complexity of the Polycomb Group (PcG) proteins in multicellular organisms in order to infer gene expansion, co-evolution, and deletion events

    Computational study of associations between histone modification and protein-DNA binding in yeast genome by integrating diverse information

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In parallel with the quick development of high-throughput technologies, <it>in vivo (vitro) </it>experiments for genome-wide identification of protein-DNA interactions have been developed. Nevertheless, a few questions remain in the field, such as how to distinguish true protein-DNA binding (functional binding) from non-specific protein-DNA binding (non-functional binding). Previous researches tackled the problem by integrated analysis of multiple available sources. However, few systematic studies have been carried out to examine the possible relationships between histone modification and protein-DNA binding. Here this issue was investigated by using publicly available histone modification data in yeast.</p> <p>Results</p> <p>Two separate histone modification datasets were studied, at both the open reading frame (ORF) and the promoter region of binding targets for 37 yeast transcription factors. Both results revealed a distinct histone modification pattern between the functional protein-DNA binding sites and non-functional ones for almost half of all TFs tested. Such difference is much stronger at the ORF than at the promoter region. In addition, a protein-histone modification interaction pathway can only be inferred from the functional protein binding targets.</p> <p>Conclusions</p> <p>Overall, the results suggest that histone modification information can be used to distinguish the functional protein-DNA binding from the non-functional, and that the regulation of various proteins is controlled by the modification of different histone lysines such as the protein-specific histone modification levels.</p

    TREEOME: A framework for epigenetic and transcriptomic data integration to explore regulatory interactions controlling transcription

    Get PDF
    Motivation: Predictive modelling of gene expression is a powerful framework for the in silico exploration of transcriptional regulatory interactions through the integration of high-throughput -omics data. A major limitation of previous approaches is their inability to handle conditional and synergistic interactions that emerge when collectively analysing genes subject to different regulatory mechanisms. This limitation reduces overall predictive power and thus the reliability of downstream biological inference. Results: We introduce an analytical modelling framework (TREEOME: tree of models of expression) that integrates epigenetic and transcriptomic data by separating genes into putative regulatory classes. Current predictive modelling approaches have found both DNA methylation and histone modification epigenetic data to provide little or no improvement in accuracy of prediction of transcript abundance despite, for example, distinct anti-correlation between mRNA levels and promoter-localised DNA methylation. To improve on this, in TREEOME we evaluate four possible methods of formulating gene-level DNA methylation metrics, which provide a foundation for identifying gene-level methylation events and subsequent differential analysis, whereas most previous techniques operate at the level of individual CpG dinucleotides. We demonstrate TREEOME by integrating gene-level DNA methylation (bisulfite-seq) and histone modification (ChIP-seq) data to accurately predict genome-wide mRNA transcript abundance (RNA-seq) for H1-hESC and GM12878 cell lines. Availability: TREEOME is implemented using open-source software and made available as a pre-configured bootable reference environment. All scripts and data presented in this study are available online at http://sourceforge.net/projects/budden2015treeome/.Comment: 14 pages, 6 figure

    Systematic analysis of lysine acetyltransferases

    Get PDF

    Prevention and prediction of production instability of CHO-K1 cell lines by the examination of epigenetic mechanisms

    Get PDF
    The CHO-K1 cell line is the most common expression system for therapeutic proteins in the pharmaceutical industry. Due to the nature of economics, the cell lines and the vector design are subject to constant change to increase product quality and quantity. During the cultivation, the production cell lines are susceptible to decreasing productivity over time. Often the loss of production can be associated with a reduction of copy number and the silencing of transgenes. During cell line development, the most promising cell lines are cultivated in large batch culture. Consequently, the loss of a stable production cell line can be very cost-intensive. For this reason I developed different strategies to avoid a reduced productivity. Instability of production cell lines can be predicted by the degree of CpG methylation of the driving promoter. Considering that the DNA methylation is at the end of an epigenetic cascade and associated with the maintenance of the repressive state, I investigated the upstream signals of histone modifications with the assumption to obtain a higher predictive power of production instability. For this reason I performed a chromatin immunoprecipitation of the histone modifications H3K9me3 and H3K27me3 as repressive signals and H3ac as well as H3K4me3 as active marks. The accumulations of those signals were measured close to the hCMV-MIE at the beginning of the cultivation and were then compared with the loss of productivity over two month. I found that the degree of the H3 acetylation (H3ac) correlated best with the production stability. Furthermore I was able to identify an H3ac threshold to exclude most of the unstable producers. In the second project I aimed to improve the vector design by considering epigenetic mechanisms. To this end I designed on the one hand a target-oriented histone acetyltransferase to enforce an open and active chromatin status at the transgene. On the other hand I point-mutated methylation-susceptible CpGs within the hCMV-MIE to impede the maintenance of inactive heterochromatin formation. Remarkably, the C to G mutation located 179 bp upstream of transcription start site resulted in very stable antibody producing cell lines. In addition, the examination of cell pools expressing eGFP showed that G-179 promoter variants were less prone to a general methylation and gene amplification, which illustrates the dominating effect in epigenetic mechanisms of one single CpG. The last project was performed to localize stable integration sites within the CHO-K1 genome. In so doing I could show that the transfection leads predominantly to integration into inactive regions. Furthermore I identified promising integration sites with a high potential to induce stable expression. However, those results are preliminary and must be viewed with caution. Further examination needs to be done to confirm these results. Considering the results of all three projects, I propose that the interplay of metabolic burden and selection pressure at an early time point of cultivation plays an important role in cell line development. Small alterations of selection pressure can lead to a decisive change of cell properties. Therefore, stable cells are less susceptible than weak producers. The increase of selection pressure leads to compensatory effect by gene amplification in the instable cell lines. The resulting adjustment of productivity masks the truly stable cells, which precludes the selection of the right cell lines. For this reason the selection pressure, the copy number as well as the growth rate should be considered to minimize repressive effects.Die CHO-K1 Zelllinie ist das am hĂ€ufigsten verwendete Expressionssystem fĂŒr therapeutische Proteine innerhalb der pharmazeutischen Industrie. Aus wirtschaftlichen GrĂŒnden wird die verwendete Zelllinie sowie die eingesetzten Vektoren stĂ€ndig verbessert um die ProduktqualitĂ€t und -quantitĂ€t zu erhöhen. WĂ€hrend der Kultivierungsphase neigen Produktionszelllinien dazu an ProduktivitĂ€t zu verlieren. Dabei wird der ProduktivitĂ€tsverlust hĂ€ufig mit einer Reduktion der Kopienzahl oder dem Silencing von Transgenen assoziiert. WĂ€hrend der Zelllinienentwicklung werden vielversprechende Zelllinien ausgewĂ€hlt und im großen Ansatz kultiviert. Ein ProduktivitĂ€tsverlust innerhalb solcher Zellen ist somit sehr kostenintensiv. Um diese Gefahr zu minimieren entwickelte ich unterschiedliche Stategien, welche darauf abzielen den ProduktivitĂ€tsverlust zu vermeiden. ProduktionsinstabilitĂ€t konnte von unserer Gruppe schon anhand des CpG Methylierungsgrades am CMV Promoter vorhergesagt werden. Die DNA Methylierung wird wahrscheinlich zur Aufrechterhaltung eines inaktiven Chromatinstatus benötigt und steht am Ende einer epigentischen Kaskade. Im Gegensatz dazu erscheinen Histonmodifikationen frĂŒher in der Signalkaskade und könnten deswegen eine höhere Aussagekraft ĂŒber die StabilitĂ€t haben. Aus diesem Grunde wurden von mir Histonemodifkationen am hCMV-MIE Promoter und Enhancer zu Beginn der Kultivierungsphase gemessen. H3K4me3, H3ac sind Histonmodifikationen die mit Expression assoziiert werden wohingegen H3K27me3 und H3K9me3 grundsĂ€tzlich mit einem inaktiven Chromatinstatus in Verbindung gebracht werden. Der Grad der unterschiedlichen Modifikationen wurde mit dem ĂŒber zwei Monate entstehenden ProduktivitĂ€tsverlust verglichen. Dabei stellte sich heraus, dass der Grad der Histon H3 Acetylierung die höchste Korrelation mit der StabilitĂ€t aufwies. Des Weiteren konnte ich einen Grenzwert fĂŒr die H3 Acetylierung definieren der einen Ausschluss der meisten instabilen Produktionszelllinien ermöglicht. Im zweiten Projekt wurde das Vector Design unter epigenetischen Aspekten verĂ€ndert. Ich erstellte eine zielgerichtete Histonacetyltransferase, um in dem Chromatinbereich des Transgenes einen offenen und aktiven Status zu induzieren. Desweiteren mutierte ich methylierungsanfĂ€llige CpGs des hCMV-MIE Promoters und Enhancers um eine Methylierung und daraus folgend einen inaktiven Chromatinstatus zu verhindern. Die C zu G Konversion an dem 179 Basenpaar oberhalb der Transkriptionsstartstelle fĂŒhrte zu einer bemerkenswert stabilen Antikörperexpression in klonalen Zelllinien. Desweiteren konnte ich bei gleicher Promotervariante in eGFP exprimierenden Zellpools eine geringere Methylierung und Genamplifikation feststellen. Somit konnte zum ersten Mal die EffektsensitivitĂ€t eines einzelnen CpGs verdeutlicht werden. Im letzten Projekt wurde die ExpressionsstabilitĂ€t abhĂ€ngig von der Integrationsstelle des Transgenes untersucht. Dabei konnte ich zeigen, dass die standardmĂ€ĂŸig durchgefĂŒhrte zufĂ€llige Integration entweder bevorzugt in inaktiven Bereichen des Euchromatin stattfindet oder dass die Selektionsdruck induzierte Genamplifikation hauptsĂ€chlich im Heterochromatin stattfindet. Weiterhin vermute ich, dass beide Ereignisse hintereinander geschaltet sind, bei der die geringe AktivitĂ€t des Transgenes im inaktiven Euchromatin die Genamplifikation im Heterochromatin fördert. Bei der Untersuchung der Chromatinlandschaft und den enthaltenden Transgenen konnte ich vielversprechende aktive Regionen identifizieren, die wahrscheinlich die StabilitĂ€t der Expression fördern. Jedoch mĂŒssten diese Ergebnisse in weiteren Experimenten bestĂ€tigt werden. Bei der Betrachtung der drei Projekte zeigt sich, dass das Wechselspiel zwischen der Belastung des Stoffwechsels der Zelle und dem Selektionsdruck in der frĂŒhen Kultivierungsphase ausschlaggebend ist fĂŒr deren weitere Entwicklung. Dabei können kleine VerĂ€nderungen des Selektionsdruckes die Zellen maßgebend beeinflussen. Stabil exprimierende Zellen sind dabei weniger angreifbar als schwach exprimierende Zellen. Bei einer Erhöhung des Selektionsdruckes kompensieren die schlechteren Produktionszelllinien ihren Nachteil durch Genamplifikation. Die Anpassung der ProduktivitĂ€t ĂŒberdeckt die stabilen Zellen welches die richtige Auswahl erschwert. Aus diesem Grunde sollte der Selektiondruck, die Kopienzahl, sowie die Wachstumsrate in den Selektionskriterien mit einbezogen werden, um reprimierende Effekte zu minimieren

    Engineering Open Chromatin with Synthetic Pioneer Factors: Enhancing Mammalian Transgene Expression and Improving Cas9-Mediated Genome Editing in Closed Chromatin

    Get PDF
    abstract: Chromatin is the dynamic structure of proteins and nucleic acids into which eukaryotic genomes are organized. For those looking to engineer mammalian genomes, chromatin is both an opportunity and an obstacle. While chromatin provides another tool with which to control gene expression, regional density can lead to variability in genome editing efficiency by CRISPR/Cas9 systems. Many groups have attempted to de-silence chromatin to regulate genes and enhance DNA's accessibility to nucleases, but inconsistent results leave outstanding questions. Here, I test different types of activators, to analyze changes in chromatin features that result for chromatin opening, and to identify the critical biochemical features that support artificially generated open, transcriptionally active chromatin. I designed, built, and tested a panel of synthetic pioneer factors (SPiFs) to open condensed, repressive chromatin with the aims of 1) activating repressed transgenes in mammalian cells and 2) reversing the inhibitory effects of closed chromatin on Cas9-endonuclease activity. Pioneer factors are unique in their ability to bind DNA in closed chromatin. In order to repurpose this natural function, I designed SPiFs from a Gal4 DNA binding domain, which has inherent pioneer functionality, fused with chromatin-modifying peptides with distinct functions. SPiFs with transcriptional activation as their primary mechanism were able to reverse this repression and induced a stably active state. My work also revealed the active site from proto-oncogene MYB as a novel transgene activator. To determine if MYB could be used generally to restore transgene expression, I fused it to a deactivated Cas9 and targeted a silenced transgene in native heterochromatin. The resulting activator was able to reverse silencing and can be chemically controlled with a small molecule drug. Other SPiFs in my panel did not increase gene expression. However, pretreatment with several of these expression-neutral SPiFs increased Cas9-mediated editing in closed chromatin, suggesting a crucial difference between chromatin that is accessible and that which contains genes being actively transcribed. Understanding this distinction will be vital to the engineering of stable transgenic cell lines for product production and disease modeling, as well as therapeutic applications such as restoring epigenetic order to misregulated disease cells.Dissertation/ThesisDoctoral Dissertation Biological Design 201
    • 

    corecore