13 research outputs found
Humán transzkripciós faktorok összehasonlító elemzése ChIP-seq adatokkal és transzkripciós faktorok komplex topológiai elrendeződésének vizsgálata a DNS-en
A transzkripciós faktorok (TF) olyan fehérjék, melyek a genom specifikus régióihoz kapcsolódnak és a megfelelő gének expresszióját befolyásolják. ChIP-seq technika segítségével azonosíthatjuk ezeknek a fehérjéknek a genomi lokalizációit, melyeket az eljárás során úgynevezett csúcs régióként detektálunk. Ezek azonban a fehérjéknek csak a megközelítőleges helyzetet mutatják meg az eljárás alacsony felbontása miatt. Ezekben a régiókban a legnagyobb fragment lefedettségű bázis azonosítása (csúcspont) nyújt segítséget a valódi DNS-fehérje interakciós pont megtalálásához.
Olyan csúcspont alapú technikát fejlesztettünk ki, mellyel ChIP-seq adatokból kinyerhető egy fehérje pontos genomi pozíciója közel bázis-pár felbontással. Ezt a módszert használtuk a CTCF/kohezin komplex vizsgálatánál, amely jelentős szerepet játszik a DNS-hurkok létrehozásában. A komplex alkotóelemeinek relatív helyzetét 1 bázispáros pontossággal határoztuk meg, majd az eredményeket három dimenziós DNS modellen ábrázoltuk. Ezzel következtetni tudtunk a komplex topológiájára, ami lehetővé tette számunkra, hogy elkészítsük a CTCF/kohezin mediálta DNS hurkolódási modellünket. Ebben az úgynevezett kettős gyűrű elméletet használtuk fel, melyben a DNS hurok létrehozásában két kromatin gyűrű vesz részt.
Ezt követően a csúcspont alapú fehérje pozíció meghatározást más proteinekre is kiterjesztettünk. Olyan adatbázis elkészítését tűztük ki célul, mely a lehető legtöbb azonosított transzkripciós faktor kötőhelyet tartalmazza és vizsgálni lehet a velük korrelációba hozható fehérjéket különböző sejttípusokban. Ehhez több mint 3700 humán ChIP-seq adatot töltöttünk le szekvencia adatbázisokból és a JASPAR CORE motívum adatbankot használtuk a kötőhelyek megtalálásához. Az adatokat egységes módon dolgoztuk fel, hogy azok összehasonlíthatóak legyenek. Ennek eredményeként genom szinten tudtuk azonosítani a cisztrómok genomi pozícióit 292 különböző típusú transzkripciós faktor esetében. Az azonosított kötőhelyeken vizsgálható, hogy mely faktorok fordulnak elő az adott régiókban, mely sejttípusokban és ezek milyen preferált pozícióiban helyezkednek el a motívum centrumához és egymáshoz viszonyítva.
Az adatbázishoz készítettünk egy interaktív webes felületet, melyen keresztül a feldolgozott ChIP-seq adatok nem csak nyilvánosan elérhetővé és letölthetővé váltak, de a különböző megjelenítési módokban az eredményeink böngészhetőek is: http://summit.med.unideb.hu/summitdb/.Transcription factors (TF) are proteins, which recognize specific regions on the genome and influence the expression of corresponding genes. The role of these factors can be investigated by analysing ChIP-seq experiments. ChIP-seq is powerful technique for genome wide measurement of transcription factor binding sites (TFBSs). We detect the genomic localization of specific proteins as so called peak regions. The summit of a peak have the highest coverage of the region and are known to more-or-less coincide with the bound DNA elements.
We developed a peak summit-based analysis method to identify the most likely location of the genomic contact region of the DNA-protein interactions at base-pair resolution from ChIP-seq data. We applied this method for analyzing the CTCF/cohesin complex, which holds together DNA loops. The relative positions of the constituents of the complex were determined with one-basepair estimated accuracy. Mapping the positions on a 3D model of DNA made it possible to deduce the approximate local topology of the complex that allowed us to predict how the CTCF/cohesin complex locks the DNA loops. As the positioning of the proteins was not compatible with previous models of loop closure, we proposed a plausible “double embrace” model in which the DNA loop is held together by two adjacent cohesin rings in such a way that the ring anchored by CTCF to one DNA duplex encircles the other DNA double helix and vice versa.
Then we extended our analysis to other transcription factors and we created a genome wide transcription factor binding site database with combination of the JASPAR non-redundant TF binding profile set and data from bioinformatical analysis of more than 3700 ChIP-seq sequencing data downloaded from SRA database. The data are uniquely processed, which makes the result comprehensive. The result of the data processing is not only a transcription factor binding site (TFBS) database, but complete pipelines for deep ChIP-seq analysis. The pipelines automatize the data collection from SRA database, proper naming of experiments, basic ChIP-seq analysis (peak, summit prediction), binding site prediction with motif optimization (identification of real transcription factor binding sites) and distance measurement for topological data extraction.
The database contains position information (distance information measured in base pair) about the surrounding ChIP-seq summits (targeted different factors from several cell lines) around each identified transcription binding sites. Our data if freely downloadable and viewable.
The ChIPSummit database has an intuitive web interface with six different views (Motif, PairShift, VennDiagram, Experiment, Genome and SNP) and freely available for browsing at the following address: http://summit.med.unideb.hu/summitdb/.d
Motif oriented high-resolution analysis of ChIP-seq data reveals the topological order of CTCF and cohesin proteins on DNA
BACKGROUND: ChIP-seq provides a wealth of information on the approximate location of DNA-binding proteins genome-wide. It is known that the targeted motifs in most cases can be found at the peak centers. A high resolution mapping of ChIP-seq peaks could in principle allow the fine mapping of the protein constituents within protein complexes, but the current ChIP-seq analysis pipelines do not target the basepair resolution strand specific mapping of peak summits. RESULTS: The approach proposed here is based on i) locating regions that are bound by a sufficient number of proteins constituting a complex; ii) determining the position of the underlying motif using either a direct or a de novo motif search approach; and iii) determining the exact location of the peak summits with respect to the binding motif in a strand specific manner. We applied this method for analyzing the CTCF/cohesin complex, which holds together DNA loops. The relative positions of the constituents of the complex were determined with one-basepair estimated accuracy. Mapping the positions on a 3D model of DNA made it possible to deduce the approximate local topology of the complex that allowed us to predict how the CTCF/cohesin complex locks the DNA loops. As the positioning of the proteins was not compatible with previous models of loop closure, we proposed a plausible "double embrace" model in which the DNA loop is held together by two adjacent cohesin rings in such a way that the ring anchored by CTCF to one DNA duplex encircles the other DNA double helix and vice versa. CONCLUSIONS: A motif-centered, strand specific analysis of ChIP-seq data improves the accuracy of determining peak positions. If a genome contains a large number of binding sites for a given protein complex, such as transcription factor heterodimers or transcription factor/cofactor complexes, the relative position of the constituent proteins on the DNA can be established with an accuracy that allow one to deduce the local topology of the protein complex. The proposed high resolution mapping approach of ChIP-seq data is applicable for detecting the contact topology of DNA-binding protein complexes
ChIPSummitDB:a ChIP-seq-based database of human transcription factor binding sites and the topological arrangements of the proteins bound to them.
ChIP-seq reveals genomic regions where proteins, e.g. transcription factors (TFs) interact with DNA. A substantial fraction of these regions, however, do not contain the cognate binding site for the TF of interest. This phenomenon might be explained by protein-protein interactions and co-precipitation of interacting gene regulatory elements. We uniformly processed 3727 human ChIP-seq data sets and determined the cistrome of 292 TFs, as well as the distances between the TF binding motif centers and the ChIP-seq peak summits. ChIPSummitDB enables the analysis of ChIP-seq data using multiple approaches. The 292 cistromes and corresponding ChIP-seq peak sets can be browsed in GenomeView. Overlapping SNPs can be inspected in dbSNPView. Most importantly, the MotifView and PairShiftView pages show the average distance between motif centers and overlapping ChIP-seq peak summits and distance distributions thereof, respectively. In addition to providing a comprehensive human TF binding site collection, the ChIPSummitDB database and web interface allows for the examination of the topological arrangement of TF complexes genome-wide. ChIPSummitDB is freely accessible at http://summit.med.unideb.hu/summitdb/. The database will be regularly updated and extended with the newly available human and mouse ChIP-seq data sets
PRMT1 and PRMT8 regulate retinoic acid-dependent neuronal differentiation with implications to neuropathology.
Retinoids are morphogens and have been implicated in cell fate commitment of embryonic stem cells (ESCs) to neurons. Their effects are mediated by RAR and RXR nuclear receptors. However, transcriptional cofactors required for cell and gene-specific retinoid signaling are not known. Here we show that protein arginine methyl transferase (PRMT) 1 and 8 have key roles in determining retinoid regulated gene expression and cellular specification in a multistage neuronal differentiation model of murine ESCs. PRMT1 acts as a selective modulator, providing the cells with a mechanism to reduce the potency of retinoid signals on regulatory "hotspots." PRMT8 is a retinoid receptor target gene itself and acts as a cell type specific transcriptional coactivator of retinoid signaling at later stages of differentiation. Lack of either of them leads to reduced nuclear arginine methylation, dysregulated neuronal gene expression, and altered neuronal activity. Importantly, depletion of PRMT8 results in altered expression of a distinct set of genes, including markers of gliomagenesis. PRMT8 is almost entirely absent in human glioblastoma tissues. We propose that PRMT1 and PRMT8 serve as a rheostat of retinoid signaling to determine neuronal cell specification in a context-dependent manner and might also be relevant in the development of human brain malignancy
PRMT1 and PRMT8 regulate retinoic acid-dependent neuronal differentiation with implications to neuropathology
Retinoids are morphogens and have been implicated in cell fate
commitment of embryonic stem cells (ESCs) to neurons. Their
effects are mediated by RAR and RXR nuclear receptors. However,
transcriptional co-factors required for cell and gene-specific
retinoid signaling are not known. Here we show that Protein
aRginine Methyl Transferase (PRMT) 1 and 8 have key roles in
determining retinoid regulated gene expression and cellular
specification in a multistage neuronal differentiation of murine
ESCs. PRMT1 acts as a selective modulator, providing the cells
with a mechanism to reduce the potency of retinoid signals on
regulatory "hotspots". PRMT8 is a retinoid receptor target gene
itself and acts as a cell type specific transcriptional co-activator of
retinoid signaling at later stages of differentiation. Lack of either of
them leads to reduced nuclear arginine methylation, dysregulated
neuronal gene expression and altered neuronal activity.
Importantly, depletion of PRMT8 results in altered expression of a
distinct set of genes, including markers of gliomagenesis. PRMT8
is almost entirely absent in human glioblastoma tissues. We
propose that PRMT1 and PRMT8 serve as a rheostat of retinoid
signaling to determine neuronal cell specification in a context-
dependent manner, and might also be relevant in the development
of human brain malignancy
Prolonged activity of the transposase helper may raise safety concerns during DNA transposon-based gene therapy
DNA transposon-based gene delivery vectors represent a promising new branch of randomly integrating vector development
for gene therapy. For the side-by-side evaluation of the
piggyBac and Sleeping Beauty systems—the only DNA transposons currently employed in clinical trials—during therapeutic
intervention, we treated the mouse model of tyrosinemia type
I with liver-targeted gene delivery using both transposon vectors. For genome-wide mapping of transposon insertion sites
we developed a new next-generation sequencing procedure
called streptavidin-based enrichment sequencing, which allowed
us to identify approximately one million integration sites for
both systems. We revealed that a high proportion of piggyBac
integrations are clustered in hot regions and found that they
are frequently recurring at the same genomic positions among
treated animals, indicating that the genome-wide distribution
of Sleeping Beauty-generated integrations is closer to random.
We also revealed that the piggyBac transposase protein exhibits
prolonged activity, which predicts the risk of oncogenesis by
generating chromosomal double-strand breaks. Safety concerns
associated with prolonged transpositional activity draw attention to the importance of squeezing the active state of the transposase enzymes into a narrower time window
Staphylococcus aureus ß-laktamázainak analízise
Borderline meticillin rezisztens Staphylococcus aureus törzsek ß-laktamázait vizsgáltuk. Esetükben ß-laktamázok túltermelése, illetve feltehetően methicillinázok termelése a rezisztencia kiváltó oka. Munkacsoportunk kétdimenziós gélelektroforézissel elemzett több S. aureus törzs fermentlevében és membránjában található ß-laktamázt. Az enzimek regenerálásával és nitrocefin - egy kromogén ß-laktám vegyület - alkalmazásával detektáltuk a pozíciójukat a géleken, majd a párhuzamos gélekből a megfelelő fehérjefolt gélben történő emésztésével és MALDI-TOF analízisével azonosítottuk őket.BscLaboratóriumi operátorg
Motif oriented high-resolution analysis of ChIP-seq data reveals the topological order of CTCF and cohesin proteins on DNA
BACKGROUND:ChIP-seq provides a wealth of information on the approximate location of DNA-binding proteins genome-wide. It is known that the targeted motifs in most cases can be found at the peak centers. A high resolution mapping of ChIP-seq peaks could in principle allow the fine mapping of the protein constituents within protein complexes, but the current ChIP-seq analysis pipelines do not target the basepair resolution strand specific mapping of peak summits.RESULTS:The approach proposed here is based on i) locating regions that are bound by a sufficient number of proteins constituting a complex; ii) determining the position of the underlying motif using either a direct or a de novo motif search approach; and iii) determining the exact location of the peak summits with respect to the binding motif in a strand specific manner. We applied this method for analyzing the CTCF/cohesin complex, which holds together DNA loops. The relative positions of the constituents of the complex were determined with one-basepair estimated accuracy. Mapping the positions on a 3D model of DNA made it possible to deduce the approximate local topology of the complex that allowed us to predict how the CTCF/cohesin complex locks the DNA loops. As the positioning of the proteins was not compatible with previous models of loop closure, we proposed a plausible "double embrace" model in which the DNA loop is held together by two adjacent cohesin rings in such a way that the ring anchored by CTCF to one DNA duplex encircles the other DNA double helix and vice versa.CONCLUSIONS:A motif-centered, strand specific analysis of ChIP-seq data improves the accuracy of determining peak positions. If a genome contains a large number of binding sites for a given protein complex, such as transcription factor heterodimers or transcription factor/cofactor complexes, the relative position of the constituent proteins on the DNA can be established with an accuracy that allow one to deduce the local topology of the protein complex. The proposed high resolution mapping approach of ChIP-seq data is applicable for detecting the contact topology of DNA-binding protein complexes