31 research outputs found

    RECLU:a pipeline to discover reproducible transcriptional start sites and their alternative regulation using capped analysis of gene expression (CAGE)

    Get PDF
    BACKGROUND: Next generation sequencing based technologies are being extensively used to study transcriptomes. Among these, cap analysis of gene expression (CAGE) is specialized in detecting the most 5’ ends of RNA molecules. After mapping the sequenced reads back to a reference genome CAGE data highlights the transcriptional start sites (TSSs) and their usage at a single nucleotide resolution. RESULTS: We propose a pipeline to group the single nucleotide TSS into larger reproducible peaks and compare their usage across biological states. Importantly, our pipeline discovers broad peaks as well as the fine structure of individual transcriptional start sites embedded within them. We assess the performance of our approach on a large CAGE datasets including 156 primary cell types and two cell lines with biological replicas. We demonstrate that genes have complicated structures of transcription initiation events. In particular, we discover that narrow peaks embedded in broader regions of transcriptional activity can be differentially used even if the larger region is not. CONCLUSIONS: By examining the reproducible fine scaled organization of TSS we can detect many differentially regulated peaks undetected by previous approaches

    On-the-fly selection of cell-specific enhancers, genes, miRNAs and proteins across the human body using SlideBase

    Get PDF
    Genomics consortia have produced large datasets profiling the expression of genes, micro-RNAs, enhancers and more across human tissues or cells. There is a need for intuitive tools to select subsets of such data that is the most relevant for specific studies. To this end, we present SlideBase, a web tool which offers a new way of selecting genes, promoters, enhancers and microRNAs that are preferentially expressed/used in a specified set of cells/tissues, based on the use of interactive sliders. With the help of sliders, SlideBase enables users to define custom expression thresholds for individual cell types/tissues, producing sets of genes, enhancers etc. which satisfy these constraints. Changes in slider settings result in simultaneous changes in the selected sets, updated in real time. SlideBase is linked to major databases from genomics consortia, including FANTOM, GTEx, The Human Protein Atlas and BioGPS. Database URL: http://slidebase.binf.ku.d

    Building promoter aware transcriptional regulatory networks using siRNA perturbation and deepCAGE

    Get PDF
    Perturbation and time-course data sets, in combination with computational approaches, can be used to infer transcriptional regulatory networks which ultimately govern the developmental pathways and responses of cells. Here, we individually knocked down the four transcription factors PU.1, IRF8, MYB and SP1 in the human monocyte leukemia THP-1 cell line and profiled the genome-wide transcriptional response of individual transcription starting sites using deep sequencing based Cap Analysis of Gene Expression. From the proximal promoter regions of the responding transcription starting sites, we derived de novo binding-site motifs, characterized their biological function and constructed a network. We found a previously described composite motif for PU.1 and IRF8 that explains the overlapping set of transcriptional responses upon knockdown of either factor

    Age-associated DNA methylation changes in immune genes, histone modifiers and chromatin remodeling factors within 5 years after birth in human blood leukocytes

    Get PDF
    Abstract Background Age-related changes in DNA methylation occurring in blood leukocytes during early childhood may reflect epigenetic maturation. We hypothesized that some of these changes involve gene networks of critical relevance in leukocyte biology and conducted a prospective study to elucidate the dynamics of DNA methylation. Serial blood samples were collected at 3, 6, 12, 24, 36, 48 and 60 months after birth in ten healthy girls born in Finland and participating in the Type 1 Diabetes Prediction and Prevention Study. DNA methylation was measured using the HumanMethylation450 BeadChip. Results After filtering for the presence of polymorphisms and cell-lineage-specific signatures, 794 CpG sites showed significant DNA methylation differences as a function of age in all children (41.6% age-methylated and 58.4% age-demethylated, Bonferroni-corrected P value <0.01). Age-methylated CpGs were more frequently located in gene bodies and within +5 to +50 kilobases (kb) of transcription start sites (TSS) and enriched in developmental, neuronal and plasma membrane genes. Age-demethylated CpGs were associated to promoters and DNAse-I hypersensitivity sites, located within −5 to +5 kb of the nearest TSS and enriched in genes related to immunity, antigen presentation, the polycomb-group protein complex and cytoplasm. Conclusions This study reveals that susceptibility loci for complex inflammatory diseases (for example, IRF5, NOD2, and PTGER4) and genes encoding histone modifiers and chromatin remodeling factors (for example, HDAC4, KDM2A, KDM2B, JARID2, ARID3A, and SMARCD3) undergo DNA methylation changes in leukocytes during early childhood. These results open new perspectives to understand leukocyte maturation and provide a catalogue of CpG sites that may need to be corrected for age effects when performing DNA methylation studies in children

    The Constrained Maximal Expression Level Owing to Haploidy Shapes Gene Content on the Mammalian X Chromosome.

    Get PDF
    X chromosomes are unusual in many regards, not least of which is their nonrandom gene content. The causes of this bias are commonly discussed in the context of sexual antagonism and the avoidance of activity in the male germline. Here, we examine the notion that, at least in some taxa, functionally biased gene content may more profoundly be shaped by limits imposed on gene expression owing to haploid expression of the X chromosome. Notably, if the X, as in primates, is transcribed at rates comparable to the ancestral rate (per promoter) prior to the X chromosome formation, then the X is not a tolerable environment for genes with very high maximal net levels of expression, owing to transcriptional traffic jams. We test this hypothesis using The Encyclopedia of DNA Elements (ENCODE) and data from the Functional Annotation of the Mammalian Genome (FANTOM5) project. As predicted, the maximal expression of human X-linked genes is much lower than that of genes on autosomes: on average, maximal expression is three times lower on the X chromosome than on autosomes. Similarly, autosome-to-X retroposition events are associated with lower maximal expression of retrogenes on the X than seen for X-to-autosome retrogenes on autosomes. Also as expected, X-linked genes have a lesser degree of increase in gene expression than autosomal ones (compared to the human/Chimpanzee common ancestor) if highly expressed, but not if lowly expressed. The traffic jam model also explains the known lower breadth of expression for genes on the X (and the Z of birds), as genes with broad expression are, on average, those with high maximal expression. As then further predicted, highly expressed tissue-specific genes are also rare on the X and broadly expressed genes on the X tend to be lowly expressed, both indicating that the trend is shaped by the maximal expression level not the breadth of expression per se. Importantly, a limit to the maximal expression level explains biased tissue of expression profiles of X-linked genes. Tissues whose tissue-specific genes are very highly expressed (e.g., secretory tissues, tissues abundant in structural proteins) are also tissues in which gene expression is relatively rare on the X chromosome. These trends cannot be fully accounted for in terms of alternative models of biased expression. In conclusion, the notion that it is hard for genes on the Therian X to be highly expressed, owing to transcriptional traffic jams, provides a simple yet robustly supported rationale of many peculiar features of X's gene content, gene expression, and evolution

    Characterization of the human RFX transcription factor family by regulatory and target gene analysis

    Get PDF
    Background: Evolutionarily conserved RFX transcription factors (TFs) regulate their target genes through a DNA sequence motif called the X-box. Thereby they regulate cellular specialization and terminal differentiation. Here, we provide a comprehensive analysis of all the eight human RFX genes (RFX1–8), their spatial and temporal expression profiles, potential upstream regulators and target genes. Results: We extracted all known human RFX1–8 gene expression profiles from the FANTOM5 database derived from transcription start site (TSS) activity as captured by Cap Analysis of Gene Expression (CAGE) technology. RFX genes are broadly (RFX1–3, RFX5, RFX7) and specifically (RFX4, RFX6) expressed in different cell types, with high expression in four organ systems: immune system, gastrointestinal tract, reproductive system and nervous system. Tissue type specific expression profiles link defined RFX family members with the target gene batteries they regulate. We experimentally confirmed novel TSS locations and characterized the previously undescribed RFX8 to be lowly expressed. RFX tissue and cell type specificity arises mainly from differences in TSS architecture. RFX transcript isoforms lacking a DNA binding domain (DBD) open up new possibilities for combinatorial target gene regulation. Our results favor a new grouping of the RFX family based on protein domain composition. We uncovered and experimentally confirmed the TFs SP2 and ESR1 as upstream regulators of specific RFX genes. Using TF binding profiles from the JASPAR database, we determined relevant patterns of X-box motif positioning with respect to gene TSS locations of human RFX target genes. Conclusions: The wealth of data we provide will serve as the basis for precisely determining the roles RFX TFs play in human development and disease.Medicine, Faculty ofOther UBCNon UBCMedical Genetics, Department ofReviewedFacult
    corecore