198 research outputs found
ADGO 2.0: interpreting microarray data and list of genes using composite annotations
ADGO 2.0 is a web-based tool that provides composite interpretations for microarray data comparing two sample groups as well as lists of genes from diverse sources of biological information. Some other tools also incorporate composite annotations solely for interpreting lists of genes but usually provide highly redundant information. This new version has the following additional features: first, it provides multiple gene set analysis methods for microarray inputs as well as enrichment analyses for lists of genes. Second, it screens redundant composite annotations when generating and prioritizing them. Third, it incorporates union and subtracted sets as well as intersection sets. Lastly, users can upload their own gene sets (e.g. predicted miRNA targets) to generate and analyze new composite sets. The first two features are unique to ADGO 2.0. Using our tool, we demonstrate analyses of a microarray dataset and a list of genes for T-cell differentiation. The new ADGO is available at http://www.btool.org/ADGO2
Interferon regulatory factors are transcriptional regulators of adipogenesis
We have sought to identify transcriptional pathways in adipogenesis using an integrated experimental and computational approach. Here, we employ high-throughput DNase hypersensitivity analysis to find regions of altered chromatin structure surrounding key adipocyte genes. Regions that display differentiation-dependent changes in hypersensitivity were used to predict binding sites for proteins involved in adipogenesis. A high-scoring example was a binding motif for interferon regulatory factor (IRF) family members. Expression of all nine mammalian IRF mRNAs is regulated during adipogenesis, and several bind to the identified motifs in a differentiation-dependent manner. Furthermore, several IRF proteins repress differentiation. This analysis suggests an important role for IRF proteins in adipocyte biology and demonstrates the utility of this approach in identifying cis- and trans-acting factors not previously suspected to participate in adipogenesis
Neural Potential of a Stem Cell Population in the Hair Follicle
The bulge region of the hair follicle serves as a repository for epithelial stem cells that can regenerate the follicle in each hair growth cycle and contribute to epidermis regeneration upon injury. Here we describe a population of multipotential stem cells in the hair follicle bulge region; these cells can be identified by fluorescence in transgenic nestin-GFP mice. The morphological features of these cells suggest that they maintain close associations with each other and with the surrounding niche. Upon explantation, these cells can give rise to neurosphere-like structures in vitro. When these cells are permitted to differentiate, they produce several cell types, including cells with neuronal, astrocytic, oligodendrocytic, smooth muscle, adipocytic, and other phenotypes. Furthermore, upon implantation into the developing nervous system of chick, these cells generate neuronal cells in vivo. We used transcriptional profiling to assess the relationship between these cells and embryonic and postnatal neural stem cells and to compare them with other stem cell populations of the bulge. Our results show that nestin-expressing cells in the bulge region of the hair follicle have stem cell-like properties, are multipotent, and can effectively generate cells of neural lineage in vitro and in vivo
Preferential Nucleosome Occupancy at High Values of DNA Helical Rise
Nucleosomes are the basic structural units of eukaryotic chromatin and play a key role in the regulation of gene expression. Nucleosome formation depends on several factors, including properties of the sequence itself, but also physical constraints and epigenetic factors such as chromatin-remodelling enzymes. In this view, a sequence-dependent approach is able to capture a general tendency of a region to bind a histone octamer. A reference data set of positioned nucleosomes of Saccharomyces cerevisiae was used to study the role of DNA helical rise in histone–DNA interaction. Genomic sequences were transformed into arrays of helical rise values by a tetranucleotide code and then turned into profiles of mean helical rise values. These profiles resemble maps of nucleosome occupancy, suggesting that intrinsic histone–DNA interactions are linked to helical rise. The obtained results show that preferential nucleosome occupancy occurs where the mean helical rise reaches its largest values. Mean helical rise profiles obtained by using maps of positioned nucleosomes of the Drosophila melanogaster and Plasmodium falciparum genomes, as well as Homo sapiens chromosome 20 confirm that nucleosomes are mainly located where the mean helical rise reaches its largest values
Inherent Signals in Sequencing-Based Chromatin-ImmunoPrecipitation Control Libraries
The growth of sequencing-based Chromatin Immuno-Precipitation studies call for a more in-depth understanding of the nature of the technology and of the resultant data to reduce false positives and false negatives. Control libraries are typically constructed to complement such studies in order to mitigate the effect of systematic biases that might be present in the data. In this study, we explored multiple control libraries to obtain better understanding of what they truly represent.First, we analyzed the genome-wide profiles of various sequencing-based libraries at a low resolution of 1 Mbp, and compared them with each other as well as against aCGH data. We found that copy number plays a major influence in both ChIP-enriched as well as control libraries. Following that, we inspected the repeat regions to assess the extent of mapping bias. Next, significantly tag-rich 5 kbp regions were identified and they were associated with various genomic landmarks. For instance, we discovered that gene boundaries were surprisingly enriched with sequenced tags. Further, profiles between different cell types were noticeably distinct although the cell types were somewhat related and similar.We found that control libraries bear traces of systematic biases. The biases can be attributed to genomic copy number, inherent sequencing bias, plausible mapping ambiguity, and cell-type specific chromatin structure. Our results suggest careful analysis of control libraries can reveal promising biological insights
Nucleosome eviction from MHC class II promoters controls positioning of the transcription start site
Nucleosome depletion at transcription start sites (TSS) has been documented genome-wide in multiple eukaryotic organisms. However, the mechanisms that mediate this nucleosome depletion and its functional impact on transcription remain largely unknown. We have studied these issues at human MHC class II (MHCII) genes. Activation-induced nucleosome free regions (NFR) encompassing the TSS were observed at all MHCII genes. Nucleosome depletion was exceptionally strong, attaining over 250-fold, at the promoter of the prototypical HLA-DRA gene. The NFR was induced primarily by the transcription factor complex that assembles on the conserved promoter-proximal enhancer situated upstream of the TSS. Functional analyses performed in the context of native chromatin demonstrated that displacing the NFR without altering the sequence of the core promoter induced a shift in the position of the TSS. The NFR thus appears to play a critical role in transcription initiation because it directs correct TSS positioning in vivo. Our results provide support for a novel mechanism in transcription initiation whereby the position of the TSS is controlled by nucleosome eviction rather than by promoter sequence
The Histone Database: an integrated resource for histones and histone fold-containing proteins
Eukaryotic chromatin is composed of DNA and protein components—core histones—that act to compactly pack the DNA into nucleosomes, the fundamental building blocks of chromatin. These nucleosomes are connected to adjacent nucleosomes by linker histones. Nucleosomes are highly dynamic and, through various core histone post-translational modifications and incorporation of diverse histone variants, can serve as epigenetic marks to control processes such as gene expression and recombination. The Histone Sequence Database is a curated collection of sequences and structures of histones and non-histone proteins containing histone folds, assembled from major public databases. Here, we report a substantial increase in the number of sequences and taxonomic coverage for histone and histone fold-containing proteins available in the database. Additionally, the database now contains an expanded dataset that includes archaeal histone sequences. The database also provides comprehensive multiple sequence alignments for each of the four core histones (H2A, H2B, H3 and H4), the linker histones (H1/H5) and the archaeal histones. The database also includes current information on solved histone fold-containing structures. The Histone Sequence Database is an inclusive resource for the analysis of chromatin structure and function focused on histones and histone fold-containing proteins
Bivalent-Like Chromatin Markers Are Predictive for Transcription Start Site Distribution in Human
Deep sequencing of 5′ capped transcripts has revealed a variety of transcription initiation patterns, from narrow, focused promoters to wide, broad promoters. Attempts have already been made to model empirically classified patterns, but virtually no quantitative models for transcription initiation have been reported. Even though both genetic and epigenetic elements have been associated with such patterns, the organization of regulatory elements is largely unknown. Here, linear regression models were derived from a pool of regulatory elements, including genomic DNA features, nucleosome organization, and histone modifications, to predict the distribution of transcription start sites (TSS). Importantly, models including both active and repressive histone modification markers, e.g. H3K4me3 and H4K20me1, were consistently found to be much more predictive than models with only single-type histone modification markers, indicating the possibility of “bivalent-like” epigenetic control of transcription initiation. The nucleosome positions are proposed to be coded in the active component of such bivalent-like histone modification markers. Finally, we demonstrated that models trained on one cell type could successfully predict TSS distribution in other cell types, suggesting that these models may have a broader application range
Integrative genomic analysis of human ribosomal DNA
The transcription of ribosomal RNA (rRNA) is critical to life. Despite its importance, ribosomal DNA (rDNA) is not included in current genome assemblies and, consequently, genomic analyses to date have excluded rDNA. Here, we show that short sequence reads can be aligned to a genome assembly containing a single rDNA repeat. Integrated analysis of ChIP-seq, DNase-seq, MNase-seq and RNA-seq data reveals several novel findings. First, the coding region of active rDNA is contained within nucleosome-depleted open chromatin that is highly transcriptionally active. Second, histone modifications are located not only at the rDNA promoter but also at novel sites within the intergenic spacer. Third, the distributions of active modifications are more similar within and between different cell types than repressive modifications. Fourth, UBF, a positive regulator of rRNA transcription, binds to sites throughout the genome. Lastly, the insulator binding protein CTCF associates with the spacer promoter of rDNA, suggesting that transcriptional insulation plays a role in regulating the transcription of rRNA. Taken together, these analyses confirm and expand the results of previous ChIP studies of rDNA and provide novel avenues for exploration of chromatin-mediated regulation of rDNA
Transcription factor site dependencies in human, mouse and rat genomes
<p>Abstract</p> <p>Background</p> <p>It is known that transcription factors frequently act together to regulate gene expression in eukaryotes. In this paper we describe a computational analysis of transcription factor site dependencies in human, mouse and rat genomes.</p> <p>Results</p> <p>Our approach for quantifying tendencies of transcription factor binding sites to co-occur is based on a binding site scoring function which incorporates dependencies between positions, the use of information about the structural class of each transcription factor (major/minor groove binder), and also considered the possible implications of varying GC content of the sequences. Significant tendencies (dependencies) have been detected by non-parametric statistical methodology (permutation tests). Evaluation of obtained results has been performed in several ways: reports from literature (many of the significant dependencies between transcription factors have previously been confirmed experimentally); dependencies between transcription factors are not biased due to similarities in their DNA-binding sites; the number of dependent transcription factors that belong to the same functional and structural class is significantly higher than would be expected by chance; supporting evidence from GO clustering of targeting genes. Based on dependencies between two transcription factor binding sites (second-order dependencies), it is possible to construct higher-order dependencies (networks). Moreover results about transcription factor binding sites dependencies can be used for prediction of groups of dependent transcription factors on a given promoter sequence. Our results, as well as a scanning tool for predicting groups of dependent transcription factors binding sites are available on the Internet.</p> <p>Conclusion</p> <p>We show that the computational analysis of transcription factor site dependencies is a valuable complement to experimental approaches for discovering transcription regulatory interactions and networks. Scanning promoter sequences with dependent groups of transcription factor binding sites improve the quality of transcription factor predictions.</p
- …