34 research outputs found

    Inferring regulatory signal from genomic data

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Inherent Signals in Sequencing-Based Chromatin-ImmunoPrecipitation Control Libraries

    Get PDF
    The growth of sequencing-based Chromatin Immuno-Precipitation studies call for a more in-depth understanding of the nature of the technology and of the resultant data to reduce false positives and false negatives. Control libraries are typically constructed to complement such studies in order to mitigate the effect of systematic biases that might be present in the data. In this study, we explored multiple control libraries to obtain better understanding of what they truly represent.First, we analyzed the genome-wide profiles of various sequencing-based libraries at a low resolution of 1 Mbp, and compared them with each other as well as against aCGH data. We found that copy number plays a major influence in both ChIP-enriched as well as control libraries. Following that, we inspected the repeat regions to assess the extent of mapping bias. Next, significantly tag-rich 5 kbp regions were identified and they were associated with various genomic landmarks. For instance, we discovered that gene boundaries were surprisingly enriched with sequenced tags. Further, profiles between different cell types were noticeably distinct although the cell types were somewhat related and similar.We found that control libraries bear traces of systematic biases. The biases can be attributed to genomic copy number, inherent sequencing bias, plausible mapping ambiguity, and cell-type specific chromatin structure. Our results suggest careful analysis of control libraries can reveal promising biological insights

    Multiplatform genome-wide identification and modeling of functional human estrogen receptor binding sites

    Get PDF
    BACKGROUND: Transcription factor binding sites (TFBS) impart specificity to cellular transcriptional responses and have largely been defined by consensus motifs derived from a handful of validated sites. The low specificity of the computational predictions of TFBSs has been attributed to ubiquity of the motifs and the relaxed sequence requirements for binding. We posited that the inadequacy is due to limited input of empirically verified sites, and demonstrated a multiplatform approach to constructing a robust model. RESULTS: Using the TFBS for the estrogen receptor (ER)α (estrogen response element [ERE]) as a model system, we extracted EREs from multiple molecular and genomic platforms whose binding to ERα has been experimentally confirmed or rejected. In silico analyses revealed significant sequence information flanking the standard binding consensus, discriminating ERE-like sequences that bind ERα from those that are nonbinders. We extended the ERE consensus by three bases, bearing a terminal G at the third position 3' and an initiator C at the third position 5', which were further validated using surface plasmon resonance spectroscopy. Our functional human ERE prediction algorithm (h-ERE) outperformed existing predictive algorithms and produced fewer than 5% false negatives upon experimental validation. CONCLUSION: Building upon a larger experimentally validated ERE set, the h-ERE algorithm is able to demarcate better the universe of ERE-like sequences that are potential ER binders. Only 14% of the predicted optimal binding sites were utilized under the experimental conditions employed, pointing to other selective criteria not related to EREs. Other factors, in addition to primary nucleotide sequence, will ultimately determine binding site selection

    Discovery of estrogen receptor α target genes and response elements in breast tumor cells

    Get PDF
    BACKGROUND: Estrogens and their receptors are important in human development, physiology and disease. In this study, we utilized an integrated genome-wide molecular and computational approach to characterize the interaction between the activated estrogen receptor (ER) and the regulatory elements of candidate target genes. RESULTS: Of around 19,000 genes surveyed in this study, we observed 137 ER-regulated genes in T-47D cells, of which only 89 were direct target genes. Meta-analysis of heterogeneous in vitro and in vivo datasets showed that the expression profiles in T-47D and MCF-7 cells are remarkably similar and overlap with genes differentially expressed between ER-positive and ER-negative tumors. Computational analysis revealed a significant enrichment of putative estrogen response elements (EREs) in the cis-regulatory regions of direct target genes. Chromatin immunoprecipitation confirmed ligand-dependent ER binding at the computationally predicted EREs in our highest ranked ER direct target genes, NRIP1, GREB1 and ABCA3. Wider examination of the cis-regulatory regions flanking the transcriptional start sites showed species conservation in mouse-human comparisons in only 6% of predicted EREs. CONCLUSIONS: Only a small core set of human genes, validated across experimental systems and closely associated with ER status in breast tumors, appear to be sufficient to induce ER effects in breast cancer cells. That cis-regulatory regions of these core ER target genes are poorly conserved suggests that different evolutionary mechanisms are operative at transcriptional control elements than at coding regions. These results predict that certain biological effects of estrogen signaling will differ between mouse and human to a larger extent than previously thought

    Whole-Genome Cartography of Estrogen Receptor α Binding Sites

    Get PDF
    Using a chromatin immunoprecipitation-paired end diTag cloning and sequencing strategy, we mapped estrogen receptor α (ERα) binding sites in MCF-7 breast cancer cells. We identified 1,234 high confidence binding clusters of which 94% are projected to be bona fide ERα binding regions. Only 5% of the mapped estrogen receptor binding sites are located within 5 kb upstream of the transcriptional start sites of adjacent genes, regions containing the proximal promoters, whereas vast majority of the sites are mapped to intronic or distal locations (>5 kb from 5′ and 3′ ends of adjacent transcript), suggesting transcriptional regulatory mechanisms over significant physical distances. Of all the identified sites, 71% harbored putative full estrogen response elements (EREs), 25% bore ERE half sites, and only 4% had no recognizable ERE sequences. Genes in the vicinity of ERα binding sites were enriched for regulation by estradiol in MCF-7 cells, and their expression profiles in patient samples segregate ERα-positive from ERα-negative breast tumors. The expression dynamics of the genes adjacent to ERα binding sites suggest a direct induction of gene expression through binding to ERE-like sequences, whereas transcriptional repression by ERα appears to be through indirect mechanisms. Our analysis also indicates a number of candidate transcription factor binding sites adjacent to occupied EREs at frequencies much greater than by chance, including the previously reported FOXA1 sites, and demonstrate the potential involvement of one such putative adjacent factor, Sp1, in the global regulation of ERα target genes. Unexpectedly, we found that only 22%–24% of the bona fide human ERα binding sites were overlapping conserved regions in whole genome vertebrate alignments, which suggest limited conservation of functional binding sites. Taken together, this genome-scale analysis suggests complex but definable rules governing ERα binding and gene regulation

    De-Novo Identification of PPARγ/RXR Binding Sites and Direct Targets during Adipogenesis

    Get PDF
    BACKGROUND: The pathophysiology of obesity and type 2 diabetes mellitus is associated with abnormalities in endocrine signaling in adipose tissue and one of the key signaling affectors operative in these disorders is the nuclear hormone transcription factor peroxisome proliferator-activated receptor-gamma (PPARgamma). PPARgamma has pleiotropic functions affecting a wide range of fundamental biological processes including the regulation of genes that modulate insulin sensitivity, adipocyte differentiation, inflammation and atherosclerosis. To date, only a limited number of direct targets for PPARgamma have been identified through research using the well established pre-adipogenic cell line, 3T3-L1. In order to obtain a genome-wide view of PPARgamma binding sites, we applied the pair end-tagging technology (ChIP-PET) to map PPARgamma binding sites in 3T3-L1 preadipocyte cells. METHODOLOGY/PRINCIPAL FINDINGS: Coupling gene expression profile analysis with ChIP-PET, we identified in a genome-wide manner over 7700 DNA binding sites of the transcription factor PPARgamma and its heterodimeric partner RXR during the course of adipocyte differentiation. Our validation studies prove that the identified sites are bona fide binding sites for both PPARgamma and RXR and that they are functionally capable of driving PPARgamma specific transcription. Our results strongly indicate that PPARgamma is the predominant heterodimerization partner for RXR during late stages of adipocyte differentiation. Additionally, we find that PPARgamma/RXR association is enriched within the proximity of the 5' region of the transcription start site and this association is significantly associated with transcriptional up-regulation of genes involved in fatty acid and lipid metabolism confirming the role of PPARgamma as the master transcriptional regulator of adipogenesis. Evolutionary conservation analysis of these binding sites is greater when adjacent to up-regulated genes than down-regulated genes, suggesting the primordial function of PPARgamma/RXR is in the induction of genes. Our functional validations resulted in identifying novel PPARgamma direct targets that have not been previously reported to promote adipogenic differentiation. CONCLUSIONS/SIGNIFICANCE: We have identified in a genome-wide manner the binding sites of PPARgamma and RXR during the course of adipogenic differentiation in 3T3L1 cells, and provide an important resource for the study of PPARgamma function in the context of adipocyte differentiation

    Zebrafish Whole-Adult-Organism Chemogenomics for Large-Scale Predictive and Discovery Chemical Biology

    Get PDF
    The ability to perform large-scale, expression-based chemogenomics on whole adult organisms, as in invertebrate models (worm and fly), is highly desirable for a vertebrate model but its feasibility and potential has not been demonstrated. We performed expression-based chemogenomics on the whole adult organism of a vertebrate model, the zebrafish, and demonstrated its potential for large-scale predictive and discovery chemical biology. Focusing on two classes of compounds with wide implications to human health, polycyclic (halogenated) aromatic hydrocarbons [P(H)AHs] and estrogenic compounds (ECs), we generated robust prediction models that can discriminate compounds of the same class from those of different classes in two large independent experiments. The robust expression signatures led to the identification of biomarkers for potent aryl hydrocarbon receptor (AHR) and estrogen receptor (ER) agonists, respectively, and were validated in multiple targeted tissues. Knowledge-based data mining of human homologs of zebrafish genes revealed highly conserved chemical-induced biological responses/effects, health risks, and novel biological insights associated with AHR and ER that could be inferred to humans. Thus, our study presents an effective, high-throughput strategy of capturing molecular snapshots of chemical-induced biological states of a whole adult vertebrate that provides information on biomarkers of effects, deregulated signaling pathways, and possible affected biological functions, perturbed physiological systems, and increased health risks. These findings place zebrafish in a strategic position to bridge the wide gap between cell-based and rodent models in chemogenomics research and applications, especially in preclinical drug discovery and toxicology
    corecore