188 research outputs found

    Sequence context affects the rate of short insertions and deletions in flies and primates

    Get PDF
    Analysis of a large collection of short insertions and deletions in primates and flies shows that the rate of insertions or deletions of specific lengths can vary by more than 100 fold, depending on the surrounding sequence

    Evolution and Selection in Yeast Promoters: Analyzing the Combined Effect of Diverse Transcription Factor Binding Sites

    Get PDF
    In comparative genomics one analyzes jointly evolutionarily related species in order to identify conserved and diverged sequences and to infer their function. While such studies enabled the detection of conserved sequences in large genomes, the evolutionary dynamics of regulatory regions as a whole remain poorly understood. Here we present a probabilistic model for the evolution of promoter regions in yeast, combining the effects of regulatory interactions of many different transcription factors. The model expresses explicitly the selection forces acting on transcription factor binding sites in the context of a dynamic evolutionary process. We develop algorithms to compute likelihood and to learn de novo collections of transcription factor binding motifs and their selection parameters from alignments. Using the new techniques, we examine the evolutionary dynamics in Saccharomyces species promoters. Analyses of an evolutionary model constructed using all known transcription factor binding motifs and of a model learned from the data automatically reveal relatively weak selection on most binding sites. Moreover, according to our estimates, strong binding sites are constraining only a fraction of the yeast promoter sequence that is under selection. Our study demonstrates how complex evolutionary dynamics in noncoding regions emerges from formalization of the evolutionary consequences of known regulatory mechanisms

    Integrative analysis of genome-wide experiments in the context of a large high-throughput data compendium

    Get PDF
    Biological systems are orchestrated by heterogeneous regulatory programs that control complex processes and adapt to a dynamic environment. Recent advances in high-throughput experimental methods provide genome-wide perspectives on such regulatory programs. A considerable amount of data on the behavior of model systems in a variety of conditions is rapidly accumulating. Still, the dominant paradigm is to analyze new genome-wide experiments separately from any other extant data, for example, by clustering the new data alone. Here we introduce a new methodology for analyzing the results of a new functional genomic study vis-à-vis a large compendium of previously published results from heterogeneous experimental techniques. We demonstrate our methodology on Saccharomyces cerevisiae, using a compendium of some 2000 experiments from 60 different publications. Most importantly, we show how the integrated analysis reveals unexpected connections among biological processes, and differentiates between novel and known effects in the analyzed experiments. Such characterization is impossible when new data sets are studied in isolation. Our results exemplify the power of the integrative approach in the analysis of genomic scale data sets and call for a paradigm shift in their study

    Formation of regulatory modules by local sequence duplication

    Get PDF
    Turnover of regulatory sequence and function is an important part of molecular evolution. But what are the modes of sequence evolution leading to rapid formation and loss of regulatory sites? Here, we show that a large fraction of neighboring transcription factor binding sites in the fly genome have formed from a common sequence origin by local duplications. This mode of evolution is found to produce regulatory information: duplications can seed new sites in the neighborhood of existing sites. Duplicate seeds evolve subsequently by point mutations, often towards binding a different factor than their ancestral neighbor sites. These results are based on a statistical analysis of 346 cis-regulatory modules in the Drosophila melanogaster genome, and a comparison set of intergenic regulatory sequence in Saccharomyces cerevisiae. In fly regulatory modules, pairs of binding sites show significantly enhanced sequence similarity up to distances of about 50 bp. We analyze these data in terms of an evolutionary model with two distinct modes of site formation: (i) evolution from independent sequence origin and (ii) divergent evolution following duplication of a common ancestor sequence. Our results suggest that pervasive formation of binding sites by local sequence duplications distinguishes the complex regulatory architecture of higher eukaryotes from the simpler architecture of unicellular organisms

    EXPANDER – an integrative program suite for microarray data analysis

    Get PDF
    BACKGROUND: Gene expression microarrays are a prominent experimental tool in functional genomics which has opened the opportunity for gaining global, systems-level understanding of transcriptional networks. Experiments that apply this technology typically generate overwhelming volumes of data, unprecedented in biological research. Therefore the task of mining meaningful biological knowledge out of the raw data is a major challenge in bioinformatics. Of special need are integrative packages that provide biologist users with advanced but yet easy to use, set of algorithms, together covering the whole range of steps in microarray data analysis. RESULTS: Here we present the EXPANDER 2.0 (EXPression ANalyzer and DisplayER) software package. EXPANDER 2.0 is an integrative package for the analysis of gene expression data, designed as a 'one-stop shop' tool that implements various data analysis algorithms ranging from the initial steps of normalization and filtering, through clustering and biclustering, to high-level functional enrichment analysis that points to biological processes that are active in the examined conditions, and to promoter cis-regulatory elements analysis that elucidates transcription factors that control the observed transcriptional response. EXPANDER is available with pre-compiled functional Gene Ontology (GO) and promoter sequence-derived data files for yeast, worm, fly, rat, mouse and human, supporting high-level analysis applied to data obtained from these six organisms. CONCLUSION: EXPANDER integrated capabilities and its built-in support of multiple organisms make it a very powerful tool for analysis of microarray data. The package is freely available for academic users a

    Constitutive Nucleosome Depletion and Ordered Factor Assembly at the GRP78 Promoter Revealed by Single Molecule Footprinting

    Get PDF
    Chromatin organization and transcriptional regulation are interrelated processes. A shortcoming of current experimental approaches to these complex events is the lack of methods that can capture the activation process on single promoters. We have recently described a method that combines methyltransferase M.SssI treatment of intact nuclei and bisulfite sequencing allowing the representation of replicas of single promoters in terms of protected and unprotected footprint modules. Here we combine this method with computational analysis to study single molecule dynamics of transcriptional activation in the stress inducible GRP78 promoter. We show that a 350–base pair region upstream of the transcription initiation site is constitutively depleted of nucleosomes, regardless of the induction state of the promoter, providing one of the first examples for such a promoter in mammals. The 350–base pair nucleosome-free region can be dissected into modules, identifying transcription factor binding sites and their combinatorial organization during endoplasmic reticulum stress. The interaction of the transcriptional machinery with the GRP78 core promoter is highly organized, represented by six major combinatorial states. We show that the TATA box is frequently occupied in the noninduced state, that stress induction results in sequential loading of the endoplasmic reticulum stress response elements, and that a substantial portion of these elements is no longer occupied following recruitment of factors to the transcription initiation site. Studying the positioning of nucleosomes and transcription factors at the single promoter level provides a powerful tool to gain novel insights into the transcriptional process in eukaryotes

    Functional Enhancers at the Gene-Poor 8q24 Cancer-Linked Locus

    Get PDF
    Multiple discrete regions at 8q24 were recently shown to contain alleles that predispose to many cancers including prostate, breast, and colon. These regions are far from any annotated gene and their biological activities have been unknown. Here we profiled a 5-megabase chromatin segment encompassing all the risk regions for RNA expression, histone modifications, and locations occupied by RNA polymerase II and androgen receptor (AR). This led to the identification of several transcriptional enhancers, which were verified using reporter assays. Two enhancers in one risk region were occupied by AR and responded to androgen treatment; one contained a single nucleotide polymorphism (rs11986220) that resides within a FoxA1 binding site, with the prostate cancer risk allele facilitating both stronger FoxA1 binding and stronger androgen responsiveness. The study reported here exemplifies an approach that may be applied to any risk-associated allele in non-protein coding regions as it emerges from genome-wide association studies to better understand the genetic predisposition of complex diseases
    corecore