188 research outputs found
Sequence context affects the rate of short insertions and deletions in flies and primates
Analysis of a large collection of short insertions and deletions in primates and flies shows that the rate of insertions or deletions of specific lengths can vary by more than 100 fold, depending on the surrounding sequence
Evolution and Selection in Yeast Promoters: Analyzing the Combined Effect of Diverse Transcription Factor Binding Sites
In comparative genomics one analyzes jointly evolutionarily related species in order to identify conserved and diverged sequences and to infer their function. While such studies enabled the detection of conserved sequences in large genomes, the evolutionary dynamics of regulatory regions as a whole remain poorly understood. Here we present a probabilistic model for the evolution of promoter regions in yeast, combining the effects of regulatory interactions of many different transcription factors. The model expresses explicitly the selection forces acting on transcription factor binding sites in the context of a dynamic evolutionary process. We develop algorithms to compute likelihood and to learn de novo collections of transcription factor binding motifs and their selection parameters from alignments. Using the new techniques, we examine the evolutionary dynamics in Saccharomyces species promoters. Analyses of an evolutionary model constructed using all known transcription factor binding motifs and of a model learned from the data automatically reveal relatively weak selection on most binding sites. Moreover, according to our estimates, strong binding sites are constraining only a fraction of the yeast promoter sequence that is under selection. Our study demonstrates how complex evolutionary dynamics in noncoding regions emerges from formalization of the evolutionary consequences of known regulatory mechanisms
Integrative analysis of genome-wide experiments in the context of a large high-throughput data compendium
Biological systems are orchestrated by heterogeneous regulatory programs that control complex processes and adapt to a dynamic environment. Recent advances in high-throughput experimental methods provide genome-wide perspectives on such regulatory programs. A considerable amount of data on the behavior of model systems in a variety of conditions is rapidly accumulating. Still, the dominant paradigm is to analyze new genome-wide experiments separately from any other extant data, for example, by clustering the new data alone. Here we introduce a new methodology for analyzing the results of a new functional genomic study vis-à-vis a large compendium of previously published results from heterogeneous experimental techniques. We demonstrate our methodology on Saccharomyces cerevisiae, using a compendium of some 2000 experiments from 60 different publications. Most importantly, we show how the integrated analysis reveals unexpected connections among biological processes, and differentiates between novel and known effects in the analyzed experiments. Such characterization is impossible when new data sets are studied in isolation. Our results exemplify the power of the integrative approach in the analysis of genomic scale data sets and call for a paradigm shift in their study
Formation of regulatory modules by local sequence duplication
Turnover of regulatory sequence and function is an important part of
molecular evolution. But what are the modes of sequence evolution leading to
rapid formation and loss of regulatory sites? Here, we show that a large
fraction of neighboring transcription factor binding sites in the fly genome
have formed from a common sequence origin by local duplications. This mode of
evolution is found to produce regulatory information: duplications can seed new
sites in the neighborhood of existing sites. Duplicate seeds evolve
subsequently by point mutations, often towards binding a different factor than
their ancestral neighbor sites. These results are based on a statistical
analysis of 346 cis-regulatory modules in the Drosophila melanogaster genome,
and a comparison set of intergenic regulatory sequence in Saccharomyces
cerevisiae. In fly regulatory modules, pairs of binding sites show
significantly enhanced sequence similarity up to distances of about 50 bp. We
analyze these data in terms of an evolutionary model with two distinct modes of
site formation: (i) evolution from independent sequence origin and (ii)
divergent evolution following duplication of a common ancestor sequence. Our
results suggest that pervasive formation of binding sites by local sequence
duplications distinguishes the complex regulatory architecture of higher
eukaryotes from the simpler architecture of unicellular organisms
EXPANDER – an integrative program suite for microarray data analysis
BACKGROUND: Gene expression microarrays are a prominent experimental tool in functional genomics which has opened the opportunity for gaining global, systems-level understanding of transcriptional networks. Experiments that apply this technology typically generate overwhelming volumes of data, unprecedented in biological research. Therefore the task of mining meaningful biological knowledge out of the raw data is a major challenge in bioinformatics. Of special need are integrative packages that provide biologist users with advanced but yet easy to use, set of algorithms, together covering the whole range of steps in microarray data analysis. RESULTS: Here we present the EXPANDER 2.0 (EXPression ANalyzer and DisplayER) software package. EXPANDER 2.0 is an integrative package for the analysis of gene expression data, designed as a 'one-stop shop' tool that implements various data analysis algorithms ranging from the initial steps of normalization and filtering, through clustering and biclustering, to high-level functional enrichment analysis that points to biological processes that are active in the examined conditions, and to promoter cis-regulatory elements analysis that elucidates transcription factors that control the observed transcriptional response. EXPANDER is available with pre-compiled functional Gene Ontology (GO) and promoter sequence-derived data files for yeast, worm, fly, rat, mouse and human, supporting high-level analysis applied to data obtained from these six organisms. CONCLUSION: EXPANDER integrated capabilities and its built-in support of multiple organisms make it a very powerful tool for analysis of microarray data. The package is freely available for academic users a
Constitutive Nucleosome Depletion and Ordered Factor Assembly at the GRP78 Promoter Revealed by Single Molecule Footprinting
Chromatin organization and transcriptional regulation are interrelated processes. A shortcoming of current experimental approaches to these complex events is the lack of methods that can capture the activation process on single promoters. We have recently described a method that combines methyltransferase M.SssI treatment of intact nuclei and bisulfite sequencing allowing the representation of replicas of single promoters in terms of protected and unprotected footprint modules. Here we combine this method with computational analysis to study single molecule dynamics of transcriptional activation in the stress inducible GRP78 promoter. We show that a 350–base pair region upstream of the transcription initiation site is constitutively depleted of nucleosomes, regardless of the induction state of the promoter, providing one of the first examples for such a promoter in mammals. The 350–base pair nucleosome-free region can be dissected into modules, identifying transcription factor binding sites and their combinatorial organization during endoplasmic reticulum stress. The interaction of the transcriptional machinery with the GRP78 core promoter is highly organized, represented by six major combinatorial states. We show that the TATA box is frequently occupied in the noninduced state, that stress induction results in sequential loading of the endoplasmic reticulum stress response elements, and that a substantial portion of these elements is no longer occupied following recruitment of factors to the transcription initiation site. Studying the positioning of nucleosomes and transcription factors at the single promoter level provides a powerful tool to gain novel insights into the transcriptional process in eukaryotes
Recommended from our members
TATTOO-seq delineates spatial and cell type-specific regulatory programs in the developing limb
The coordinated differentiation of progenitor cells into specialized cell types and their spatial organization into distinct domains is central to embryogenesis. Here, we developed and applied an unbiased spatially resolved single-cell transcriptomics method to identify the genetic programs underlying the emergence of specialized cell types during mouse limb development and their spatial integration. We identify multiple transcription factors whose expression patterns are predominantly associated with cell type specification or spatial position, suggesting two parallel yet highly interconnected regulatory systems.We demonstrate that the embryonic limb undergoes a complex multiscale reorganization upon perturbation of one of its spatial organizing centers, including the loss of specific cell populations, alterations of preexisting cell states' molecular identities, and changes in their relative spatial distribution. Our study shows how multidimensional single-cell, spatially resolved molecular atlases can allow the deconvolution of spatial identity and cell fate and reveal the interconnected genetic networks that regulate organogenesis and its reorganization upon genetic alterations
A Probabilistic Methodology for Integrating Knowledge and Experiments on Biological Networks
Functional Enhancers at the Gene-Poor 8q24 Cancer-Linked Locus
Multiple discrete regions at 8q24 were recently shown to contain alleles that predispose to many cancers including prostate, breast, and colon. These regions are far from any annotated gene and their biological activities have been unknown. Here we profiled a 5-megabase chromatin segment encompassing all the risk regions for RNA expression, histone modifications, and locations occupied by RNA polymerase II and androgen receptor (AR). This led to the identification of several transcriptional enhancers, which were verified using reporter assays. Two enhancers in one risk region were occupied by AR and responded to androgen treatment; one contained a single nucleotide polymorphism (rs11986220) that resides within a FoxA1 binding site, with the prostate cancer risk allele facilitating both stronger FoxA1 binding and stronger androgen responsiveness. The study reported here exemplifies an approach that may be applied to any risk-associated allele in non-protein coding regions as it emerges from genome-wide association studies to better understand the genetic predisposition of complex diseases
- …
