Search CORE

12,293 research outputs found

Adaptive evolution of transcription factor binding sites

Author: Berg Johannes
Lässig Michael
Willmann Stana
Publication venue
Publication date: 01/01/2004
Field of study

The regulation of a gene depends on the binding of transcription factors to specific sites located in the regulatory region of the gene. The generation of these binding sites and of cooperativity between them are essential building blocks in the evolution of complex regulatory networks. We study a theoretical model for the sequence evolution of binding sites by point mutations. The approach is based on biophysical models for the binding of transcription factors to DNA. Hence we derive empirically grounded fitness landscapes, which enter a population genetics model including mutations, genetic drift, and selection. We show that the selection for factor binding generically leads to specific correlations between nucleotide frequencies at different positions of a binding site. We demonstrate the possibility of rapid adaptive evolution generating a new binding site for a given transcription factor by point mutations. The evolutionary time required is estimated in terms of the neutral (background) mutation rate, the selection coefficient, and the effective population size. The efficiency of binding site formation is seen to depend on two joint conditions: the binding site motif must be short enough and the promoter region must be long enough. These constraints on promoter architecture are indeed seen in eukaryotic systems. Furthermore, we analyse the adaptive evolution of genetic switches and of signal integration through binding cooperativity between different sites. Experimental tests of this picture involving the statistics of polymorphisms and phylogenies of sites are discussed.Comment: published versio

arXiv.org e-Print Archive

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Evidence for convergent nucleotide evolution and high allelic turnover rates at the complementary sex determiner (csd) gene of western and Asian honey bees

Author: Beye Martin
Hasselmann Martin
Koeniger Gudrun
Koeniger Nikolaus
Pflugfelder Jochen
Tingek Salim
Vekemans Xavier
Publication venue
Publication date: 01/01/2008
Field of study

Our understanding of the impact of recombination, mutation, genetic drift and selection on the evolution of a single gene is still limited. Here we investigate the impact of all of these evolutionary forces at the complementary sex determiner (csd) gene which evolves under a balancing mode of selection. Females are heterozygous at the csd gene and males are hemizygous; diploid males are lethal and occur when csd is homozygous. Rare alleles thus have a selective advantage, are seldom lost by the effect of genetic drift and are maintained over extended periods of time when compared to neutral polymorphisms. Here, we report on the analysis of 17, 19 and 15 csd alleles of Apis cerana, Apis dorsata and Apis mellifera honey bees respectively. We observed great heterogeneity of synonymous (pi S) and nonsynonymous (pi N) polymorphisms across the gene, with a consistent peak in exon 6 and 7. We propose that exons 6 and 7 encode the potential specifying domain (csd-PSD) which has accumulated elevated nucleotide polymorphisms over time by balancing selection. We observed no direct evidence that balancing selection favors the accumulation of nonsynonymous changes at csd-PSD (pi N/pi S ratios are all < 1, ranging from 0.6 to 0.95). We observed an excess of shared nonsynonymous changes, which suggests that strong evolutionary constraints are operating at csd-PSD resulting in the independent accumulation of the same nonsynonymous changes in different alleles across species (convergent evolution). Analysis of a csd-PSD genealogy revealed relatively short average coalescence times (~6 million years), low average synonymous nucleotide diversity (pi S < 0.09) and a lack of trans-specific alleles which substantially contrasts with previously analyzed loci under strong balancing selection. We excluded the possibility of a burst of diversification after population bottlenecking and intragenic recombination as explanatory factors, leaving high turn-over rates as the explanation for this observation. By comparing observed allele richness and average coalescence times with a simplified model of csd-coalescence, we found that small long term population sizes (i.e. Ne <104), but not high mutation rates, can explain short maintenance times, implicating a strong impact of genetic drift on the molecular evolution of highly social honey bees

Hochschulschriftenserver - Universität Frankfurt am Main

Recommended from our members

TITER: predicting translation initiation sites by deep learning.

Author: Hu Hailin
Jiang Tao
Zeng Jianyang
Zhang Lei
Zhang Sai
Publication venue: eScholarship, University of California
Publication date: 01/07/2017
Field of study

MotivationTranslation initiation is a key step in the regulation of gene expression. In addition to the annotated translation initiation sites (TISs), the translation process may also start at multiple alternative TISs (including both AUG and non-AUG codons), which makes it challenging to predict TISs and study the underlying regulatory mechanisms. Meanwhile, the advent of several high-throughput sequencing techniques for profiling initiating ribosomes at single-nucleotide resolution, e.g. GTI-seq and QTI-seq, provides abundant data for systematically studying the general principles of translation initiation and the development of computational method for TIS identification.MethodsWe have developed a deep learning-based framework, named TITER, for accurately predicting TISs on a genome-wide scale based on QTI-seq data. TITER extracts the sequence features of translation initiation from the surrounding sequence contexts of TISs using a hybrid neural network and further integrates the prior preference of TIS codon composition into a unified prediction framework.ResultsExtensive tests demonstrated that TITER can greatly outperform the state-of-the-art prediction methods in identifying TISs. In addition, TITER was able to identify important sequence signatures for individual types of TIS codons, including a Kozak-sequence-like motif for AUG start codon. Furthermore, the TITER prediction score can be related to the strength of translation initiation in various biological scenarios, including the repressive effect of the upstream open reading frames on gene expression and the mutational effects influencing translation initiation efficiency.Availability and implementationTITER is available as an open-source software and can be downloaded from https://github.com/zhangsaithu/titer [email protected] or [email protected] informationSupplementary data are available at Bioinformatics online

eScholarship - University of California

Edge usage, motifs and regulatory logic for cell cycling genetic networks

Author: Krzywicki A.
Martin O. C.
Zagorski M.
Publication venue: 'American Physical Society (APS)'
Publication date: 01/01/2013
Field of study

The cell cycle is a tightly controlled process, yet its underlying genetic network shows marked differences across species. Which of the associated structural features follow solely from the ability to impose the appropriate gene expression patterns? We tackle this question in silico by examining the ensemble of all regulatory networks which satisfy the constraint of producing a given sequence of gene expressions. We focus on three cell cycle profiles coming from baker's yeast, fission yeast and mammals. First, we show that the networks in each of the ensembles use just a few interactions that are repeatedly reused as building blocks. Second, we find an enrichment in network motifs that is similar in the two yeast cell cycle systems investigated. These motifs do not have autonomous functions, but nevertheless they reveal a regulatory logic for cell cycling based on a feed-forward cascade of activating interactions.Comment: 9 pages, 9 figures, to be published in Phys. Rev.

arXiv.org e-Print Archive

Hal-Diderot

Measuring reproducibility of high-throughput experiments

Author: Bickel Peter J.
Brown James B.
Huang Haiyan
Li Qunhua
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 21/10/2011
Field of study

Reproducibility is essential to reliable scientific discovery in high-throughput experiments. In this work we propose a unified approach to measure the reproducibility of findings identified from replicate experiments and identify putative discoveries using reproducibility. Unlike the usual scalar measures of reproducibility, our approach creates a curve, which quantitatively assesses when the findings are no longer consistent across replicates. Our curve is fitted by a copula mixture model, from which we derive a quantitative reproducibility score, which we call the "irreproducible discovery rate" (IDR) analogous to the FDR. This score can be computed at each set of paired replicate ranks and permits the principled setting of thresholds both for assessing reproducibility and combining replicates. Since our approach permits an arbitrary scale for each replicate, it provides useful descriptive measures in a wide variety of situations to be explored. We study the performance of the algorithm using simulations and give a heuristic analysis of its theoretical properties. We demonstrate the effectiveness of our method in a ChIP-seq experiment.Comment: Published in at http://dx.doi.org/10.1214/11-AOAS466 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

Bayesian variable selection and data integration for biological regulatory networks

Author: Chen Guang
Jensen Shane T.
Stoeckert Jr, Christian J.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2006
Field of study

A substantial focus of research in molecular biology are gene regulatory networks: the set of transcription factors and target genes which control the involvement of different biological processes in living cells. Previous statistical approaches for identifying gene regulatory networks have used gene expression data, ChIP binding data or promoter sequence data, but each of these resources provides only partial information. We present a Bayesian hierarchical model that integrates all three data types in a principled variable selection framework. The gene expression data are modeled as a function of the unknown gene regulatory network which has an informed prior distribution based upon both ChIP binding and promoter sequence data. We also present a variable weighting methodology for the principled balancing of multiple sources of prior information. We apply our procedure to the discovery of gene regulatory relationships in Saccharomyces cerevisiae (Yeast) for which we can use several external sources of information to validate our results. Our inferred relationships show greater biological relevance on the external validation measures than previous data integration methods. Our model also estimates synergistic and antagonistic interactions between transcription factors, many of which are validated by previous studies. We also evaluate the results from our procedure for the weighting for multiple sources of prior information. Finally, we discuss our methodology in the context of previous approaches to data integration and Bayesian variable selection.Comment: Published in at http://dx.doi.org/10.1214/07-AOAS130 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

ScholarlyCommons@Penn

Biophysical Fitness Landscapes for Transcription Factor Binding Sites

Author: Haldane Allan
Manhart Michael
Morozov Alexandre V.
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 02/12/2013
Field of study

Evolutionary trajectories and phenotypic states available to cell populations are ultimately dictated by intermolecular interactions between DNA, RNA, proteins, and other molecular species. Here we study how evolution of gene regulation in a single-cell eukaryote S. cerevisiae is affected by the interactions between transcription factors (TFs) and their cognate genomic sites. Our study is informed by high-throughput in vitro measurements of TF-DNA binding interactions and by a comprehensive collection of genomic binding sites. Using an evolutionary model for monomorphic populations evolving on a fitness landscape, we infer fitness as a function of TF-DNA binding energy for a collection of 12 yeast TFs, and show that the shape of the predicted fitness functions is in broad agreement with a simple thermodynamic model of two-state TF-DNA binding. However, the effective temperature of the model is not always equal to the physical temperature, indicating selection pressures in addition to biophysical constraints caused by TF-DNA interactions. We find little statistical support for the fitness landscape in which each position in the binding site evolves independently, showing that epistasis is common in evolution of gene regulation. Finally, by correlating TF-DNA binding energies with biological properties of the sites or the genes they regulate, we are able to rule out several scenarios of site-specific selection, under which binding sites of the same TF would experience a spectrum of selection pressures depending on their position in the genome. These findings argue for the existence of universal fitness landscapes which shape evolution of all sites for a given TF, and whose properties are determined in part by the physics of protein-DNA interactions

arXiv.org e-Print Archive

Directory of Open Access Journals

PubMed Central

FigShare