94 research outputs found

    Sequence context affects the rate of short insertions and deletions in flies and primates

    Get PDF
    Analysis of a large collection of short insertions and deletions in primates and flies shows that the rate of insertions or deletions of specific lengths can vary by more than 100 fold, depending on the surrounding sequence

    Shake It, Donā€™t Break It: Positive Feedback and the Evolution of Oscillator Design

    Get PDF
    In cell cycle control, a negative feedback oscillator design is shown to be reinforced with a positive feedback loop, giving a robust oscillatory architecture that is surprisingly common in biology

    Conservation of regulatory elements between two species of Drosophila

    Get PDF
    BACKGROUND: One of the important goals in the post-genomic era is to determine the regulatory elements within the non-coding DNA of a given organism's genome. The identification of functional cis-regulatory modules has proven difficult since the component factor binding sites are small and the rules governing their arrangement are poorly understood. However, the genomes of suitably diverged species help to predict regulatory elements based on the generally accepted assumption that conserved blocks of genomic sequence are likely to be functional. To judge the efficacy of strategies that prefilter by sequence conservation it is important to know to what extent the converse assumption holds, namely that functional elements common to both species will fall within these conserved blocks. The recently completed sequence of a second Drosophila species provides an opportunity to test this assumption for one of the experimentally best studied regulatory networks in multicellular organisms, the body patterning of the fly embryo. RESULTS: We find that 50%ā€“70% of known binding sites reside in conserved sequence blocks, but these percentages are not greatly enriched over what is expected by chance. Finally, a computational genome-wide search in both species for regulatory modules based on clusters of binding sites suggests that genes central to the regulatory network are consistently recovered. CONCLUSIONS: Our results indicate that binding sites remain clustered for these "core modules" while not necessarily residing in conserved blocks. This is an important clue as to how regulatory information is encoded in the genome and how modules evolve

    Developmental regulatory bits

    Get PDF

    Self-organization of stem cells into embryos: A window on early mammalian development.

    Get PDF
    Embryonic development is orchestrated by robust and complex regulatory mechanisms acting at different scales of organization. In vivo studies are particularly challenging for mammals after implantation, owing to the small size and inaccessibility of the embryo. The generation of stem cell models of the embryo represents a powerful system with which to dissect this complexity. Control of geometry, modulation of the physical environment, and priming with chemical signals reveal the intrinsic capacity of embryonic stem cells to make patterns. Adding the stem cells for the extraembryonic lineages generates three-dimensional models that are more autonomous from the environment and recapitulate many features of the pre- and postimplantation mouse embryo, including gastrulation. Here, we review the principles of self-organization and how they set cells in motion to create an embryo.M.N.S received funding from an Early Career Leverhulme Trust fellowship and an Advanced EMBO fellowship. Work in the laboratory of M.Z-G. is funded by the Wellcome Trust (207415/Z/17/Z) and the European Research Council (ERC grant 669198). Work of E.D.S. is funded by NIH grant GM101653

    Computational detection of genomic cis-regulatory modules applied to body patterning in the early Drosophila embryo

    Get PDF
    BACKGROUND: Regulation of gene transcription is crucial for the function and development of all organisms. While gene prediction programs that identify protein coding sequence are used with remarkable success in the annotation of genomes, the development of computational methods to analyze noncoding regions and to delineate transcriptional control elements is still in its infancy. RESULTS: Here we present novel algorithms to detect cis-regulatory modules through genome wide scans for clusters of transcription factor binding sites using three levels of prior information. When binding sites for the factors are known, our statistical segmentation algorithm, Ahab, yields about 150 putative gap gene regulated modules, with no adjustable parameters other than a window size. If one or more related modules are known, but no binding sites, repeated motifs can be found by a customized Gibbs sampler and input to Ahab, to predict genes with similar regulation. Finally using only the genome, we developed a third algorithm, Argos, that counts and scores clusters of overrepresented motifs in a window of sequence. Argos recovers many of the known modules, upstream of the segmentation genes, with no training data. CONCLUSIONS: We have demonstrated, in the case of body patterning in the Drosophila embryo, that our algorithms allow the genome-wide identification of regulatory modules. We believe that Ahab overcomes many problems of recent approaches and we estimated the false positive rate to be about 50%. Argos is the first successful attempt to predict regulatory modules using only the genome without training data. Complete results and module predictions across the Drosophila genome are available at http://uqbar.rockefeller.edu/~siggia/

    PhyloGibbs: A Gibbs Sampling Motif Finder That Incorporates Phylogeny

    Get PDF
    A central problem in the bioinformatics of gene regulation is to find the binding sites for regulatory proteins. One of the most promising approaches toward identifying these short and fuzzy sequence patterns is the comparative analysis of orthologous intergenic regions of related species. This analysis is complicated by various factors. First, one needs to take the phylogenetic relationship between the species into account in order to distinguish conservation that is due to the occurrence of functional sites from spurious conservation that is due to evolutionary proximity. Second, one has to deal with the complexities of multiple alignments of orthologous intergenic regions, and one has to consider the possibility that functional sites may occur outside of conserved segments. Here we present a new motif sampling algorithm, PhyloGibbs, that runs on arbitrary collections of multiple local sequence alignments of orthologous sequences. The algorithm searches over all ways in which an arbitrary number of binding sites for an arbitrary number of transcription factors (TFs) can be assigned to the multiple sequence alignments. These binding site configurations are scored by a Bayesian probabilistic model that treats aligned sequences by a model for the evolution of binding sites and ā€œbackgroundā€ intergenic DNA. This model takes the phylogenetic relationship between the species in the alignment explicitly into account. The algorithm uses simulated annealing and Monte Carlo Markov-chain sampling to rigorously assign posterior probabilities to all the binding sites that it reports. In tests on synthetic data and real data from five Saccharomyces species our algorithm performs significantly better than four other motif-finding algorithms, including algorithms that also take phylogeny into account. Our results also show that, in contrast to the other algorithms, PhyloGibbs can make realistic estimates of the reliability of its predictions. Our tests suggest that, running on the five-species multiple alignment of a single gene's upstream region, PhyloGibbs on average recovers over 50% of all binding sites in S. cerevisiae at a specificity of about 50%, and 33% of all binding sites at a specificity of about 85%. We also tested PhyloGibbs on collections of multiple alignments of intergenic regions that were recently annotated, based on ChIP-on-chip data, to contain binding sites for the same TF. We compared PhyloGibbs's results with the previous analysis of these data using six other motif-finding algorithms. For 16 of 21 TFs for which all other motif-finding methods failed to find a significant motif, PhyloGibbs did recover a motif that matches the literature consensus. In 11 cases where there was disagreement in the results we compiled lists of known target genes from the literature, and found that running PhyloGibbs on their regulatory regions yielded a binding motif matching the literature consensus in all but one of the cases. Interestingly, these literature gene lists had little overlap with the targets annotated based on the ChIP-on-chip data. The PhyloGibbs code can be downloaded from http://www.biozentrum.unibas.ch/~nimwegen/cgi-bin/phylogibbs.cgi or http://www.imsc.res.in/~rsidd/phylogibbs. The full set of predicted sites from our tests on yeast are available at http://www.swissregulon.unibas.ch

    Cross-species comparison significantly improves genome-wide prediction of cis-regulatory modules in Drosophila

    Get PDF
    BACKGROUND: The discovery of cis-regulatory modules in metazoan genomes is crucial for understanding the connection between genes and organism diversity. It is important to quantify how comparative genomics can improve computational detection of such modules. RESULTS: We run the Stubb software on the entire D. melanogaster genome, to obtain predictions of modules involved in segmentation of the embryo. Stubb uses a probabilistic model to score sequences for clustering of transcription factor binding sites, and can exploit multiple species data within the same probabilistic framework. The predictions are evaluated using publicly available gene expression data for thousands of genes, after careful manual annotation. We demonstrate that the use of a second genome (D. pseudoobscura) for cross-species comparison significantly improves the prediction accuracy of Stubb, and is a more sensitive approach than intersecting the results of separate runs over the two genomes. The entire list of predictions is made available online. CONCLUSION: Evolutionary conservation of modules serves as a filter to improve their detection in silico. The future availability of additional fruitfly genomes therefore carries the prospect of highly specific genome-wide predictions using Stubb
    • ā€¦
    corecore