2,888 research outputs found

    High resolution mapping of Twist to DNA in Drosophila embryos: Efficient functional analysis and evolutionary conservation

    Get PDF
    Cis-regulatory modules (CRMs) function by binding sequence specific transcription factors, but the relationship between in vivo physical binding and the regulatory capacity of factor-bound DNA elements remains uncertain. We investigate this relationship for the well-studied Twist factor in Drosophila melanogaster embryos by analyzing genome-wide factor occupancy and testing the functional significance of Twist occupied regions and motifs within regions. Twist ChIP-seq data efficiently identified previously studied Twist-dependent CRMs and robustly predicted new CRM activity in transgenesis, with newly identified Twist-occupied regions supporting diverse spatiotemporal patterns (>74% positive, n = 31). Some, but not all, candidate CRMs require Twist for proper expression in the embryo. The Twist motifs most favored in genome ChIP data (in vivo) differed from those most favored by Systematic Evolution of Ligands by EXponential enrichment (SELEX) (in vitro). Furthermore, the majority of ChIP-seq signals could be parsimoniously explained by a CABVTG motif located within 50 bp of the ChIP summit and, of these, CACATG was most prevalent. Mutagenesis experiments demonstrated that different Twist E-box motif types are not fully interchangeable, suggesting that the ChIP-derived consensus (CABVTG) includes sites having distinct regulatory outputs. Further analysis of position, frequency of occurrence, and sequence conservation revealed significant enrichment and conservation of CABVTG E-box motifs near Twist ChIP-seq signal summits, preferential conservation of ±150 bp surrounding Twist occupied summits, and enrichment of GA- and CA-repeat sequences near Twist occupied summits. Our results show that high resolution in vivo occupancy data can be used to drive efficient discovery and dissection of global and local cis-regulatory logic

    Evolutionary divergence and limits of conserved non-coding sequence detection in plant genomes

    Full text link
    The discovery of regulatory motifs embedded in upstream regions of plants is a particularly challenging bioinformatics task. Previous studies have shown that motifs in plants are short compared with those found in vertebrates. Furthermore, plant genomes have undergone several diversification mechanisms such as genome duplication events which impact the evolution of regulatory motifs. In this article, a systematic phylogenomic comparison of upstream regions is conducted to further identify features of the plant regulatory genomes, the component of genomes regulating gene expression, to enable future de novo discoveries. The findings highlight differences in upstream region properties between major plant groups and the effects of divergence times and duplication events. First, clear differences in upstream region evolution can be detected between monocots and dicots, thus suggesting that a separation of these groups should be made when searching for novel regulatory motifs, particularly since universal motifs such as the TATA box are rare. Second, investigating the decay rate of significantly aligned regions suggests that a divergence time of 100 mya sets a limit for reliable conserved non-coding sequence (CNS) detection. Insights presented here will set a framework to help identify embedded motifs of functional relevance by understanding the limits of bioinformatics detection for CNSs.</p

    Evolution of YY1, YY2, REX1 and DNA-binding motifs in vertebrate genomes

    Get PDF
    Transcription factors are important for many aspects of gene regulation in eukaryotes. YY1 (Yin-Yang 1) is a particularly interesting example of a highly conserved zinc-finger transcription factor, involved in transcriptional activation, repression, initiation, and in chromatin modification. YY1 is ubiquitously expressed in mammals, and its binding sites are found in ~10% of human genes as well as in repetitive elements. It is a targeting protein of the Polycomb complex and is involved in mammalian genomic imprinting. First, we explored the evolutionary history of YY1 using 62 species and formation of its paralogs, YY2 and REX1, which are found in mammals, and Pho and Phol, which are found in Drosophila. We confirmed the specificity of the consensus YY1 binding site and the differences of the target binding motifs of YY2 and REX1 which are reflected in their amino acid sequences. We found that the core motif, CCAT, is conserved for all three homologs and that YY2 and REX1 were produced via retrotransposition events early in the mammalian lineage. Second, we identified unusual clusters of YY1-binding motifs found in the coding regions of olfactory receptor genes (OLFRs) in mammals but not in fish. Olfactory genes provide scent detection and are the largest class of genes in mammals. Statistical analysis indicates that the core of the YY1-binding motifs cannot be acounted for by conserved amino acid motifs or overall protein homology. Thus selection has acted at the DNA level rather than at the protein level in preserving these YY1-binding sites within coding regions. Therefore, YY1 is likely to play a crucial role in regulating the expression of OLFRs. Third, we produced a new method of microarray data analysis predicated on the positions of genes along a chromosome as well as their expression levels. This technique is supplementary to traditional microarray data analysis and adds a new dimension to finding target genes of interest by looking for co-regulation. Overall, this work provides a coherent background to the evolution of YY1 and its homologs. It provides strong evidence that coding sequences of genes can encode information both at the DNA level and the protein level

    The Evolution and Mechanics of Translational Control in Plants

    Get PDF
    The expression of numerous plant mRNAs is attenuated by RNA sequence elements located in the 5\u27 and 3\u27 untranslated regions (UTRs). For example, in plants and many higher eukaryotes, roughly 35% of genes encode mRNAs that contain one or more upstream open reading frames (uORFs) in the 5\u27 UTR. For this dissertation I have analyzed the pattern of conservation of such mRNA sequence elements. In the first set of studies, I have taken a comparative transcriptomics approach to address which RNA sequence elements are conserved between various families of angiosperm plants. Such conservation indicates an element\u27s fundamental importance to plant biology, points to pathways for which it is most vital, and suggests the mechanism by which it acts. Conserved motifs were detected in 3% of genes. These include di-purine repeat motifs, uORF-associated motifs, putative binding sites for PUMILIO-like RNA binding proteins, small RNA targets, and a wide range of other sequence motifs. Due to the scanning process that precedes translation initiation, uORFs are often translated, thereby repressing initiation at the an mRNA\u27s main ORF. As one might predict, I found a clear bias against the AUG start codon within the 5\u27 untranslated region (5\u27 UTR) among all plants examined. Further supporting this finding, comparative analysis indicates that, for ~42% of genes, AUGs and their resultant uORFs reduce carrier fitness. Interestingly, for at least 5% of genes, uORFs are not only tolerated, but enriched. The remaining uORFs appear to be neutral. Because of their tangible impact on plant biology, it is critical to differentiate how uORFs affect translation and how, in many cases, their inhibitory effects are neutralized. In pursuit of this aim, I developed a computational model of the initiation process that uses five parameters to account for uORF presence. In vivo translation efficiency data from uORF-containing reporter constructs were used to estimate the model\u27s parameters in wild type Arabidopsis. In addition, the model was applied to identify salient defects associated with a mutation in the subunit h of eukaryotic initiation factor 3 (eIF3h). The model indicates that eIF3h, by supporting re-initation during uORF elongation, facilitates uORF tolerance

    Multidimensional chromatin profiling of zebrafish pancreas to uncover and investigate disease-relevant enhancers

    Get PDF
    The pancreas is a central organ for human diseases. Most alleles uncovered by genome-wide association studies of pancreatic dysfunction traits overlap with non-coding sequences of DNA. Many contain epigenetic marks of cis-regulatory elements active in pancreatic cells, suggesting that alterations in these sequences contribute to pancreatic diseases. Animal models greatly help to understand the role of non-coding alterations in disease. However, interspecies identification of equivalent cis-regulatory elements faces fundamental challenges, including lack of sequence conservation. Here we combine epigenetic assays with reporter assays in zebrafish and human pancreatic cells to identify interspecies functionally equivalent cis-regulatory elements, regardless of sequence conservation. Among other potential disease-relevant enhancers, we identify a zebrafish ptf1a distal-enhancer whose deletion causes pancreatic agenesis, a phenotype previously found to be induced by mutations in a distal-enhancer of PTF1A in humans, further supporting the causality of this condition in vivo. This approach helps to uncover interspecies functionally equivalent cis-regulatory elements and their potential role in human disease.This study was supported by the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (ERC-2015-StG-680156-ZPR and ERC-2016-AdG-740041-EvoLand to J.L.G.-S.). J.B. is supported by an FCT CEEC grant (CEECIND/03482/2018). J.L.G.-S. is supported by the Spanish Ministerio de Economía y Competitividad (BFU2016-74961-P), the Marató TV3 Fundacion (Grant 201611) and the institutional grant Unidad de Excelencia María de Maeztu (MDM-2016-0687). R.B.C. was funded by FCT (ON2201403-CTO-BPD), IBMC (BIM/04293-UID991520-BPD) and EMBO (Short-Term Fellowship). J.Tx. (SFRH/BD/126467/2016), M.D. (SFRH/BD/135957/2018), A.E. (SFRH/BD/147762/2019), and F.J.F. (PD/BD/105745/2014) are PhD fellows from FCT. M.G. was supported by the EnvMetaGen project via the European Union’s Horizon 2020 research and innovation programme (grant 668981). This work was funded by National Funds through FCT—Fundação para a Ciência e a Tecnologia, I.P., under the project UIDB/04293/2020”

    Motifs and cis-regulatory modules mediating the expression of genes co-expressed in presynaptic neurons

    Get PDF
    An integrative strategy of comparative genomics, experimental and computational approaches reveals aspects of a regulatory network controlling neuronal-specific expression in presynaptic neurons
    corecore