186 research outputs found

    Amphioxus functional genomics and the origins of vertebrate gene regulation.

    Get PDF
    Vertebrates have greatly elaborated the basic chordate body plan and evolved highly distinctive genomes that have been sculpted by two whole-genome duplications. Here we sequence the genome of the Mediterranean amphioxus (Branchiostoma lanceolatum) and characterize DNA methylation, chromatin accessibility, histone modifications and transcriptomes across multiple developmental stages and adult tissues to investigate the evolution of the regulation of the chordate genome. Comparisons with vertebrates identify an intermediate stage in the evolution of differentially methylated enhancers, and a high conservation of gene expression and its cis-regulatory logic between amphioxus and vertebrates that occurs maximally at an earlier mid-embryonic phylotypic period. We analyse regulatory evolution after whole-genome duplications, and find that-in vertebrates-over 80% of broadly expressed gene families with multiple paralogues derived from whole-genome duplications have members that restricted their ancestral expression, and underwent specialization rather than subfunctionalization. Counter-intuitively, paralogues that restricted their expression increased the complexity of their regulatory landscapes. These data pave the way for a better understanding of the regulatory principles that underlie key vertebrate innovations

    Identification and annotation of conserved promoters and macrophage-expressed genes in the pig genome.

    Get PDF
    BACKGROUND: The FANTOM5 consortium used Cap Analysis of Gene Expression (CAGE) tag sequencing to produce a comprehensive atlas of promoters and enhancers within the human and mouse genomes. We reasoned that the mapping of these regulatory elements to the pig genome could provide useful annotation and evidence to support assignment of orthology. RESULTS: For human transcription start sites (TSS) associated with annotated human-mouse orthologs, 17% mapped to the pig genome but not to the mouse, 10% mapped only to the mouse, and 55% mapped to both pig and mouse. Around 17% did not map to either species. The mapping percentages were lower where there was not clear orthology relationship, but in every case, mapping to pig was greater than to mouse, and the degree of homology was also greater. Combined mapping of mouse and human CAGE-defined promoters identified at least one putative conserved TSS for >16,000 protein-coding genes. About 54% of the predicted locations of regulatory elements in the pig genome were supported by CAGE and/or RNA-Seq analysis from pig macrophages. CONCLUSIONS: Comparative mapping of promoters and enhancers from humans and mice can provide useful preliminary annotation of other animal genomes. The data also confirm extensive gain and loss of regulatory elements between species, and the likelihood that pigs provide a better model than mice for human gene regulation and function

    Conservation of different mechanisms of Hox cluster regulation within chordates

    Full text link
    [eng] In this thesis we have covered the importance of finding underlying conservation events to better understand the regulatory mechanisms of important development orchestrators like the Hox cluster. As an example of these non-evident conservation, we have shown two cases, as described below. The first case studied, after developing a software able to detect homologous long noncoding RNAs by means of microsynteny analyses, is the conservation of Hotairm1 in Chordata. For assessing the homology of this lncRNA, first we had to identify the lncRNA fraction within the B. lanceolatum transcriptome. With a reliable lincRNA dataset, we used our pipeline, LincOFinder, to identify orthologs between human and amphioxus through microsynteny. After the identification of Hotairm1 as one of the lincRNAs with conserved microsynteny, we used Xenopus as a proxy to analyse the homologies in the expression and the function. We had to proceed this way due to the difficulties associated with the inhibition of genes in B. lanceolatum, and the unavailability of expression patterns for Hotairm1 in the bibliography. After we successfully characterised Hotairm1 expression in amphioxus and Xenopus, we injected morpholino oligonucleotides to target and inhibit the splicing of Hotairm1 to promote an isoform imbalance. Through the phenotype obtained and the performing of qPCRs, we were able to deduct the mechanism of Hotairm1 and successfully relate this mechanism with the one described in human cells. With all the data obtained we were able to strongly suggest that the amphioxus Hotairm1 is homologous to the Xenopus and human Hotairm1, thus being conserved in most of the lineages within chordates. The second case studied was the conservation of the regulation of the Hox cluster mediated by Cdx. When analysing the B. floridae knockouts of Cdx and Pdx obtained using the TALEN technique, we found a severe phenotype of the developing larvae in Cdx-/- and a mild phenotype in Pdx-/-. The Cdx-/- phenotype consisted in the disruption of posterior gut development, as well as an underdevelopment of the postanal tail, coupled with a non-opening anus. When looking at changes in the expression of the Hox cluster in this Cdx-/- embryos, we found collinear misregulation of the expressed Hox genes, with the most anterior Hox cluster genes upregulated, and the most posterior ones downregulated. This is very similar to findings seen in triple morpholino knockdowns of the Cdx genes in Xenopus, indicating that in both, Xenopus and amphioxus, Cdx is regulating the Hox cluster through a homologous mechanism

    High resolution temporal transcriptomics of mouse embryoid body development reveals complex expression dynamics of coding and noncoding loci.

    Get PDF
    Cellular responses to stimuli are rapid and continuous and yet the vast majority of investigations of transcriptional responses during developmental transitions typically use long interval time courses; limiting the available interpretive power. Moreover, such experiments typically focus on protein-coding transcripts, ignoring the important impact of long noncoding RNAs. We therefore evaluated coding and noncoding expression dynamics at unprecedented temporal resolution (6-hourly) in differentiating mouse embryonic stem cells and report new insight into molecular processes and genome organization. We present a highly resolved differentiation cascade that exhibits coding and noncoding transcriptional alterations, transcription factor network interactions and alternative splicing events, little of which can be resolved by long-interval developmental time-courses. We describe novel short lived and cycling patterns of gene expression and dissect temporally ordered gene expression changes in response to transcription factors. We elucidate patterns in gene co-expression across the genome, describe asynchronous transcription at bidirectional promoters and functionally annotate known and novel regulatory lncRNAs. These findings highlight the complex and dynamic molecular events underlying mammalian differentiation that can only be observed though a temporally resolved time course

    Promoter architecture and gene expression dynamics in embryonic development

    Get PDF
    Genes indispensable for proper embryonic development show intricate patterns of expression throughout the time, space and magnitude of their activity. This diversity is enabled by elaborate regulatory mechanisms that guide their expression. They also possess a distinct type of core promoters that enable the integration of all regulatory inputs. However, it is still not clear how is coordination of regulation achieved. The first step towards understanding this process is to characterise dynamics of expression, and core promoter features that process the regulation. In this thesis, I explored the diversity of spatio-temporal gene expression during zebrafish development. I defined a novel measure of anatomical specificity that defines how precisely an anatomical structure is defined in the Anatomical Ontology system. Using anatomical specificity measure, I quantified gene expression dynamics from mRNA in situ hybridisation data. Gene expression divergence from in situs was used to predict expression levels from RNA-seq expression data. This analysis allowed me to propose a measure of gene expression complexity which showed that genes with the highest complexity score are developmental genes, whereas genes with low complexity score are involved in housekeeping functions. Next, I developed a method that reports significantly enriched core promoter elements in a group of genes. Using this method, I compared differences in core promoter composition in active genes expressed in different developmental periods. In addition, this method found groups of genes with a specific core promoter structure that are specified for a biological process. Finally, I used scRNA-seq data from zebrafish development to identify patterns of gene co-expression across different cell clusters. Co-expression suggests that a gene pair possesses a common regulatory programme. I show that genes with the most divergent co-expression patterns across development are developmental genes and that housekeeping genes have least diverse co-expression patterns. I went further to create co-expression networks which allowed me to analyse co-expression patterns into more details.Open Acces

    Genome-Wide Computational Prediction and Analysis of Core Promoter Elements across Plant Monocots and Dicots

    Get PDF
    Transcription initiation, essential to gene expression regulation, involves recruitment of basal transcription factors to the core promoter elements (CPEs). The distribution of currently known CPEs across plant genomes is largely unknown. This is the first large scale genome-wide report on the computational prediction of CPEs across eight plant genomes to help better understand the transcription initiation complex assembly. The distribution of thirteen known CPEs across four monocots (Brachypodium distachyon, Oryza sativa ssp. japonica, Sorghum bicolor, Zea mays) and four dicots (Arabidopsis thaliana, Populus trichocarpa, Vitis vinifera, Glycine max) reveals the structural organization of the core promoter in relation to the TATA-box as well as with respect to other CPEs. The distribution of known CPE motifs with respect to transcription start site (TSS) exhibited positional conservation within monocots and dicots with slight differences across all eight genomes. Further, a more refined subset of annotated genes based on orthologs of the model monocot (O. sativa ssp. japonica) and dicot (A. thaliana) genomes supported the positional distribution of these thirteen known CPEs. DNA free energy profiles provided evidence that the structural properties of promoter regions are distinctly different from that of the non-regulatory genome sequence. It also showed that monocot core promoters have lower DNA free energy than dicot core promoters. The comparison of monocot and dicot promoter sequences highlights both the similarities and differences in the core promoter architecture irrespective of the species-specific nucleotide bias. This study will be useful for future work related to genome annotation projects and can inspire research efforts aimed to better understand regulatory mechanisms of transcription

    Uncovering the functional constraints underlying the genomic organisation of the Odorant-Binding Protein genes

    Get PDF
    Animal olfactory systems have a critical role for the survival and reproduction of individuals. In insects, the odorant-binding proteins (OBPs) are encoded by a moderately sized gene family, and mediate the first steps of the olfactory processing. Most OBPs are organized in clusters of a few paralogs, which are conserved over time. Currently, the biological mechanism explaining the close physical proximity among OBPs is not yet established. Here, we conducted a comprehensive study aiming to gain insights into the mechanisms underlying the OBP genomic organization. We found that the OBP clusters are embedded within large conserved arrangements. These organizations also include other non-OBP genes, which often encode proteins integral to plasma membrane. Moreover, the conservation degree of such large clusters is related to the following: 1) the promoter architecture of the confined genes, 2) a characteristic transcriptional environment, and 3) the chromatin conformation of the chromosomal region. Our results suggest that chromatin domains may restrict the location of OBP genes to regions having the appropriate transcriptional environment, leading to the OBP cluster structure. However, the appropriate transcriptional environment for OBP and the other neighbor genes is not dominated by reduced levels of expression noise. Indeed, the stochastic fluctuations in the OBP transcript abundance may have a critical role in the combinatorial nature of the olfactory coding process

    Non-coding genome contributions to the development and evolution of mammalian organs

    Get PDF
    Protein-coding sequences only cover 1-2% of a typical mammalian genome. The remaining non-coding space hides thousands of genomic elements, some of which act via their DNA sequence while others are transcribed into non-coding RNAs. Many well-characterized non-coding elements are involved in the regulation of other genes, a process essential for the emergence of different cell types and organs during development. Changes in the expression of conserved genes during development are in turn thought to facilitate evolutionary innovation in form and function. Thus, non-coding genomic elements are hypothesized to play important roles in developmental and evolutionary processes. However, challenges related to the identification and characterization of these elements, in particular in non-model organisms, has limited the study of their overall contributions to mammalian organ development and evolution. During my dissertation work, I addressed this gap by studying two major classes of non-coding elements, long non-coding RNAs (lncRNAs) and cis-regulatory elements (CREs). In the first part of my thesis, I analyzed the expression profiles of lncRNAs during the development of seven major organs in six mammals and a bird. I showed that, unlike protein-coding genes, only a small fraction of lncRNAs is expressed in reproducibly dynamic patterns during organ development. These lncRNAs are enriched for a series of features associated with functional relevance, including increased evolutionary conservation and regulatory complexity, highlighting them as candidates for further molecular characterization. I then associated these lncRNAs with specific genes and functions based on their spatiotemporal expression profiles. My analyses also revealed differences in lncRNA contributions across organs and developmental stages, identifying a developmental transition from broadly expressed and conserved lncRNAs towards an increasing number of lineage- and organ-specific lncRNAs. Following up on these global analyses, I then focused on a newly-identified lncRNA in the marsupial opossum, Female Specific on chromosome X (FSX). The broad and likely autonomous female-specific expression of FSX suggests a role in marsupial X-chromosome inactivation (XCI). I showed that FSX shares many expression and sequence features with another lncRNA, RSX — a known regulator of XCI in marsupials. Comparisons to other marsupials revealed that both RSX and FSX emerged in the common marsupial ancestor and have since been preserved in marsupial genomes, while their broad and female-specific expression has been retained for at least 76 million years of evolution. Taken together, my analyses highlighted FSX as a novel candidate for regulating marsupial XCI. In the third part of this work, I shifted my focus to CREs and their cell type-specific activities in the developing mouse cerebellum. After annotating cerebellar cell types and states based on single-cell chromatin accessibility data, I identified putative CREs and characterized their spatiotemporal activity across cell types and developmental stages. Focusing on progenitor cells, I described temporal changes in CRE activity that are shared between early germinal zones, supporting a model of cell fate induction through common developmental cues. By examining chromatin accessibility dynamics during neuronal differentiation, I revealed a gradual divergence in the regulatory programs of major cerebellar neuron types. In the final part, I explored the evolutionary histories of CREs and their potential contributions to gene expression changes between species. By comparing mouse CREs to vertebrate genomes and chromatin accessibility profiles from the marsupial opossum, I identified a temporal decrease in CRE conservation, which is shared across cerebellar cell types. However, I also found differences in constraint between cell types, with microglia having the fastest evolving CREs in the mouse cerebellum. Finally, I used deep learning models to study the regulatory grammar of cerebellar cell types in human and mouse, showing that the sequence rules determining CRE activity are conserved across mammals. I then used these models to retrace the evolutionary changes leading to divergent CRE activity between species. Collectively, my PhD work provides insights into the evolutionary dynamics of non-coding genes and regulatory elements, the processes associated with their conservation, and their contributions to the development and evolution of mammalian cell types and organs
    corecore