1,138 research outputs found

    Shuffling of cis-regulatory elements is a pervasive feature of the vertebrate lineage

    Get PDF
    BACKGROUND: All vertebrates share a remarkable degree of similarity in their development as well as in the basic functions of their cells. Despite this, attempts at unearthing genome-wide regulatory elements conserved throughout the vertebrate lineage using BLAST-like approaches have thus far detected noncoding conservation in only a few hundred genes, mostly associated with regulation of transcription and development. RESULTS: We used a unique combination of tools to obtain regional global-local alignments of orthologous loci. This approach takes into account shuffling of regulatory regions that are likely to occur over evolutionary distances greater than those separating mammalian genomes. This approach revealed one order of magnitude more vertebrate conserved elements than was previously reported in over 2,000 genes, including a high number of genes found in the membrane and extracellular regions. Our analysis revealed that 72% of the elements identified have undergone shuffling. We tested the ability of the elements identified to enhance transcription in zebrafish embryos and compared their activity with a set of control fragments. We found that more than 80% of the elements tested were able to enhance transcription significantly, prevalently in a tissue-restricted manner corresponding to the expression domain of the neighboring gene. CONCLUSION: Our work elucidates the importance of shuffling in the detection of cis-regulatory elements. It also elucidates how similarities across the vertebrate lineage, which go well beyond development, can be explained not only within the realm of coding genes but also in that of the sequences that ultimately govern their expression

    Genetic and epigenetic regulation of gene expression in pancreatic islets

    Get PDF
    T2D is a complex disease with evidence of a strong genetic basis. Studies of the human pancreatic islets have provided valuable insight into the islet regulatory landscape and identified enrichment of T2D-associated variants in islet enhancers. The relationship between cis-regulatory variation and changes in gene expression, however, remains unclear. This question is challenging to address in human, as it calls for a systematic analysis of cis-regulatory variation and the impact it has on gene expression in the native genomic context. While a mouse model can be used for this purpose, the regulatory landscape of the mouse pancreatic islets has been poorly characterized. This thesis addresses the questions of genetic and epigenetic regulation of gene expression in pancreatic islets in two parts. First, a genome-wide map of several histone modifications and transcription factor binding sites is created for the mouse pancreatic islet. This enabled identification of the mouse islet regulatory elements and characterization of enhancer clustering, transcription factor occupancy and conservation. A systematic comparison of enhancer clustering between mouse and human islets identified species-common and species-specific subsets of the islet regulatory program, each associated to a distinct biological function of the pancreatic islet. Second, a hybrid mouse was used as a model where naturally occurring genetic variation drives changes in gene expression. High-density mapping of allelic regulatory activity, chromatin accessibility and transcription provided insight into the properties of both genes subject to cis-regulatory variation and the regulatory elements driving the change in expression. As a result, tissue-specific genes associated to clustered enhancers were shown to be most influenced by cis-regulatory variation. Additionally, enhancer clustering emerged as the dominant property of regulatory elements associated to changes in gene expression. Overall, this thesis advanced our knowledge of the mouse islet regulatory landscape and provided novel insight into the properties of functional cis-regulatory variation.Open Acces

    Role of Cis-regulatory Elements in Transcriptional Regulation: From Evolution to 4D Interactions

    Get PDF
    Transcriptional regulation is the principal mechanism in establishing cell-type specific gene activity by exploring an almost infinite space of different combinations of regulatory elements, transcription factors with high precision. Recent efforts have mapped thousands of candidate regulatory elements, of which a great portion is cell-type specific yet it is still unclear as to what fraction of these elements is functional, what genes these elements regulate, or how they are established in a cell-type specific manner. In this dissertation, I will discuss methods and approaches I developed to better understand the role of regulatory elements and transcription factors in gene expression regulation. First, by comparing the transcriptome and chromatin landscape between mouse and human innate immune cells I showed specific gene expression programs are regulated by highly conserved regulatory elements that contain a set of constrained sequence motifs, which can successfully classify gene-induction in both species. Next, using chromatin interactions I accurately defined functional enhancers and their target genes. This fine mapping dramatically improved the prediction of transcriptional changes. Finally, we built a supervised learning approach to detect the short DNA sequences motifs that regulate the activation of regulatory elements following LPS stimulation. This approach detected several transcription factors to be critical in remodeling the epigenetic landscape both across time and individuals. Overall this thesis addresses several important aspects of cis-regulatory elements in transcriptional regulation and started to derive principles and models of gene-expression regulation that address the fundamental question: ā€œHow do cis-regulatory elements drive cell-type-specific transcription?

    The evolutionary dynamics of genomic regulatory blocks in metazoan genomes

    Get PDF
    Developmental genes require intricate control of the timing, location and magnitude of their expression. This is provided by multiple evolutionarily conserved enhancers, known as conserved non-coding elements (CNEs). CNEs cluster around their target genes, forming long syntenic arrays known as genomic regulatory blocks (GRBs). Current methods for GRB identification rely on the selection of arbitrary minimum conservation thresholds, impeding their performance in many contexts. In this thesis, I propose a novel measure of pairwise genome conservation that eliminates the need for conservation thresholds, and use this measure to study the evolutionary dynamics of GRBs in metazoa. I define sets of GRBs based on their rate of regulatory turnover ā€“ high turnover GRBs (htGRBs) and low turnover GRBs (ltGRBs) ā€“ in three independent metazoan lineages. I show that ht- and ltGRBs target functionally distinct classes of genes, and that these genes tend to be expressed during late and early development respectively, potentially contributing to their differing tolerance of regulatory turnover. Moreover, the differences between ht- and ltGRBs are consistent across all three lineages, suggesting that similar evolutionary pressures have defined the rate of turnover in these GRBs since their emergence in the metazoan ancestor. Next I identify GRBs in the extremely compact Caenorhabditis elegans and Oikopleura dioica genomes for the first time, and use these GRBs to investigate the effects of genome compaction on GRB size and composition. I show that GRB size scales proportionally with genome size and that GRBs exhibit similar enrichment and depletion of specific genomic features. This suggests that regardless of background genome content, GRBs are under similar pressure to maintain a permissive environment for long-range gene regulation. The development of a threshold-free GRB identification method has facilitated the analysis of GRBs in both closely related species and compact genomes, providing further insights into their origin and evolution.Open Acces

    SINEUPs, a new class of antisense long non-coding RNAs that enhance synthesis of target proteins in cells: molecular mechanisms and applications

    Get PDF
    Thanks to continuous advances in sequencing technologies, we know that a huge number of non-coding RNAs are transcribed from mammalian genomes. Of these, long non-coding RNAs (lncRNAs) represent the widest and most heterogeneous class. An increasing number of studies are unveiling lncRNA functions, supporting their active role in regulating gene expression. Regardless of lncRNAs specific functional features, their organization into discrete domains seems to represent a common denominator. Through such domains lncRNAs can recruit and coordinate the activity of multiple effectors, thus working as \u201cflexible modular scaffolds\u201d. This model has globally driven towards the quest for regulatory elements within lncRNAs, with a special attention on functional cues deriving from RNA folding. Since transposable elements (TEs) represent 40% of nucleotides of lncRNA sequences, they have been proposed as candidate functional modules. Carrieri and colleagues recently reported that an embedded inverted SINEB2 element acts as a functional domain in antisense (AS) Uchl1, an AS lncRNA able to increase translation of partially-overlapping protein-coding sense Uchl1 mRNA. AS Uchl1 regulatory properties depend on two RNA domains. A 5' overlapping sequence to the sense transcript is the Binding Domain (BD) and drives specificity of action. An embedded inverted SINEB2 element functions as Effector Domain (ED) conferring translational activation power. AS Uchl1 is the representative member of a new class of lncRNAs, named SINEUPs, as they rely on a SINEB2 element to UP-regulate translation. AS Uchl1 activity can be transferred to a synthetic construct by manipulating the AS sequence in the BD, suggesting the potential use of AS Uchl1- derived synthetic SINEUPs as tools to increase translation of selected targets. This work was the first example of a specific biological function assigned to an embedded TE leading to the hypothesis that embedded TEs provide functional modules to lncRNAs. A major limit to the application of SINEUPs is represented by the poor knowledge of the basic mechanisms underlying the biological activity of the ED. A crucial challenge becomes the identification of secondary structures that may confer characteristic protein binding properties. Protein partners would modulate SINEUPs action and contribute to achieve specific functional outputs. In this thesis, I focus on understanding the molecular basis of SINEUPs activity in cells and I discuss the potential applications of synthetic SINEUPs as translation enhancers. First, I investigated the structural basis for translation activation mediated by the ED of SINEUPs. I pointed out that specific structural regions, containing a short terminal hairpin, are involved in the ability of natural and synthetic SINEUPs to increase translation of target mRNAs. Next, I identified protein partners modulating the activity of SINEUPs in cells. I found that AS Uchl1 interacts with the interleukin enhancer-binding factor 3 (ILF3) and that the presence of the inverted SINEB2 favors binding in vivo. In particular, I demonstrated that the AS Uchl1-embedded TEs, inverted SINEB2 and Alu, direct AS Uchl1 localization to ILF3-containing complexes, thus contributing to AS Uchl1 bias towards nuclear localization. I thus suggest that nuclear retention could represent a possible mechanism regulating SINEUP activity. I also validated the scalability of synthetic SINEUPs as tools to increase protein synthesis of targets of choice. I showed that SINEUP technology can be adapted to a broader number of targets, with interesting potential applications in different fields, from biotechnology to therapy. SINEUPs function in an array of cell lines and can be efficiently directed toward N-terminally tagged proteins. Their biological activity is retained in a miniaturized version within the range of small RNAs length. Their modular structure can be exploited to successfully design synthetic SINEUPs against selected endogenous targets, supporting their efficacy as tools to modulate gene expression in vitro and in vivo. Hence, I propose SINEUPs as versatile tools to enhance translation of mRNAs of choice

    Comparative transcriptomics in plants

    Get PDF
    Comparative genomics is the study of the structural and functional rela- tionships between the genomes of different species or strains. Recently microarray experiments have yielded massive amounts of expression infor- mation for many genes under various conditions or in different tissues for different model species. Expression compendia grouping multiple microar- ray experiments performed in similar (or different) experimental condition make it possible to define correlated expression patterns between genes. Genes within such a coexpression cluster are expected to have more similar functionality compared to genes lacking expression similarity. In this thesis the different steps required to systematically compare expres- sion data across species are described and some future applications of plant comparative transcriptomics are highlighted. Then we analyzed if function- ally related genes show coexpression in Arabidopsis and rice and developed a general framework to measure expression context conservation (ECC) for orthologous genes. Additionally, we studied the evolutionary parameters influencing ECC conservation and compared expression with sequence evo- lution. At the end, a new method is presented to define high quality tis- sue specific genes in seven different plant species; A.thaliana (Arabidopsis), Z.mays (Maize), M.truncatula (Medicago), P.trichocarpa (Poplar), O.sativa (Rice), G.max (Soybean) and V.vinifera (Grape) using Affymetrix microar- ray expression profiles. We also performed an in-depth study on the rela- tionship between leaf tissue specific genes coexpression clusters, within a species and in comparison with other species for a set of strictly selected genes

    Non-coding genome contributions to the development and evolution of mammalian organs

    Get PDF
    Protein-coding sequences only cover 1-2% of a typical mammalian genome. The remaining non-coding space hides thousands of genomic elements, some of which act via their DNA sequence while others are transcribed into non-coding RNAs. Many well-characterized non-coding elements are involved in the regulation of other genes, a process essential for the emergence of different cell types and organs during development. Changes in the expression of conserved genes during development are in turn thought to facilitate evolutionary innovation in form and function. Thus, non-coding genomic elements are hypothesized to play important roles in developmental and evolutionary processes. However, challenges related to the identification and characterization of these elements, in particular in non-model organisms, has limited the study of their overall contributions to mammalian organ development and evolution. During my dissertation work, I addressed this gap by studying two major classes of non-coding elements, long non-coding RNAs (lncRNAs) and cis-regulatory elements (CREs). In the first part of my thesis, I analyzed the expression profiles of lncRNAs during the development of seven major organs in six mammals and a bird. I showed that, unlike protein-coding genes, only a small fraction of lncRNAs is expressed in reproducibly dynamic patterns during organ development. These lncRNAs are enriched for a series of features associated with functional relevance, including increased evolutionary conservation and regulatory complexity, highlighting them as candidates for further molecular characterization. I then associated these lncRNAs with specific genes and functions based on their spatiotemporal expression profiles. My analyses also revealed differences in lncRNA contributions across organs and developmental stages, identifying a developmental transition from broadly expressed and conserved lncRNAs towards an increasing number of lineage- and organ-specific lncRNAs. Following up on these global analyses, I then focused on a newly-identified lncRNA in the marsupial opossum, Female Specific on chromosome X (FSX). The broad and likely autonomous female-specific expression of FSX suggests a role in marsupial X-chromosome inactivation (XCI). I showed that FSX shares many expression and sequence features with another lncRNA, RSX ā€” a known regulator of XCI in marsupials. Comparisons to other marsupials revealed that both RSX and FSX emerged in the common marsupial ancestor and have since been preserved in marsupial genomes, while their broad and female-specific expression has been retained for at least 76 million years of evolution. Taken together, my analyses highlighted FSX as a novel candidate for regulating marsupial XCI. In the third part of this work, I shifted my focus to CREs and their cell type-specific activities in the developing mouse cerebellum. After annotating cerebellar cell types and states based on single-cell chromatin accessibility data, I identified putative CREs and characterized their spatiotemporal activity across cell types and developmental stages. Focusing on progenitor cells, I described temporal changes in CRE activity that are shared between early germinal zones, supporting a model of cell fate induction through common developmental cues. By examining chromatin accessibility dynamics during neuronal differentiation, I revealed a gradual divergence in the regulatory programs of major cerebellar neuron types. In the final part, I explored the evolutionary histories of CREs and their potential contributions to gene expression changes between species. By comparing mouse CREs to vertebrate genomes and chromatin accessibility profiles from the marsupial opossum, I identified a temporal decrease in CRE conservation, which is shared across cerebellar cell types. However, I also found differences in constraint between cell types, with microglia having the fastest evolving CREs in the mouse cerebellum. Finally, I used deep learning models to study the regulatory grammar of cerebellar cell types in human and mouse, showing that the sequence rules determining CRE activity are conserved across mammals. I then used these models to retrace the evolutionary changes leading to divergent CRE activity between species. Collectively, my PhD work provides insights into the evolutionary dynamics of non-coding genes and regulatory elements, the processes associated with their conservation, and their contributions to the development and evolution of mammalian cell types and organs
    • ā€¦
    corecore