6 research outputs found

    G-quadruplex occurrence and conservation : more than just a question of guanine–cytosine content

    Get PDF
    Abstract : G-quadruplexes are motifs found in DNA and RNA that can fold into tertiary structures. Until now, they have been studied experimentally mainly in humans and a few other species. Recently, predictions have been made with bacterial and archaeal genomes. Nevertheless, a global comparison of predicted G4s (pG4s) across and within the three living kingdoms has not been addressed. In this study, we aimed to predict G4s in genes and transcripts of all kingdoms of living organisms and investigated the differences in their distributions. The relation of the predictions with GC content was studied. It appears that GC content is not the only parameter impacting G4 predictions and abundance. The distribution of pG4 densities varies depending on the class of transcripts and the group of species. Indeed, we have observed that, in coding transcripts, there are more predicted G4s than expected for eukaryotes but not for archaea and bacteria, while in noncoding transcripts, there are as many or fewer predicted G4s in all species groups. We even noticed that some species with the same GC content presented different pG4 profiles. For instance, Leishmania major and Chlamydomonas reinhardtii both have 60% of GC content, but the former has a pG4 density of 0.07 and the latter 1.16

    GAIA : G-quadruplexes in alive creature database

    Get PDF
    Abstract : G-quadruplexes (G4) are 3D structures that are found in both DNA and RNA. Interest in this structure has grown over the past few years due to both its implication in diverse biological mechanisms and its potential use as a therapeutic target, to name two examples. G4s in humans have been widely studied; however, the level of their study in other species remains relatively minimal. That said, progress in this field has resulted in the prediction of G4s structures in various species, ranging from bacteria to eukaryotes. These predictions were analysed in a previous study which revealed that G4s are present in all living kingdoms. To date, eleven different databases have grouped the various G4s depending on either their structures, on the proteins thatmight bind them, or on their location in the various genomes. However, none of these databases contains information on their location in the transcriptome of many of the implicated species. The GAIA database was designed so as to make this data available online in a user-friendly manner. Through its web interface, users can query GAIA to filter G4s, which, we hope, will help the research in this field. GAIA is available at: https://gaia.cobius.usherbrooke.c

    Development of a highly optimized procedure for the discovery of RNA G-quadruplexes by combining several strategies

    Get PDF
    Abstract : RNA G-quadruplexes (rG4s) are non-canonical secondary structures that are formed by the selfassociation of guanine quartets and that are stabilized by monovalent cations (e.g. potassium). rG4s are key elements in several post-transcriptional regulation mechanisms, including both messenger RNA (mRNA) and microRNA processing, mRNA transport and translation, to name but a few examples. Over the past few years, multiple high-throughput approaches have been developed in order to identify rG4s, including bioinformatic prediction, in vitro assays and af nity capture experiments coupled to RNA sequencing. Each individual approach had its limits, and thus yielded only a fraction of the potential rG4 that are further con rmed (i.e., there is a signi cant level of false positive). This report aims to bene t from the strengths of several existing approaches to identify rG4s with a high potential of being folded in cells. Brie y, rG4s were pulled-down from cell lysates using the biotinylated biomimetic G4 ligand BioTASQ and the sequences thus isolated were then identi ed by RNA sequencing. Then, a novel bioinformatic pipeline that included DESeq2 to identify rG4 enriched transcripts, MACS2 to identify rG4 peaks, rG4-seq to increase rG4 formation probability and G4RNA Screener to detect putative rG4s was performed. This work ow uncovers new rG4 candidates whose rG4-folding was then con rmed in vitro using an array of established biophysical methods. Clearly, this work ow led to the identi cation of novel rG4s in a highly specic and reliable manner

    Where are G-quadruplexes located in the human transcriptome?

    Get PDF
    Abstract: It has been demonstrated that RNA G-quadruplexes (G4) are structural motifs present in transcriptomes and play important regulatory roles in several posttranscriptional mechanisms. However, the full picture of RNA G4 locations and the extent of their implication remain elusive. Solely computational prediction analysis of the whole transcriptome may reveal all potential G4, since experimental identifications are always limited to specific conditions or specific cell lines. The present study reports the first in-depth computational prediction of potential G4 region across the complete human transcriptome. Although using a relatively stringent approach based on three prediction scores that accounts for the composition of G4 sequences, the composition of their neighboring sequences, and the various forms of G4, over 1.1 million of potential G4 (pG4) were predicted. The abundance of G4 was computationally confirmed in both 5 and 3 UTR as well as splicing junction of mRNA, appreciate for the first time in the long ncRNA, while almost absent of most of the small ncRNA families. The present results constitute an important step toward a full understanding of the roles of G4 in posttranscriptional mechanisms

    G-quadruplexes conservation and evolution

    No full text
    Pendant longtemps, seul un rôle d’intermédiaire était connu à l’ARN. Cependant, il a été trouvé que grâce à sa forme simple brin, cette molécule peut se replier sur elle-même et ainsi constituer des structures qui lui confèrent diverses fonctions. Parmi ces formations, le G- quadruplexe (G4) est composé d’appariements de type Hoogsteen au lieu des appariements de type Watson-Crick. Les G4 peuvent se replier dans l’ARN, mais aussi dans l’ADN et y avoir de multiples rôles. Leur étude, initialement timide, a vécu un engouement dans les dernières années après avoir été découvertes dans les télomères et ayant un impact sur les cancers. Les G4 sont aussi fortement enrichis dans des régions régulatrices, comme les promoteurs dans les génomes ou les régions non traduites dans les transcrits. Quelques recherches laissent penser qu’il en est de même pour tous les eucaryotes, mais pas pour les bactéries. À ce jour, l’évolution des G4 reste peu étudiée, principalement à cause de la mauvaise conservation des séquences. Dans la présente thèse, la distribution des G4 dans les différents royaumes du vivant a été explorée, mais aussi différentes approches pour mieux appréhender leur évolution. Pour ce faire, une première étape était de prédire les G4 dans le transcriptome humain, et de valider ces prédictions. Ensuite, cette stratégie a été appliquée à diverses espèces au travers des trois royaumes du vivant. Ces données ont été analysées puis mises à disposition pour la communauté de chercheurs dans une base de données nommées GAIA. Finalement, une approche a été développée pour tenter de mieux comprendre l’évolution des G4.Abstract: For a long time, RNA only known function was its intermediate role. Yet, it has been discovered that, thanks to its single-strand form, it can fold onto itself to generate structures and thus acquire diverse functions. Among these structures stands G-quadruplexes (G4s), which are made of Hoogsteen base-pairing instead of Watson & Crick ones. This structure can fold in RNA, but also in DNA where they can have multiple functions. The G4 study remained shy until the past few years when G4s interest suddenly increased with the discovery of their presence in telomeres and the impact they have on cancer. G4s are also highly enriched in other regulatory regions like promoters in genomes or in untranslated regions of transcripts. Some studies are implying that this distribution might be similar among all eukaryotes but not bacteria. The evolution of G4s is still relatively unknown and few studies are made on it. This is mainly due to one phenomenon: G4s sequences are badly conserved, which harden the task of its study. In the present thesis, it is proposed to study more in-depth G4s distribution among living kingdoms, but also to try new approaches to apprehend their evolution. To do so, the first step was to predict G4s inside the human transcriptome and to validate these predictions. Then, this method has been applied to diverse species among the three living kingdoms. These data have been be analysed and made public thanks to the GAIA database. Finally, an approach has been be developed to look at the evolution of G4s

    Development of a highly optimized procedure for the discovery of RNA G-quadruplexes by combining several strategies

    No full text
    International audienceRNA G-quadruplexes (rG4s) are non-canonical secondary structures that are formed by the selfassociation of guanine quartets and that are stabilized by monovalent cations (e.g. potassium). rG4s are key elements in several post-transcriptional regulation mechanisms, including both messenger RNA (mRNA) and microRNA processing, mRNA transport and translation, to name but a few examples. Over the past few years, multiple high-throughput approaches have been developed in order to identify rG4s, including bioinformatic prediction, in vitro assays and affinity capture experiments coupled to RNA sequencing. Each individual approach had its limits, and thus yielded only a fraction of the potential rG4 that are further confirmed (i.e., there is a significant level of false positive). This report aims to benefit from the strengths of several existing approaches to identify rG4s with a high potential of being folded in cells. Briefly, rG4s were pulled-down from cell lysates using the biotinylated biomimetic G4 ligand BioTASQ and the sequences thus isolated were then identified by RNA sequencing. Then, a novel bioinformatic pipeline that included DESeq2 to identify rG4 enriched transcripts, MACS2 to identify rG4 peaks, rG4-seq to increase rG4 formation probability and G4RNA Screener to detect putative rG4s was performed. This workflow uncovers new rG4 candidates whose rG4-folding was then confirmed in vitro using an array of established biophysical methods. Clearly, this workflow led to the identification of novel rG4s in a highly specific and reliable manner
    corecore