26 research outputs found

    Motif independent identification of potential RNA G-quadruplexes by G4RNA screener

    Get PDF
    Abstract: Motivation: G-quadruplex structures in RNA molecules are known to have regulatory impacts in cells but are difficult to locate in the genome. The minimal requirements for G-quadruplex folding in RNA (G 3N1-7G 3N1-7G 3N1-7G 3) is being challenged by observations made on specific examples in recent years. The definition of potential G-quadruplex sequences has major repercussions on the observation of the structure since it introduces a bias. The canonical motif only describes a sub-population of the reported G-quadruplexes. To address these issues, we propose an RNA G-quadruplex prediction strategy that does not rely on a motif definition. Results: We trained an artificial neural network with sequences of experimentally validated G-quadruplexes from the G4RNA database encoded using an abstract definition of their sequence. This artificial neural network, G4NN, evaluates the similarity of a given sequence to known G-quadruplexes and reports it as a score. G4NN has a predictive power comparable to the reported G richness and G/C skewness evaluations that are the current state-of-the-art for the identification of potential RNA G-quadruplexes. We combined these approaches in the G4RNA screener, a program designed to manage and evaluate the sequences to identify potential G-quadruplexes

    Aven recognition of RNA G-quadruplexes regulates translation of the mixed lineage leukemia protooncogenes

    Get PDF
    Abstract: G-quadruplexes (G4) are extremely stable secondary structures forming stacks of guanine tetrads. DNA G4 structures have been extensively studied, however, less is known about G4 motifs in mRNAs, especially in their coding sequences. Herein, we show that Aven stimulates the mRNA translation of the mixed lineage leukemia (MLL) proto-oncogene in an arginine methylation-dependent manner. The Aven RGG/RG motif bound G4 structures within the coding regions of the MLL1 and MLL4 mRNAs increasing their polysomal association and translation, resulting in the induction of transcription of leukemic genes. The DHX36 RNA helicase associated with the Aven complex and was required for optimal translation of G4 mRNAs. Depletion of Aven led to a decrease in synthesis of MLL1 and MLL4 proteins resulting in reduced proliferation of leukemic cells. These findings identify an Aven-centered complex that stimulates the translation of G4 harboring mRNAs, thereby promoting survival of leukemic cells

    G4RNA: an RNA G-quadruplex database

    Get PDF
    Abstract: G-quadruplexes (G4) are tetrahelical structures formed from planar arrangement of guanines in nucleic acids. A simple, regular motif was originally proposed to describe G4-forming sequences. More recently, however, formation of G4 was discovered to depend, at least in part, on the contextual backdrop of neighboring sequences. Prediction of G4 folding is thus becoming more challenging as G4 outlier structures, not described by the originally proposed motif, are increasingly reported. Recent observations thus call for a comprehensive tool, capable of consolidating the expanding information on tested G4s, in order to conduct systematic comparative analyses of G4-promoting sequences. The G4RNA Database we propose was designed to help meet the need for easily-retrievable data on known RNA G4s. A user-friendly, flexible query system allows for data retrieval on experimentally tested sequences, from many separate genes, to assess G4-folding potential. Query output sorts data according to sequence position, G4 likelihood, experimental outcomes and associated bibliographical references. G4RNA also provides an ideal foundation to collect and store additional sequence and experimental data, considering the growing interest G4s currently generate

    Where are G-quadruplexes located in the human transcriptome?

    Get PDF
    Abstract: It has been demonstrated that RNA G-quadruplexes (G4) are structural motifs present in transcriptomes and play important regulatory roles in several posttranscriptional mechanisms. However, the full picture of RNA G4 locations and the extent of their implication remain elusive. Solely computational prediction analysis of the whole transcriptome may reveal all potential G4, since experimental identifications are always limited to specific conditions or specific cell lines. The present study reports the first in-depth computational prediction of potential G4 region across the complete human transcriptome. Although using a relatively stringent approach based on three prediction scores that accounts for the composition of G4 sequences, the composition of their neighboring sequences, and the various forms of G4, over 1.1 million of potential G4 (pG4) were predicted. The abundance of G4 was computationally confirmed in both 5 and 3 UTR as well as splicing junction of mRNA, appreciate for the first time in the long ncRNA, while almost absent of most of the small ncRNA families. The present results constitute an important step toward a full understanding of the roles of G4 in posttranscriptional mechanisms

    The folding of 5′-UTR human G-quadruplexes possessing a long central loop

    Get PDF
    Abstract: G-quadruplexes are widespread four-stranded structures that are adopted by G-rich regions of both DNA and RNA and are involved in essential biological processes such as mRNA translation. They are formed by the stacking of two or more G-quartets that are linked together by three loops. Although the maximal loop length is usually fixed to 7 nt in most G-quadruplexpredicting software, it has already been demonstrated that artificial DNA G-quadruplexes containing two distal loops that are limited to 1 nt each and a central loop up to 30 nt long are likely to form in vitro. This report demonstrates that such structures possessing a long central loop are actually found in the 5′-UTRs of human mRNAs. Firstly, 1453 potential G-quadruplex-forming sequences (PG4s) were identified through a bioinformatic survey that searched for sequences respecting the requirement for two 1-nt long distal loops and a long central loop of 2–90 nt in length. Secondly, in vitro in-line probing experiments confirmed and characterized the folding of eight candidates possessing central loops of 10–70 nt long. Finally, the biological effect of several G-quadruplexes with a long central loop on mRNA expression was studied in cellulo using a luciferase gene reporter assay. Clearly, the actual definition of G-quadruplex-forming sequences is too conservative and must be expanded to include the long central loop. This greatly expands the number of expected PG4s in the transcriptome. Consideration of these new candidates might aid in elucidating the potentially important biological implications of the G-quadruplex structure

    Prediction of irregular RNA G-quadruplexes

    No full text
    Depuis que l’ARN a été démontré comme ayant des fonctions biologiques propres autres que le support passif d’information entre l’ADN et le décodage des protéines, la compré- hension du repliement de l’ARN est devenue une discipline importante pour en comprendre l’impact dans la biologie moléculaire. Les structures d’ARN sont plus complexes que celles retrouvées dans l’ADN étant donné l’absence de brin complémentaire. Le brin unique d’ARN se replie sur lui même en diverses structures intramoléculaires telles que le G- quadruplex, une tétrahélice qui repose une propriété des guanines à interagir quatre à quatre. Cette interaction forme un plan cyclique dont l’empilement génère le G-quadruplex. Les premières séquences G-quadruplexes identifiées répondaient au motif G ≥3 N 1-7 G ≥3 N 1-7 G ≥3 N 1-7 G ≥3 qui est rapidement devenu une norme d’identification de G-quadruplexes potentiels. Cependant, le motif est une définition à la fois trop restreinte pour accommoder les nouveaux exemples reportés et qui ne considère pas l’environnement nucléotidique dans lequel se trouve le motif, lequel influence son repliement. Cette thèse présente mes contributions à l’amélioration de l’identification de nouveaux G-quadruplexes potentiels. Afin d’obtenir une meilleure vue de la problématique, un répertoire de séquences dont le repliement en G4 a été vérifié dans la littérature a été mis à la disposition de la communauté en ligne. Ces séquences ont permis l’entraînement du réseau de neurones artificiels G4NN, un outil de prédiction de G-quadruplexes développée avec une attention pour les G-quadruplexes atypiques en réduisant le biais du motif. G4NN est disponible accompagné de deux autres méthodes complémentaires dans l’outil G4RNA screener disponible sur le web pour un usage ponctuel ou en version locale en ligne de commande pour un usage extensif.Abstract: Since RNA was demonstrated to perform genuine biological functions beside the passive support for information between DNA and proteins’ decoding, the comprehension of RNA folding became an important field of interest in order to grasp its impact in molecular biology. RNA structures are more complex than their DNA counterparts considering the absence of a complementary strand. This single stranded RNA folds on itself in a variety of intramolecular structures, including the G-quadruplex tetrahelix that relies on a property of four guanines to interact together. This interaction generates a cyclic plane which upon stacking forms the G-quadruplex. The first G-quadruplexes sequences identified fit the motif G ≥3 N 1-7 G ≥3 N 1-7 G ≥3 N 1-7 G ≥3 which rapidly became the usual potential G-quadruplexes identification system. However, this definition is both too narrow to accomodate recently reported examples and does not consider the nucleotidic environment where the motif is observed. The nucleotidic environment is known to be involved in the G-quadruplex folding outcome. This thesis presents my contributions to address the challenge of potential G-quadruplexes identification. To produce a better understanding of the problematic, a repository of sequences verified for G-quadruplex folding has been released online for consultation by the community. These sequences supported the training of an artificial neural network G4NN, a tool to predict G-quadruplexes developped with a particular attention to unusual G-quadruplexes along with a reduced bias associated to motif search. G4NN is available along with two other methods as part of the G4RNA screener tool available as a webpage for ponctual usage or as a local command line version adapted to extensive usage

    Irregular G-quadruplexes found in the untranslated regions of human mRNAs influence translation

    Get PDF
    Abstract : G-quadruplex structures are composed of coplanar guanines and are found in both DNA and RNA. They are formed by the stacking of two or more G-quartets that are linked together by three loops. The current belief is that RNA G-quadruplexes include loops of l to 7 nucleotides in length, although recent evidence indicates that the central loop (loop 2) can be longer if loops 1 and 3 are limited to a single nucleotide each. With the objective of broadening the definition of irregular RNA G-quadruplexes, a bioinformatic search was performed to find potential G-quadruplexes located in the untranslated regions of human mRNAs (i.e. in the 5 and 3 -UTRs) that contain either a long loop 1 or 3 of up to 40 nucleotides in length.RNAmolecules including the potential sequences were then synthesized and examined in vitro by in-line probing for the formation of G-quadruplex structures. The sequences that adopted a G-quadruplex structure were cloned into a luciferase dual vector and examined for their ability to modulate translation in cellulo. Some irregular G-quadruplexes were observed to either promote or repress translation regardless of the position or the size of the long loop they possessed. Even if the composition of a RNA G-quadruplex is not quite completely understood, the results presented in this report clearly demonstrate that what defines aRNAG-quadruplex is much broader than what we previously believed

    Guanine nucleotide-binding protein-like 1 (GNL1) binds RNA G-quadruplex structures in genes associated with Parkinson’s disease

    Get PDF
    Abstract : RNAs are highly regulated at the post-transcriptional level in neurodegenerative diseases and just a few mutations can significantly affect the fate of neuronal cells. To date, the impact of G-quadruplex (G4) regulation in neurodegenerative diseases like Parkinson’s disease (PD) has not been analysed. In this study, in silico potential G4s located in deregulated genes related to the nervous system were initially identified and were found to be significantly enriched. Several G4 sequences found in the 5ʹ untranslated regions (5ʹUTR) of mRNAs associated with Parkinson’s disease were demonstrated to in fact fold in vitro by biochemical assays. Subcloning of the full-length 5ʹUTRs of these candidates upstream of a luciferase reporter system led to the demonstration that the G4s of both Parkin RBR E3 Ubiquitin Protein Ligase (PRKN) and Vacuolar Protein Sorting-Associated Protein 35 (VPS35) significantly repressed the translation of both genes in SH-SY5Y cells. Subsequently, a strategy of using label-free RNA affinity purification assays with either of these two G4 sequences as bait isolated the Guanine Nucleotide- Binding Protein-Like 1 (GNL1). The latter was shown to have a higher affinity for the G4 sequences than for their mutated version. This study sheds light on new RNA G-quadruplexes located in genes dysregulated in Parkinson disease and a new G4-binding protein, GNL1

    G-Quadruplexes influence pri-microRNA processing

    Get PDF
    <p>RNA G-Quadruplexes (G4) have been shown to possess many biological functions, including the regulation of microRNA (miRNA) biogenesis and function. However, their impact on pri-miRNA processing remains unknown. We identified G4 located near the Drosha cleavage site in three distinct pri-miRNAs: pri-mir200c, pri-mir451a, and pri-mir497. The folding of the potential G4 motifs was determined in solution. Subsequently, mutations disrupting G4 folding led to important changes in the mature miRNAs levels in cells. Moreover, using small antisense oligonucleotides binding to the pri-miRNA, it was possible to modulate, either positively or negatively, the mature miRNA levels. Together, these data demonstrate that G4 motifs could contribute to the regulation of pri-mRNA processing, a novel role for G4. Considering that bio-informatics screening indicates that between 9% and 50% of all pri-miRNAs contain a putative G4, these structures possess interesting potential as future therapeutic targets.</p
    corecore