13 research outputs found

    Intrasplicing coordinates alternative first exons with alternative splicing in the protein 4.1R gene

    Get PDF
    In the protein 4.1R gene, alternative first exons splice differentially to alternative 3' splice sites far downstream in exon 2'/2 (E2'/2). We describe a novel intrasplicing mechanism by which exon 1A (E1A) splices exclusively to the distal E2'/2 acceptor via two nested splicing reactions regulated by novel properties of exon 1B (E1B). E1B behaves as an exon in the first step, using its consensus 5' donor to splice to the proximal E2'/2 acceptor. A long region of downstream intron is excised, juxtaposing E1B with E2'/2 to generate a new composite acceptor containing the E1B branchpoint/pyrimidine tract and E2 distal 3' AG-dinucleotide. Next, the upstream E1A splices over E1B to this distal acceptor, excising the remaining intron plus E1B and E2' to form mature E1A/E2 product. We mapped branch points for both intrasplicing reactions and demonstrated that mutation of the E1B 5' splice site or branchpoint abrogates intrasplicing. In the 4.1R gene, intrasplicing ultimately determines N-terminal protein structure and function. More generally, intrasplicing represents a new mechanism whereby alternative promoters can be coordinated with downstream alternative splicing

    Local assembly and pre-mRNA splicing analyses by high-throughput sequencing data

    Get PDF
    Next generation sequencing (NGS) approaches have become one of the most widely used tools in biotechnology. With high throughput sequencing, people can analyze non-model species at an unprecedented high resolution. NGS provides fast, deep and cheap sequencing solutions, and it has been used to answer various biological questions. In this thesis, I have developed a set of tools and used them to study several interesting research topics. First, de novo whole-genome assembly is still a very challenging technical task. For eukaryotic genomes, de novo assembly typically requires computational resources with very large memory and fast processors. Instead of trying to assemble the whole genome as done in previous approaches, I focus on efficiently reconstructing the genomic regions related to the homologous protein or cDNA sequences. I have developed SRAssembler, a local assembly program using the iterative chromosome walking strategy to assemble the loci of interest directly. Second, I used high-throughput RNA sequencing (refered to as RNA-Seq) data to analyze different intron splicing models and their relative frequency of occurrence. The first mechanism I explored is the recursive splicing patterns in large introns. I have implemented a pipeline called RSSFinder, which can search for recursive sites confirmed by RNA-Seq data. My study suggests the prevalence of recursive splicing in different species. These predicted recursive sites can also be used to investigate certain diseases associated with abnormal splicing of transcripts. In addition, I have demonstrated the use of RNA-Seq data to decipher the detailed mechanisms involved in splicing and their relationship with transcription. Here I proposed mathematical models to estimate the distribution of mRNA splicing intermediates. I evaluated my models with simulated data and an Arabidopsis thaliana dataset. My results indicate that co-transcriptional splicing is widespread in Arabidopsis thaliana

    Nested introns in an intron: Evidence of multi-step splicing in a large intron of the human dystrophin pre-mRNA

    Get PDF
    AbstractThe mechanisms by which huge human introns are spliced out precisely are poorly understood. We analyzed large intron 7 (110199 nucleotides) generated from the human dystrophin (DMD) pre-mRNA by RT-PCR. We identified branching between the authentic 5′ splice site and the branch point; however, the sequences far from the branch site were not detectable. This RT-PCR product was resistant to exoribonuclease (RNase R) digestion, suggesting that the detected lariat intron has a closed loop structure but contains gaps in its sequence. Transient and concomitant generation of at least two branched fragments from nested introns within large intron 7 suggests internal nested splicing events before the ultimate splicing at the authentic 5′ and 3′ splice sites. Nested splicing events, which bring the authentic 5′ and 3′ splice sites into close proximity, could be one of the splicing mechanisms for the extremely large introns

    The Peculiarities of Large Intron Splicing in Animals

    Get PDF
    In mammals a considerable 92% of genes contain introns, with hundreds and hundreds of these introns reaching the incredible size of over 50,000 nucleotides. These “large introns” must be spliced out of the pre-mRNA in a timely fashion, which involves bringing together distant 5′ and 3′ acceptor and donor splice sites. In invertebrates, especially Drosophila, it has been shown that larger introns can be spliced efficiently through a process known as recursive splicing—a consecutive splicing from the 5′-end at a series of combined donor-acceptor splice sites called RP-sites. Using a computational analysis of the genomic sequences, we show that vertebrates lack the proper enrichment of RP-sites in their large introns, and, therefore, require some other method to aid splicing. We analyzed over 15,000 non-redundant, large introns from six mammals, 1,600 from chicken and zebrafish, and 560 non-redundant large introns from five invertebrates. Our bioinformatic investigation demonstrates that, unlike the studied invertebrates, the studied vertebrate genomes contain consistently abundant amounts of direct and complementary strand interspersed repetitive elements (mainly SINEs and LINEs) that may form stems with each other in large introns. This examination showed that predicted stems are indeed abundant and stable in the large introns of mammals. We hypothesize that such stems with long loops within large introns allow intron splice sites to find each other more quickly by folding the intronic RNA upon itself at smaller intervals and, thus, reducing the distance between donor and acceptor sites

    Lessons from non-canonical splicing

    Get PDF
    Recent improvements in experimental and computational techniques that are used to study the transcriptome have enabled an unprecedented view of RNA processing, revealing many previously unknown non-canonical splicing events. This includes cryptic events located far from the currently annotated exons and unconventional splicing mechanisms that have important roles in regulating gene expression. These non-canonical splicing events are a major source of newly emerging transcripts during evolution, especially when they involve sequences derived from transposable elements. They are therefore under precise regulation and quality control, which minimizes their potential to disrupt gene expression. We explain how non-canonical splicing can lead to aberrant transcripts that cause many diseases, and also how it can be exploited for new therapeutic strategies

    A spliceosomal twin intron (stwintron) participates in both exon skipping and evolutionary exon loss

    Get PDF
    International audienceSpliceosomal twin introns (stwintrons) are introns where any of the three consensus sequences involved in splicing is interrupted by another intron (internal intron). In Aspergillus nidulans, a donor-disrupted stwintron (intron-1) is extant in the transcript encoding a reticulon-like protein. The orthologous transcript of Aspergillus niger can be alternatively spliced; the exon downstream the stwintron could be skipped by excising a sequence that comprises this stwintron, the neighbouring intron-2, and the exon bounded by these. This process involves the use of alternative 3' splice sites for the internal intron, the resulting alternative intervening sequence being a longer 3'-extended stwintron. In 29 species of Onygenales, a multi-step splicing process occurs in the orthologous transcript, in which a complex intervening sequence including the stwintron and neigbouring intron-2, generates by three splicing reactions a "second order intron" which must then be excised with a fourth splicing event. The gene model in two species can be envisaged as one canonical intron (intron-1) evolved from this complex intervening sequence of nested canonical introns found elsewhere in Onygenales. Postulated splicing intermediates were experimentally verified in one or more species. This work illustrates a role of stwintrons in both alternative splicing and the evolution of intron structure

    Transcriptional regulation of the overlapping human genes optn and CCDC3

    Get PDF
    Human genes OPTN and CCDC3, encoding respectively for optineurin and coiled-coil domain-containing protein 3, are part of the PDB6 locus, a genetic hotspot strongly associated with Paget's disease of bone (PDB), the second most prevalent metabolic bone disease after osteoporosis. OPTN and CCDC3 genes share a head-to-head configuration and partially overlapping sequences, located on opposite strands of this locus. We first defined the molecular structure of the two genes based on the in silico identified mRNAs, which included several alternatively spliced transcripts. The task was performed with the aid of bioinformatics tools and online databases, such as expressed sequence tags (ESTs), AceView gene browser and Splign software; as a result, we obtained a comprehensive map of OPTN and CCDC3, emphasizing the size and position of introns and exons of each transcript. Next, we assessed the activity of CCDC3 and OPTN promoter regions; due to their head-to-head disposition, the two genes share a common regulatory sequence. A putative CCDC3 alternative promoter, located downstream and exclusive for certain CCDC3 transcripts, was identified by analysing the gene structure obtained in silico. The activity of the promoter regions was validated by transient transfecting pGL3 reporter constructs, containing the promoter sequences under analysis, into HEK 293 cells, followed by luciferase assays. Trans-acting regulatory proteins, e.g. transcription factors (TFs), putatively involved in the regulation of the two genes, were identified in silico by analyzing the promoter sequences through bioinformatics software. The analysis revealed several putative TF binding sites, including for NF-κB, a TF known to play a role in the pathogenesis of PDB. Transient co-transfection of HEK 293 cells with pGL3 reporter constructs and transcription factor NF-κB expression vectors, followed by luciferase assays, have been performed in order to confirm their role as trans-regulators of the target promoters, and to unveil the presence of a possible co-regulation.Os genes humanos OPTN e CCDC3, que codificam respectivamente para optineurina e coiled-coil domain-containing protein 3, fazem parte do locus PDB6, um hotspot genético fortemente associado à doença óssea de Paget (PDB), a segunda doença óssea metabólica mais prevalente após a osteoporose. Os genes OPTN e CCDC3 compartilham uma configuração frente a frente e sequências parcialmente sobrepostas, localizadas em cadeias opostas desse locus. Primeiro, definimos a estrutura molecular dos dois genes com os mRNAs identificados in silico, que incluíam vários transcritos alternadamente unidos. A tarefa foi realizada com o auxílio de ferramentas de bioinformática e bancos de dados on-line, como tags de sequência expressa (ESTs), navegador de genes AceView e software Splign; como resultado, obtivemos um mapa abrangente de OPTN e CCDC3, enfatizando o tamanho e a posição dos íntrons e exons de cada transcrição. Em seguida, avaliamos a atividade das regiões promotoras de CCDC3 e OPTN; devido à sua disposição frente a frente, os dois genes compartilham uma sequência reguladora comum. Um promotor alternativo de CCDC3, localizado a jusante e exclusivo para certos transcritos de CCDC3, foi identificado através da análise da estrutura genética obtida em silico. A atividade das regiões promotoras foi validada por construções repórteres de transfecção pGL3 transitórias, contendo as seqüências promotoras em análise, em células HEK 293, seguidas por ensaios de luciferase. Proteínas reguladoras de ação trans, p. fatores de transcrição (TFs), potencialmente envolvidos na regulação dos dois genes, foram identificados in silico através da análise das seqüências promotoras através do software de bioinformática. A análise revelou vários locais de ligação a TF, incluindo NF-κB, um TF conhecido por desempenhar um papel na patogênese do PDB. A co-transfecção transitória de células HEK 293 com construções repórter pGL3 e vetores de expressão contendo o TF NF-κB, seguidos de ensaios de luciferase, foram realizados para confirmar seu papel como reguladores trans dos promotores alvo e para revelar a presença de um possível co-regulação

    A telescope for the RNA universe : novel bioinformatic approaches to analyze RNA sequencing data

    Get PDF
    In this thesis I focus on the application of bioinformatics to analyze RNA. The type of experimental data of interest is sequencing data generated with various Next Generation Sequencing technique: nuclear RNA, cytoplasmic RNA, captured polyadenylated RNA fragments, etc. I highlight the necessity in developing new tools (e.g., to analyze nuclear RNA) and give a showcase example of implementing such tool and showing its usability on a real biological experiment. The thesis also covers existing tools to perform various types of RNA analysis and shows how these tools can be twigged and expanded to answer certain biological questions (e.g., studying changes in RNA specific to human aging). I also show how current bioinformatic approaches can be used in a particularly complex study such as investigating cancer (in this thesis, breast cancer) pathogenesis.UBL - phd migration 201

    Putting the Pieces Together: Exons and piRNAs: A Dissertation

    Get PDF
    Analysis of gene expression has undergone a technological revolution. What was impossible 6 years ago is now routine. High-throughput DNA sequencing machines capable of generating hundreds of millions of reads allow, indeed force, a major revision toward the study of the genome’s functional output—the transcriptome. This thesis examines the history of DNA sequencing, measurement of gene expression by sequencing, isoform complexity driven by alternative splicing and mammalian piRNA precursor biogenesis. Examination of these topics is framed around development of a novel RNA-templated DNA-DNA ligation assay (SeqZip) that allows for efficient analysis of abundant, complex, and functional long RNAs. The discussion focuses on the future of transcriptome analysis, development and applications of SeqZip, and challenges presented to biomedical researchers by extremely large and rich datasets
    corecore