71 research outputs found

    Comparative Analysis of Alternative Splicing in Homo sapiens, Mus musculus and Rattus norvegicus Transcriptomes

    Get PDF
    Analyzing transcriptomes in the context of all available genome and transcript sequence data has the potential to reveal biologically meaningful insight into functional properties of genes and complexity of genomes. Alternative splicing is one of the major mechanisms contributing to the complexity of genomes. This important cellular process generates several different messenger R N A transcripts from a single gene, expression of which produces structurally and functionally different proteins. Regulation of alternative splicing could be tissue-specific, developmental stage and/or physiological condition dependent. Comprehensive analysis of alternative splicing is essential to understand fully the capacity of genomes and thus proteomes. Comparative analyses of alternative splicing across species can provide significant biological insight not only to evolution of alternative splicing, but also to its regulation and functional significance. For comprehensive analyses of alternatively spliced genes, we developed and utilized databases of alternatively spliced transcripts in transcriptomes of Homo sapiens, M u s musculus and Rattus norvegicns. Our databases allow in-depth analyses of alternative and constitutive exons within alternatively spliced genes. Interactive web implementation of our databases brings to end-users the ability to instantly identify orthologous human-mouse, human-rat and mouse-rat gene-pairs with their corresponding exons. A novel visualization method w e introduce, provides easy access to conserved alternative splicing data and a tool to explore the evolutionary significance, regulation and function of this important biological process. Our statistical analysis showed high prevalence of variant loci in human, mouse and rat transcriptomes. 8 1 % of h u m a n loci are variant, as are 7 4 % of mouse loci and 5 8 % of rat loci, revealing widespread presence of alternative splicing in all three transcriptomes. W e further showed that alternative splicing events are mainly due to the presence or absence of cassette exons. More than 6 0 % of alternative exons are cassette exons in all three transcriptomes. Specifically, to analyze the impact of alternative splicing on transcription factor protein structure, we studied the effect of cassette exons on protein domain architectures of mouse transcription factors. We showed that alternative splicing preferentially adds or deletes domains important in DNA-binding function of the transcription factors. 7 5 % of the domains affected by cassette exons are DNA-binding domains. Further, we showed that there is a single transcription factor isoform within a given tissue and isoforms differ across different tissues indicating tissue-specificity of alternatively spliced transcription factors. These results indicate that alternative splicing might contribute to differential gene expression via creation of tissue-specific transcription factor isoforms. In addition, we showed that in the human transcriptome, there is a high prevalence of transcript sequence data from cancer tissues. More than 80% of human variant loci contain transcripts from cancer tissues. We showed that cancer transcripts introduce variation beyond normal alternative splicing via cancer-specific cassette exons. In the majority of tissues, more than 20 % of the cassette exons are from cancer transcripts only. Our results quantitatively validate presence of aberrant alternative splicing in cancer sequence data. Lastly, through a comparative analysis of alternatively spliced genes in transcriptomes of Drosophila melanogaster, Caenorhabditis elegans, Arabidopsis thaliana and Plasmodium falciparum to those in human, mouse and rat transcriptomes, w e showed that there is more alternative splicing in genomes of more complex organisms and that there is an elevation of alternative splicing in mammalian genomes

    Quantitative profiling of the UGT transcriptome in human drug metabolizing tissues

    Get PDF
    Alternative splicing as a mean to control gene expression and diversify function is suspected to considerably influence drug response and clearance. We report the quantitative expression profiles of the human UGT genes including alternatively spliced variants not previously annotated established by deep RNA-sequencing in tissues of pharmacological importance. We reveal a comprehensive quantification of the alternative UGT transcriptome that differ across tissues and among individuals. Alternative transcripts that comprise novel in-frame sequences associated or not with truncations of the 5’ and/or 3’ termini, significantly contribute to the total expression levels of each UGT1 and UGT2 gene averaging 21% in normal tissues, with expression of UGT2 variants surpassing those of UGT1. Quantitative data expose preferential tissue expression patterns and remodelling in favour of alternative variants upon tumorigenesis. These complex alternative splicing programs have the strong potential to contribute to interindividual variability in drug metabolism in addition to diversify the UGT proteome

    Bayesian nonparametric discovery of isoforms and individual specific quantification

    Get PDF
    Most human protein-coding genes can be transcribed into multiple distinct mRNA isoforms. These alternative splicing patterns encourage molecular diversity, and dysregulation of isoform expression plays an important role in disease etiology. However, isoforms are difficult to characterize from short-read RNA-seq data because they share identical subsequences and occur in different frequencies across tissues and samples. Here, we develop BIISQ, a Bayesian nonparametric model for isoform discovery and individual specific quantification from short-read RNA-seq data. BIISQ does not require isoform reference sequences but instead estimates an isoform catalog shared across samples. We use stochastic variational inference for efficient posterior estimates and demonstrate superior precision and recall for simulations compared to state-of-the-art isoform reconstruction methods. BIISQ shows the most gains for low abundance isoforms, with 36% more isoforms correctly inferred at low coverage versus a multi-sample method and 170% more versus single-sample methods. We estimate isoforms in the GEUVADIS RNA-seq data and validate inferred isoforms by associating genetic variants with isoform ratios

    Computational analysis of alternative splicing in human and mice

    Get PDF
    Im ersten Teil wurden Transkript-Spleißstellen untersucht, mit dem Ziel, alternative und Referenzspleißstellen zu unterscheiden. Die Ergebnisse belegen, dass sich beide Klassen von Spleißstellen durch einen Spleißstellen-Score und vermehrtes Auftreten von Spleißfaktor-Bindemotiven in Umgebung der Spleißstellen abgrenzen lassen. Zusätzlich konnte eine positive Korrelation zwischen der Häufigkeit der Nutzung bestimmter Spleißstellen und dem Spleißstellen-Score in beiden Vergleichsklassen nachgewiesen werden. Diese Abhängigkeit impliziert, dass die Genauigkeit der Annotation alternativer Spleißvarianten mit der Anzahl beobachteter Transkripte steigt. Im zweiten Teil wurde das Spleißsignalmotiv GYNNGY untersucht, welches mehr als 40% aller überlappenden Donor-Spleißsignale ausmacht. Mittels in silico Analysen und experimenteller Validierung wurde die Plausibilität dieses subtilen Spleißmusters bestätigt. Der Vergleich mit anderen humanen Spleißvarianten sowie mit Tandem Donoren in Maus-Transkripten zeigte zudem ausgeprägte Unterschiede bezüglich des Spleißstellen-Scores, der Konservierung, sowie dem Vorkommen von Spleißfaktoren-Bindemotiven. Die Verschiebung des Leserasters durch alternatives Spleißen an GYNNGY-Donoren lässt auf eine komplexe Rolle im RNA-Reifungsprozess schließen. Im dritten Teil wurden Reaktionen des spleißosomalen Makrokomples aus publizierten, experimentellen Daten zusammengestellt und mit Hilfe der Petri-Netz-Theorie in einem qualitativen Modell dargestellt. Unter Annahme eines Steady-State Systems wurden minimale, semipositive T-Invarianten berechnet und zur Validierung des Modells herangezogen. Auf Grundlage der vollständigen Abdeckung des Reaktionsnetzwerks mit T-Invarianten konnten weitere Strukturmerkmale, wie Maximal-Gemeinsame Transitions.Mengen und T-Cluster berechnet werden, welche wichtige Stadien des Spleißosomaufbaus widerspiegeln

    Recurrent structural variation, clustered sites of selection, and disease risk for the complement factor H (CFH) gene family

    Get PDF
    Data deposition: The data reported in this paper have been deposited as a National Center for Biotechnology Information BioProject (accession no. PRJNA401648). Author contributions: S.C. and E.E.E. designed research; S.C., C.B., L.H., K.P., K.M.M., M.S., A.E.W., V.D., T.A.G.-L., and R.K.W. performed research; S.C., J.H., C.B., L.H., K.P., K.M.M., M.S., A.E.W., V.D., F.G., A.J.R., R.H.G., T.A.G.-L., R.K.W., B.H.F.W., P.N.B., R.A., and E.E.E. contributed new reagents/analytic tools; S.C., B.J.N., J.H., and E.E.E. analyzed data; and S.C., B.J.N., and E.E.E. wrote the paper.Peer reviewedPublisher PD

    Characterization of the ubiquitin-like protein Hub1 and its role in pre-mRNA splicing in human cells

    Get PDF
    corecore