94 research outputs found

    ASTALAVISTA: dynamic and flexible analysis of alternative splicing events in custom gene datasets

    Get PDF
    In the process of establishing more and more complete annotations of eukaryotic genomes, a constantly growing number of alternative splicing (AS) events has been reported over the last decade. Consequently, the increasing transcript coverage also revealed the real complexity of some variations in the exon–intron structure between transcript variants and the need for computational tools to address ‘complex’ AS events. ASTALAVISTA (alternative splicing transcriptional landscape visualization tool) employs an intuitive and complete notation system to univocally identify such events. The method extracts AS events dynamically from custom gene annotations, classifies them into groups of common types and visualizes a comprehensive picture of the resulting AS landscape. Thus, ASTALAVISTA can characterize AS for whole transcriptome data from reference annotations (GENCODE, REFSEQ, ENSEMBL) as well as for genes selected by the user according to common functional/structural attributes of interest: http://genome.imim.es/astalavist

    Modeling of autosomal-dominant retinitis pigmentosa in Caenorhabditis elegans uncovers a nexus between global impaired functioning of certain splicing factors and cell type-specific apoptosis

    Get PDF
    Retinitis pigmentosa (RP) is a rare genetic disease that causes gradual blindness through retinal degeneration. Intriguingly, seven of the 24 genes identified as responsible for the autosomal-dominant form (adRP) are ubiquitous spliceosome components whose impairment causes disease only in the retina. The fact that these proteins are essential in all organisms hampers genetic, genomic, and physiological studies, but we addressed these difficulties by using RNAi in Caenorhabditis elegans. Our study of worm phenotypes produced by RNAi of splicing-related adRP (s-adRP) genes functionally distinguishes between components of U4 and U5 snRNP complexes, because knockdown of U5 proteins produces a stronger phenotype. RNA-seq analyses of worms where s-adRP genes were partially inactivated by RNAi, revealed mild intron retention in developing animals but not in adults, suggesting a positive correlation between intron retention and transcriptional activity. interestingly, RNAi of s-adRP genes produces an increase in the expression of atl-1 (homolog of human ATR), which is normally activated in response to replicative stress and certain DNA-damaging agents. The up-regulation of atl-1 correlates with the ectopic expression of the pro-apoptotic gene egl-1 and apoptosis in hypodermal cells, which produce the cuticle, but not in other cell types. Our model in C. elegans resembles s-adRP in two aspects: The phenotype caused by global knockdown of s-adRP genes is cell type-specific and associated with high transcriptional activity. Finally, along with a reduced production of mature transcripts, we propose a model in which the retina-specific cell death in s-adRP patients can be induced through genomic instability

    Characterization of 3D genomic interactions in fetal pig muscle

    Get PDF
    Genome sequence alone is not sufficient to explain the overall coordination of nuclear activity in a particular tissue. The nuclear organisation and genomic long-range intra- and inter-chromosomal interactions play an important role in the regulation of gene expression and the activation of tissue- specific gene networks. Here we present an overview of the pig genome architecture in muscle at two late developmental stages. The muscle maturation process occurs between the 90th day and the end of gestation (114 days), a key period for survival at birth. To characterise this period we profiled chromatin interactions genome-wide with in situ Hi-C (High Throughput Chromosome Conformation Capture) in muscle samples collected at 90 and 110 days of gestation, specific moments where a drastic change in gene expression has been reported. About 200 million read pairs per library were generated (3 replicates per condition). This allowed: (a) the design of an experimental Hi-C protocol optimized for frozen fetal tissues, (b) the first Hi-C contact heatmaps in fetal porcine muscle cells, and (c) to profile Topologically Associated Domains (TADs) defined as genomic domains with high levels of chromatin interactions. Using the new assembly version Sus scrofa v11, we could map 82% of the Hi-C reads on the reference genome. After filtering, 49% of valid read pairs were used to infer the genomic interactions in both developmental stages. In addition, ChIP-seq experiments were performed to map the binding of the structural protein CTCF, known to regulate genome structure by promoting interactions between genes and distal enhancers. The Hi-C and ChIP-seq data were analysed in combination with the results of a previous transcriptome analysis, focusing on the hun-dreds of genes that were reported as differentially expressed during muscle maturation. We will report the observed general differences between both developmental stages in terms of transcription and structure

    Profiling the landscape of transcription, chromatin accessibility and chromosome conformation of cattle, pig, chicken and goat genomes [FAANG pilot project]

    Get PDF
    Functional annotation of livestock genomes is a critical and obvious next step to derive maximum benefit for agriculture, animal science, animal welfare and human health. The aim of the Fr-AgENCODE project is to generate multi-species functional genome annotations by applying high-throughput molecular assays on three target tissues/cells relevant to the study of immune and metabolic traits. An extensive collection of stored samples from other tissues is available for further use (FAANG Biosamples ‘FR-AGENCODE’). From each of two males and two females per species (pig, cattle, goat, chicken), strand-oriented RNA-seq and chromatin accessibility ATAC-seq assays were performed on liver tissue and on two T-cell types (CD3+CD4+&CD3+CD8+) sorted from blood (mammals) or spleen (chicken). Chromosome Conformation Capture (in situ Hi-C) was also carried out on liver. Sequencing reads from the 3 assays were processed using standard processing pipelines. While most (50–70%) RNA-seq reads mapped to annotated exons, thousands of novel transcripts and genes were found, including extensions of annotated protein-coding genes and new lncRNAs (see abstract #69857). Consistency of ATAC-seq results was confirmed by the significant proportion of called peaks in promoter regions (36–66%) and by the specific accumulation pattern of peaks around gene starts (TSS) v. gene ends (TTS). Principal Component Analyses for RNA-seq (based on quantified gene expression) and ATAC-seq (based on quantified chromatin accessibility) highlighted clusters characterised by cell type and sex in all species. From Hi-C data, we generated 40kb-resolution interaction maps, profiled a genome-wide Directionality Index and identified from 4,100 (chicken) to 12,100 (pig) topologically-associating do- mains (TADs). Correlations were reported between RNA-seq and ATAC-seq results (see abstract #71581). In summary, we present here an overview of the first multi-species and -tissue annotations of chromatin accessibility and genome architecture related to gene expression for farm animals

    A General Definition and Nomenclature for Alternative Splicing Events

    Get PDF
    Understanding the molecular mechanisms responsible for the regulation of the transcriptome present in eukaryotic cells is one of the most challenging tasks in the postgenomic era. In this regard, alternative splicing (AS) is a key phenomenon contributing to the production of different mature transcripts from the same primary RNA sequence. As a plethora of different transcript forms is available in databases, a first step to uncover the biology that drives AS is to identify the different types of reflected splicing variation. In this work, we present a general definition of the AS event along with a notation system that involves the relative positions of the splice sites. This nomenclature univocally and dynamically assigns a specific “AS code” to every possible pattern of splicing variation. On the basis of this definition and the corresponding codes, we have developed a computational tool (AStalavista) that automatically characterizes the complete landscape of AS events in a given transcript annotation of a genome, thus providing a platform to investigate the transcriptome diversity across genes, chromosomes, and species. Our analysis reveals that a substantial part—in human more than a quarter—of the observed splicing variations are ignored in common classification pipelines. We have used AStalavista to investigate and to compare the AS landscape of different reference annotation sets in human and in other metazoan species and found that proportions of AS events change substantially depending on the annotation protocol, species-specific attributes, and coding constraints acting on the transcripts. The AStalavista system therefore provides a general framework to conduct specific studies investigating the occurrence, impact, and regulation of AS

    Evidence for Transcript Networks Composed of Chimeric RNAs in Human Cells

    Get PDF
    The classic organization of a gene structure has followed the Jacob and Monod bacterial gene model proposed more than 50 years ago. Since then, empirical determinations of the complexity of the transcriptomes found in yeast to human has blurred the definition and physical boundaries of genes. Using multiple analysis approaches we have characterized individual gene boundaries mapping on human chromosomes 21 and 22. Analyses of the locations of the 5â€Č and 3â€Č transcriptional termini of 492 protein coding genes revealed that for 85% of these genes the boundaries extend beyond the current annotated termini, most often connecting with exons of transcripts from other well annotated genes. The biological and evolutionary importance of these chimeric transcripts is underscored by (1) the non-random interconnections of genes involved, (2) the greater phylogenetic depth of the genes involved in many chimeric interactions, (3) the coordination of the expression of connected genes and (4) the close in vivo and three dimensional proximity of the genomic regions being transcribed and contributing to parts of the chimeric RNAs. The non-random nature of the connection of the genes involved suggest that chimeric transcripts should not be studied in isolation, but together, as an RNA network

    Integrating alternative splicing detection into gene prediction

    Get PDF
    Alternative splicing ( AS) is now considered as a major actor in transcriptome/proteome diversity and it cannot be neglected in the annotation process of a new genome. Despite considerable progresses in term of accuracy in computational gene prediction, the ability to reliably predict AS variants when there is local experimental evidence of it remains an open challenge for gene finders.We have used a new integrative approach that allows to incorporate AS detection into ab initio gene prediction. This method relies on the analysis of genomically aligned transcript sequences ( ESTs and/or cDNAs), and has been implemented in the dynamic programming algorithm of the graph-based gene finder EuGENE. Given a genomic sequence and a set of aligned transcripts, this new version identifies the set of transcripts carrying evidence of alternative splicing events, and provides, in addition to the classical optimal gene prediction, alternative optimal predictions ( among those which are consistent with the AS events detected). This allows for multiple annotations of a single gene in a way such that each predicted variant is supported by a transcript evidence ( but not necessarily with a full-length coverage).This automatic combination of experimental data analysis and ab initio gene finding offers an ideal integration of alternatively spliced gene prediction inside a single annotation pipeline

    Localisation de gÚnes et variants par intégration d'informations

    No full text
    L'exploitation des données issues des projets de séquençage des génomes représente un enjeu majeur de la biologie moderne et de la bioinformatique. Une étape déterminante du processus d'annotation est la localisation dans les séquences d'ADN des gÚnes codant pour des protéines. Le travail réalisé dans le cadre de cette thÚse se base sur un logiciel de détection de gÚnes (EuGÚne) qui intÚges des informations multiples au sein d'un modÚle non probabiliste représentant sous la forme d'un graphe (DAG) l'ensemble des structures géniques potentielles d'une séquence ADN. Une méthode d'estimation des paramÚtres du modÚle basée sur une procédure d'optimisation stochastique des performances a permis la prise en compte de nouvelles informations, comme des données d'homologie inter- et intra-génomique. Le problÚme de la prédiction de variants d'épissage alternatif est également traité, grùce à une nouvelle méthode intégrant les approches intrinsÚques et extrinsÚques.As more genomes are sequenced, the exact localisation of protein-coding genes in genomic DNA sequences is a major challenge in bioinformatics and modern biology. This work is based on a gene finding software (EuGÚne) that integrates several sources of information in a non-probabilistic graph-based gene structure model (DAG). A weighting parameters estimation method based on a stochastic optimization process has been designed to allow the incorporation of new types of data, like inter- and intra-genomic homology information. The problem of predicting several alternatively spliced variants for one gene has also been adressed by including a transcript data analysis into the global gene finding process, resulting in a new extrinsic/intrinsic integrative approach.TOULOUSE3-BU Santé-Centrale (315552105) / SudocSudocFranceF
    • 

    corecore