109 research outputs found

    Landscape of transcription in human cells

    Get PDF
    Eukaryotic cells make many types of primary and processed RNAs that are found either in specific subcellular compartments or throughout the cells. A complete catalogue of these RNAs is not yet available and their characteristic subcellular localizations are also poorly understood. Because RNA represents the direct output of the genetic information encoded by genomes and a significant proportion of a cell’s regulatory capabilities are focused on its synthesis, processing, transport, modification and translation, the generation of such a catalogue is crucial for understanding genome function. Here we report evidence that three-quarters of the human genome is capable of being transcribed, as well as observations about the range and levels of expression, localization, processing fates, regulatory regions and modifications of almost all currently annotated and thousands of previously unannotated RNAs. These observations, taken together, prompt a redefinition of the concept of a gene

    Characterization of 3D genomic interactions in fetal pig muscle

    Get PDF
    Genome sequence alone is not sufficient to explain the overall coordination of nuclear activity in a particular tissue. The nuclear organisation and genomic long-range intra- and inter-chromosomal interactions play an important role in the regulation of gene expression and the activation of tissue- specific gene networks. Here we present an overview of the pig genome architecture in muscle at two late developmental stages. The muscle maturation process occurs between the 90th day and the end of gestation (114 days), a key period for survival at birth. To characterise this period we profiled chromatin interactions genome-wide with in situ Hi-C (High Throughput Chromosome Conformation Capture) in muscle samples collected at 90 and 110 days of gestation, specific moments where a drastic change in gene expression has been reported. About 200 million read pairs per library were generated (3 replicates per condition). This allowed: (a) the design of an experimental Hi-C protocol optimized for frozen fetal tissues, (b) the first Hi-C contact heatmaps in fetal porcine muscle cells, and (c) to profile Topologically Associated Domains (TADs) defined as genomic domains with high levels of chromatin interactions. Using the new assembly version Sus scrofa v11, we could map 82% of the Hi-C reads on the reference genome. After filtering, 49% of valid read pairs were used to infer the genomic interactions in both developmental stages. In addition, ChIP-seq experiments were performed to map the binding of the structural protein CTCF, known to regulate genome structure by promoting interactions between genes and distal enhancers. The Hi-C and ChIP-seq data were analysed in combination with the results of a previous transcriptome analysis, focusing on the hun-dreds of genes that were reported as differentially expressed during muscle maturation. We will report the observed general differences between both developmental stages in terms of transcription and structure

    Profiling the landscape of transcription, chromatin accessibility and chromosome conformation of cattle, pig, chicken and goat genomes [FAANG pilot project]

    Get PDF
    Functional annotation of livestock genomes is a critical and obvious next step to derive maximum benefit for agriculture, animal science, animal welfare and human health. The aim of the Fr-AgENCODE project is to generate multi-species functional genome annotations by applying high-throughput molecular assays on three target tissues/cells relevant to the study of immune and metabolic traits. An extensive collection of stored samples from other tissues is available for further use (FAANG Biosamples ‘FR-AGENCODE’). From each of two males and two females per species (pig, cattle, goat, chicken), strand-oriented RNA-seq and chromatin accessibility ATAC-seq assays were performed on liver tissue and on two T-cell types (CD3+CD4+&CD3+CD8+) sorted from blood (mammals) or spleen (chicken). Chromosome Conformation Capture (in situ Hi-C) was also carried out on liver. Sequencing reads from the 3 assays were processed using standard processing pipelines. While most (50–70%) RNA-seq reads mapped to annotated exons, thousands of novel transcripts and genes were found, including extensions of annotated protein-coding genes and new lncRNAs (see abstract #69857). Consistency of ATAC-seq results was confirmed by the significant proportion of called peaks in promoter regions (36–66%) and by the specific accumulation pattern of peaks around gene starts (TSS) v. gene ends (TTS). Principal Component Analyses for RNA-seq (based on quantified gene expression) and ATAC-seq (based on quantified chromatin accessibility) highlighted clusters characterised by cell type and sex in all species. From Hi-C data, we generated 40kb-resolution interaction maps, profiled a genome-wide Directionality Index and identified from 4,100 (chicken) to 12,100 (pig) topologically-associating do- mains (TADs). Correlations were reported between RNA-seq and ATAC-seq results (see abstract #71581). In summary, we present here an overview of the first multi-species and -tissue annotations of chromatin accessibility and genome architecture related to gene expression for farm animals

    Transcriptome characterization by RNA sequencing identifies a major molecular and clinical subdivision in chronic lymphocytic leukemia

    Get PDF
    Chronic lymphocytic leukemia (CLL) has heterogeneous clinical and biological behavior. Whole-genome and -exome sequencing has contributed to the characterization of the mutational spectrum of the disease, but the underlying transcriptional profile is still poorly understood. We have performed deep RNA sequencing in different subpopulations of normal B-lymphocytes and CLL cells from a cohort of 98 patients, and characterized the CLL transcriptional landscape with unprecedented resolution. We detected thousands of transcriptional elements differentially expressed between the CLL and normal B cells, including protein-coding genes, noncoding RNAs, and pseudogenes. Transposable elements are globally derepressed in CLL cells. In addition, two thousand genes-most of which are not differentially expressed-exhibit CLL-specific splicing patterns. Genes involved in metabolic pathways showed higher expression in CLL, while genes related to spliceosome, proteasome, and ribosome were among the most down-regulated in CLL. Clustering of the CLL samples according to RNA-seq derived gene expression levels unveiled two robust molecular subgroups, C1 and C2. C1/C2 subgroups and the mutational status of the immunoglobulin heavy variable (IGHV) region were the only independent variables in predicting time to treatment in a multivariate analysis with main clinico-biological features. This subdivision was validated in an independent cohort of patients monitored through DNA microarrays. Further analysis shows that B-cell receptor (BCR) activation in the microenvironment of the lymph node may be at the origin of the C1/C2 differences

    Evidence for Transcript Networks Composed of Chimeric RNAs in Human Cells

    Get PDF
    The classic organization of a gene structure has followed the Jacob and Monod bacterial gene model proposed more than 50 years ago. Since then, empirical determinations of the complexity of the transcriptomes found in yeast to human has blurred the definition and physical boundaries of genes. Using multiple analysis approaches we have characterized individual gene boundaries mapping on human chromosomes 21 and 22. Analyses of the locations of the 5′ and 3′ transcriptional termini of 492 protein coding genes revealed that for 85% of these genes the boundaries extend beyond the current annotated termini, most often connecting with exons of transcripts from other well annotated genes. The biological and evolutionary importance of these chimeric transcripts is underscored by (1) the non-random interconnections of genes involved, (2) the greater phylogenetic depth of the genes involved in many chimeric interactions, (3) the coordination of the expression of connected genes and (4) the close in vivo and three dimensional proximity of the genomic regions being transcribed and contributing to parts of the chimeric RNAs. The non-random nature of the connection of the genes involved suggest that chimeric transcripts should not be studied in isolation, but together, as an RNA network

    Enhanced Transcriptome Maps from Multiple Mouse Tissues Reveal Evolutionary Constraint in Gene Expression for Thousands of Genes

    Get PDF
    We characterized by RNA-seq the transcriptional profiles of a large and heterogeneous collection of mouse tissues, augmenting the mouse transcriptome with thousands of novel transcript candidates. Comparison with transcriptome profiles obtained in human cell lines reveals substantial conservation of transcriptional programs, and uncovers a distinct class of genes with levels of expression across cell types and species, that have been constrained early in vertebrate evolution. This core set of genes capture a substantial and constant fraction of the transcriptional output of mammalian cells, and participates in basic functional and structural housekeeping processes common to all cell types. Perturbation of these constrained genes is associated with significant phenotypes including embryonic lethality and cancer. Evolutionary constraint in gene expression levels is not reflected in the conservation of the genomic sequences, but is associated with strong and conserved epigenetic marking, as well as to a characteristic post-transcriptional regulatory program in which sub-cellular localization and alternative splicing play comparatively large roles

    Landscape of transcription in human cells

    Get PDF
    Eukaryotic cells make many types of primary and processed RNAs that are found either in specific sub-cellular compartments or throughout the cells. A complete catalogue of these RNAs is not yet available and their characteristic sub-cellular localizations are also poorly understood. Since RNA represents the direct output of the genetic information encoded by genomes and a significant proportion of a cell’s regulatory capabilities are focused on its synthesis, processing, transport, modifications and translation, the generation of such a catalogue is crucial for understanding genome function. Here we report evidence that three quarters of the human genome is capable of being transcribed, as well as observations about the range and levels of expression, localization, processing fates, regulatory regions and modifications of almost all currently annotated and thousands of previously unannotated RNAs. These observations taken together prompt to a redefinition of the concept of a gene

    The genomic landscape of balanced cytogenetic abnormalities associated with human congenital anomalies

    Get PDF
    Despite the clinical significance of balanced chromosomal abnormalities (BCAs), their characterization has largely been restricted to cytogenetic resolution. We explored the landscape of BCAs at nucleotide resolution in 273 subjects with a spectrum of congenital anomalies. Whole-genome sequencing revised 93% of karyotypes and demonstrated complexity that was cryptic to karyotyping in 21% of BCAs, highlighting the limitations of conventional cytogenetic approaches. At least 33.9% of BCAs resulted in gene disruption that likely contributed to the developmental phenotype, 5.2% were associated with pathogenic genomic imbalances, and 7.3% disrupted topologically associated domains (TADs) encompassing known syndromic loci. Remarkably, BCA breakpoints in eight subjects altered a single TAD encompassing MEF2C, a known driver of 5q14.3 microdeletion syndrome, resulting in decreased MEF2C expression. We propose that sequence-level resolution dramatically improves prediction of clinical outcomes for balanced rearrangements and provides insight into new pathogenic mechanisms, such as altered regulation due to changes in chromosome topology

    Comparative analysis of the transcriptome across distant species

    Get PDF
    The transcriptome is the readout of the genome. Identifying common features in it across distant species can reveal fundamental principles. To this end, the ENCODE and modENCODE consortia have generated large amounts of matched RNA-sequencing data for human, worm and fly. Uniform processing and comprehensive annotation of these data allow comparison across metazoan phyla, extending beyond earlier within-phylum transcriptome comparisons and revealing ancient, conserved features. Specifically, we discover co-expression modules shared across animals, many of which are enriched in developmental genes. Moreover, we use expression patterns to align the stages in worm and fly development and find a novel pairing between worm embryo and fly pupae, in addition to the embryo-to-embryo and larvae-to-larvae pairings. Furthermore, we find that the extent of non-canonical, non-coding transcription is similar in each organism, per base pair. Finally, we find in all three organisms that the gene-expression levels, both coding and non-coding, can be quantitatively predicted from chromatin features at the promoter using a 'universal model' based on a single set of organism-independent parameters