57 research outputs found

    "gtrellis": an R/Bioconductor package for making genome-level Trellis graphics

    Get PDF
    BACKGROUND: Trellis graphics are a visualization method that splits data by one or more categorical variables and displays subsets of the data in a grid of panels. Trellis graphics are broadly used in genomic data analysis to compare statistics over different categories in parallel and reveal multivariate relationships. However, current software packages to produce Trellis graphics have not been designed with genomic data in mind and lack some functionality that is required for effective visualization of genomic data. RESULTS: Here we introduce the gtrellis package which provides an efficient and extensible way to visualize genomic data in a Trellis layout. gtrellis provides highly flexible Trellis layouts which allow efficient arrangement of genomic categories on the plot. It supports multiple-track visualization, which makes it straightforward to visualize several properties of genomic data in parallel to explain complex relationships. In addition, gtrellis provides an extensible framework that allows adding user-defined graphics. CONCLUSIONS: The gtrellis package provides an easy and effective way to visualize genomic data and reveal high dimensional relationships on a genome-wide scale. gtrellis can be flexibly extended and thus can also serve as a base package for highly specific purposes. gtrellis makes it easy to produce novel visualizations, which can lead to the discovery of previously unrecognized patterns in genomic data. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-1051-4) contains supplementary material, which is available to authorized users

    EnrichedHeatmap: an R/Bioconductor package for comprehensive visualization of genomic signal associations

    Get PDF
    Background: High-throughput sequencing data are dramatically increasing in volume. Thus, there is urgent need for efficient tools to perform fast and integrative analysis of multiple data types. Enriched heatmap is a specific form of heatmap that visualizes how genomic signals are enriched over specific target regions. It is commonly used and efficient at revealing enrichment patterns especially for high dimensional genomic and epigenomic datasets. Results: We present a new R package named EnrichedHeatmap that efficiently visualizes genomic signal enrichment. It provides advanced solutions for normalizing genomic signals within target regions as well as offering highly customizable visualizations. The major advantage of EnrichedHeatmap is the ability to conveniently generate parallel heatmaps as well as complex annotations, which makes it easy to integrate and visualize comprehensive overviews of the patterns and associations within and between complex datasets. Conclusions: EnrichedHeatmap facilitates comprehensive understanding of high dimensional genomic and epigenomic data. The power of EnrichedHeatmap is demonstrated by visualization of the complex associations between DNA methylation, gene expression and various histone modifications

    gtrellis: an R/Bioconductor package for making genome-level Trellis graphics

    Get PDF
    BACKGROUND: Trellis graphics are a visualization method that splits data by one or more categorical variables and displays subsets of the data in a grid of panels. Trellis graphics are broadly used in genomic data analysis to compare statistics over different categories in parallel and reveal multivariate relationships. However, current software packages to produce Trellis graphics have not been designed with genomic data in mind and lack some functionality that is required for effective visualization of genomic data. RESULTS: Here we introduce the gtrellis package which provides an efficient and extensible way to visualize genomic data in a Trellis layout. gtrellis provides highly flexible Trellis layouts which allow efficient arrangement of genomic categories on the plot. It supports multiple-track visualization, which makes it straightforward to visualize several properties of genomic data in parallel to explain complex relationships. In addition, gtrellis provides an extensible framework that allows adding user-defined graphics. CONCLUSIONS: The gtrellis package provides an easy and effective way to visualize genomic data and reveal high dimensional relationships on a genome-wide scale. gtrellis can be flexibly extended and thus can also serve as a base package for highly specific purposes. gtrellis makes it easy to produce novel visualizations, which can lead to the discovery of previously unrecognized patterns in genomic data. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-1051-4) contains supplementary material, which is available to authorized users

    Evolutionary Trajectories of IDH Glioblastomas Reveal a Common Path of Early Tumorigenesis Instigated Years ahead of Initial Diagnosis

    Get PDF
    We studied how intratumoral genetic heterogeneity shapes tumor growth and therapy response for isocitrate dehydrogenase (IDH)-wild-type glioblastoma, a rapidly regrowing tumor. We inferred the evolutionary trajectories of matched pairs of primary and relapsed tumors based on deep whole-genome-sequencing data. This analysis suggests both a distant origin of de novo glioblastoma, up to 7 years before diagnosis, and a common path of early tumorigenesis, with one or more of chromosome 7 gain, 9p loss, or 10 loss, at tumor initiation. TERT promoter mutations often occurred later as a prerequisite for rapid growth. In contrast to this common early path, relapsed tumors acquired no stereotypical pattern of mutations and typically regrew from oligoclonal origins, suggesting sparse selective pressure by therapeutic measures

    Impact of cancer mutational signatures on transcription factor motifs in the human genome

    Get PDF
    Background: Somatic mutations in cancer genomes occur through a variety of molecular mechanisms, which contribute to different mutational patterns. To summarize these, mutational signatures have been defined using a large number of cancer genomes, and related to distinct mutagenic processes. Each cancer genome can be compared to this reference dataset and its exposure to one or the other signature be determined. Given the very different mutational patterns of these signatures, we anticipate that they will have distinct impact on genomic elements, in particular motifs for transcription factor binding sites (TFBS). Methods: We used the 30 mutational signatures from the COSMIC database, and derived a theoretical framework to infer the impact of these signatures on the alteration of transcription factor (TF) binding motifs from the JASPAR database. Hence, we translated the trinucleotide mutation frequencies of the signatures into alteration frequencies of specific TF binding motifs, leading either to creation or disruption of these motifs. Results: Motif families show different susceptibility to alterations induced by the mutational signatures. For certain motifs, a high correlation is observed between the TFBS motif creation and disruption events related to the information content of the motif. Moreover, we observe striking patterns regarding for example the Ets-motif family, for which a high impact of UV induced signatures is observed. Our model also confirms the susceptibility of specific transcription factor motifs to deamination processes. Conclusion: Our results show that the mutational signatures have different impact on the binding motifs of transcription factors and that for certain high complexity motifs there is a strong correlation between creation and disruption, related to the information content of the motif. This study represents a background estimation of the alterations due purely to mutational signatures in the absence of additional contributions, e.g. from evolutionary processes

    Aromatic and proteomic analyses corroborate the distinction between Mediterranean landraces and modern varieties of durum wheat

    Get PDF
    In this paper volatile organic compounds (VOCs) from durum wheat cultivars and landraces were analyzed using PTR-TOF-MS. The aim was to characterize the VOC's profile of the wholemeal flour and of the kernel to find out if any VOCs were specific to varieties and sample matrices. The VOC data is accompanied by SDS-PAGE analyses of the storage proteins (gliadins and glutenins). Statistical analyses was carried out both on the signals obtained by MS and on the protein profiles. The difference between the VOC profile of two cultivars or two preparations of the same sample - matrices, in this case kernel vs wholemeal flour - can be very subtle; the high resolution of PTR-TOF-MS - down to levels as low as pptv - made it possible to recognize these differences. The effects of grinding on the VOC profiles were analyzed using SIMPER and Tanglegram statistical methods. Our results show that it is possible describe samples using VOC profiles and protein data

    Aromatic and proteomic analyses corroborate the distinction between Mediterranean landraces and modern varieties of durum wheat

    Get PDF
    In this paper volatile organic compounds (VOCs) from durum wheat cultivars and landraces were analyzed using PTR-TOF-MS. The aim was to characterize the VOC’s profile of the wholemeal flour and of the kernel to find out if any VOCs were specific to varieties and sample matrices. The VOC data is accompanied by SDS-PAGE analyses of the storage proteins (gliadins and glutenins). Statistical analyses was carried out both on the signals obtained by MS and on the protein profiles. The difference between the VOC profile of two cultivars or two preparations of the same sample - matrices, in this case kernel vs wholemeal flour - can be very subtle; the high resolution of PTR-TOF-MS - down to levels as low as pptv - made it possible to recognize these differences. The effects of grinding on the VOC profiles were analyzed using SIMPER and Tanglegram statistical methods. Our results show that it is possible describe samples using VOC profiles and protein data

    ACEseq – allele specific copy number estimation from whole genome sequencing

    Get PDF
    ACEseq is a computational tool for allele-specific copy number estimation in tumor genomes based on whole genome sequencing. In contrast to other tools it features GC-bias correction, unique replication timing-bias correction and integration of structural variant (SV) breakpoints for improved genome segmentation. ACEseq clearly outperforms widely used state-of-the art methods, provides a fully automated estimation of tumor cell content and ploidy, and additionally computes homologous recombination deficiency scores.</jats:p
    corecore