13 research outputs found

    escheR: unified multi-dimensional visualizations with Gestalt principles

    Get PDF
    SUMMARY: The creation of effective visualizations is a fundamental component of data analysis. In biomedical research, new challenges are emerging to visualize multi-dimensional data in a 2D space, but current data visualization tools have limited capabilities. To address this problem, we leverage Gestalt principles to improve the design and interpretability of multi-dimensional data in 2D data visualizations, layering aesthetics to display multiple variables. The proposed visualization can be applied to spatially-resolved transcriptomics data, but also broadly to data visualized in 2D space, such as embedding visualizations. We provide an open source R package escheR, which is built off of the state-of-the-art ggplot2 visualization framework and can be seamlessly integrated into genomics toolboxes and workflows. AVAILABILITY AND IMPLEMENTATION: The open source R package escheR is freely available on Bioconductor (https://bioconductor.org/packages/escheR)

    A data-driven single-cell and spatial transcriptomic map of the human prefrontal cortex

    Get PDF
    The molecular organization of the human neocortex historically has been studied in the context of its histological layers. However, emerging spatial transcriptomic technologies have enabled unbiased identification of transcriptionally defined spatial domains that move beyond classic cytoarchitecture. We used the Visium spatial gene expression platform to generate a data-driven molecular neuroanatomical atlas across the anterior-posterior axis of the human dorsolateral prefrontal cortex. Integration with paired single-nucleus RNA-sequencing data revealed distinct cell type compositions and cell-cell interactions across spatial domains. Using PsychENCODE and publicly available data, we mapped the enrichment of cell types and genes associated with neuropsychiatric disorders to discrete spatial domains

    LieberInstitute/deconvo_review-paper: LieberInstitute/deconvo_review-paper: v0_preprint

    Full text link
    <p>updated release to reflect the addition of MIT license.</p&gt

    Somatic evolutionary timings of driver mutations

    Full text link
    Abstract Background A unified analysis of DNA sequences from hundreds of tumors concluded that the driver mutations primarily occur in the earliest stages of cancer formation, with relatively few driver mutation events detected in the late-arising subclones. However, emerging evidence from the sequencing of multiple tumors and tumor regions per individual suggests that late-arising subclones with additional driver mutations are underestimated in single-sample analyses. Methods To test whether driver mutations generally map to early tumor development, we examined multi-regional tumor sequencing data from 101 individuals reported in 11 published studies. Following previous studies, we annotated mutations as early-arising when all tumors/regions had those mutations (ubiquitous). We then inferred the fraction of mutations occurring early and compared it with late-arising mutations that were found in only single tumors/regions. Results While a large fraction of driver mutations in tumors occurred relatively early in cancers, later driver mutations occurred at least as frequently as the early drivers in a substantial number of patients. This result was robust to many different approaches to annotate driver mutations. The relative frequency of early and late driver mutations varied among patients of the same cancer type and in different cancer types. We found that previous reports of the preponderance of early driver mutations were primarily informed by analysis of single tumor variant allele profiles, with which it is challenging to clearly distinguish between early and late drivers. Conclusions The origin and preponderance of new driver mutations are not limited to early stages of tumor evolution, with different tumors and regions showing distinct driver mutations and, consequently, distinct characteristics. Therefore, tumors with extensive intratumor heterogeneity appear to have many newly acquired drivers

    LieberInstitute/Habenula_Pilot: v0_preprint

    Full text link
    <p>Pre-print initial release to create a Zenodo DOI + badge.</p&gt

    A new method for inferring timetrees from temporally sampled molecular sequences.

    Full text link
    Pathogen timetrees are phylogenies scaled to time. They reveal the temporal history of a pathogen spread through the populations as captured in the evolutionary history of strains. These timetrees are inferred by using molecular sequences of pathogenic strains sampled at different times. That is, temporally sampled sequences enable the inference of sequence divergence times. Here, we present a new approach (RelTime with Dated Tips [RTDT]) to estimating pathogen timetrees based on a relative rate framework underlying the RelTime approach that is algebraic in nature and distinct from all other current methods. RTDT does not require many of the priors demanded by Bayesian approaches, and it has light computing requirements. In analyses of an extensive collection of computer-simulated datasets, we found the accuracy of RTDT time estimates and the coverage probabilities of their confidence intervals (CIs) to be excellent. In analyses of empirical datasets, RTDT produced dates that were similar to those reported in the literature. In comparative benchmarking with Bayesian and non-Bayesian methods (LSD, TreeTime, and treedater), we found that no method performed the best in every scenario. So, we provide a brief guideline for users to select the most appropriate method in empirical data analysis. RTDT is implemented for use via a graphical user interface and in high-throughput settings in the newest release of cross-platform MEGA X software, freely available from http://www.megasoftware.net

    Challenges and opportunities to computationally deconvolve heterogeneous tissue with varying cell sizes using single cell RNA-sequencing datasets

    Full text link
    Deconvolution of cell mixtures in "bulk" transcriptomic samples from homogenate human tissue is important for understanding the pathologies of diseases. However, several experimental and computational challenges remain in developing and implementing transcriptomics-based deconvolution approaches, especially those using a single cell/nuclei RNA-seq reference atlas, which are becoming rapidly available across many tissues. Notably, deconvolution algorithms are frequently developed using samples from tissues with similar cell sizes. However, brain tissue or immune cell populations have cell types with substantially different cell sizes, total mRNA expression, and transcriptional activity. When existing deconvolution approaches are applied to these tissues, these systematic differences in cell sizes and transcriptomic activity confound accurate cell proportion estimates and instead may quantify total mRNA content. Furthermore, there is a lack of standard reference atlases and computational approaches to facilitate integrative analyses, including not only bulk and single cell/nuclei RNA-seq data, but also new data modalities from spatial -omic or imaging approaches. New multi-assay datasets need to be collected with orthogonal data types generated from the same tissue block and the same individual, to serve as a "gold standard" for evaluating new and existing deconvolution methods. Below, we discuss these key challenges and how they can be addressed with the acquisition of new datasets and approaches to analysis.Comment: 28 pages; 4 figure
    corecore