Search CORE

35,744 research outputs found

Mapping Topographic Structure in White Matter Pathways with Level Set Trees

Author: Kent Brian P.
Rinaldo Alessandro
Verstynen Timothy
Yeh Fang-Cheng
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 20/11/2013
Field of study

Fiber tractography on diffusion imaging data offers rich potential for describing white matter pathways in the human brain, but characterizing the spatial organization in these large and complex data sets remains a challenge. We show that level set trees---which provide a concise representation of the hierarchical mode structure of probability density functions---offer a statistically-principled framework for visualizing and analyzing topography in fiber streamlines. Using diffusion spectrum imaging data collected on neurologically healthy controls (N=30), we mapped white matter pathways from the cortex into the striatum using a deterministic tractography algorithm that estimates fiber bundles as dimensionless streamlines. Level set trees were used for interactive exploration of patterns in the endpoint distributions of the mapped fiber tracks and an efficient segmentation of the tracks that has empirical accuracy comparable to standard nonparametric clustering methods. We show that level set trees can also be generalized to model pseudo-density functions in order to analyze a broader array of data types, including entire fiber streamlines. Finally, resampling methods show the reliability of the level set tree as a descriptive measure of topographic structure, illustrating its potential as a statistical descriptor in brain imaging analysis. These results highlight the broad applicability of level set trees for visualizing and analyzing high-dimensional data like fiber tractography output

arXiv.org e-Print Archive

Directory of Open Access Journals

PubMed Central

Superheat: An R package for creating beautiful and extendable heatmaps for visualizing complex data

Author: Barter Rebecca L
Yu Bin
Publication venue
Publication date: 26/01/2017
Field of study

The technological advancements of the modern era have enabled the collection of huge amounts of data in science and beyond. Extracting useful information from such massive datasets is an ongoing challenge as traditional data visualization tools typically do not scale well in high-dimensional settings. An existing visualization technique that is particularly well suited to visualizing large datasets is the heatmap. Although heatmaps are extremely popular in fields such as bioinformatics for visualizing large gene expression datasets, they remain a severely underutilized visualization tool in modern data analysis. In this paper we introduce superheat, a new R package that provides an extremely flexible and customizable platform for visualizing large datasets using extendable heatmaps. Superheat enhances the traditional heatmap by providing a platform to visualize a wide range of data types simultaneously, adding to the heatmap a response variable as a scatterplot, model results as boxplots, correlation information as barplots, text information, and more. Superheat allows the user to explore their data to greater depths and to take advantage of the heterogeneity present in the data to inform analysis decisions. The goal of this paper is two-fold: (1) to demonstrate the potential of the heatmap as a default visualization method for a wide range of data types using reproducible examples, and (2) to highlight the customizability and ease of implementation of the superheat package in R for creating beautiful and extendable heatmaps. The capabilities and fundamental applicability of the superheat package will be explored via three case studies, each based on publicly available data sources and accompanied by a file outlining the step-by-step analytic pipeline (with code).Comment: 26 pages, 10 figure

arXiv.org e-Print Archive

eScholarship - University of California

Recommended from our members

A Single Visualization Technique for Displaying Multiple Metabolite-Phenotype Associations.

Author: Antonelli Joseph
Cheng Susan
Claggett Brian L
Demler Olga
Demosthenes Emmanuella J
Henglin Mir
Jain Mohit
Lagerborg Kim A
Larson Martin G
Niiranen Teemu
Vasan Ramachandran S
von Jeinsen Beatrice
Watrous Jeramie D
Publication venue: eScholarship, University of California
Publication date: 01/07/2019
Field of study

To assist with management and interpretation of human metabolomics data, which are rapidly increasing in quantity and complexity, we need better visualization tools. Using a dataset of several hundred metabolite measures profiled in a cohort of ~1500 individuals sampled from a population-based community study, we performed association analyses with eight demographic and clinical traits and outcomes. We compared frequently used existing graphical approaches with a novel 'rain plot' approach to display the results of these analyses. The 'rain plot' combines features of a raindrop plot and a conventional heatmap to convey results of multiple association analyses. A rain plot can simultaneously indicate effect size, directionality, and statistical significance of associations between metabolites and several traits. This approach enables visual comparison features of all metabolites examined with a given trait. The rain plot extends prior approaches and offers complementary information for data interpretation. Additional work is needed in data visualizations for metabolomics to assist investigators in the process of understanding and convey large-scale analysis results effectively, feasibly, and practically

eScholarship - University of California

Possibilities and Limits in Visualizing Large Amounts of Multidimensional Data

Author: Grinstein G.
Keim Daniel A.
Kriegel Hans-Peter
Levkowitz H.
Publication venue: Ludwig-Maximilians-Universität München
Publication date: 01/01/1994
Field of study

Open Access LMU

Modeling and visualizing uncertainty in gene expression clusters using Dirichlet process mixtures

Author: De la Cruz Bernard J.
Ghahramani Zoubin
Rasmussen Carl Edward
Wild David L.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2009
Field of study

Although the use of clustering methods has rapidly become one of the standard computational approaches in the literature of microarray gene expression data, little attention has been paid to uncertainty in the results obtained. Dirichlet process mixture (DPM) models provide a nonparametric Bayesian alternative to the bootstrap approach to modeling uncertainty in gene expression clustering. Most previously published applications of Bayesian model-based clustering methods have been to short time series data. In this paper, we present a case study of the application of nonparametric Bayesian clustering methods to the clustering of high-dimensional nontime series gene expression data using full Gaussian covariances. We use the probability that two genes belong to the same cluster in a DPM model as a measure of the similarity of these gene expression profiles. Conversely, this probability can be used to define a dissimilarity measure, which, for the purposes of visualization, can be input to one of the standard linkage algorithms used for hierarchical clustering. Biologically plausible results are obtained from the Rosetta compendium of expression profiles which extend previously published cluster analyses of this data

Crossref

Warwick Research Archives Portal Repository

MPG.PuRe