Search CORE

48,820 research outputs found

Data analytics enhanced data visualization and interrogation with parallel coordinates plots

Author: Akbar MS
Gabrys B
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 08/02/2019
Field of study

© 2018 IEEE. Parallel coordinates plots (PCPs) suffer from curse of dimensionality when used with larger multidimensional datasets. Curse of dimentionality results in clutter which hides important visual data trends among coordinates. A number of solutions to address this problem have been proposed including filtering, aggregation, and dimension reordering. These solutions, however, have their own limitations with regard to exploring relationships and trends among the coordinates in PCPs. Correlation based coordinates reordering techniques are among the most popular and have been widely used in PCPs to reduce clutter, though based on the conducted experiments, this research has identified some of their limitations. To achieve better visualization with reduced clutter, we have proposed and evaluated dimensions reordering approach based on minimization of the number of crossing pairs. In the last step, k-means clustering is combined with reordered coordinates to highlight key trends and patterns. The conducted comparative analysis have shown that minimum crossings pairs approach performed much better than other applied techniques for coordinates reordering, and when combined with k-means clustering, resulted in better visualization with significantly reduced clutter

OPUS - University of Technology Sydney

Evaluation of a Bundling Technique for Parallel Coordinates

Author: Heinrich Julian
Kirkpatrick Arthur E.
Luo Yuan
Weiskopf Daniel
Zhang Hao
Publication venue
Publication date: 27/09/2011
Field of study

We describe a technique for bundled curve representations in parallel-coordinates plots and present a controlled user study evaluating their effectiveness. Replacing the traditional C^0 polygonal lines by C^1 continuous piecewise Bezier curves makes it easier to visually trace data points through each coordinate axis. The resulting Bezier curves can then be bundled to visualize data with given cluster structures. Curve bundles are efficient to compute, provide visual separation between data clusters, reduce visual clutter, and present a clearer overview of the dataset. A controlled user study with 14 participants confirmed the effectiveness of curve bundling for parallel-coordinates visualization: 1) compared to polygonal lines, it is equally capable of revealing correlations between neighboring data attributes; 2) its geometric cues can be effective in displaying cluster information. For some datasets curve bundling allows the color perceptual channel to be applied to other data attributes, while for complex cluster patterns, bundling and color can represent clustering far more clearly than either alone

arXiv.org e-Print Archive

CiteSeerX

Visual Multi-Metric Grouping of Eye-Tracking Data

Author: Burch Michael
Kumar Ayush
Mueller Klaus
Netzel Rudolf
Weiskopf Daniel
Publication venue: University of Bern
Publication date: 14/02/2018
Field of study

We present an algorithmic and visual grouping of participants and eye-tracking metrics derived from recorded eye-tracking data. Our method utilizes two well-established visualization concepts. First, parallel coordinates are used to provide an overview of the used metrics, their interactions, and similarities, which helps select suitable metrics that describe characteristics of the eye-tracking data. Furthermore, parallel coordinates plots enable an analyst to test the effects of creating a combination of a subset of metrics resulting in a newly derived eye-tracking metric. Second, a similarity matrix visualization is used to visually represent the affine combination of metrics utilizing an algorithmic grouping of subjects that leads to distinct visual groups of similar behavior. To keep the diagrams of the matrix visualization simple and understandable, we visually encode our eye- tracking data into the cells of a similarity matrix of participants. The algorithmic grouping is performed with a clustering based on the affine combination of metrics, which is also the basis for the similarity value computation of the similarity matrix. To illustrate the usefulness of our visualization, we applied it to an eye-tracking data set involving the reading behavior of metro maps of up to 40 participants. Finally, we discuss limitations and scalability issues of the approach focusing on visual and perceptual issues

arXiv.org e-Print Archive

Journal of Eye Movement Research

Understanding Clusters in Multidimensional Spaces: Making Meaning by Combining Insights from Coordinated Views of Domain Knowledge (2004)

Author: Seo Jinwook
Shneiderman Ben
Publication venue
Publication date: 01/01/2005
Field of study

Cluster analysis of multidimensional data is widely used in many research areas including financial, economical, sociological, and biological analyses. Finding natural subclasses in a data set not only reveals interesting patterns but also serves as a basis for further analyses. One of the troubles with cluster analysis is that evaluating how interesting a clustering result is to researchers is subjective, application-dependent, and even difficult to measure. This problem generally gets worse as dimensionality and the number of items grows. The remedy is to enable researchers to apply domain knowledge to facilitate insight about the significance of the clustering result. This article presents a way to better understand a clustering result by combining insights from two interactively coordinated visual displays of domain knowledge. The first is a parallel coordinates view powered by a direct-manipulation search. The second is a domain knowledge view containing a well-understood and meaningful tabular or hierarchical information for the same data set. Our examples depend on hierarchical clustering of gene expression data, coordinated with a parallel coordinates view and with the gene annotation and gene ontology

LDAExplore: Visualizing Topic Models Generated Using Latent Dirichlet Allocation

Author: Brantley Kiante
Chen Jian
Ganesan Ashwinkumar
Pan Shimei
Publication venue
Publication date: 23/07/2015
Field of study

We present LDAExplore, a tool to visualize topic distributions in a given document corpus that are generated using Topic Modeling methods. Latent Dirichlet Allocation (LDA) is one of the basic methods that is predominantly used to generate topics. One of the problems with methods like LDA is that users who apply them may not understand the topics that are generated. Also, users may find it difficult to search correlated topics and correlated documents. LDAExplore, tries to alleviate these problems by visualizing topic and word distributions generated from the document corpus and allowing the user to interact with them. The system is designed for users, who have minimal knowledge of LDA or Topic Modelling methods. To evaluate our design, we run a pilot study which uses the abstracts of 322 Information Visualization papers, where every abstract is considered a document. The topics generated are then explored by users. The results show that users are able to find correlated documents and group them based on topics that are similar

arXiv.org e-Print Archive

CiteSeerX

Measuring Visual Complexity of Cluster-Based Visualizations

Author: Chen M.
Dasgupta A.
Duffy B.
Kosara R.
Walton S.
Publication venue
Publication date: 01/01/2012
Field of study

Handling visual complexity is a challenging problem in visualization owing to the subjectiveness of its definition and the difficulty in devising generalizable quantitative metrics. In this paper we address this challenge by measuring the visual complexity of two common forms of cluster-based visualizations: scatter plots and parallel coordinatess. We conceptualize visual complexity as a form of visual uncertainty, which is a measure of the degree of difficulty for humans to interpret a visual representation correctly. We propose an algorithm for estimating visual complexity for the aforementioned visualizations using Allen's interval algebra. We first establish a set of primitive 2-cluster cases in scatter plots and another set for parallel coordinatess based on symmetric isomorphism. We confirm that both are the minimal sets and verify the correctness of their members computationally. We score the uncertainty of each primitive case based on its topological properties, including the existence of overlapping regions, splitting regions and meeting points or edges. We compare a few optional scoring schemes against a set of subjective scores by humans, and identify the one that is the most consistent with the subjective scores. Finally, we extend the 2-cluster measure to k-cluster measure as a general purpose estimator of visual complexity for these two forms of cluster-based visualization

arXiv.org e-Print Archive

CiteSeerX

Oxford University Research Archive