48,820 research outputs found
Data analytics enhanced data visualization and interrogation with parallel coordinates plots
© 2018 IEEE. Parallel coordinates plots (PCPs) suffer from curse of dimensionality when used with larger multidimensional datasets. Curse of dimentionality results in clutter which hides important visual data trends among coordinates. A number of solutions to address this problem have been proposed including filtering, aggregation, and dimension reordering. These solutions, however, have their own limitations with regard to exploring relationships and trends among the coordinates in PCPs. Correlation based coordinates reordering techniques are among the most popular and have been widely used in PCPs to reduce clutter, though based on the conducted experiments, this research has identified some of their limitations. To achieve better visualization with reduced clutter, we have proposed and evaluated dimensions reordering approach based on minimization of the number of crossing pairs. In the last step, k-means clustering is combined with reordered coordinates to highlight key trends and patterns. The conducted comparative analysis have shown that minimum crossings pairs approach performed much better than other applied techniques for coordinates reordering, and when combined with k-means clustering, resulted in better visualization with significantly reduced clutter
Evaluation of a Bundling Technique for Parallel Coordinates
We describe a technique for bundled curve representations in
parallel-coordinates plots and present a controlled user study evaluating their
effectiveness. Replacing the traditional C^0 polygonal lines by C^1 continuous
piecewise Bezier curves makes it easier to visually trace data points through
each coordinate axis. The resulting Bezier curves can then be bundled to
visualize data with given cluster structures. Curve bundles are efficient to
compute, provide visual separation between data clusters, reduce visual
clutter, and present a clearer overview of the dataset. A controlled user study
with 14 participants confirmed the effectiveness of curve bundling for
parallel-coordinates visualization: 1) compared to polygonal lines, it is
equally capable of revealing correlations between neighboring data attributes;
2) its geometric cues can be effective in displaying cluster information. For
some datasets curve bundling allows the color perceptual channel to be applied
to other data attributes, while for complex cluster patterns, bundling and
color can represent clustering far more clearly than either alone
Visual Multi-Metric Grouping of Eye-Tracking Data
We present an algorithmic and visual grouping of participants and eye-tracking metrics derived from recorded eye-tracking data. Our method utilizes two well-established visualization concepts. First, parallel coordinates are used to provide an overview of the used metrics, their interactions, and similarities, which helps select suitable metrics that describe characteristics of the eye-tracking data. Furthermore, parallel coordinates plots enable an analyst to test the effects of creating a combination of a subset of metrics resulting in a newly derived eye-tracking metric. Second, a similarity matrix visualization is used to visually represent the affine combination of metrics utilizing an algorithmic grouping of subjects that leads to distinct visual groups of similar behavior. To keep the diagrams of the matrix visualization simple and understandable, we visually encode our eye- tracking data into the cells of a similarity matrix of participants. The algorithmic grouping is performed with a clustering based on the affine combination of metrics, which is also the basis for the similarity value computation of the similarity matrix. To illustrate the usefulness of our visualization, we applied it to an eye-tracking data set involving the reading behavior of metro maps of up to 40 participants. Finally, we discuss limitations and scalability issues of the approach focusing on visual and perceptual issues
Understanding Clusters in Multidimensional Spaces: Making Meaning by Combining Insights from Coordinated Views of Domain Knowledge (2004)
Cluster analysis of multidimensional data is widely used in many research areas including financial, economical, sociological, and biological analyses. Finding natural subclasses in a data set not only reveals interesting patterns but also serves as a basis for further analyses. One of the troubles with cluster analysis is that evaluating how interesting a clustering result is to researchers is subjective, application-dependent, and even difficult to measure. This problem generally gets worse as dimensionality and the number of items grows. The remedy is to enable researchers to apply domain knowledge to facilitate insight about the significance of the clustering result. This article presents a way to better understand a clustering result by combining insights from two interactively coordinated visual displays of domain knowledge. The first is a parallel coordinates view powered by a direct-manipulation search. The second is a domain knowledge view containing a well-understood and meaningful tabular or hierarchical information for the same data set. Our examples depend on hierarchical clustering of gene expression data, coordinated with a parallel coordinates view and with the gene annotation and gene ontology
LDAExplore: Visualizing Topic Models Generated Using Latent Dirichlet Allocation
We present LDAExplore, a tool to visualize topic distributions in a given
document corpus that are generated using Topic Modeling methods. Latent
Dirichlet Allocation (LDA) is one of the basic methods that is predominantly
used to generate topics. One of the problems with methods like LDA is that
users who apply them may not understand the topics that are generated. Also,
users may find it difficult to search correlated topics and correlated
documents. LDAExplore, tries to alleviate these problems by visualizing topic
and word distributions generated from the document corpus and allowing the user
to interact with them. The system is designed for users, who have minimal
knowledge of LDA or Topic Modelling methods. To evaluate our design, we run a
pilot study which uses the abstracts of 322 Information Visualization papers,
where every abstract is considered a document. The topics generated are then
explored by users. The results show that users are able to find correlated
documents and group them based on topics that are similar
Measuring Visual Complexity of Cluster-Based Visualizations
Handling visual complexity is a challenging problem in visualization owing to
the subjectiveness of its definition and the difficulty in devising
generalizable quantitative metrics. In this paper we address this challenge by
measuring the visual complexity of two common forms of cluster-based
visualizations: scatter plots and parallel coordinatess. We conceptualize
visual complexity as a form of visual uncertainty, which is a measure of the
degree of difficulty for humans to interpret a visual representation correctly.
We propose an algorithm for estimating visual complexity for the aforementioned
visualizations using Allen's interval algebra. We first establish a set of
primitive 2-cluster cases in scatter plots and another set for parallel
coordinatess based on symmetric isomorphism. We confirm that both are the
minimal sets and verify the correctness of their members computationally. We
score the uncertainty of each primitive case based on its topological
properties, including the existence of overlapping regions, splitting regions
and meeting points or edges. We compare a few optional scoring schemes against
a set of subjective scores by humans, and identify the one that is the most
consistent with the subjective scores. Finally, we extend the 2-cluster measure
to k-cluster measure as a general purpose estimator of visual complexity for
these two forms of cluster-based visualization
- …