10,990 research outputs found

    VizRank: Data Visualization Guided by Machine Learning

    Get PDF
    Data visualization plays a crucial role in identifying interesting patterns in exploratory data analysis. Its use is, however, made difficult by the large number of possible data projections showing different attribute subsets that must be evaluated by the data analyst. In this paper, we introduce a method called VizRank, which is applied on classified data to automatically select the most useful data projections. VizRank can be used with any visualization method that maps attribute values to points in a two-dimensional visualization space. It assesses possible data projections and ranks them by their ability to visually discriminate between classes. The quality of class separation is estimated by computing the predictive accuracy of k-nearest neighbor classifier on the data set consisting of x and y positions of the projected data points and their class information. The paper introduces the method and presents experimental results which show that VizRank's ranking of projections highly agrees with subjective rankings by data analysts. The practical use of VizRank is also demonstrated by an application in the field of functional genomics

    Overlap Removal of Dimensionality Reduction Scatterplot Layouts

    Full text link
    Dimensionality Reduction (DR) scatterplot layouts have become a ubiquitous visualization tool for analyzing multidimensional data items with presence in different areas. Despite its popularity, scatterplots suffer from occlusion, especially when markers convey information, making it troublesome for users to estimate items' groups' sizes and, more importantly, potentially obfuscating critical items for the analysis under execution. Different strategies have been devised to address this issue, either producing overlap-free layouts, lacking the powerful capabilities of contemporary DR techniques in uncover interesting data patterns, or eliminating overlaps as a post-processing strategy. Despite the good results of post-processing techniques, the best methods typically expand or distort the scatterplot area, thus reducing markers' size (sometimes) to unreadable dimensions, defeating the purpose of removing overlaps. This paper presents a novel post-processing strategy to remove DR layouts' overlaps that faithfully preserves the original layout's characteristics and markers' sizes. We show that the proposed strategy surpasses the state-of-the-art in overlap removal through an extensive comparative evaluation considering multiple different metrics while it is 2 or 3 orders of magnitude faster for large datasets.Comment: 11 pages and 9 figure

    Possibilities and Limits in Visualizing Large Amounts of Multidimensional Data

    Get PDF
    • …
    corecore