2,074 research outputs found

    Approximated and User Steerable tSNE for Progressive Visual Analytics

    Full text link
    Progressive Visual Analytics aims at improving the interactivity in existing analytics techniques by means of visualization as well as interaction with intermediate results. One key method for data analysis is dimensionality reduction, for example, to produce 2D embeddings that can be visualized and analyzed efficiently. t-Distributed Stochastic Neighbor Embedding (tSNE) is a well-suited technique for the visualization of several high-dimensional data. tSNE can create meaningful intermediate results but suffers from a slow initialization that constrains its application in Progressive Visual Analytics. We introduce a controllable tSNE approximation (A-tSNE), which trades off speed and accuracy, to enable interactive data exploration. We offer real-time visualization techniques, including a density-based solution and a Magic Lens to inspect the degree of approximation. With this feedback, the user can decide on local refinements and steer the approximation level during the analysis. We demonstrate our technique with several datasets, in a real-world research scenario and for the real-time analysis of high-dimensional streams to illustrate its effectiveness for interactive data analysis

    Methods of Hierarchical Clustering

    Get PDF
    We survey agglomerative hierarchical clustering algorithms and discuss efficient implementations that are available in R and other software environments. We look at hierarchical self-organizing maps, and mixture models. We review grid-based clustering, focusing on hierarchical density-based approaches. Finally we describe a recently developed very efficient (linear time) hierarchical clustering algorithm, which can also be viewed as a hierarchical grid-based algorithm.Comment: 21 pages, 2 figures, 1 table, 69 reference

    Projections as visual aids for classification system design.

    Get PDF
    Dimensionality reduction is a compelling alternative for high-dimensional data visualization. This method provides insight into high-dimensional feature spaces by mapping relationships between observations (high-dimensional vectors) to low (two or three) dimensional spaces. These low-dimensional representations support tasks such as outlier and group detection based on direct visualization. Supervised learning, a subfield of machine learning, is also concerned with observations. A key task in supervised learning consists of assigning class labels to observations based on generalization from previous experience. Effective development of such classification systems depends on many choices, including features descriptors, learning algorithms, and hyperparameters. These choices are not trivial, and there is no simple recipe to improve classification systems that perform poorly. In this context, we first propose the use of visual representations based on dimensionality reduction (projections) for predictive feedback on classification efficacy. Second, we propose a projection-based visual analytics methodology, and supportive tooling, that can be used to improve classification systems through feature selection. We evaluate our proposal through experiments involving four datasets and three representative learning algorithms

    TR-2009005: Visual Analytics: A Multi-Faceted Overview

    Full text link
    • …
    corecore