3 research outputs found

    How to evaluate a subspace visual projection in interactive visual systems? A position paper

    Get PDF
    International audienceThis paper presents a position paper on subspace projection evaluation methods in interactive visual systems. We focus on how to evaluate real information rendered through the visual data projection for the mining of high dimensional data sets. To do this, we investigate automatic techniques that select the best visual projection and we discuss how they evaluate the projections to help the user before interactivity. When we deal with high dimensional data sets, the number of potential projections exceeds the limit of human interpretation. To find the optimal subspace representation, there are two possibilities, the first one is to find the optimal subspace which reproduces what really exists in the original data: getting the existing clusters and/or outliers in the projection. The second possibility consists in researching subspaces according to the knowledge discovery process: discovering novel, but meaningful information, such as clusters and/or outliers from the projection. The problem is that visual projection cannot be in adequation with the subspaces. In some cases, the visual projection can show some things that do not really exist in the original data space (which can be considered as an artifact). The mapping between the visual structure and the real data structure is as important as the efficiency and accuracy of the visualization. We examine and discuss the literature of Information visualization, Visual analytic, High dimensional data visualization, and interactive data mining and machine learning communities, on how to evaluate the faithfulness of the visual projection information

    Quality-based guidance for exploratory dimensionality reduction

    No full text
    High-dimensional data sets containing hundreds of variables are difficult to explore, as traditional visualization methods often are unable to represent such data effectively. This is commonly addressed by employing dimensionality reduction prior to visualization. Numerous dimensionality reduction methods are available. However, few reduction approaches take the importance of several structures into account and few provide an overview of structures existing in the full high-dimensional data set. For exploratory analysis, as well as for many other tasks, several structures may be of interest. Exploration of the full high-dimensional data set without reduction may also be desirable. This paper presents flexible methods for exploratory analysis and interactive dimensionality reduction. Automated methods are employed to analyse the variables, using a range of quality metrics, providing one or more measures of ‘interestingness’ for individual variables. Through ranking, a single value of interestingness is obtained, based on several quality metrics, that is usable as a threshold for the most interesting variables. An interactive environment is presented in which the user is provided with many possibilities to explore and gain understanding of the high-dimensional data set. Guided by this, the analyst can explore the high-dimensional data set and interactively select a subset of the potentially most interesting variables, employing various methods for dimensionality reduction. The system is demonstrated through a use-case analysing data from a DNA sequence-based study of bacterial populations
    corecore