84 research outputs found

    VizRank: Data Visualization Guided by Machine Learning

    Get PDF
    Data visualization plays a crucial role in identifying interesting patterns in exploratory data analysis. Its use is, however, made difficult by the large number of possible data projections showing different attribute subsets that must be evaluated by the data analyst. In this paper, we introduce a method called VizRank, which is applied on classified data to automatically select the most useful data projections. VizRank can be used with any visualization method that maps attribute values to points in a two-dimensional visualization space. It assesses possible data projections and ranks them by their ability to visually discriminate between classes. The quality of class separation is estimated by computing the predictive accuracy of k-nearest neighbor classifier on the data set consisting of x and y positions of the projected data points and their class information. The paper introduces the method and presents experimental results which show that VizRank's ranking of projections highly agrees with subjective rankings by data analysts. The practical use of VizRank is also demonstrated by an application in the field of functional genomics

    Application of Data Visualization and Big Data Analysis in Intelligent Agriculture

    Get PDF
    Intelligent agriculture can renovate agricultural production and management, making agricultural production truly scientific and efficient. The existing data mining technology for agricultural information is powerful and professional. But the technology is not well adapted for intelligent agriculture. Therefore, this paper introduces data visualization and big data analysis into the application scenarios of intelligent agriculture. Firstly, an intelligent agriculture data visualization system was established, and the RadViz data visualization method was detailed for intelligent agriculture. Moreover, the intelligent agriculture data were processed using dimensionality reduction through principal component analysis (PCA) and further optimized through k-means clustering (KMC). Finally, the crop yield was predicted using the multiple regression algorithm and the residual principal component regression algorithm. The crop yield prediction model was proved effective through experiments

    Enhancement in Visualization of Parallel Coordinates using Curves

    Get PDF
    In this paper I analysis about the visualization techniques of large set of data with parallel coordinates. Parallel Coordinate is an interesting method which can be widely used throughout the world, not only at research area but also other field such as business, market, finance and so on. The aim of this research work is to implement Parallel Coordinate and refinements to Parallel Coordinates using curve. In parallel coordinates visualization of data set is performed by using straight lines. Then lines replaced with the collection of smooth curves across the attribute axis, allowing individual data element to be traced under certain limitations normally impossible due to “Crossing Problem” .Then the notion of spreading out points on axis with few discrete value is introduced, which leads to a simple filter technique when the user selects value on such axis. In this paper I proposed a new concept of visualization of large set of data with parallel Coordinate. Parallel coordinates were proposed by Alfred Inselberg as a new way to represent multidimensional information. A parallel coordinate’s visualization assigns one vertical axis to each variable, and evenly spaces these axes horizontally. This is in contrast to the traditional Cartesian coordinates system where all axes are mutually perpendicular. By drawing the axes parallel to one another, one can represent data in much greater than three dimensions. Each variable is plotted on its own axis, and the values of the variables on adjacent axes are connected by straight lines. Thus, a point in an n-dimensional space becomes a polygonal line laid out across the n parallel axes with n-1 line segments connecting the n data values. In this way, the search for relations among the variables is transformed into a 2-D pattern recognition problem, and the variables become amenable to visualization

    Three-dimensional Radial Visualization of High-dimensional Datasets with Mixed Features

    Full text link
    We develop methodology for 3D radial visualization (RadViz) of high-dimensional datasets. Our display engine is called RadViz3D and extends the classical 2D RadViz that visualizes multivariate data in the 2D plane by mapping every record to a point inside the unit circle. We show that distributing anchor points at least approximately uniformly on the 3D unit sphere provides a better visualization with minimal artificial visual correlation for data with uncorrelated variables. Our RadViz3D methodology therefore places equi-spaced anchor points, one for every feature, exactly for the five Platonic solids, and approximately via a Fibonacci grid for the other cases. Our Max-Ratio Projection (MRP) method then utilizes the group information in high dimensions to provide distinctive lower-dimensional projections that are then displayed using Radviz3D. Our methodology is extended to datasets with discrete and continuous features where a Gaussianized distributional transform is used in conjunction with copula models before applying MRP and visualizing the result using RadViz3D. A R package radviz3d implementing our complete methodology is available.Comment: 12 pages, 10 figures, 1 tabl

    Three-dimensional Radial Visualization of High-dimensional Continuous or Discrete Data

    Get PDF
    This paper develops methodology for 3D radial visualization of high-dimensional datasets. Our display engine is called RadViz3D and extends the classic RadViz that visualizes multivariate data in the 2D plane by mapping every record to a point inside the unit circle. The classic RadViz display has equally-spaced anchor points on the unit circle, with each of them associated with an attribute or feature of the dataset. RadViz3D obtains equi-spaced anchor points exactly for the five Platonic solids and approximately for the other cases via a Fibonacci grid. We show that distributing anchor points at least approximately uniformly on the 3D unit sphere provides a better visualization than in 2D. We also propose a Max-Ratio Projection (MRP) method that utilizes the group information in high dimensions to provide distinctive lower-dimensional projections that are then displayed using Radviz3D. Our methodology is extended to datasets with discrete and mixed features where a generalized distributional transform is used in conjuction with copula models before applying MRP and RadViz3D visualization

    ICE: An Interactive Configuration Explorer for High Dimensional Categorical Parameter Spaces

    Full text link
    There are many applications where users seek to explore the impact of the settings of several categorical variables with respect to one dependent numerical variable. For example, a computer systems analyst might want to study how the type of file system or storage device affects system performance. A usual choice is the method of Parallel Sets designed to visualize multivariate categorical variables. However, we found that the magnitude of the parameter impacts on the numerical variable cannot be easily observed here. We also attempted a dimension reduction approach based on Multiple Correspondence Analysis but found that the SVD-generated 2D layout resulted in a loss of information. We hence propose a novel approach, the Interactive Configuration Explorer (ICE), which directly addresses the need of analysts to learn how the dependent numerical variable is affected by the parameter settings given multiple optimization objectives. No information is lost as ICE shows the complete distribution and statistics of the dependent variable in context with each categorical variable. Analysts can interactively filter the variables to optimize for certain goals such as achieving a system with maximum performance, low variance, etc. Our system was developed in tight collaboration with a group of systems performance researchers and its final effectiveness was evaluated with expert interviews, a comparative user study, and two case studies.Comment: 10 pages, Published by IEEE at VIS 2019 (Vancouver, BC, Canada

    Visualising Mutually Non-dominating Solution Sets in Many-objective Optimisation

    Get PDF
    Copyright © 2013 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works.As many-objective optimization algorithms mature, the problem owner is faced with visualizing and understanding a set of mutually nondominating solutions in a high dimensional space. We review existing methods and present new techniques to address this problem. We address a common problem with the well-known heatmap visualization, since the often arbitrary ordering of rows and columns renders the heatmap unclear, by using spectral seriation to rearrange the solutions and objectives and thus enhance the clarity of the heatmap. A multiobjective evolutionary optimizer is used to further enhance the simultaneous visualization of solutions in objective and parameter space. Two methods for visualizing multiobjective solutions in the plane are introduced. First, we use RadViz and exploit interpretations of barycentric coordinates for convex polygons and simplices to map a mutually nondominating set to the interior of a regular convex polygon in the plane, providing an intuitive representation of the solutions and objectives. Second, we introduce a new measure of the similarity of solutions—the dominance distance—which captures the order relations between solutions. This metric provides an embedding in Euclidean space, which is shown to yield coherent visualizations in two dimensions. The methods are illustrated on standard test problems and data from a benchmark many-objective problem

    Concentric RadViz: visual exploration of multi-task classification

    Get PDF
    The discovery of patterns in large data collections is a difficult task. Visualization and machine learning techniques have emerged as a way to facilitate data analysis, providing tools to uncover relevant patterns from the data. This paper presents Concentric RadViz, a general purpose class visualization system that takes into account multi-class, multi-label and multi-task classifiers. Concentric RadViz uses a force attenuation scheme, which minimizes cluttering and ambiguity in the visual layout. In addition, the user can add concentric circles to the layout in order to represent classification tasks. Our validation results and the application of Concentric RadViz for two real collections suggest that this tool can reveal important data patterns and relations. In our application, the user can interact with the visualization by selecting regions of interest according to specific criteria and changing projection parameters.FAPESP (#2011/22749- 8, #2012/17961-0, #2012/24801-0, #2014/09546-9, #2014/18665-1)CNPq (#132239/2013-2, #305796/2013- 5, #302643/2013-3
    • 

    corecore