1,005 research outputs found

    Understanding High Dimensional Spaces through Visual Means Employing Multidimensional Projections

    Full text link
    Data visualisation helps understanding data represented by multiple variables, also called features, stored in a large matrix where individuals are stored in lines and variable values in columns. These data structures are frequently called multidimensional spaces.In this paper, we illustrate ways of employing the visual results of multidimensional projection algorithms to understand and fine-tune the parameters of their mathematical framework. Some of the common mathematical common to these approaches are Laplacian matrices, Euclidian distance, Cosine distance, and statistical methods such as Kullback-Leibler divergence, employed to fit probability distributions and reduce dimensions. Two of the relevant algorithms in the data visualisation field are t-distributed stochastic neighbourhood embedding (t-SNE) and Least-Square Projection (LSP). These algorithms can be used to understand several ranges of mathematical functions including their impact on datasets. In this article, mathematical parameters of underlying techniques such as Principal Component Analysis (PCA) behind t-SNE and mesh reconstruction methods behind LSP are adjusted to reflect the properties afforded by the mathematical formulation. The results, supported by illustrative methods of the processes of LSP and t-SNE, are meant to inspire students in understanding the mathematics behind such methods, in order to apply them in effective data analysis tasks in multiple applications

    Studies on dimension reduction and feature spaces :

    Get PDF
    Today's world produces and stores huge amounts of data, which calls for methods that can tackle both growing sizes and growing dimensionalities of data sets. Dimension reduction aims at answering the challenges posed by the latter. Many dimension reduction methods consist of a metric transformation part followed by optimization of a cost function. Several classes of cost functions have been developed and studied, while metrics have received less attention. We promote the view that metrics should be lifted to a more independent role in dimension reduction research. The subject of this work is the interaction of metrics with dimension reduction. The work is built on a series of studies on current topics in dimension reduction and neural network research. Neural networks are used both as a tool and as a target for dimension reduction. When the results of modeling or clustering are represented as a metric, they can be studied using dimension reduction, or they can be used to introduce new properties into a dimension reduction method. We give two examples of such use: visualizing results of hierarchical clustering, and creating supervised variants of existing dimension reduction methods by using a metric that is built on the feature space of a neural network. Combining clustering with dimension reduction results in a novel way for creating space-efficient visualizations, that tell both about hierarchical structure and about distances of clusters. We study feature spaces used in a recently developed neural network architecture called extreme learning machine. We give a novel interpretation for such neural networks, and recognize the need to parameterize extreme learning machines with the variance of network weights. This has practical implications for use of extreme learning machines, since the current practice emphasizes the role of hidden units and ignores the variance. A current trend in the research of deep neural networks is to use cost functions from dimension reduction methods to train the network for supervised dimension reduction. We show that equally good results can be obtained by training a bottlenecked neural network for classification or regression, which is faster than using a dimension reduction cost. We demonstrate that, contrary to the current belief, using sparse distance matrices for creating fast dimension reduction methods is feasible, if a proper balance between short-distance and long-distance entries in the sparse matrix is maintained. This observation opens up a promising research direction, with possibility to use modern dimension reduction methods on much larger data sets than which are manageable today

    Comparison of visualization methods of genome-wide SNP profiles in childhood acute lymphoblastic leukaemia

    Full text link
    Data mining and knowledge discovery have been applied to datasets in various industries including biomedical data. Modelling, data mining and visualization in biomedical data address the problem of extracting knowledge from large and complex biomedical data. The current challenge of dealing with such data is to develop statistical-based and data mining methods that search and browse the underlying patterns within the data. In this paper, we employ several data reduction methods for visualizing genome- wide Single Nucleotide Polymorphism (SNP) datasets based on state-of-art data reduction techniques. Visualization approach has been selected based on the trustworthiness of the resultant visualizations. To deal with large amounts of genetic variation data, we have chosen to apply different data reduction methods to deal with the problem induced by high dimensionality. Based on the trustworthiness metric we found that neighbour Retrieval Visualizer (NeRV) outperformed other methods. This method optimizes the retrieval quality of Stochastic neighbour Embedding. The quality measure of the visualization (i.e. NeRV) showed excellent results, even though the dataset was reduced from 13917 to 2 dimensions. The visualization results will assist clinicians and biomedical researchers in understanding the systems biology of patients and how to compare different groups of clusters in visualizations. © 2008, Australian Computer Society, Inc

    Dynamic Composite Data Physicalization Using Wheeled Micro-Robots

    Get PDF
    This paper introduces dynamic composite physicalizations, a new class of physical visualizations that use collections of self-propelled objects to represent data. Dynamic composite physicalizations can be used both to give physical form to well-known interactive visualization techniques, and to explore new visualizations and interaction paradigms. We first propose a design space characterizing composite physicalizations based on previous work in the fields of Information Visualization and Human Computer Interaction. We illustrate dynamic composite physicalizations in two scenarios demonstrating potential benefits for collaboration and decision making, as well as new opportunities for physical interaction. We then describe our implementation using wheeled micro-robots capable of locating themselves and sensing user input, before discussing limitations and opportunities for future work

    A Descriptive Framework for Temporal Data Visualizations Based on Generalized Space-Time Cubes

    Get PDF
    International audienceWe present the generalized space-time cube, a descriptive model for visualizations of temporal data. Visualizations are described as operations on the cube, which transform the cube's 3D shape into readable 2D visualizations. Operations include extracting subparts of the cube, flattening it across space or time or transforming the cubes geometry and content. We introduce a taxonomy of elementary space-time cube operations and explain how these operations can be combined and parameterized. The generalized space-time cube has two properties: (1) it is purely conceptual without the need to be implemented, and (2) it applies to all datasets that can be represented in two dimensions plus time (e.g. geo-spatial, videos, networks, multivariate data). The proper choice of space-time cube operations depends on many factors, for example, density or sparsity of a cube. Hence, we propose a characterization of structures within space-time cubes, which allows us to discuss strengths and limitations of operations. We finally review interactive systems that support multiple operations, allowing a user to customize his view on the data. With this framework, we hope to facilitate the description, criticism and comparison of temporal data visualizations, as well as encourage the exploration of new techniques and systems. This paper is an extension of Bach et al.'s (2014) work

    Visualization of Time-Varying Data from Atomistic Simulations and Computational Fluid Dynamics

    Get PDF
    Time-varying data from simulations of dynamical systems are rich in spatio-temporal information. A key challenge is how to analyze such data for extracting useful information from the data and displaying spatially evolving features in the space-time domain of interest. We develop/implement multiple approaches toward visualization-based analysis of time-varying data obtained from two common types of dynamical simulations: molecular dynamics (MD) and computational fluid dynamics (CFD). We also make application case studies. Parallel first-principles molecular dynamics simulations produce massive amounts of time-varying three-dimensional scattered data representing atomic (molecular) configurations for material system being simulated. Rendering the atomic position-time series along with the extracted additional information helps us understand the microscopic processes in complex material system at atomic length and time scales. Radial distribution functions, coordination environments, and clusters are computed and rendered for visualizing structural behavior of the simulated material systems. Atom (particle) trajectories and displacement data are extracted and rendered for visualizing dynamical behavior of the system. While improving our atomistic visualization system to make it versatile, stable and scalable, we focus mainly on atomic trajectories. Trajectory rendering can represent complete simulation information in a single display; however, trajectories get crowded and the associated clutter/occlusion problem becomes serious for even moderate data size. We present and assess various approaches for clutter reduction including constrained rendering, basic and adaptive position merging, and information encoding. Data model with HDF5 and partial I/O, and GLSL shading are adopted to enhance the rendering speed and quality of the trajectories. For applications, a detailed visualization-based analysis is carried out for simulated silicate melts such as model basalt systems. On the other hand, CFD produces temporally and spatially resolved numerical data for fluid systems consisting of a million to tens of millions of cells (mesh points). We implement time surfaces (in particular, evolving surfaces of spheres) for visualizing the vector (flow) field to study the simulated mixing of fluids in the stirred tank

    Recognizing the Design Patterns of Complex Vaults: Drawing, Survey and Modeling. Experiments on Palazzo Mazzonis’ Atrium in Turin

    Get PDF
    This paper shows the results of research advances on complex vaulted systems produced by the integration of laser scanner survey techniques and three-dimensional modeling for the geometric interpretation of built architecture to recognizing the geometric matrices of the design conception. The integration between TLS techniques and digital modeling methods led to the definition of new workflows, aimed at optimizing the use of data and at refining the quality of the geometrical interpretation. The process incorporates the traditional activities of freehand drawing of eydotipes, aimed at a deep understanding of the peculiar characteristics of the artifact. In particular, from these procedures new opportunities for the research arise to better understand the relationships between survey data, geometric matrices and compositional rules. The case study presented here, the atrium of Palazzo Mazzonis in Turin was chosen among a small number of atria that present characteristics of originality and uniqueness in a panorama of realizations strongly characterized by compliance with well-established compositional schemes
    • …
    corecore