102,155 research outputs found

    Statistical Inference using the Morse-Smale Complex

    Full text link
    The Morse-Smale complex of a function ff decomposes the sample space into cells where ff is increasing or decreasing. When applied to nonparametric density estimation and regression, it provides a way to represent, visualize, and compare multivariate functions. In this paper, we present some statistical results on estimating Morse-Smale complexes. This allows us to derive new results for two existing methods: mode clustering and Morse-Smale regression. We also develop two new methods based on the Morse-Smale complex: a visualization technique for multivariate functions and a two-sample, multivariate hypothesis test.Comment: 45 pages, 13 figures. Accepted to Electronic Journal of Statistic

    Geoscience after IT: Part F. Familiarization with quantitative analysis

    Get PDF
    Numbers, measurement and calculation extend our view of the world. Statistical methods describe the properties of sets of quantitative data, and can test models (particularly the model that observed relationships arose by chance) and help us to draw conclusions. Links between spatial and quantitative methods, through coordinate geometry and matrix algebra, lead to graphical representations for visualizing and exploring relationships. Multivariate statistics tie into visualization to look at pattern among many properties

    Multivariate Spatial Visualization using GeoIcons and Image Charts

    Get PDF
    Spatial databases are growing in size and complexity, yet current visual data mining methods are challenged when it comes to multivariate spatial data. The specific research question addressed in this thesis is: how can spatial multivariate data be effectively visualized using an icon based non-fused co-visualization approach? The thesis presents a Python based design and implementation of a visualization program termed GeoIcon Viewer. The program incorporates two different visualization methods: GeoIcon Image Map and Region-of-Interest Image Layers Chart. The GeoIcon Image Map technique uses an icon to co-visualize up to nine attributes at a single location. The Region-of-Interest Image Layers Chart method uses a small multiples approach to support the GeoIcon Image Map technique for data with negligible value differences. The thesis demonstrates the successful implementation of the GeoIcon Viewer with a case study involving remote sensing digital image analysis of a copper deposit. With the two visualization methods and eight input attributes, the GeoIcon Viewer generated real time interactive visualization outputs that can aid a user in multivariate spatial data mining

    Teaching Stats for Data Science

    Get PDF
    “Data science” is a useful catchword for methods and concepts original to the field of statistics, but typically being applied to large, multivariate, observational records. Such datasets call for techniques not often part of an introduction to statistics: modeling, consideration of covariates, sophisticated visualization, and causal reasoning. This article re-imagines introductory statistics as an introduction to data science and proposes a sequence of 10 blocks that together compose a suitable course for extracting information from contemporary data. Recent extensions to the mosaic packages for R together with tools from the “tidyverse” provide a concise and readable notation for wrangling, visualization, model-building, and model interpretation: the fundamental computational tasks of data science

    A Multi-Code Analysis Toolkit for Astrophysical Simulation Data

    Full text link
    The analysis of complex multiphysics astrophysical simulations presents a unique and rapidly growing set of challenges: reproducibility, parallelization, and vast increases in data size and complexity chief among them. In order to meet these challenges, and in order to open up new avenues for collaboration between users of multiple simulation platforms, we present yt (available at http://yt.enzotools.org/), an open source, community-developed astrophysical analysis and visualization toolkit. Analysis and visualization with yt are oriented around physically relevant quantities rather than quantities native to astrophysical simulation codes. While originally designed for handling Enzo's structure adaptive mesh refinement (AMR) data, yt has been extended to work with several different simulation methods and simulation codes including Orion, RAMSES, and FLASH. We report on its methods for reading, handling, and visualizing data, including projections, multivariate volume rendering, multi-dimensional histograms, halo finding, light cone generation and topologically-connected isocontour identification. Furthermore, we discuss the underlying algorithms yt uses for processing and visualizing data, and its mechanisms for parallelization of analysis tasks.Comment: 18 pages, 6 figures, emulateapj format. Resubmitted to Astrophysical Journal Supplement Series with revisions from referee. yt can be found at http://yt.enzotools.org

    Seeing more than the graph: evaluation of multivariate graph visualization methods.

    Get PDF
    Many real-world networks are multivariate, i.e., they have attributes associated with nodes and/or edges. Examples include social networks whose nodes represent people and edges represent relationships. There is usually information about each person (such as name, age, and gender) and the relationship (such type, duration, and strength). Besides common graph analysis tasks (such as identifying the most influential or structurally important nodes), there are more complex analyses for multivariate networks. One of these is the multivariate graph clustering, i.e., identifying clusters formed by nodes that have similar attributes and are close to each other in terms of graph distance. For instance, in social network analysis, it is interesting to sociologists whether or not people with similar characteristics (node attributes) are also connected to each other. Currently there are very few visualization methods available for such analysis. Graph and multivariate visualization have been well studied separately in the literature. Herman et al. summarized the recent work on graph visualization [3], and Wong and Bergeron covered the development in multivariate visualization [4]. However, there is relatively less work available on multivariate network visualization. Two types of approaches are commonly used. The first one is the mapping approach, which maps attributes to visual elements of a node or edge. A simple example is to map one attribute to node size and another to node color [2]. A more advanced mapping approach uses glyphs to represent node or edge attributes. One such example is to use the length and width of a rectangle node glyph to represent two node attributes [1]. The second one is the 2.5D approach: it uses the third dimension to present the multivariate information, while the graph is shown on a 2D plane. Examples include the recently proposed "GraphScape" [5], which adopts a landscape metaphor: each attribute is represented by a two-and-a-half- dimensional surface, whose height indicates its value. Each approach has its strength and weakness. The mapping approach is effective of showing numerical value using visual element such as size, but it can be difficult to compare the value of attributes represented by different elements such as size and color. The problem is alleviated by a carefully designed glyph, but visual complexity increases quickly as the number of attributes that a glyph needs to represent grows. The 2.5D approach is good at showing the distribution of attribute values over the network, but the attribute surface could introduce occlusion and affect the visibility of underlying network. In this paper, we present a study evaluating the effectiveness of these two approaches for different analysis tasks. We compare the performance of mapping and 2.5D approach in a controlled lab environment. We included both simple tasks (such as identifying nodes with the largest attribute value) and complex tasks (such as multivariate graph clustering). The performance is measured both in terms of accuracy and completion time. The results indicate that statistically mapping approach performs better for the simple tasks, while the 2.5D approach is favored in the complex task. The outcomes from this study provide some guidelines for the design of effective multivariate graph visualization for different analysis tasks