15,945 research outputs found

    Interactive tag maps and tag clouds for the multiscale exploration of large spatio-temporal datasets

    Get PDF
    'Tag clouds' and 'tag maps' are introduced to represent geographically referenced text. In combination, these aspatial and spatial views are used to explore a large structured spatio-temporal data set by providing overviews and filtering by text and geography. Prototypes are implemented using freely available technologies including Google Earth and Yahoo! 's Tag Map applet. The interactive tag map and tag cloud techniques and the rapid prototyping method used are informally evaluated through successes and limitations encountered. Preliminary evaluation suggests that the techniques may be useful for generating insights when visualizing large data sets containing geo-referenced text strings. The rapid prototyping approach enabled the technique to be developed and evaluated, leading to geovisualization through which a number of ideas were generated. Limitations of this approach are reflected upon. Tag placement, generalisation and prominence at different scales are issues which have come to light in this study that warrant further work

    Revisiting Guerry's data: Introducing spatial constraints in multivariate analysis

    Full text link
    Standard multivariate analysis methods aim to identify and summarize the main structures in large data sets containing the description of a number of observations by several variables. In many cases, spatial information is also available for each observation, so that a map can be associated to the multivariate data set. Two main objectives are relevant in the analysis of spatial multivariate data: summarizing covariation structures and identifying spatial patterns. In practice, achieving both goals simultaneously is a statistical challenge, and a range of methods have been developed that offer trade-offs between these two objectives. In an applied context, this methodological question has been and remains a major issue in community ecology, where species assemblages (i.e., covariation between species abundances) are often driven by spatial processes (and thus exhibit spatial patterns). In this paper we review a variety of methods developed in community ecology to investigate multivariate spatial patterns. We present different ways of incorporating spatial constraints in multivariate analysis and illustrate these different approaches using the famous data set on moral statistics in France published by Andr\'{e}-Michel Guerry in 1833. We discuss and compare the properties of these different approaches both from a practical and theoretical viewpoint.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS356 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Approximate Inference in Continuous Determinantal Point Processes

    Full text link
    Determinantal point processes (DPPs) are random point processes well-suited for modeling repulsion. In machine learning, the focus of DPP-based models has been on diverse subset selection from a discrete and finite base set. This discrete setting admits an efficient sampling algorithm based on the eigendecomposition of the defining kernel matrix. Recently, there has been growing interest in using DPPs defined on continuous spaces. While the discrete-DPP sampler extends formally to the continuous case, computationally, the steps required are not tractable in general. In this paper, we present two efficient DPP sampling schemes that apply to a wide range of kernel functions: one based on low rank approximations via Nystrom and random Fourier feature techniques and another based on Gibbs sampling. We demonstrate the utility of continuous DPPs in repulsive mixture modeling and synthesizing human poses spanning activity spaces

    A review of data visualization: opportunities in manufacturing sequence management.

    No full text
    Data visualization now benefits from developments in technologies that offer innovative ways of presenting complex data. Potentially these have widespread application in communicating the complex information domains typical of manufacturing sequence management environments for global enterprises. In this paper the authors review the visualization functionalities, techniques and applications reported in literature, map these to manufacturing sequence information presentation requirements and identify the opportunities available and likely development paths. Current leading-edge practice in dynamic updating and communication with suppliers is not being exploited in manufacturing sequence management; it could provide significant benefits to manufacturing business. In the context of global manufacturing operations and broad-based user communities with differing needs served by common data sets, tool functionality is generally ahead of user application

    Curriculum Guidelines for Undergraduate Programs in Data Science

    Get PDF
    The Park City Math Institute (PCMI) 2016 Summer Undergraduate Faculty Program met for the purpose of composing guidelines for undergraduate programs in Data Science. The group consisted of 25 undergraduate faculty from a variety of institutions in the U.S., primarily from the disciplines of mathematics, statistics and computer science. These guidelines are meant to provide some structure for institutions planning for or revising a major in Data Science
    • …
    corecore