17,652 research outputs found

    Progressive Analytics: A Computation Paradigm for Exploratory Data Analysis

    Get PDF
    Exploring data requires a fast feedback loop from the analyst to the system, with a latency below about 10 seconds because of human cognitive limitations. When data becomes large or analysis becomes complex, sequential computations can no longer be completed in a few seconds and data exploration is severely hampered. This article describes a novel computation paradigm called Progressive Computation for Data Analysis or more concisely Progressive Analytics, that brings at the programming language level a low-latency guarantee by performing computations in a progressive fashion. Moving this progressive computation at the language level relieves the programmer of exploratory data analysis systems from implementing the whole analytics pipeline in a progressive way from scratch, streamlining the implementation of scalable exploratory data analysis systems. This article describes the new paradigm through a prototype implementation called ProgressiVis, and explains the requirements it implies through examples.Comment: 10 page

    Approximated and User Steerable tSNE for Progressive Visual Analytics

    Full text link
    Progressive Visual Analytics aims at improving the interactivity in existing analytics techniques by means of visualization as well as interaction with intermediate results. One key method for data analysis is dimensionality reduction, for example, to produce 2D embeddings that can be visualized and analyzed efficiently. t-Distributed Stochastic Neighbor Embedding (tSNE) is a well-suited technique for the visualization of several high-dimensional data. tSNE can create meaningful intermediate results but suffers from a slow initialization that constrains its application in Progressive Visual Analytics. We introduce a controllable tSNE approximation (A-tSNE), which trades off speed and accuracy, to enable interactive data exploration. We offer real-time visualization techniques, including a density-based solution and a Magic Lens to inspect the degree of approximation. With this feedback, the user can decide on local refinements and steer the approximation level during the analysis. We demonstrate our technique with several datasets, in a real-world research scenario and for the real-time analysis of high-dimensional streams to illustrate its effectiveness for interactive data analysis

    Visualisation techniques for users and designers of layout algorithms

    Get PDF
    Visualisation systems consisting of a set of components through which data and interaction commands flow have been explored by a number of researchers. Such hybrid and multistage algorithms can be used to reduce overall computation time, and to provide views of the data that show intermediate results and the outputs of complementary algorithms. In this paper we present work on expanding the range and variety of such components, with two new techniques for analysing and controlling the performance of visualisation processes. While the techniques presented are quite different, they are unified within HIVE: a visualisation system based upon a data-flow model and visual programming. Embodied within this system is a framework for weaving together our visualisation components to better afford insight into data and also deepen understanding of the process of the data's visualisation. We describe the new components and offer short case studies of their application. We demonstrate that both analysts and visualisation designers can benefit from a rich set of components and integrated tools for profiling performance

    Holistic corpus-based dialectology

    Get PDF
    This paper is concerned with sketching future directions for corpus-based dialectology. We advocate a holistic approach to the study of geographically conditioned linguistic variability, and we present a suitable methodology, 'corpusbased dialectometry', in exactly this spirit. Specifically, we argue that in order to live up to the potential of the corpus-based method, practitioners need to (i) abandon their exclusive focus on individual linguistic features in favor of the study of feature aggregates, (ii) draw on computationally advanced multivariate analysis techniques (such as multidimensional scaling, cluster analysis, and principal component analysis), and (iii) aid interpretation of empirical results by marshalling state-of-the-art data visualization techniques. To exemplify this line of analysis, we present a case study which explores joint frequency variability of 57 morphosyntax features in 34 dialects all over Great Britain
    • 

    corecore