79 research outputs found

    Interactive Feature Selection and Visualization for Large Observational Data

    Get PDF
    Data can create enormous values in both scientific and industrial fields, especially for access to new knowledge and inspiration of innovation. As the massive increases in computing power, data storage capacity, as well as capability of data generation and collection, the scientific research communities are confronting with a transformation of exploiting the advanced uses of the large-scale, complex, and high-resolution data sets in situation awareness and decision-making projects. To comprehensively analyze the big data problems requires the analyses aiming at various aspects which involves of effective selections of static and time-varying feature patterns that fulfills the interests of domain users. To fully utilize the benefits of the ever-growing size of data and computing power in real applications, we proposed a general feature analysis pipeline and an integrated system that is general, scalable, and reliable for interactive feature selection and visualization of large observational data for situation awareness. The great challenge tackled in this dissertation was about how to effectively identify and select meaningful features in a complex feature space. Our research efforts mainly included three aspects: 1. Enable domain users to better define their interests of analysis; 2. Accelerate the process of feature selection; 3. Comprehensively present the intermediate and final analysis results in a visualized way. For static feature selection, we developed a series of quantitative metrics that related the user interest with the spatio-temporal characteristics of features. For timevarying feature selection, we proposed the concept of generalized feature set and used a generalized time-varying feature to describe the selection interest. Additionally, we provided a scalable system framework that manages both data processing and interactive visualization, and effectively exploits the computation and analysis resources. The methods and the system design together actualized interactive feature selections from two representative large observational data sets with large spatial and temporal resolutions respectively. The final results supported the endeavors in applications of big data analysis regarding combining the statistical methods with high performance computing techniques to visualize real events interactively

    Visual analytics for relationships in scientific data

    Get PDF
    Domain scientists hope to address grand scientific challenges by exploring the abundance of data generated and made available through modern high-throughput techniques. Typical scientific investigations can make use of novel visualization tools that enable dynamic formulation and fine-tuning of hypotheses to aid the process of evaluating sensitivity of key parameters. These general tools should be applicable to many disciplines: allowing biologists to develop an intuitive understanding of the structure of coexpression networks and discover genes that reside in critical positions of biological pathways, intelligence analysts to decompose social networks, and climate scientists to model extrapolate future climate conditions. By using a graph as a universal data representation of correlation, our novel visualization tool employs several techniques that when used in an integrated manner provide innovative analytical capabilities. Our tool integrates techniques such as graph layout, qualitative subgraph extraction through a novel 2D user interface, quantitative subgraph extraction using graph-theoretic algorithms or by querying an optimized B-tree, dynamic level-of-detail graph abstraction, and template-based fuzzy classification using neural networks. We demonstrate our system using real-world workflows from several large-scale studies. Parallel coordinates has proven to be a scalable visualization and navigation framework for multivariate data. However, when data with thousands of variables are at hand, we do not have a comprehensive solution to select the right set of variables and order them to uncover important or potentially insightful patterns. We present algorithms to rank axes based upon the importance of bivariate relationships among the variables and showcase the efficacy of the proposed system by demonstrating autonomous detection of patterns in a modern large-scale dataset of time-varying climate simulation

    Faster inference from state space models via GPU computing

    Get PDF
    Funding: C.F.-J. is funded via a doctoral scholarship from the University of St Andrews, School of Mathematics and Statistics.Inexpensive Graphics Processing Units (GPUs) offer the potential to greatly speed up computation by employing their massively parallel architecture to perform arithmetic operations more efficiently. Population dynamics models are important tools in ecology and conservation. Modern Bayesian approaches allow biologically realistic models to be constructed and fitted to multiple data sources in an integrated modelling framework based on a class of statistical models called state space models. However, model fitting is often slow, requiring hours to weeks of computation. We demonstrate the benefits of GPU computing using a model for the population dynamics of British grey seals, fitted with a particle Markov chain Monte Carlo algorithm. Speed-ups of two orders of magnitude were obtained for estimations of the log-likelihood, compared to a traditional ‘CPU-only’ implementation, allowing for an accurate method of inference to be used where this was previously too computationally expensive to be viable. GPU computing has enormous potential, but one barrier to further adoption is a steep learning curve, due to GPUs' unique hardware architecture. We provide a detailed description of hardware and software setup, and our case study provides a template for other similar applications. We also provide a detailed tutorial-style description of GPU hardware architectures, and examples of important GPU-specific programming practices.Publisher PDFPeer reviewe

    Big Data Computing for Geospatial Applications

    Get PDF
    The convergence of big data and geospatial computing has brought forth challenges and opportunities to Geographic Information Science with regard to geospatial data management, processing, analysis, modeling, and visualization. This book highlights recent advancements in integrating new computing approaches, spatial methods, and data management strategies to tackle geospatial big data challenges and meanwhile demonstrates opportunities for using big data for geospatial applications. Crucial to the advancements highlighted in this book is the integration of computational thinking and spatial thinking and the transformation of abstract ideas and models to concrete data structures and algorithms

    Intelligence artificielle: Les défis actuels et l'action d'Inria - Livre blanc Inria

    Get PDF
    Livre blanc Inria N°01International audienceInria white papers look at major current challenges in informatics and mathematics and show actions conducted by our project-teams to address these challenges. This document is the first produced by the Strategic Technology Monitoring & Prospective Studies Unit. Thanks to a reactive observation system, this unit plays a lead role in supporting Inria to develop its strategic and scientific orientations. It also enables the institute to anticipate the impact of digital sciences on all social and economic domains. It has been coordinated by Bertrand Braunschweig with contributions from 45 researchers from Inria and from our partners. Special thanks to Peter Sturm for his precise and complete review.Les livres blancs d’Inria examinent les grands défis actuels du numérique et présentent les actions menées par noséquipes-projets pour résoudre ces défis. Ce document est le premier produit par la cellule veille et prospective d’Inria. Cette unité, par l’attention qu’elle porte aux évolutions scientifiques et technologiques, doit jouer un rôle majeur dans la détermination des orientations stratégiques et scientifiques d’Inria. Elle doit également permettre à l’Institut d’anticiper l’impact des sciences du numérique dans tous les domaines sociaux et économiques. Ce livre blanc a été coordonné par Bertrand Braunschweig avec des contributions de 45 chercheurs d’Inria et de ses partenaires. Un grand merci à Peter Sturm pour sa relecture précise et complète. Merci également au service STIP du centre de Saclay – Île-de-France pour la correction finale de la version française

    Analyzing Granger causality in climate data with time series classification methods

    Get PDF
    Attribution studies in climate science aim for scientifically ascertaining the influence of climatic variations on natural or anthropogenic factors. Many of those studies adopt the concept of Granger causality to infer statistical cause-effect relationships, while utilizing traditional autoregressive models. In this article, we investigate the potential of state-of-the-art time series classification techniques to enhance causal inference in climate science. We conduct a comparative experimental study of different types of algorithms on a large test suite that comprises a unique collection of datasets from the area of climate-vegetation dynamics. The results indicate that specialized time series classification methods are able to improve existing inference procedures. Substantial differences are observed among the methods that were tested

    Earth Observation Open Science and Innovation

    Get PDF
    geospatial analytics; social observatory; big earth data; open data; citizen science; open innovation; earth system science; crowdsourced geospatial data; citizen science; science in society; data scienc

    Flood hazard hydrology: interdisciplinary geospatial preparedness and policy

    Get PDF
    Thesis (Ph.D.) University of Alaska Fairbanks, 2017Floods rank as the deadliest and most frequently occurring natural hazard worldwide, and in 2013 floods in the United States ranked second only to wind storms in accounting for loss of life and damage to property. While flood disasters remain difficult to accurately predict, more precise forecasts and better understanding of the frequency, magnitude and timing of floods can help reduce the loss of life and costs associated with the impact of flood events. There is a common perception that 1) local-to-national-level decision makers do not have accurate, reliable and actionable data and knowledge they need in order to make informed flood-related decisions, and 2) because of science--policy disconnects, critical flood and scientific analyses and insights are failing to influence policymakers in national water resource and flood-related decisions that have significant local impact. This dissertation explores these perceived information gaps and disconnects, and seeks to answer the question of whether flood data can be accurately generated, transformed into useful actionable knowledge for local flood event decision makers, and then effectively communicated to influence policy. Utilizing an interdisciplinary mixed-methods research design approach, this thesis develops a methodological framework and interpretative lens for each of three distinct stages of flood-related information interaction: 1) data generation—using machine learning to estimate streamflow flood data for forecasting and response; 2) knowledge development and sharing—creating a geoanalytic visualization decision support system for flood events; and 3) knowledge actualization—using heuristic toolsets for translating scientific knowledge into policy action. Each stage is elaborated on in three distinct research papers, incorporated as chapters in this dissertation, that focus on developing practical data and methodologies that are useful to scientists, local flood event decision makers, and policymakers. Data and analytical results of this research indicate that, if certain conditions are met, it is possible to provide local decision makers and policy makers with the useful actionable knowledge they need to make timely and informed decisions
    • …
    corecore