1,221 research outputs found

    Towards Data-Driven Large Scale Scientific Visualization and Exploration

    Get PDF
    Technological advances have enabled us to acquire extremely large datasets but it remains a challenge to store, process, and extract information from them. This dissertation builds upon recent advances in machine learning, visualization, and user interactions to facilitate exploration of large-scale scientific datasets. First, we use data-driven approaches to computationally identify regions of interest in the datasets. Second, we use visual presentation for effective user comprehension. Third, we provide interactions for human users to integrate domain knowledge and semantic information into this exploration process. Our research shows how to extract, visualize, and explore informative regions on very large 2D landscape images, 3D volumetric datasets, high-dimensional volumetric mouse brain datasets with thousands of spatially-mapped gene expression profiles, and geospatial trajectories that evolve over time. The contribution of this dissertation include: (1) We introduce a sliding-window saliency model that discovers regions of user interest in very large images; (2) We develop visual segmentation of intensity-gradient histograms to identify meaningful components from volumetric datasets; (3) We extract boundary surfaces from a wealth of volumetric gene expression mouse brain profiles to personalize the reference brain atlas; (4) We show how to efficiently cluster geospatial trajectories by mapping each sequence of locations to a high-dimensional point with the kernel distance framework. We aim to discover patterns, relationships, and anomalies that would lead to new scientific, engineering, and medical advances. This work represents one of the first steps toward better visual understanding of large-scale scientific data by combining machine learning and human intelligence

    Sensor Data Visualization in Virtual Globe

    Get PDF
    Dissertation submitted in partial fulfillment of the requirements for the Degree of Master of Science in Geospatial Technologies.With the recent developments related with sensors in matters of standardization and accessibility, valuable data covering different geographical subjects have become widely available. The applications that can leverage sensor data are still under development and there is much to do in this subject in the scientific community. Data visualization tools are one of the immediately relevant needs related with sensor data. Such tools would help to increase the understanding and exploration of the data from which many other fields can get benefits. Virtual Globes are becoming increasingly popular in the society. The existence of several implementations and millions of users (scientific and no scientific) around the world are a proof of their increasing usability as a tool for representing and sharing geographical content. In this document we present a generic tool for visualizing sensor data retrieved from SOS servers over the NASA World Wind virtual globe. For this, we started by creating a classification of sensor data that helps in defining possible visualizations for the different types of sensor data. Using this classification as a basis, we have implemented a set of visualization types to ease sensor data exploration. We also included analysis capabilities by integrating the SEXTANTE library in the visualization tool. The results of the analysis can be included in the virtual globe as part of the visualizations

    The Data Big Bang and the Expanding Digital Universe: High-Dimensional, Complex and Massive Data Sets in an Inflationary Epoch

    Get PDF
    Recent and forthcoming advances in instrumentation, and giant new surveys, are creating astronomical data sets that are not amenable to the methods of analysis familiar to astronomers. Traditional methods are often inadequate not merely because of the size in bytes of the data sets, but also because of the complexity of modern data sets. Mathematical limitations of familiar algorithms and techniques in dealing with such data sets create a critical need for new paradigms for the representation, analysis and scientific visualization (as opposed to illustrative visualization) of heterogeneous, multiresolution data across application domains. Some of the problems presented by the new data sets have been addressed by other disciplines such as applied mathematics, statistics and machine learning and have been utilized by other sciences such as space-based geosciences. Unfortunately, valuable results pertaining to these problems are mostly to be found only in publications outside of astronomy. Here we offer brief overviews of a number of concepts, techniques and developments, some "old" and some new. These are generally unknown to most of the astronomical community, but are vital to the analysis and visualization of complex datasets and images. In order for astronomers to take advantage of the richness and complexity of the new era of data, and to be able to identify, adopt, and apply new solutions, the astronomical community needs a certain degree of awareness and understanding of the new concepts. One of the goals of this paper is to help bridge the gap between applied mathematics, artificial intelligence and computer science on the one side and astronomy on the other.Comment: 24 pages, 8 Figures, 1 Table. Accepted for publication: "Advances in Astronomy, special issue "Robotic Astronomy

    An overview of kriging and cokriging predictors for functional random fields

    Get PDF
    This article presents an overview of methodologies for spatial prediction of functional data, focusing on both stationary and non-stationary conditions. A significant aspect of the functional random fields analysis is evaluating stationarity to characterize the stability of statistical properties across the spatial domain. The article explores methodologies from the literature, providing insights into the challenges and advancements in functional geostatistics. This work is relevant from theoreti cal and practical perspectives, offering an integrated view of methodologies tailored to the specific stationarity conditions of the functional processes under study. The practical implications of our work span across fields like environmental monitoring, geosciences, and biomedical research. This overview encourages advancements in functional geostatistics, paving the way for the development of innovative techniques for analyzing and predicting spatially correlated functional data. It lays the groundwork for future research, enhancing our understanding of spatial statistics and its applications.This research was partially supported by FONDECYT, grant number 1200525 (V.L.), from the National Agency for Research and Development (ANID) of the Chilean government under the Ministry of Science, Technology, Knowledge, and Innovation; and by Portuguese funds through the CMAT—Research Centre of Mathematics of University of Minho—within projects UIDB/00013/2020 and UIDP/00013/2020 (C.C.)

    Worldwide Weather Forecasting by Deep Learning

    Get PDF
    La prévision météorologique a été et demeure une tâche ardue ayant été approchée sous plusieurs angles au fil des années. Puisque les modèles proéminents récents sont souvent des modèles d’appentissage machine, l’importance de la disponibilité, de la quantité et de la qualité des données météorologiques augmente. De plus, la revue des proéminents modèles d’apprentissage profond appliqués à la prédiction de séries chronologiques météorologiques suggère que leur principale limite est la formulation et la structure des données qui leur sont fournies en entrée, ce qui restreint la portée et la complexité des problèmes qu’ils tentent de résoudre. À cet effet, cette recherche fournit une solution, l’algorithme d’interpolation géospatiale SkNNI (interpolation des k plus proches voisins sphérique), pour transformer et structurer les données géospatiales disparates de manière à les rendre utiles pour entraîner des modèles prédictifs. SkNNI se démarque des algorithmes d’interpolation géospatiale communs, principalement de par sa forte robustesse aux données d’observation bruitées ainsi que sa considération accrue des voisinages d’interpolation. De surcroît, à travers la conception, l’entraînement et l’évaluation de l’architecture de réseau de neurones profond DeltaNet, cette recherche démontre la faisabilité et le potentiel de la prédiction météorologique multidimensionnelle mondiale par apprentissage profond. Cette approche fait usage de SkNNI pour prétraiter les données météorologiques en les transformant en cartes géospatiales à multiples canaux météorologiques qui sont organisées et utilisées en tant qu’éléments de séries chronologiques. Ce faisant, le recours à de telles cartes géospatiales ouvre de nouveaux horizons quant à la définition et à la résolution de problèmes de prévisions géospatiales (p. ex. météorologiques) plus complexes. ----------ABSTRACT: Weather forecasting has been and still is a challenging task which has been approached from many angles throughout the years. Since recent state-of-the-art models are often machine learning ones, the importance of weather data availability, quantity and quality rises. Also, the review of prominent deep learning models for weather time series forecasting suggests their main limitation is the formulation and structure of their input data, which restrains the scope and complexity of the problems they attempt to solve. As such, this work provides a solution, the spherical k-nearest neighbors interpolation (SkNNI) algorithm, to transform and structure scattered geospatial data in a way that makes it useful for predictive model training. SkNNI shines when compared to other common geospatial interpolation methods, mainly because of its high robustness to noisy observation data and acute interpolation neighborhood awareness. Furthermore, through the design, training and evaluation of the DeltaNet deep neural network architecture, this work demonstrates the feasibility and potential of multidimensional worldwide weather forecasting by deep learning. This approach leverages SkNNI to preprocess weather data into multi-channel geospatial weather frames, which are then organized and used as time series elements. Thus, working with such geospatial frames opens new avenues to define and solve more complex geospatial (e.g. weather) forecasting problems

    Posterior Probability Modeling and Image Classification for Archaeological Site Prospection: Building a Survey Efficacy Model for Identifying Neolithic Felsite Workshops in the Shetland Islands

    Get PDF
    The application of custom classification techniques and posterior probability modeling (PPM) using Worldview-2 multispectral imagery to archaeological field survey is presented in this paper. Research is focused on the identification of Neolithic felsite stone tool workshops in the North Mavine region of the Shetland Islands in Northern Scotland. Sample data from known workshops surveyed using differential GPS are used alongside known non-sites to train a linear discriminant analysis (LDA) classifier based on a combination of datasets including Worldview-2 bands, band difference ratios (BDR) and topographical derivatives. Principal components analysis is further used to test and reduce dimensionality caused by redundant datasets. Probability models were generated by LDA using principal components and tested with sites identified through geological field survey. Testing shows the prospective ability of this technique and significance between 0.05 and 0.01, and gain statistics between 0.90 and 0.94, higher than those obtained using maximum likelihood and random forest classifiers. Results suggest that this approach is best suited to relatively homogenous site types, and performs better with correlated data sources. Finally, by combining posterior probability models and least-cost analysis, a survey least-cost efficacy model is generated showing the utility of such approaches to archaeological field survey

    Design and Interpretability of Contour Lines for Visualizing Multivariate Data

    Get PDF
    Multivariate geospatial data are commonly visualized using contour plots, where the plots for various attributes are often examined side by side, or using color blending. As the number of attributes grows, however, these approaches become less efficient. This limitation motivated the use of glyphs, where different attributes are mapped to different pre-attentive features of the glyphs. Since both contour plot overlays and glyphs clutter the underlying map, in this paper we examine whether contour lines, which are already present in map space, can be leveraged to visualize multivariate geospatial data. We present five different designs for stylizing contour lines, and investigate their interpretability using three crowdsourced studies. We evaluated the designs through a set of common geospatial data analysis tasks on a four-dimensional dataset. Our first two studies examined how the contour line width and the number of contour intervals affect interpretability, using synthetic datasets where we controlled the underlying data distribution. Study 1 revealed that the increase of width improves the task performance in most of the designs, specially in completion time, except some scenarios where reducing width does not affect performance where the visibility of the background is critical. In Study 2, we found out that fewer contour intervals lead to less visual clutter, hence improved performance. We then compared the designs in a third study that used both synthetic and real-life meteorological data. The study revealed that the results found using synthetic data were generalizable to the real-life data, as hypothesized. Moreover, we formulated a design recommendation table tuned to give users task- and category-specific design suggestions under various environment constraints. At last, we discuss the comparison between the lab and online versions of study 1 with respect to display size (lab study was done on big screen and vice versa). Our studies show the effectiveness of stylizing contour lines to represent multivariate data, reveal trade-offs among design parameters, and provide designers with important insights into the factors that influence multivariate interpretability. We also show some real-life scenarios where our visualization approach may improve decision making

    Towards quantifying the effects of resource extraction on land cover and topography through remote sensing analysis: Confronting issues of scale and data scarcity

    Get PDF
    This dissertation focuses on the mapping and monitoring of mineral mining activity using remotely sensed data. More specifically, it explores the challenges and issues associated with remote sensing-based analysis of land use land cover (LULC) and topographic changes in the landscape associated with artisanal and industrial-scale mining. It explores broad themes of image analysis, including evaluation of error in digital elevation models (DEMs), integration of multiple scales and data sources, quantification of change, and remote sensing classification in data-scarce environments. The dissertation comprises three case studies.;The first case study examines the LULC change associated with two scales of mining activity (industrial and artisanal) near Tortiya, Cote d\u27Ivoire. Industrial mining activity was successfully mapped in a regional LULC classification using Landsat multispectral imagery and support vector machines (SVMs). However, mapping artisanal mining required high-resolution imagery to discriminate the small, complex patterns of associated disturbance.;The second case study is an investigation of the potential for quantifying topographic change associated with mountain top removal mining and the associated valley-fill operations for a region in West Virginia, USA, using publicly available DEMs. A 1:24,000 topographic map data, the shuttle radar topography mission (SRTM) DEM, a state-wide photogrammetric DEM, and the Advanced Spaceborne Thermal Emission Radiometer (ASTER) Global DEM (GDEM) were compared to a lidar bare-earth reference DEM. The observed mean error in both the SRTM and GDEM was statistically different than zero and modeled a surface well above the reference DEM surface. Mean error in the other DEMs was lower, and not significantly different than zero. The magnitude of the root mean square error (RMSE) suggests that only topographic change associated with the largest topographic disturbances would be separable from background noise using global DEMS such as the SRTM. Nevertheless, regionally available DEMs from photogrammetric sources allow mapping of mining change and quantification of the total volume of earth removal.;Monitoring topographic change associated with mining is challenging in regions where publicly available DEMs are limited or not available. This challenge is particularly acute for artisanal mining, where the topographic disturbance, though locally important, is unlikely to be detected in global elevation data sets. Therefore, the third and final case study explored the potential for creating fine-spatial resolution bare-earth DEMs from digital surface models (DSMs) using high spatial resolution commercial satellite imagery and subsequent filtering of elevation artifacts using commercial lidar software and other spatial filtering techniques. Leaf-on and leaf-off DSMs were compared to highlight the effect of vegetation on derived bare-earth DEM accuracy. The raw leaf-off DSM was found to have very low error overall, with notably higher error in areas of evergreen vegetation. The raw leaf-on DSM was found to have a RMSE error much higher than the leaf-off data, and similar to that of the SRTM in dense deciduous forest. However, filtering using the commercial techniques developed for lidar notably reduced the error present in the raw DSMs, suggesting that such approaches could help overcome data scarcity in regions where regional or national elevation data sets are not available.;Collectively this research addressed data issues and methodological challenges in the analysis of 3D changes caused by resource extraction. Elevation and optical imagery are key data sets for mapping the disturbance associated with mining. The particular combination required regarding data spatial scale, and for elevation, accuracy, is a function of the type and scale of the mining

    Principal Geodesic Analysis of Merge Trees (and Persistence Diagrams)

    Full text link
    This paper presents a computational framework for the Principal Geodesic Analysis of merge trees (MT-PGA), a novel adaptation of the celebrated Principal Component Analysis (PCA) framework [87] to the Wasserstein metric space of merge trees [92]. We formulate MT-PGA computation as a constrained optimization problem, aiming at adjusting a basis of orthogonal geodesic axes, while minimizing a fitting energy. We introduce an efficient, iterative algorithm which exploits shared-memory parallelism, as well as an analytic expression of the fitting energy gradient, to ensure fast iterations. Our approach also trivially extends to extremum persistence diagrams. Extensive experiments on public ensembles demonstrate the efficiency of our approach - with MT-PGA computations in the orders of minutes for the largest examples. We show the utility of our contributions by extending to merge trees two typical PCA applications. First, we apply MT-PGA to data reduction and reliably compress merge trees by concisely representing them by their first coordinates in the MT-PGA basis. Second, we present a dimensionality reduction framework exploiting the first two directions of the MT-PGA basis to generate two-dimensional layouts of the ensemble. We augment these layouts with persistence correlation views, enabling global and local visual inspections of the feature variability in the ensemble. In both applications, quantitative experiments assess the relevance of our framework. Finally, we provide a lightweight C++ implementation that can be used to reproduce our results

    Visual exploration of climate variability changes using wavelet analysis

    Get PDF
    Due to its nonlinear nature, the climate system shows quite high natural variability on different time scales, including multiyear oscillations such as the El Ni˜no Southern Oscillation phenomenon. Beside a shift of the mean states and of extreme values of climate variables, climate change may also change the frequency or the spatial patterns of these natural climate variations. Wavelet analysis is a well established tool to investigate variability in the frequency domain. However, due to the size and complexity of the analysis results, only few time series are commonly analyzed concurrently. In this paper we will explore different techniques to visually assist the user in the analysis of variability and variability changes to allow for a holistic analysis of a global climate model data set consisting of several variables and extending over 250 years. Our new framework and data from the IPCC AR4 simulations with the coupled climate model ECHAM5/MPI-OM are used to explore the temporal evolution of El Ni˜no due to climate change
    corecore