8,908 research outputs found

    Low-latency, query-driven analytics over voluminous multidimensional, spatiotemporal datasets

    Get PDF
    2017 Summer.Includes bibliographical references.Ubiquitous data collection from sources such as remote sensing equipment, networked observational devices, location-based services, and sales tracking has led to the accumulation of voluminous datasets; IDC projects that by 2020 we will generate 40 zettabytes of data per year, while Gartner and ABI estimate 20-35 billion new devices will be connected to the Internet in the same time frame. The storage and processing requirements of these datasets far exceed the capabilities of modern computing hardware, which has led to the development of distributed storage frameworks that can scale out by assimilating more computing resources as necessary. While challenging in its own right, storing and managing voluminous datasets is only the precursor to a broader field of study: extracting knowledge, insights, and relationships from the underlying datasets. The basic building block of this knowledge discovery process is analytic queries, encompassing both query instrumentation and evaluation. This dissertation is centered around query-driven exploratory and predictive analytics over voluminous, multidimensional datasets. Both of these types of analysis represent a higher-level abstraction over classical query models; rather than indexing every discrete value for subsequent retrieval, our framework autonomously learns the relationships and interactions between dimensions in the dataset (including time series and geospatial aspects), and makes the information readily available to users. This functionality includes statistical synopses, correlation analysis, hypothesis testing, probabilistic structures, and predictive models that not only enable the discovery of nuanced relationships between dimensions, but also allow future events and trends to be predicted. This requires specialized data structures and partitioning algorithms, along with adaptive reductions in the search space and management of the inherent trade-off between timeliness and accuracy. The algorithms presented in this dissertation were evaluated empirically on real-world geospatial time-series datasets in a production environment, and are broadly applicable across other storage frameworks

    Earth Observation Open Science and Innovation

    Get PDF
    geospatial analytics; social observatory; big earth data; open data; citizen science; open innovation; earth system science; crowdsourced geospatial data; citizen science; science in society; data scienc

    How to integrate geochemistry at affordable costs into reactive transport for large-scale systems: Abstract Book

    Get PDF
    This international workshop entitled “How to integrate geochemistry at affordable costs into reac-tive transport for large-scale systems” was organized by the Institute of Resource Ecology of the Helmholtz-Zentrum Dresden Rossendorf in Feb-ruary 2020. A mechanistic understanding and building on that an appropriate modelling of geochemical processes is essential for reliably predicting contaminant transport in groundwater systems, but also in many other cases where migration of hazardous substances is expected and consequently has to be assessed and limited. In case of already present contaminations, such modelling may help to quantify the threads and to support the development and application of suitable remediation measures. Typical application areas are nuclear waste disposal, environmental remediation, mining and milling, carbon capture & storage, or geothermal energy production. Experts from these fields were brought together to discuss large-scale reactive transport modelling (RTM) because the scales covered by such pre-dictions may reach up to one million year and dozens of kilometers. Full-fledged incorporation of geochemical processes, e.g. sorption, precipitation, or redox reactions (to name just a few important basic processes) will thus create inacceptable long computing times. As an effective way to integrate geochemistry at affordable costs into RTM different geochemical concepts (e.g. multidimensional look-up tables, surrogate functions, machine learning, utilization of uncertainty and sensitivity analysis etc.) exist and were extensively discussed throughout the workshop. During the 3-day program of the workshop keynote and regular lectures from experts in the field, a poster session, and a radio lab tour had been offered. In total, 40 scientists from 28 re-search institutes and 8 countries participated

    Numerical Modeling of Flexible Structures in Open Ocean Environment

    Get PDF
    The dissertation presents advancements in numerical modeling of offshore aquaculture and harbor protection structures in the open ocean environment. The advancements were implemented in the finite element software Hydro-FE that expands the Morison equation approach previously incorporated in Aqua-FE software developed at the University of New Hampshire. The concept of equivalent dropper was introduced and validated on the example of a typical mussel longline design. Parametric studies for mussel dropper drag coefficients and bending stiffness contributions were performed for different environmental conditions. To model kelp aggregates in macroalgae aquaculture, a corresponding numerical technique was developed. The technique proposes a modified Morison-type approach calibrated in full-scale physical tow tank experiments conducted at Hydromechanics Laboratory of the United States Naval Academy. In addition to the numerical modeling techniques, an advanced methodology for multidimensional approximation of the current velocity fields around offshore installations was proposed. The methodology was applied to model a response of a kelp farm by utilizing tidal-driven acoustic Doppler current profiler measurements. Finally, a numerical model of a floating protective barrier was built in the Hydro-FE software to evaluate its seaworthiness. The model was validated by comparison to measurements obtained in scaled physical wave tank tests and field deployments

    Contributions to the efficient use of general purpose coprocessors: kernel density estimation as case study

    Get PDF
    142 p.The high performance computing landscape is shifting from assemblies of homogeneous nodes towards heterogeneous systems, in which nodes consist of a combination of traditional out-of-order execution cores and accelerator devices. Accelerators provide greater theoretical performance compared to traditional multi-core CPUs, but exploiting their computing power remains as a challenging task.This dissertation discusses the issues that arise when trying to efficiently use general purpose accelerators. As a contribution to aid in this task, we present a thorough survey of performance modeling techniques and tools for general purpose coprocessors. Then we use as case study the statistical technique Kernel Density Estimation (KDE). KDE is a memory bound application that poses several challenges for its adaptation to the accelerator-based model. We present a novel algorithm for the computation of KDE that reduces considerably its computational complexity, called S-KDE. Furthermore, we have carried out two parallel implementations of S-KDE, one for multi and many-core processors, and another one for accelerators. The latter has been implemented in OpenCL in order to make it portable across a wide range of devices. We have evaluated the performance of each implementation of S-KDE in a variety of architectures, trying to highlight the bottlenecks and the limits that the code reaches in each device. Finally, we present an application of our S-KDE algorithm in the field of climatology: a novel methodology for the evaluation of environmental models

    Data analytics 2016: proceedings of the fifth international conference on data analytics

    Get PDF

    Technology assessment of advanced automation for space missions

    Get PDF
    Six general classes of technology requirements derived during the mission definition phase of the study were identified as having maximum importance and urgency, including autonomous world model based information systems, learning and hypothesis formation, natural language and other man-machine communication, space manufacturing, teleoperators and robot systems, and computer science and technology
    • …
    corecore