On-the-Fly Data Synopses: Efficient Data Exploration in the Simulation Sciences

Ham, DA; Heinis, T

research

On-the-Fly Data Synopses: Efficient Data Exploration in the Simulation Sciences

Authors: DA Ham
T Heinis
Publication date: 29 June 2015
Publisher: 'Association for Computing Machinery (ACM)'
Doi

Abstract

As a consequence of ever more powerful computing hardware and increasingly precise instruments, our capacity to produce scientific data by far outpaces our ability to efficiently store and analyse it. Few of today's tools to analyse scientific data are able to handle the deluge captured by instruments or generated by supercomputers. In many scenarios, however, it suffices to analyse a small subset of the data in detail. What scientists analysing the data consequently need are efficient means to explore the full dataset using approximate query results and to identify the subsets of interest. Once found, interesting areas can still be scrutinised using a precise, but also more time-consuming analysis. Data synopses fit the bill as they provide fast (but approximate) query execution on massive amounts of data. Generating data synopses after the data is stored, however, requires us to analyse all the data again, and is thus inefficient What we propose is to generate the synopsis for simulation applications on-the-fly when the data is captured. Doing so typically means changing the simulation or data capturing code and is tedious and typically just a one-off solution that is not generally applicable. In contrast, our vision gives scientists a high-level language and the infrastructure needed to generate code that creates data synopses on-the-fly, as the simulation runs. In this paper we discuss the data management challenges associated with our approach</jats:p

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

Supporting member

Spiral - Imperial College Digital Repository

oai:spiral.imperial.ac.uk:1004...

Last time updated on 17/02/2017

Crossref

info:doi/10.1145%2F2814710.281...

Last time updated on 17/09/2020