1 research outputs found
Reinforced Approximate Exploratory Data Analysis
Exploratory data analytics (EDA) is a sequential decision making process
where analysts choose subsequent queries that might lead to some interesting
insights based on the previous queries and corresponding results. Data
processing systems often execute the queries on samples to produce results with
low latency. Different downsampling strategy preserves different statistics of
the data and have different magnitude of latency reductions. The optimum choice
of sampling strategy often depends on the particular context of the analysis
flow and the hidden intent of the analyst. In this paper, we are the first to
consider the impact of sampling in interactive data exploration settings as
they introduce approximation errors. We propose a Deep Reinforcement Learning
(DRL) based framework which can optimize the sample selection in order to keep
the analysis and insight generation flow intact. Evaluations with 3 real
datasets show that our technique can preserve the original insight generation
flow while improving the interaction latency, compared to baseline methods.Comment: Appears in the 37th AAAI Conference on Artificial Intelligence
(AAAI), 202