12 research outputs found
Deductive Optimization of Relational Data Storage
Optimizing the physical data storage and retrieval of data are two key
database management problems. In this paper, we propose a language that can
express a wide range of physical database layouts, going well beyond the row-
and column-based methods that are widely used in database management systems.
We use deductive synthesis to turn a high-level relational representation of a
database query into a highly optimized low-level implementation which operates
on a specialized layout of the dataset. We build a compiler for this language
and conduct experiments using a popular database benchmark, which shows that
the performance of these specialized queries is competitive with a
state-of-the-art in memory compiled database system
Differentiable Functional Program Interpreters
Abstract Programming by Example (PBE) is the task of inducing computer programs from input-output examples. It can be seen as a type of machine learning where the hypothesis space is the set of legal programs in some programming language. Recent work on differentiable interpreters relaxes the discrete space of programs into a continuous space so that search over programs can be performed using gradient-based optimization. While conceptually powerful, so far differentiable interpreter-based program synthesis has only been capable of solving very simple problems. In this work, we study modeling choices that arise when constructing a differentiable programming language and their impact on the success of synthesis. The main motivation for the modeling choices comes from functional programming: we study the effect of memory allocation schemes, immutable data, type systems, and built-in control-flow structures. Empirically we show that incorporating functional programming ideas into differentiable programming languages allows us to learn much more complex programs than is possible with existing differentiable languages
Query Optimization for Dynamic Imputation
© 2017 VLDB. Missing values are common in data analysis and present a usability challenge. Users are forced to pick between removing tuples withmissing values or creating a cleaned version of their data by applying a relatively expensive imputation strategy. Our system, ImputeDB, incorporates imputation into a costbased query optimizer, performing necessary imputations onthefly for each query. This allows users to immediately explore their data, while the system picks the optimal placement of imputation operations. We evaluate this approach on three real-world survey-based datasets. Our experiments show that our query plans execute between 10 and 140 times faster than first imputing the base tables. Furthermore, we show that the query results from on-the-fly imputation differ from the traditional base-table imputation approach by 0-8%. Finally, we show that while dropping tuples with missing values that fail query constraints discards 6-78% of the data, on-the-fly imputation loses only 0-21%
Recommended from our members
Sensitivity of the icecube detector to astrophysical sources of high energy muon neutrinos
We present the results of a Monte-Carlo study of the sensitivity of the planned IceCube detector to predicted fluxes of muon neutrinos at TeV to PeV energies. A complete simulation of the detector and data analysis is used to study the detector's capability to search for muon neutrinos from sources such as active galaxies and gamma-ray bursts. We study the effective area and the angular resolution of the detector as a function of muon energy and angle of incidence. We present detailed calculations of the sensitivity of the detector to both diffuse and pointlike neutrino emissions, including an assessment of the sensitivity to neutrinos detected in coincidence with gamma-ray burst observations. After three years of datataking, IceCube will have been able to detect a point source flux of E2*dN/dE = 7*10^-9 cm^-2s^-1GeV at a 5-sigma significance, or, in the absence of a signal, place a 90 percent c.l. limit at a level E2*dN/dE = 2*10^-9 cm^-2s^-1GeV. A diffuse E-2 flux would be detectable at a minimum strength of E2*dN/dE = 1*10^-8 cm^-2s^-1sr^-1GeV. A gamma-ray burst model following the formulation of Waxman and Bahcall would result in a 5-sigma effect after the observation of 200 bursts in coincidence with satellite observations of the gamma-rays