Search CORE

3,246 research outputs found

Pay One, Get Hundreds for Free: Reducing Cloud Costs through Shared Query Execution

Author: Chen Chung-Min
Graefe Goetz
Harizopoulos Stavros
Lang Christian A.
Manegold Stefan
Transaction Processing Performance Council
Zukowski Marcin
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/09/2018
Field of study

Cloud-based data analysis is nowadays common practice because of the lower system management overhead as well as the pay-as-you-go pricing model. The pricing model, however, is not always suitable for query processing as heavy use results in high costs. For example, in query-as-a-service systems, where users are charged per processed byte, collections of queries accessing the same data frequently can become expensive. The problem is compounded by the limited options for the user to optimize query execution when using declarative interfaces such as SQL. In this paper, we show how, without modifying existing systems and without the involvement of the cloud provider, it is possible to significantly reduce the overhead, and hence the cost, of query-as-a-service systems. Our approach is based on query rewriting so that multiple concurrent queries are combined into a single query. Our experiments show the aggregated amount of work done by the shared execution is smaller than in a query-at-a-time approach. Since queries are charged per byte processed, the cost of executing a group of queries is often the same as executing a single one of them. As an example, we demonstrate how the shared execution of the TPC-H benchmark is up to 100x and 16x cheaper in Amazon Athena and Google BigQuery than using a query-at-a-time approach while achieving a higher throughput

arXiv.org e-Print Archive

Repository for Publications and Research Data

Vectorwise: Beyond Column Stores

Author: Boncz P.A.
Zukowski M.
Publication venue
Publication date: 01/01/2012
Field of study

textabstractThis paper tells the story of Vectorwise, a high-performance analytical database system, from multiple perspectives: its history from academic project to commercial product, the evolution of its technical architecture, customer reactions to the product and its future research and development roadmap. One take-away from this story is that the novelty in Vectorwise is much more than just column-storage: it boasts many query processing innovations in its vectorized execution model, and an adaptive mixed row/column data storage model with indexing support tailored to analytical workloads. Another one is that there is a long road from research prototype to commercial product, though database research continues to achieve a strong innovative inﬂuence on product development

Faster Geometric Algorithms via Dynamic Determinant Computation

Author: Abbott
Avis
Avrachenkov
Bareiss
Bartlett
Barvinok
Basu
Berkowitz
Bird
Boehm
Boissonnat
Brönnimann
Brönnimann
Brönnimann
Bunch
Büeler
Büeler
CGAL
Chand
Clarkson
Clarkson
Clarkson
Conway
Coppersmith
Cox
Dumas
Dumas
Edelsbrunner
Emiris
Emiris
Fisikopoulos
Fukuda
Garling
Gawrilow
Gelfand
Guennebaud
Harville
Hornus
Iliopoulos
Kaltofen
Kaltofen
Kettner
Krattenthaler
Le Gall
Luis Peñaranda
Mahajan
Poole
Press
Rambau
Robinson
Rote
Sankowski
Seidel
Sherman
Urbańska
Villard
Vissarion Fisikopoulos
Yap
Ziegler
Publication venue: 'Elsevier BV'
Publication date: 12/01/2016
Field of study

The computation of determinants or their signs is the core procedure in many important geometric algorithms, such as convex hull, volume and point location. As the dimension of the computation space grows, a higher percentage of the total computation time is consumed by these computations. In this paper we study the sequences of determinants that appear in geometric algorithms. The computation of a single determinant is accelerated by using the information from the previous computations in that sequence. We propose two dynamic determinant algorithms with quadratic arithmetic complexity when employed in convex hull and volume computations, and with linear arithmetic complexity when used in point location problems. We implement the proposed algorithms and perform an extensive experimental analysis. On one hand, our analysis serves as a performance study of state-of-the-art determinant algorithms and implementations. On the other hand, we demonstrate the supremacy of our methods over state-of-the-art implementations of determinant and geometric algorithms. Our experimental results include a 20 and 78 times speed-up in volume and point location computations in dimension 6 and 11 respectively.Comment: 29 pages, 8 figures, 3 table

arXiv.org e-Print Archive

DI-fusion

Estimating Cardinalities with Deep Sketches

Author: Boncz Peter
Kemper Alfons
Kipf Andreas
Kipf Thomas
Leis Viktor
Müller Jonas
Neumann Thomas
Radke Bernhard
Vorona Dimitri
Publication venue
Publication date: 17/04/2019
Field of study

We introduce Deep Sketches, which are compact models of databases that allow us to estimate the result sizes of SQL queries. Deep Sketches are powered by a new deep learning approach to cardinality estimation that can capture correlations between columns, even across tables. Our demonstration allows users to define such sketches on the TPC-H and IMDb datasets, monitor the training process, and run ad-hoc queries against trained sketches. We also estimate query cardinalities with HyPer and PostgreSQL to visualize the gains over traditional cardinality estimators.Comment: To appear in SIGMOD'1

arXiv.org e-Print Archive

Code Generation for Efficient Query Processing in Managed Runtimes

Author: Bierman Gavin M.
Nagel Fabian
Viglas Stratis D.
Publication venue
Publication date: 01/01/2014
Field of study

In this paper we examine opportunities arising from the conver-gence of two trends in data management: in-memory database sys-tems (IMDBs), which have received renewed attention following the availability of affordable, very large main memory systems; and language-integrated query, which transparently integrates database queries with programming languages (thus addressing the famous ‘impedance mismatch ’ problem). Language-integrated query not only gives application developers a more convenient way to query external data sources like IMDBs, but also to use the same querying language to query an application’s in-memory collections. The lat-ter offers further transparency to developers as the query language and all data is represented in the data model of the host program-ming language. However, compared to IMDBs, this additional free-dom comes at a higher cost for query evaluation. Our vision is to improve in-memory query processing of application objects by introducing database technologies to managed runtimes. We focus on querying and we leverage query compilation to im-prove query processing on application objects. We explore dif-ferent query compilation strategies and study how they improve the performance of query processing over application data. We take C] as the host programming language as it supports language-integrated query through the LINQ framework. Our techniques de-liver significant performance improvements over the default LINQ implementation. Our work makes important first steps towards a future where data processing applications will commonly run on machines that can store their entire datasets in-memory, and will be written in a single programming language employing language-integrated query and IMDB-inspired runtimes to provide transparent and highly efficient querying. 1

CiteSeerX