15,946 research outputs found
Building Efficient Query Engines in a High-Level Language
Abstraction without regret refers to the vision of using high-level
programming languages for systems development without experiencing a negative
impact on performance. A database system designed according to this vision
offers both increased productivity and high performance, instead of sacrificing
the former for the latter as is the case with existing, monolithic
implementations that are hard to maintain and extend. In this article, we
realize this vision in the domain of analytical query processing. We present
LegoBase, a query engine written in the high-level language Scala. The key
technique to regain efficiency is to apply generative programming: LegoBase
performs source-to-source compilation and optimizes the entire query engine by
converting the high-level Scala code to specialized, low-level C code. We show
how generative programming allows to easily implement a wide spectrum of
optimizations, such as introducing data partitioning or switching from a row to
a column data layout, which are difficult to achieve with existing low-level
query compilers that handle only queries. We demonstrate that sufficiently
powerful abstractions are essential for dealing with the complexity of the
optimization effort, shielding developers from compiler internals and
decoupling individual optimizations from each other. We evaluate our approach
with the TPC-H benchmark and show that: (a) With all optimizations enabled,
LegoBase significantly outperforms a commercial database and an existing query
compiler. (b) Programmers need to provide just a few hundred lines of
high-level code for implementing the optimizations, instead of complicated
low-level code that is required by existing query compilation approaches. (c)
The compilation overhead is low compared to the overall execution time, thus
making our approach usable in practice for compiling query engines
Parameterized Compilation Lower Bounds for Restricted CNF-formulas
We show unconditional parameterized lower bounds in the area of knowledge
compilation, more specifically on the size of circuits in decomposable negation
normal form (DNNF) that encode CNF-formulas restricted by several graph width
measures. In particular, we show that
- there are CNF formulas of size and modular incidence treewidth
whose smallest DNNF-encoding has size , and
- there are CNF formulas of size and incidence neighborhood diversity
whose smallest DNNF-encoding has size .
These results complement recent upper bounds for compiling CNF into DNNF and
strengthen---quantitatively and qualitatively---known conditional low\-er
bounds for cliquewidth. Moreover, they show that, unlike for many graph
problems, the parameters considered here behave significantly differently from
treewidth
From Query to Usable Code: An Analysis of Stack Overflow Code Snippets
Enriched by natural language texts, Stack Overflow code snippets are an
invaluable code-centric knowledge base of small units of source code. Besides
being useful for software developers, these annotated snippets can potentially
serve as the basis for automated tools that provide working code solutions to
specific natural language queries.
With the goal of developing automated tools with the Stack Overflow snippets
and surrounding text, this paper investigates the following questions: (1) How
usable are the Stack Overflow code snippets? and (2) When using text search
engines for matching on the natural language questions and answers around the
snippets, what percentage of the top results contain usable code snippets?
A total of 3M code snippets are analyzed across four languages: C\#, Java,
JavaScript, and Python. Python and JavaScript proved to be the languages for
which the most code snippets are usable. Conversely, Java and C\# proved to be
the languages with the lowest usability rate. Further qualitative analysis on
usable Python snippets shows the characteristics of the answers that solve the
original question. Finally, we use Google search to investigate the alignment
of usability and the natural language annotations around code snippets, and
explore how to make snippets in Stack Overflow an adequate base for future
automatic program generation.Comment: 13th IEEE/ACM International Conference on Mining Software
Repositories, 11 page
Grover's search algorithm: An optical approach
The essential operations of a quantum computer can be accomplished using
solely optical elements, with different polarization or spatial modes
representing the individual qubits. We present a simple all-optical
implementation of Grover's algorithm for efficient searching, in which a
database of four elements is searched with a single query. By `compiling' the
actual setup, we have reduced the required number of optical elements from 24
to only 12. We discuss the extension to larger databases, and the limitations
of these techniques.Comment: 6 pages, 5 figures. To appear in a special issue of the Journal of
Modern Optics -- "The Physics of Quantum Information
- …