1 research outputs found
Design of an intermediate representation for query languages
Data oriented applications, usually written in a high-level, general-purpose
programming language (such as Java) interact with database through a coarse
interface. Informally, the text of a query is built on the application side
(either via plain string concatenation or through an abstract notion of
statement) and shipped to the database over the wire where it is executed. The
results are then serialized and sent back to the "client-code" where they are
translated in the language's native datatypes. This round trip is detrimental
to performances but, worse, such a programming model prevents one from having
richer queries, namely queries containing user-defined functions (that is
functions defined by the programmer and used e.g. in the filter condition of a
SQL query). While some databases also possess a "server-side" language (e.g.
PL/SQL in Oracle database), its integration with the very-optimized query
execution engine is still minimal and queries containing (PL/SQL) user-defined
functions remain notoriously inefficient. In this setting, we reviewed existing
language-integrated query frameworks, highlighting that existing database query
languages (including SQL) share high-level querying primitives (e.g.,
filtering, joins, aggregation) that can be represented by operators, but differ
widely regarding the semantics of their expression language. In order to
represent queries in an application language- and database-agnostic manner, we
designed a small calculus, dubbed "QIR" for Query Intermediate Representation.
QIR contains expressions, corresponding to a small extension of the pure
lambda-calculus, and operators to represent usual querying primitives. In the
effort to send efficient queries to the database, we abstracted the idea of
"good" query representations in a measure on QIR terms. Then, we designed an
evaluation strategy rewriting QIR query representations into "better" ones