14,070 research outputs found
Building Efficient Query Engines in a High-Level Language
Abstraction without regret refers to the vision of using high-level
programming languages for systems development without experiencing a negative
impact on performance. A database system designed according to this vision
offers both increased productivity and high performance, instead of sacrificing
the former for the latter as is the case with existing, monolithic
implementations that are hard to maintain and extend. In this article, we
realize this vision in the domain of analytical query processing. We present
LegoBase, a query engine written in the high-level language Scala. The key
technique to regain efficiency is to apply generative programming: LegoBase
performs source-to-source compilation and optimizes the entire query engine by
converting the high-level Scala code to specialized, low-level C code. We show
how generative programming allows to easily implement a wide spectrum of
optimizations, such as introducing data partitioning or switching from a row to
a column data layout, which are difficult to achieve with existing low-level
query compilers that handle only queries. We demonstrate that sufficiently
powerful abstractions are essential for dealing with the complexity of the
optimization effort, shielding developers from compiler internals and
decoupling individual optimizations from each other. We evaluate our approach
with the TPC-H benchmark and show that: (a) With all optimizations enabled,
LegoBase significantly outperforms a commercial database and an existing query
compiler. (b) Programmers need to provide just a few hundred lines of
high-level code for implementing the optimizations, instead of complicated
low-level code that is required by existing query compilation approaches. (c)
The compilation overhead is low compared to the overall execution time, thus
making our approach usable in practice for compiling query engines
Open Programming Language Interpreters
Context: This paper presents the concept of open programming language
interpreters and the implementation of a framework-level metaobject protocol
(MOP) to support them. Inquiry: We address the problem of dynamic interpreter
adaptation to tailor the interpreter's behavior on the task to be solved and to
introduce new features to fulfill unforeseen requirements. Many languages
provide a MOP that to some degree supports reflection. However, MOPs are
typically language-specific, their reflective functionality is often
restricted, and the adaptation and application logic are often mixed which
hardens the understanding and maintenance of the source code. Our system
overcomes these limitations. Approach: We designed and implemented a system to
support open programming language interpreters. The prototype implementation is
integrated in the Neverlang framework. The system exposes the structure,
behavior and the runtime state of any Neverlang-based interpreter with the
ability to modify it. Knowledge: Our system provides a complete control over
interpreter's structure, behavior and its runtime state. The approach is
applicable to every Neverlang-based interpreter. Adaptation code can
potentially be reused across different language implementations. Grounding:
Having a prototype implementation we focused on feasibility evaluation. The
paper shows that our approach well addresses problems commonly found in the
research literature. We have a demonstrative video and examples that illustrate
our approach on dynamic software adaptation, aspect-oriented programming,
debugging and context-aware interpreters. Importance: To our knowledge, our
paper presents the first reflective approach targeting a general framework for
language development. Our system provides full reflective support for free to
any Neverlang-based interpreter. We are not aware of any prior application of
open implementations to programming language interpreters in the sense defined
in this paper. Rather than substituting other approaches, we believe our system
can be used as a complementary technique in situations where other approaches
present serious limitations
A Monitoring Language for Run Time and Post-Mortem Behavior Analysis and Visualization
UFO is a new implementation of FORMAN, a declarative monitoring language, in
which rules are compiled into execution monitors that run on a virtual machine
supported by the Alamo monitor architecture.Comment: In M. Ronsse, K. De Bosschere (eds), proceedings of the Fifth
International Workshop on Automated Debugging (AADEBUG 2003), September 2003,
Ghent. cs.SE/030902
Is One Hyperparameter Optimizer Enough?
Hyperparameter tuning is the black art of automatically finding a good
combination of control parameters for a data miner. While widely applied in
empirical Software Engineering, there has not been much discussion on which
hyperparameter tuner is best for software analytics. To address this gap in the
literature, this paper applied a range of hyperparameter optimizers (grid
search, random search, differential evolution, and Bayesian optimization) to
defect prediction problem. Surprisingly, no hyperparameter optimizer was
observed to be `best' and, for one of the two evaluation measures studied here
(F-measure), hyperparameter optimization, in 50\% cases, was no better than
using default configurations.
We conclude that hyperparameter optimization is more nuanced than previously
believed. While such optimization can certainly lead to large improvements in
the performance of classifiers used in software analytics, it remains to be
seen which specific optimizers should be applied to a new dataset.Comment: 7 pages, 2 columns, accepted for SWAN1
Apache Calcite: A Foundational Framework for Optimized Query Processing Over Heterogeneous Data Sources
Apache Calcite is a foundational software framework that provides query
processing, optimization, and query language support to many popular
open-source data processing systems such as Apache Hive, Apache Storm, Apache
Flink, Druid, and MapD. Calcite's architecture consists of a modular and
extensible query optimizer with hundreds of built-in optimization rules, a
query processor capable of processing a variety of query languages, an adapter
architecture designed for extensibility, and support for heterogeneous data
models and stores (relational, semi-structured, streaming, and geospatial).
This flexible, embeddable, and extensible architecture is what makes Calcite an
attractive choice for adoption in big-data frameworks. It is an active project
that continues to introduce support for the new types of data sources, query
languages, and approaches to query processing and optimization.Comment: SIGMOD'1
Sawja: Static Analysis Workshop for Java
Static analysis is a powerful technique for automatic verification of
programs but raises major engineering challenges when developing a full-fledged
analyzer for a realistic language such as Java. This paper describes the Sawja
library: a static analysis framework fully compliant with Java 6 which provides
OCaml modules for efficiently manipulating Java bytecode programs. We present
the main features of the library, including (i) efficient functional
data-structures for representing program with implicit sharing and lazy
parsing, (ii) an intermediate stack-less representation, and (iii) fast
computation and manipulation of complete programs
- …