27,964 research outputs found
Query Optimization to Improve Performance of the Code Execution
Object-Oriented Programming (OOP) is one of the most successful techniques for abstraction. Bundling together objects into collections of objects, and then operating on these collections, is a fundamental part of main stream object-oriented programming languages. Object querying is an abstraction of operations over collections, whereas manual implementations are performed at low level which forces the developers to specify how a task must be done. Some object-oriented languages allow the programmers to express queries explicitly in the code, which are optimized using the query optimization techniques from the database domain. In this regard, we have developed a technique that performs query optimization at compile-time to reduce the burden of optimization at run-time to improve the performance of the code execution. Keywords- Querying; joins; compile time; run-time; histograms; query optimizatio
Exploring run-time reduction in programming codes via query optimization and caching
Object oriented programming languages raised the level of abstraction by supporting the explicit first class query constructs in the programming codes. These query constructs allow programmers to express operations on collections more abstractly than relying on their realization in loops or through provided libraries. Join optimization techniques from the field of database technology support efficient realizations of such language constructs. However, the problem associated with the existing techniques such as query optimization in Java Query Language (JQL) incurs run time overhead. Besides the programming languages supporting first-class query constructs, the usage of annotations has also increased in the software engineering community recently. Annotations are a common means of providing metadata information to the source code. The object oriented programming languages such as C# provides attributes constraints and Java has its own annotation constructs that allow the developers to include the metadata information in the program codes. This work introduces a series of query optimization approaches to reduce the run time of the programs involving explicit queries over collections. The proposed approaches rely on histograms to estimate the selectivity of the predicates and the joins in order to construct the query plans. The annotations in the source code are also utilized to gather the metadata required for the selectivity estimation of the numerical as well as the string valued predicates and joins in the queries. Several cache heuristics are proposed that effectively cache the results of repeated queries in the program codes. The cached query results are incrementally maintained up-to-date after the update operations to the collections --Abstract, page iv
From Nested-Loop to Join Queries in OODB
Most declarative SQL-like query languages for object-oriented database systems are orthogonal languages allowing for arbitrary nesting of expressions in the select-, from-, and where-clause. Expressions in the from-clause may be base tables as well as set-valued attributes. In this paper, we propose a general strategy for the optimization of nested OOSQL queries. As in the relational model, the translation/optimization goal is to move from tuple- to set-oriented query processing. Therefore, OOSQL is translated into the algebraic language ADL, and by means of algebraic rewriting nested queries are transformed into join queries as far as possible. Three different optimization options are described, and a strategy to assign priorities to options is proposed
Flattening an object algebra to provide performance
Algebraic transformation and optimization techniques have been the method of choice in relational query execution, but applying them in object-oriented (OO) DBMSs is difficult due to the complexity of OO query languages. This paper demonstrates that the problem can be simplified by mapping an OO data model to the binary relational model implemented by Monet, a state-of-the-art database kernel. We present a generic mapping scheme to flatten data models and study the case of straightforward OO model. We show how flattening enabled us to implement a query algebra, using only a very limited set of simple operations. The required primitives and query execution strategies are discussed, and their performance is evaluated on the 1-GByte TPC-D (Transaction-processing Performance Council's Benchmark D), showing that our divide-and-conquer approach yields excellent result
Transformation of structure-shy programs with application to XPath queries and strategic functions
Various programming languages allow the construction of structure-shy programs. Such programs are defined generically for many different datatypes and only specify specific behavior for a few relevant subtypes. Typical examples are XML query languages that allow selection of subdocuments without exhaustively specifying intermediate element tags. Other examples are languages and libraries for polytypic or strategic functional programming and for adaptive object-oriented programming.
In this paper, we present an algebraic approach to transformation of declarative structure-shy programs, in particular for strategic functions and XML queries. We formulate a rich set of algebraic laws, not just for transformation of structure-shy programs, but also for their conversion into structure-sensitive programs and vice versa. We show how subsets of these laws can be used to construct effective rewrite systems for specialization, generalization, and optimization of structure-shy programs. We present a type-safe encoding of these rewrite systems in Haskell which itself uses strategic functional programming techniques. We discuss the application of these rewrite systems for XPath query optimization and for query migration in the context of schema evolution
Building Efficient Query Engines in a High-Level Language
Abstraction without regret refers to the vision of using high-level
programming languages for systems development without experiencing a negative
impact on performance. A database system designed according to this vision
offers both increased productivity and high performance, instead of sacrificing
the former for the latter as is the case with existing, monolithic
implementations that are hard to maintain and extend. In this article, we
realize this vision in the domain of analytical query processing. We present
LegoBase, a query engine written in the high-level language Scala. The key
technique to regain efficiency is to apply generative programming: LegoBase
performs source-to-source compilation and optimizes the entire query engine by
converting the high-level Scala code to specialized, low-level C code. We show
how generative programming allows to easily implement a wide spectrum of
optimizations, such as introducing data partitioning or switching from a row to
a column data layout, which are difficult to achieve with existing low-level
query compilers that handle only queries. We demonstrate that sufficiently
powerful abstractions are essential for dealing with the complexity of the
optimization effort, shielding developers from compiler internals and
decoupling individual optimizations from each other. We evaluate our approach
with the TPC-H benchmark and show that: (a) With all optimizations enabled,
LegoBase significantly outperforms a commercial database and an existing query
compiler. (b) Programmers need to provide just a few hundred lines of
high-level code for implementing the optimizations, instead of complicated
low-level code that is required by existing query compilation approaches. (c)
The compilation overhead is low compared to the overall execution time, thus
making our approach usable in practice for compiling query engines
Efficient and Reasonable Object-Oriented Concurrency
Making threaded programs safe and easy to reason about is one of the chief
difficulties in modern programming. This work provides an efficient execution
model for SCOOP, a concurrency approach that provides not only data race
freedom but also pre/postcondition reasoning guarantees between threads. The
extensions we propose influence both the underlying semantics to increase the
amount of concurrent execution that is possible, exclude certain classes of
deadlocks, and enable greater performance. These extensions are used as the
basis an efficient runtime and optimization pass that improve performance 15x
over a baseline implementation. This new implementation of SCOOP is also 2x
faster than other well-known safe concurrent languages. The measurements are
based on both coordination-intensive and data-manipulation-intensive benchmarks
designed to offer a mixture of workloads.Comment: Proceedings of the 10th Joint Meeting of the European Software
Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of
Software Engineering (ESEC/FSE '15). ACM, 201
A Data Transformation System for Biological Data Sources
Scientific data of importance to biologists in the Human Genome Project resides not only in conventional databases, but in structured files maintained in a number of different formats (e.g. ASN.1 and ACE) as well a.s sequence analysis packages (e.g. BLAST and FASTA). These formats and packages contain a number of data types not found in conventional databases, such as lists and variants, and may be deeply nested. We present in this paper techniques for querying and transforming such data, and illustrate their use in a prototype system developed in conjunction with the Human Genome Center for Chromosome 22. We also describe optimizations performed by the system, a crucial issue for bulk data
- âŠ