32,362 research outputs found
Intelligent SPARQL Endpoints: Optimizing Execution Performance by Automatic Query Relaxation and Queue Scheduling
The Web of Data is widely considered as one of the major global repositories populated with countless interconnected and struc- tured data prompting these linked datasets to be continuously and sharply increasing. In this context the so-called SPARQL Protocol and RDF Query Language is commonly used to retrieve and manage stored data by means of SPARQL endpoints, a query processing service especially designed to get access to these databases. Nevertheless, due to the large amount of data tackled by such endpoints and their structural complex- ity, these services usually suffer from severe performance issues, including inadmissible processing times. This work aims at overcoming this noted inefficiency by designing a distributed parallel system architecture that improves the performance of SPARQL endpoints by incorporating two functionalities: 1) a queuing system to avoid bottlenecks during the exe- cution of SPARQL queries; and 2) an intelligent relaxation of the queries submitted to the endpoint at hand whenever the relaxation itself and the consequently lowered complexity of the query are beneficial for the over- all performance of the system. To this end the system relies on a two-fold optimization criterion: the minimization of the query running time, as predicted by a supervised learning model; and the maximization of the quality of the results of the query as quantified by a measure of similar- ity. These two conflicting optimization criteria are efficiently balanced by two bi-objective heuristic algorithms sequentially executed over groups of SPARQL queries. The approach is validated on a prototype and several experiments that evince the applicability of the proposed scheme
Deductive Optimization of Relational Data Storage
Optimizing the physical data storage and retrieval of data are two key
database management problems. In this paper, we propose a language that can
express a wide range of physical database layouts, going well beyond the row-
and column-based methods that are widely used in database management systems.
We use deductive synthesis to turn a high-level relational representation of a
database query into a highly optimized low-level implementation which operates
on a specialized layout of the dataset. We build a compiler for this language
and conduct experiments using a popular database benchmark, which shows that
the performance of these specialized queries is competitive with a
state-of-the-art in memory compiled database system
Open issues in semantic query optimization in relational DBMS
After two decades of research into Semantic Query Optimization (SQO) there is clear agreement as to the efficacy of SQO. However, although there are some experimental implementations there are still no commercial implementations. We
first present a thorough analysis of research into SQO. We identify three problems which inhibit the effective use of SQO in Relational Database Management Systems(RDBMS). We then propose solutions to these problems and describe first steps towards the implementation of an effective semantic query optimizer for relational databases
A Survey on Array Storage, Query Languages, and Systems
Since scientific investigation is one of the most important providers of
massive amounts of ordered data, there is a renewed interest in array data
processing in the context of Big Data. To the best of our knowledge, a unified
resource that summarizes and analyzes array processing research over its long
existence is currently missing. In this survey, we provide a guide for past,
present, and future research in array processing. The survey is organized along
three main topics. Array storage discusses all the aspects related to array
partitioning into chunks. The identification of a reduced set of array
operators to form the foundation for an array query language is analyzed across
multiple such proposals. Lastly, we survey real systems for array processing.
The result is a thorough survey on array data storage and processing that
should be consulted by anyone interested in this research topic, independent of
experience level. The survey is not complete though. We greatly appreciate
pointers towards any work we might have forgotten to mention.Comment: 44 page
Progressive Processing of Continuous Range Queries in Hierarchical Wireless Sensor Networks
In this paper, we study the problem of processing continuous range queries in
a hierarchical wireless sensor network. Contrasted with the traditional
approach of building networks in a "flat" structure using sensor devices of the
same capability, the hierarchical approach deploys devices of higher capability
in a higher tier, i.e., a tier closer to the server. While query processing in
flat sensor networks has been widely studied, the study on query processing in
hierarchical sensor networks has been inadequate. In wireless sensor networks,
the main costs that should be considered are the energy for sending data and
the storage for storing queries. There is a trade-off between these two costs.
Based on this, we first propose a progressive processing method that
effectively processes a large number of continuous range queries in
hierarchical sensor networks. The proposed method uses the query merging
technique proposed by Xiang et al. as the basis and additionally considers the
trade-off between the two costs. More specifically, it works toward reducing
the storage cost at lower-tier nodes by merging more queries, and toward
reducing the energy cost at higher-tier nodes by merging fewer queries (thereby
reducing "false alarms"). We then present how to build a hierarchical sensor
network that is optimal with respect to the weighted sum of the two costs. It
allows for a cost-based systematic control of the trade-off based on the
relative importance between the storage and energy in a given network
environment and application. Experimental results show that the proposed method
achieves a near-optimal control between the storage and energy and reduces the
cost by 0.989~84.995 times compared with the cost achieved using the flat
(i.e., non-hierarchical) setup as in the work by Xiang et al.Comment: 41 pages, 20 figure
- …