6,644 research outputs found
Open issues in semantic query optimization in relational DBMS
After two decades of research into Semantic Query Optimization (SQO) there is clear agreement as to the efficacy of SQO. However, although there are some experimental implementations there are still no commercial implementations. We
first present a thorough analysis of research into SQO. We identify three problems which inhibit the effective use of SQO in Relational Database Management Systems(RDBMS). We then propose solutions to these problems and describe first steps towards the implementation of an effective semantic query optimizer for relational databases
Robust query processing for linked data fragments
Linked Data Fragments (LDFs) refer to interfaces that allow for publishing and querying Knowledge Graphs on the Web. These interfaces primarily differ in their expressivity and allow for exploring different trade-offs when balancing the workload between clients and servers in decentralized SPARQL query processing. To devise efficient query plans, clients typically rely on heuristics that leverage the metadata provided by the LDF interface, since obtaining fine-grained statistics from remote sources is a challenging task. However, these heuristics are prone to potential estimation errors based on the metadata which can lead to inefficient query executions with a high number of requests, large amounts of data transferred, and, consequently, excessive execution times. In this work, we investigate robust query processing techniques for Linked Data Fragment clients to address these challenges. We first focus on robust plan selection by proposing CROP, a query plan optimizer that explores the cost and robustness of alternative query plans. Then, we address robust query execution by proposing a new class of adaptive operators: Polymorphic Join Operators. These operators adapt their join strategy in response to possible cardinality estimation errors. The results of our first experimental study show that CROP outperforms state-of-the-art clients by exploring alternative plans based on their cost and robustness. In our second experimental study, we investigate how different planning approaches can benefit from polymorphic join operators and find that they enable more efficient query execution in the majority of cases
Cost-Based Optimization of Integration Flows
Integration flows are increasingly used to specify and execute data-intensive integration tasks between heterogeneous systems and applications. There are many different application areas such as real-time ETL and data synchronization between operational systems. For the reasons of an increasing amount of data, highly distributed IT infrastructures, and high requirements for data consistency and up-to-dateness of query results, many instances of integration flows are executed over time. Due to this high load and blocking synchronous source systems, the performance of the central integration platform is crucial for an IT infrastructure. To tackle these high performance requirements, we introduce the concept of cost-based optimization of imperative integration flows that relies on incremental statistics maintenance and inter-instance plan re-optimization. As a foundation, we introduce the concept of periodical re-optimization including novel cost-based optimization techniques that are tailor-made for integration flows. Furthermore, we refine the periodical re-optimization to on-demand re-optimization in order to overcome the problems of many unnecessary re-optimization steps and adaptation delays, where we miss optimization opportunities. This approach ensures low optimization overhead and fast workload adaptation
Privacy-Preserving Shortest Path Computation
Navigation is one of the most popular cloud computing services. But in
virtually all cloud-based navigation systems, the client must reveal her
location and destination to the cloud service provider in order to learn the
fastest route. In this work, we present a cryptographic protocol for navigation
on city streets that provides privacy for both the client's location and the
service provider's routing data. Our key ingredient is a novel method for
compressing the next-hop routing matrices in networks such as city street maps.
Applying our compression method to the map of Los Angeles, for example, we
achieve over tenfold reduction in the representation size. In conjunction with
other cryptographic techniques, this compressed representation results in an
efficient protocol suitable for fully-private real-time navigation on city
streets. We demonstrate the practicality of our protocol by benchmarking it on
real street map data for major cities such as San Francisco and Washington,
D.C.Comment: Extended version of NDSS 2016 pape
Don't Treat the Symptom, Find the Cause! Efficient Artificial-Intelligence Methods for (Interactive) Debugging
In the modern world, we are permanently using, leveraging, interacting with,
and relying upon systems of ever higher sophistication, ranging from our cars,
recommender systems in e-commerce, and networks when we go online, to
integrated circuits when using our PCs and smartphones, the power grid to
ensure our energy supply, security-critical software when accessing our bank
accounts, and spreadsheets for financial planning and decision making. The
complexity of these systems coupled with our high dependency on them implies
both a non-negligible likelihood of system failures, and a high potential that
such failures have significant negative effects on our everyday life. For that
reason, it is a vital requirement to keep the harm of emerging failures to a
minimum, which means minimizing the system downtime as well as the cost of
system repair. This is where model-based diagnosis comes into play.
Model-based diagnosis is a principled, domain-independent approach that can
be generally applied to troubleshoot systems of a wide variety of types,
including all the ones mentioned above, and many more. It exploits and
orchestrates i.a. techniques for knowledge representation, automated reasoning,
heuristic problem solving, intelligent search, optimization, stochastics,
statistics, decision making under uncertainty, machine learning, as well as
calculus, combinatorics and set theory to detect, localize, and fix faults in
abnormally behaving systems.
In this thesis, we will give an introduction to the topic of model-based
diagnosis, point out the major challenges in the field, and discuss a selection
of approaches from our research addressing these issues.Comment: Habilitation Thesi
Rank-aware, Approximate Query Processing on the Semantic Web
Search over the Semantic Web corpus frequently leads to queries having large result sets. So, in order to discover relevant data elements, users must rely on ranking techniques to sort results according to their relevance. At the same time, applications oftentimes deal with information needs, which do not require complete and exact results. In this thesis, we face the problem of how to process queries over Web data in an approximate and rank-aware fashion
Responsible Composition and Optimization of Integration Processes under Correctness Preserving Guarantees
Enterprise Application Integration deals with the problem of connecting
heterogeneous applications, and is the centerpiece of current on-premise, cloud
and device integration scenarios. For integration scenarios, structurally
correct composition of patterns into processes and improvements of integration
processes are crucial. In order to achieve this, we formalize compositions of
integration patterns based on their characteristics, and describe optimization
strategies that help to reduce the model complexity, and improve the process
execution efficiency using design time techniques. Using the formalism of timed
DB-nets - a refinement of Petri nets - we model integration logic features such
as control- and data flow, transactional data storage, compensation and
exception handling, and time aspects that are present in reoccurring solutions
as separate integration patterns. We then propose a realization of optimization
strategies using graph rewriting, and prove that the optimizations we consider
preserve both structural and functional correctness. We evaluate the
improvements on a real-world catalog of pattern compositions, containing over
900 integration processes, and illustrate the correctness properties in case
studies based on two of these processes.Comment: 37 page
- …