29 research outputs found
Combining Rewriting and Incremental Materialisation Maintenance for Datalog Programs with Equality
Materialisation precomputes all consequences of a set of facts and a datalog
program so that queries can be evaluated directly (i.e., independently from the
program). Rewriting optimises materialisation for datalog programs with
equality by replacing all equal constants with a single representative; and
incremental maintenance algorithms can efficiently update a materialisation for
small changes in the input facts. Both techniques are critical to practical
applicability of datalog systems; however, we are unaware of an approach that
combines rewriting and incremental maintenance. In this paper we present the
first such combination, and we show empirically that it can speed up updates by
several orders of magnitude compared to using either rewriting or incremental
maintenance in isolation.Comment: All proofs contained in the appendix. 7 pages + 4 pages appendix. 7
algorithms and one table with evaluation result
Computing CQ lower-bounds over OWL 2 through approximation to RSA
Conjunctive query (CQ) answering over knowledge bases is an important
reasoning task. However, with expressive ontology languages such as OWL, query
answering is computationally very expensive. The PAGOdA system addresses this
issue by using a tractable reasoner to compute lower and upper-bound
approximations, falling back to a fully-fledged OWL reasoner only when these
bounds don't coincide. The effectiveness of this approach critically depends on
the quality of the approximations, and in this paper we explore a technique for
computing closer approximations via RSA, an ontology language that subsumes all
the OWL 2 profiles while still maintaining tractability. We present a novel
approximation of OWL 2 ontologies into RSA, and an algorithm to compute a
closer (than PAGOdA) lower bound approximation using the RSA combined approach.
We have implemented these algorithms in a prototypical CQ answering system, and
we present a preliminary evaluation of our system that shows significant
performance improvements w.r.t. PAGOdA.Comment: 26 pages, 1 figur
Inference as a data management problem
Inference over OWL ontologies with large A-Boxes has been researched as a data management problem in recent years. This work adopts the strategy of applying a tableaux-based reasoner for complete T-Box classification, and using a rule-based mechanism for scalable A-Box reasoning. Specifically, we establish for the classified T-Box an inference framework, which can be used to compute and materialise inference results. The inference we focus on is type inference in A-Box reasoning, which we define as the process of deriving for each A-Box instance its memberships of OWL classes and properties. As our approach materialises the inference results, it in general provides faster query processing than non-materialising techniques, at the expense of larger space requirement and slower update speed. When the A-Box size is suitable for an RDBMS, we compile the inference framework to triggers, which incrementally update the inference materialisation from both data inserts and data deletes, without needing to re-compute the whole inference. More importantly, triggers make inference available as atomic consequences of inserts or deletes, which preserves the ACID properties of transactions, and such inference is known as transactional reasoning. When the A-Box size is beyond the capability of an RDBMS, we then compile the inference framework to Spark programmes, which provide scalable inference materialisation in a Big Data system, and our evaluation considers up to reasoning 270 million A-Box facts. Evaluating our work, and comparing with two state-of-the-art reasoners, we empirically verify that our approach is able to perform scalable inference materialisation, and to provide faster query processing with comparable completeness of reasoning.Open Acces
Modular Materialisation of Datalog Programs
The semina\"ive algorithm can materialise all consequences of arbitrary
datalog rules, and it also forms the basis for incremental algorithms that
update a materialisation as the input facts change. Certain (combinations of)
rules, however, can be handled much more efficiently using custom algorithms.
To integrate such algorithms into a general reasoning approach that can handle
arbitrary rules, we propose a modular framework for materialisation computation
and its maintenance. We split a datalog program into modules that can be
handled using specialised algorithms, and handle the remaining rules using the
semina\"ive algorithm. We also present two algorithms for computing the
transitive and the symmetric-transitive closure of a relation that can be used
within our framework. Finally, we show empirically that our framework can
handle arbitrary datalog programs while outperforming existing approaches,
often by orders of magnitude.Comment: Accepted at AAAI 201
Flexible Integration and Efficient Analysis of Multidimensional Datasets from the Web
If numeric data from the Web are brought together, natural scientists can compare climate measurements with estimations, financial analysts can evaluate companies based on balance sheets and daily stock market values, and citizens can explore the GDP per capita from several data sources. However, heterogeneities and size of data remain a problem. This work presents methods to query a uniform view - the Global Cube - of available datasets from the Web and builds on Linked Data query approaches
SUMA: A Partial Materialization-Based Scalable Query Answering in OWL 2 DL
AbstractOntology-mediated querying (OMQ) provides a paradigm for query answering according to which users not only query records at the database but also query implicit information inferred from ontology. A key challenge in OMQ is that the implicit information may be infinite, which cannot be stored at the database and queried by off -the -shelf query engine. The commonly adopted technique to deal with infinite entailments is query rewriting, which, however, comes at the cost of query rewriting at runtime. In this work, the partial materialization method is proposed to ensure that the extension is always finite. The partial materialization technology does not rewrite query but instead computes partial consequences entailed by ontology before the online query. Besides, a query analysis algorithm is designed to ensure the completeness of querying rooted and Boolean conjunctive queries over partial materialization. We also soundly and incompletely expand our method to support highly expressive ontology language, OWL 2 DL. Finally, we further optimize the materialization efficiency by role rewriting algorithm and implement our approach as a prototype system SUMA by integrating off-the-shelf efficient SPARQL query engine. The experiments show that SUMA is complete on each test ontology and each test query, which is the same as Pellet and outperforms PAGOdA. Besides, SUMA is highly scalable on large datasets
Conjunctive query answering over unrestricted OWLĀ 2 ontologies
Conjunctive Query (CQ) answering is a primary reasoning task over knowledge bases. However, when considering expressive description logics, query answering can be computationally very expensive; reasoners for CQ answering, although heavily optimized, often sacrifice expressive power of the input ontology or completeness of the computed answers in order to achieve tractability and scalability for the problem. In this work, we present a hybrid query answering architecture that combines various services to provide a CQ answering service for OWL. Specifically, it combines scalable CQ answering services for tractable languages with a CQ answering service for a more expressive language approaching the full OWL 2. If the query can be fully answered by one of the tractable services, then that service is used, to ensure maximum performance. Otherwise, the tractable services are used to compute lower and upper bound approximations. The union of the lower bounds and the intersection of the upper bounds are then compared. If the bounds do not coincide, then the āgapā answers are checked using the āfullā service. These techniques led to the development of two new systems: (i) RSAComb, an efficient implementation of a new tractable answering service for RSA (role safety acyclic) (ii) ACQuA, a reference implementation of the proposed hybrid architecture combining RSAComb, PAGOdA, and HermiT to provide a CQ answering service for OWL. Our extensive evaluation shows how the additional computational cost introduced by reasoning over a more expressive language like RSA can still provide a significant improvement compared to relying on a fully-fledged reasoner. Additionally, we show how ACQuA can reliably match the performance of PAGOdA, a state-of-the-art CQ answering system that uses a similar approach, and can significantly improve performance when PAGOdA extensively relies on the underlying fully-fledged reasoner