15 research outputs found

    Closed-World Semantics for Query Answering in Temporal Description Logics

    Get PDF
    Ontology-mediated query answering is a popular paradigm for enriching answers to user queries with background knowledge. For querying the absence of information, however, there exist only few ontology-based approaches. Moreover, these proposals conflate the closed-domain and closed-world assumption, and therefore are not suited to deal with the anonymous objects that are common in ontological reasoning. Many real-world applications, like processing electronic health records (EHRs), also contain a temporal dimension, and require efficient reasoning algorithms. Moreover, since medical data is not recorded on a regular basis, reasoners must deal with sparse data with potentially large temporal gaps. Our contribution consists of three main parts: Firstly, we introduce a new closed-world semantics for answering conjunctive queries with negation over ontologies formulated in the description logic ELH⊥, which is based on the minimal universal model. We propose a rewriting strategy for dealing with negated query atoms, which shows that query answering is possible in polynomial time in data complexity. Secondly, we introduce a new temporal variant of ELH⊥ that features a convexity operator. We extend this minimal-world semantics for answering metric temporal conjunctive queries with negation over the logic and obtain similar rewritability and complexity results. Thirdly, apart from the theoretical results, we evaluate minimal-world semantics in practice by selecting patients, based their EHRs, that match given criteria

    Query Answering in Probabilistic Data and Knowledge Bases

    Get PDF
    Probabilistic data and knowledge bases are becoming increasingly important in academia and industry. They are continuously extended with new data, powered by modern information extraction tools that associate probabilities with knowledge base facts. The state of the art to store and process such data is founded on probabilistic database systems, which are widely and successfully employed. Beyond all the success stories, however, such systems still lack the fundamental machinery to convey some of the valuable knowledge hidden in them to the end user, which limits their potential applications in practice. In particular, in their classical form, such systems are typically based on strong, unrealistic limitations, such as the closed-world assumption, the closed-domain assumption, the tuple-independence assumption, and the lack of commonsense knowledge. These limitations do not only lead to unwanted consequences, but also put such systems on weak footing in important tasks, querying answering being a very central one. In this thesis, we enhance probabilistic data and knowledge bases with more realistic data models, thereby allowing for better means for querying them. Building on the long endeavor of unifying logic and probability, we develop different rigorous semantics for probabilistic data and knowledge bases, analyze their computational properties and identify sources of (in)tractability and design practical scalable query answering algorithms whenever possible. To achieve this, the current work brings together some recent paradigms from logics, probabilistic inference, and database theory

    Integrating and querying linked datasets through ontological rules

    Get PDF
    The Web of Linked Open Data has developed from a few datasets in 2007 into a large data space containing billions of RDF triples published and stored in hundreds of independent datasets, so as to form the so called Linked Open Data Cloud. This information cloud, ranging over a wide set of data domains, poses a challenge when it comes to reconciling heterogeneous schemas or vocabularies adopted by data publishers. Motivated by this challenge, in this thesis was address the problem of integrating and querying multiple heterogeneous Linked Data sets through ontological rules. Firstly, we propose a formalisation of the notion of a peer-to-peer Linked Data integration system, where the mappings between peers comprise schema-level mappings and equality constraints between different IRIs; we call this formalism an RDF Peer System(RPS). We show that the semantics of the mappings preserve tractability of answering Basic Graph Pattern (BGP) SPARQL queries against the data stored in the RDF sources and the set of constraints given by the RPS mappings. Then, we address the problem of SPARQL query rewriting under RPSs and we show that it is not possible to rewrite an input BGP SPARQL query into a SPARQL 1.0 query under general RPSs, as the RPS peer mappings are not first-order-rewritable rules; this is a major drawback of general RPSs since data materialisation is required to exploit their full semantics. With the adoption of the more recent standard SPARQL 1.1 and its property paths we are able to extend the expressivity of the target language beyond first-order by including regular expressions in the body of the target SPARQL queries, that is, by expressing conjunctive two-way regular path queries (C2RPQs). Following this idea, in the second part of the thesis we step away from the language of RPSs to conduct a study on C2RPQ-rewritability under a broader ontology language. We define [ELHI`inh] (harmless linear ELHI), an ontology language that generalises both the DL-Lite[R] and linear ELH description logics. We prove the rewritability of instance queries (queries with a single atom in their body) under [ELHI`inh] knowledge bases with C2RPQs as the target language, presenting a query rewriting algorithm that makes use of non-deterministic finite-state automata. Following from that, we propose a query rewriting algorithm for answering conjunctive queries under [ELHI`inh] knowledge bases, with C2RPQs as the target language. Since C2RPQs can be straightforwardly expressed in SPARQL 1.1 by means of property paths, we believe that our approach is directly applicable to real-world querying settings. Lastly, we undertake a complexity analysis for query answering under [ELHI`inh]. We analyse the computational cost of query answering in terms of both data complexity (where the ontology and the query are fixed and the data alone is a variable input)and combined complexity (where query, ontology and data all constitute the variable input). We show that answering instance queries under [ELHI`inh] is NLogSpace-complete for data complexity and in PTime for combined complexity; we also show that answering CQs under [ELHI`inh] is NLogSpace-complete for data complexity and NP-complete for combined complexity

    Design and Evaluation of Algorithms for Parallel Classification of Ontologies

    Get PDF
    Description Logics are a family of knowledge representation formalisms with formal semantics. In recent years, DLs have influenced the design and standardization of the Web Ontology Language OWL. The acceptance of OWL as a web standard has promoted the widespread utilization of DL ontologies on the web. One of the most frequently used inference services of description logic reasoners classifies all named classes of OWL ontologies into a subsumption hierarchy. Due to emerging OWL ontologies from the web community consisting of up to hundreds of thousand of named classes and the increasing availability of multi-processor and multi- or many-core computers, the need for parallelizing description logic inference services to achieve a better scalability is expected. The contribution of this thesis has two aspects. On a theoretical level, it first presents algorithms to construct a TBox in parallel, which are independent of a particular DL logic, however they sacrifice completeness. Then, a sound and complete algorithm for TBox classification in parallel is presented. In this algorithm all the subsumption relationships between concepts of a partition assigned to a single thread are found correctly, in other words, correctness of the TBox subsumption hierarchy is guaranteed. Thereafter, we provide an extension of the sound and complete algorithm which is used to handle TBox classification concurrently and more efficiently. This thesis also describes an optimization technique suitable for better partitioning the list of concepts to be inserted into the TBox. On a practical level, a running prototype, Parallel TBox Classifier was implemented for each generation of the classifier based on the above theoretical foundations, respectively. The Parallel TBox Classifier is used to evaluate the practical merit of the proposed algorithms as well as the effectiveness of the designed optimizations against existing state-of-the-art benchmarks. The empirical results illustrate that Parallel TBox Classifier outperforms the Sequential TBox Classifier on real world ontologies with a linear or superlinear speedup factor. Parallel TBox Classifier can form a basis to develop more efficient parallel classification techniques for real world ontologies with different sizes and DL complexities

    Pseudo-contractions as Gentle Repairs

    Get PDF
    Updating a knowledge base to remove an unwanted consequence is a challenging task. Some of the original sentences must be either deleted or weakened in such a way that the sentence to be removed is no longer entailed by the resulting set. On the other hand, it is desirable that the existing knowledge be preserved as much as possible, minimising the loss of information. Several approaches to this problem can be found in the literature. In particular, when the knowledge is represented by an ontology, two different families of frameworks have been developed in the literature in the past decades with numerous ideas in common but with little interaction between the communities: applications of AGM-like Belief Change and justification-based Ontology Repair. In this paper, we investigate the relationship between pseudo-contraction operations and gentle repairs. Both aim to avoid the complete deletion of sentences when replacing them with weaker versions is enough to prevent the entailment of the unwanted formula. We show the correspondence between concepts on both sides and investigate under which conditions they are equivalent. Furthermore, we propose a unified notation for the two approaches, which might contribute to the integration of the two areas

    Quantitative Methods for Similarity in Description Logics

    Get PDF
    Description Logics (DLs) are a family of logic-based knowledge representation languages used to describe the knowledge of an application domain and reason about it in formally well-defined way. They allow users to describe the important notions and classes of the knowledge domain as concepts, which formalize the necessary and sufficient conditions for individual objects to belong to that concept. A variety of different DLs exist, differing in the set of properties one can use to express concepts, the so-called concept constructors, as well as the set of axioms available to describe the relations between concepts or individuals. However, all classical DLs have in common that they can only express exact knowledge, and correspondingly only allow exact inferences. Either we can infer that some individual belongs to a concept, or we can't, there is no in-between. In practice though, knowledge is rarely exact. Many definitions have their exceptions or are vaguely formulated in the first place, and people might not only be interested in exact answers, but also in alternatives that are "close enough". This thesis is aimed at tackling how to express that something "close enough", and how to integrate this notion into the formalism of Description Logics. To this end, we will use the notion of similarity and dissimilarity measures as a way to quantify how close exactly two concepts are. We will look at how useful measures can be defined in the context of DLs, and how they can be incorporated into the formal framework in order to generalize it. In particular, we will look closer at two applications of thus measures to DLs: Relaxed instance queries will incorporate a similarity measure in order to not just give the exact answer to some query, but all answers that are reasonably similar. Prototypical definitions on the other hand use a measure of dissimilarity or distance between concepts in order to allow the definitions of and reasoning with concepts that capture not just those individuals that satisfy exactly the stated properties, but also those that are "close enough"

    An introduction to description logics and query rewriting

    Get PDF
    This chapter gives an overview of the description logics underlying the OWL 2 Web Ontology Language and its three tractable profiles, OWL 2 RL, OWL 2 EL and OWL 2 QL. We consider the syntax and semantics of these description logics as well as main reasoning tasks and their computational complexity. We also discuss the semantical foundations for fist-order and datalog rewritings of conjunctive queries over knowledge bases given in the OWL2 profiles, and outline the architecture of the ontology-based data access system Ontop

    Inference as a data management problem

    Get PDF
    Inference over OWL ontologies with large A-Boxes has been researched as a data management problem in recent years. This work adopts the strategy of applying a tableaux-based reasoner for complete T-Box classification, and using a rule-based mechanism for scalable A-Box reasoning. Specifically, we establish for the classified T-Box an inference framework, which can be used to compute and materialise inference results. The inference we focus on is type inference in A-Box reasoning, which we define as the process of deriving for each A-Box instance its memberships of OWL classes and properties. As our approach materialises the inference results, it in general provides faster query processing than non-materialising techniques, at the expense of larger space requirement and slower update speed. When the A-Box size is suitable for an RDBMS, we compile the inference framework to triggers, which incrementally update the inference materialisation from both data inserts and data deletes, without needing to re-compute the whole inference. More importantly, triggers make inference available as atomic consequences of inserts or deletes, which preserves the ACID properties of transactions, and such inference is known as transactional reasoning. When the A-Box size is beyond the capability of an RDBMS, we then compile the inference framework to Spark programmes, which provide scalable inference materialisation in a Big Data system, and our evaluation considers up to reasoning 270 million A-Box facts. Evaluating our work, and comparing with two state-of-the-art reasoners, we empirically verify that our approach is able to perform scalable inference materialisation, and to provide faster query processing with comparable completeness of reasoning.Open Acces

    Polynomial-Time Reasoning Support for Design and Maintenance of Large-Scale Biomedical Ontologies

    Get PDF
    Description Logics (DLs) belong to a successful family of knowledge representation formalisms with two key assets: formally well-defined semantics which allows to represent knowledge in an unambiguous way and automated reasoning which allows to infer implicit knowledge from the one given explicitly. This thesis investigates various reasoning techniques for tractable DLs in the EL family which have been implemented in the CEL system. It suggests that the use of the lightweight DLs, in which reasoning is tractable, is beneficial for ontology design and maintenance both in terms of expressivity and scalability. The claim is supported by a case study on the renown medical ontology SNOMED CT and extensive empirical evaluation on several large-scale biomedical ontologies
    corecore