99 research outputs found

    A Machine Learning Approach for Optimizing Heuristic Decision-making in OWL Reasoners

    Get PDF
    Description Logics (DLs) are formalisms for representing knowledge bases of application domains. TheWeb Ontology Language (OWL) is a syntactic variant of a very expressive description logic. OWL reasoners can infer implied information from OWL ontologies. The performance of OWL reasoners can be severely affected by situations that require decision-making over many alternatives. Such a non-deterministic behavior is often controlled by heuristics that are based on insufficient information. This thesis proposes a novel OWL reasoning approach that applies machine learning (ML) to implement pragmatic and optimal decision-making strategies in such situations. Disjunctions occurring in ontologies are one source of non deterministic actions in reasoners. We propose two ML-based approaches to reduce the non-determinism caused by dealing with disjunctions. The first approach is restricted to propositional description logic while the second one can deal with standard description logic. The first approach builds a logistic regression classifier that chooses a proper branching heuristic for an input ontology. Branching heuristics are first developed to help Propositional Satisfiability (SAT) based solvers with making decisions about which branch to pick in each branching level. The second approach is the developed version of the first approach. An SVM (Support Vector Machine) classier is designed to select an appropriate expansion-ordering heuristic for an input ontology. The built-in heuristics are designed for expansion ordering of satisfiability testing in OWL reasoners. They determine the order for branches in search trees. Both of the above approaches speed up our ML-based reasoner by up to two orders of magnitude in comparison to the non-ML reasoner. Another source of non-deterministic actions is the order in which tableau rules should be applied. On average, our ML-based approach that is an SVM classifier achieves a speedup of two orders of magnitude when compared to the most expensive rule ordering of the non-ML reasoner

    Metamodeling and metaquerying in OWL 2 QL

    Get PDF
    OWL 2 QL is a standard profile of the OWL 2 ontology language, specifically tailored to Ontology-Based Data Management. Inspired by recent work on higher-order Description Logics, in this paper we present a new semantics for OWL 2 QL ontologies, called Metamodeling Semantics (MS), and show that, in contrast to the official Direct Semantics (DS) for OWL 2, it allows exploiting the metamodeling capabilities natively offered by the OWL 2 punning. We then extend unions of conjunctive queries with both metavariables, and the possibility of using TBox atoms, with the purpose of expressing meaningful metalevel queries. We first show that under MS both satisfiability checking and answering queries including only ABox atoms, have the same complexity as under DS. Second, we investigate the problem of answering general metaqueries, and single out a new source of complexity coming from the combined presence of a specific type of incompleteness in the ontology, and of TBox axioms among the query atoms. Then we focus on a specific class of ontologies, called TBox-complete, where there is no incompleteness in the TBox axioms, and show that general metaquery answering in this case has again the same complexity as under DS. Finally, we move to general ontologies and show that answering general metaqueries is coNP-complete with respect to ontology complexity, Π2p-complete with respect to combined complexity, and remains AC0 with respect to ABox complexity

    Scalable Generation of Type Embeddings Using the ABox

    Get PDF
    Structured knowledge bases gain their expressive power from both the ABox and TBox. While the ABox is rich in data, the TBox contains the ontological assertions that are often necessary for logical inference. The crucial links between the ABox and the TBox are served by is-a statements (formally a part of the ABox) that connect instances to types, also referred to as classes or concepts. Latent space embedding algorithms, such as RDF2Vec and TransE, have been used to great effect to model instances in the ABox. Such algorithms work well on large-scale knowledge bases like DBpedia and Geonames, as they are robust to noise and are low-dimensional and real-valued. In this paper, we investigate a supervised algorithm for deriving type embeddings in the same latent space as a given set of entity embeddings. We show that our algorithm generalizes to hundreds of types, and via incremental execution, achieves near-linear scaling on graphs with millions of instances and facts. We also present a theoretical foundation for our proposed model, and the means of validating the model. The empirical utility of the embeddings is illustrated on five partitions of the English DBpedia ABox. We use visualization and clustering to show that our embeddings are in good agreement with the manually curated TBox. We also use the embeddings to perform a soft clustering on 4 million DBpedia instances in terms of the 415 types explicitly participating in is-a relationships in the DBpedia ABox. Lastly, we present a set of results obtained by using the embeddings to recommend types for untyped instances. Our method is shown to outperform another feature-agnostic baseline while achieving 15x speedup without any growth in memory usage

    Answering Object Queries over Knowledge Bases with Expressive Underlying Description Logics

    Get PDF
    Many information sources can be viewed as collections of objects and descriptions about objects. The relationship between objects is often characterized by a set of constraints that semantically encode background knowledge of some domain. The most straightforward and fundamental way to access information in these repositories is to search for objects that satisfy certain selection criteria. This work considers a description logics (DL) based representation of such information sources and object queries, which allows for automated reasoning over the constraints accompanying objects. Formally, a knowledge base K=(T, A) captures constraints in the terminology (a TBox) T, and objects with their descriptions in the assertions (an ABox) A, using some DL dialect L. In such a setting, object descriptions are L-concepts and object identifiers correspond to individual names occurring in K. Correspondingly, object queries are the well known problem of instance retrieval in the underlying DL knowledge base K, which returns the identifiers of qualifying objects. This work generalizes instance retrieval over knowledge bases to provide users with answers in which both identifiers and descriptions of qualifying objects are given. The proposed query paradigm, called assertion retrieval, is favoured over instance retrieval since it provides more informative answers to users. A more compelling reason is related to performance: assertion retrieval enables a transfer of basic relational database techniques, such as caching and query rewriting, in the context of an assertion retrieval algebra. The main contributions of this work are two-fold: one concerns optimizing the fundamental reasoning task that underlies assertion retrieval, namely, instance checking, and the other establishes a query compilation framework based on the assertion retrieval algebra. The former is necessary because an assertion retrieval query can entail a large volume of instance checking requests in the form of K|= a:C, where "a" is an individual name and "C" is a L-concept. This work thus proposes a novel absorption technique, ABox absorption, to improve instance checking. ABox absorption handles knowledge bases that have an expressive underlying dialect L, for instance, that requires disjunctive knowledge. It works particularly well when knowledge bases contain a large number of concrete domain concepts for object descriptions. This work further presents a query compilation framework based on the assertion retrieval algebra to make assertion retrieval more practical. In the framework, a suite of rewriting rules is provided to generate a variety of query plans, with a focus on plans that avoid reasoning w.r.t. the background knowledge bases when sufficient cached results of earlier requests exist. ABox absorption and the query compilation framework have been implemented in a prototypical system, dubbed CARE Assertion Retrieval Engine (CARE). CARE also defines a simple yet effective cost model to search for the best plan generated by query rewriting. Empirical studies of CARE have shown that the proposed techniques in this work make assertion retrieval a practical application over a variety of domains

    Design and Evaluation of Algorithms for Parallel Classification of Ontologies

    Get PDF
    Description Logics are a family of knowledge representation formalisms with formal semantics. In recent years, DLs have influenced the design and standardization of the Web Ontology Language OWL. The acceptance of OWL as a web standard has promoted the widespread utilization of DL ontologies on the web. One of the most frequently used inference services of description logic reasoners classifies all named classes of OWL ontologies into a subsumption hierarchy. Due to emerging OWL ontologies from the web community consisting of up to hundreds of thousand of named classes and the increasing availability of multi-processor and multi- or many-core computers, the need for parallelizing description logic inference services to achieve a better scalability is expected. The contribution of this thesis has two aspects. On a theoretical level, it first presents algorithms to construct a TBox in parallel, which are independent of a particular DL logic, however they sacrifice completeness. Then, a sound and complete algorithm for TBox classification in parallel is presented. In this algorithm all the subsumption relationships between concepts of a partition assigned to a single thread are found correctly, in other words, correctness of the TBox subsumption hierarchy is guaranteed. Thereafter, we provide an extension of the sound and complete algorithm which is used to handle TBox classification concurrently and more efficiently. This thesis also describes an optimization technique suitable for better partitioning the list of concepts to be inserted into the TBox. On a practical level, a running prototype, Parallel TBox Classifier was implemented for each generation of the classifier based on the above theoretical foundations, respectively. The Parallel TBox Classifier is used to evaluate the practical merit of the proposed algorithms as well as the effectiveness of the designed optimizations against existing state-of-the-art benchmarks. The empirical results illustrate that Parallel TBox Classifier outperforms the Sequential TBox Classifier on real world ontologies with a linear or superlinear speedup factor. Parallel TBox Classifier can form a basis to develop more efficient parallel classification techniques for real world ontologies with different sizes and DL complexities

    Reasoning Algebraically with Description Logics

    Get PDF
    Semantic Web applications based on the Web Ontology Language (OWL) often require the use of numbers in class descriptions for expressing cardinality restrictions on properties or even classes. Some of these cardinalities are specified explicitly, but quite a few are entailed and need to be discovered by reasoning procedures. Due to the Description Logic (DL) foundation of OWL, those reasoning services are offered by DL reasoners. Existing DL reasoners employ reasoning procedures that are arithmetically uninformed and substitute arithmetic reasoning by "don't know" non-determinism in order to cover all possible cases. This lack of information about arithmetic problems dramatically degrades the performance of DL reasoners in many cases, especially with ontologies relying on the use of Nominals and Qualied Cardinality Restrictions. The contribution of this thesis is twofold: on the theoretical level, it presents algebra�ic reasoning with DL (ReAl DL) using a sound, complete, and terminating reasoning procedure for the DL SHOQ. ReAl DL combines tableau reasoning procedures with algebraic methods, namely Integer Programming, to ensure arithmetically better informed reasoning. SHOQ extends the standard DL ALC with transitive roles, role hierarchies, qualified cardinality restrictions (QCRs), and nominals, and forms an expressive subset of OWL. Although the proposed algebraic tableau is double exponential in the worst case, it deals with cardinalities with an additional level of information and properties that make the calculus amenable and well suited for optimizations. In order for ReAl DL to have a practical merit, suited optimizations are proposed towards achieving an efficient reasoning approach that addresses the sources of complexity related to nominals and QCRs. On the practical level, a running prototype reasoner (HARD) is implemented based on the proposed calculus and optimizations. HARD is used to evaluate the practical merit of ReAl DL, as well as the effectiveness of the proposed optimizations. Experimental results based on real world and synthetic ontologies show that ReAl DL outperforms existing reasoning approaches in handling the interactions between nominals and QCRs. ReAl DL also comes with some interesting features such as the ability to handle ontologies with cyclic descriptions without adopting special blocking strategies. ReAl DL can form a basis to provide more efficient reasoning support for ontologies using nominals or QCRs

    On the Computation of Common Subsumers in Description Logics

    Get PDF
    Description logics (DL) knowledge bases are often build by users with expertise in the application domain, but little expertise in logic. To support this kind of users when building their knowledge bases a number of extension methods have been proposed to provide the user with concept descriptions as a starting point for new concept definitions. The inference service central to several of these approaches is the computation of (least) common subsumers of concept descriptions. In case disjunction of concepts can be expressed in the DL under consideration, the least common subsumer (lcs) is just the disjunction of the input concepts. Such a trivial lcs is of little use as a starting point for a new concept definition to be edited by the user. To address this problem we propose two approaches to obtain "meaningful" common subsumers in the presence of disjunction tailored to two different methods to extend DL knowledge bases. More precisely, we devise computation methods for the approximation-based approach and the customization of DL knowledge bases, extend these methods to DLs with number restrictions and discuss their efficient implementation

    Action, Time and Space in Description Logics

    Get PDF
    Description Logics (DLs) are a family of logic-based knowledge representation (KR) formalisms designed to represent and reason about static conceptual knowledge in a semantically well-understood way. On the other hand, standard action formalisms are KR formalisms based on classical logic designed to model and reason about dynamic systems. The largest part of the present work is dedicated to integrating DLs with action formalisms, with the main goal of obtaining decidable action formalisms with an expressiveness significantly beyond propositional. To this end, we offer DL-tailored solutions to the frame and ramification problem. One of the main technical results is that standard reasoning problems about actions (executability and projection), as well as the plan existence problem are decidable if one restricts the logic for describing action pre- and post-conditions and the state of the world to decidable Description Logics. A smaller part of the work is related to decidable extensions of Description Logics with concrete datatypes, most importantly with those allowing to refer to the notions of space and time

    Parallelizing Description Logic Reasoning

    Get PDF
    Description Logic has become one of the primary knowledge representation and reasoning methodologies during the last twenty years. A lot of areas are benefiting from description logic based technologies. Description logic reasoning algorithms and a number of optimization techniques for them play an important role and have been intensively researched. However, few of them have been systematically investigated in a concurrency context in spite of multi-processor computing facilities growing up. Meanwhile, semantic web, an application domain of description logic, is producing vast knowledge data on the Internet, which needs to be dealt with by using scalable solutions. This situation requires description logic reasoners to be endowed with reasoning scalability. This research introduced concurrent computing in two aspects: classification, and tableau-based description logic reasoning. Classification is a core description logic reasoning service. Over more than two decades many research efforts have been devoted to optimizing classification. Those classification optimization algorithms have shown their pragmatic effectiveness for sequential processing. However, as concurrent computing becomes widely available, new classification algorithms that are well suited to parallelization need to be developed. This need is further supported by the observation that most available OWL reasoners, which are usually based on tableau reasoning, can only utilize a single processor. Such an inadequacy often leads users working in ontology development to frustration, especially if their ontologies are complex and require long processing times. Classification service finds out all named concept subsumption relationships entailed in a knowledge base. Each subsumption test enrolls two concepts and is independent of the others. At most n^2 subsumption tests are needed for a knowledge base which contains n concepts. As the first contribution of this research, we developed an algorithm and a corresponding architecture showing that reasoning scalability can be gained by using concurrent computing. Further, this research investigated how concurrent computing can increase performance of tableau-based description logic reasoning algorithms. Tableau-based description logic reasoning decides a problem by constructing an AND-OR tree. Before this research, some research has shown the effectiveness of parallelizing processing disjunction branches of a tableau expansion tree. Our research has shown how reasoning scalability can be gained by processing conjunction branches of a tableau expansion tree. In addition, this research developed an algorithm, merge classification, that uses a divide and conquer strategy for parallelizing classification. This method applies concurrent computing to the more efficient classification algorithm, top-search & bottom-search, which has been adopted as a standard procedure for classification. Reasoning scalability can be observed in a number of real world cases by using this algorithm

    A survey of large-scale reasoning on the Web of data

    Get PDF
    As more and more data is being generated by sensor networks, social media and organizations, the Webinterlinking this wealth of information becomes more complex. This is particularly true for the so-calledWeb of Data, in which data is semantically enriched and interlinked using ontologies. In this large anduncoordinated environment, reasoning can be used to check the consistency of the data and of asso-ciated ontologies, or to infer logical consequences which, in turn, can be used to obtain new insightsfrom the data. However, reasoning approaches need to be scalable in order to enable reasoning over theentire Web of Data. To address this problem, several high-performance reasoning systems, whichmainly implement distributed or parallel algorithms, have been proposed in the last few years. Thesesystems differ significantly; for instance in terms of reasoning expressivity, computational propertiessuch as completeness, or reasoning objectives. In order to provide afirst complete overview of thefield,this paper reports a systematic review of such scalable reasoning approaches over various ontologicallanguages, reporting details about the methods and over the conducted experiments. We highlight theshortcomings of these approaches and discuss some of the open problems related to performing scalablereasoning
    • …
    corecore