255 research outputs found

    Scalable Reasoning for Knowledge Bases Subject to Changes

    Get PDF
    ScienceWeb is a semantic web system that collects information about a research community and allows users to ask qualitative and quantitative questions related to that information using a reasoning engine. The more complete the knowledge base is, the more helpful answers the system will provide. As the size of knowledge base increases, scalability becomes a challenge for the reasoning system. As users make changes to the knowledge base and/or new information is collected, providing fast enough response time (ranging from seconds to a few minutes) is one of the core challenges for the reasoning system. There are two basic inference methods commonly used in first order logic: forward chaining and backward chaining. As a general rule, forward chaining is a good method for a static knowledge base and backward chaining is good for the more dynamic cases. The goal of this thesis was to design a hybrid reasoning architecture and develop a scalable reasoning system whose efficiency is able to meet the interaction requirements in a ScienceWeb system when facing a large and evolving knowledge base. Interposing a backward chaining reasoner between an evolving knowledge base and a query manager with support of trust yields an architecture that can support reasoning in the face of frequent changes. An optimized query-answering algorithm, an optimized backward chaining algorithm and a trust-based hybrid reasoning algorithm are three key algorithms in such an architecture. Collectively, these three algorithms are significant contributions to the field of backward chaining reasoners over ontologies. I explored the idea of trust in the trust-based hybrid reasoning algorithm, where each change to the knowledge base is analyzed as to what subset of the knowledge base is impacted by the change and could therefore contribute to incorrect inferences. I adopted greedy ordering and deferring joins in optimized query-answering algorithm. I introduced four optimizations in the algorithm for backward chaining. These optimizations are: 1) the implementation of the selection function, 2) the upgraded substitute function, 3) the application of OLDT and 4) solving of the owl: sameAs problem. I evaluated our optimization techniques by comparing the results with and without optimization techniques. I evaluated our optimized query answering algorithm by comparing to a traditional backward-chaining reasoner. I evaluated our trust-based hybrid reasoning algorithm by comparing the performance of a forward chaining algorithm to that of a pure backward chaining algorithm. The evaluation results have shown that the hybrid reasoning architecture with the scalable reasoning system is able to support scalable reasoning of ScienceWeb to answer qualitative questions effectively when facing both a fixed knowledge base and an evolving knowledge base

    OWL Reasoners still useable in 2023

    Full text link
    In a systematic literature and software review over 100 OWL reasoners/systems were analyzed to see if they would still be usable in 2023. This has never been done in this capacity. OWL reasoners still play an important role in knowledge organisation and management, but the last comprehensive surveys/studies are more than 8 years old. The result of this work is a comprehensive list of 95 standalone OWL reasoners and systems using an OWL reasoner. For each item, information on project pages, source code repositories and related documentation was gathered. The raw research data is provided in a Github repository for anyone to use

    RDF graph validation using rule-based reasoning

    Get PDF
    The correct functioning of Semantic Web applications requires that given RDF graphs adhere to an expected shape. This shape depends on the RDF graph and the application's supported entailments of that graph. During validation, RDF graphs are assessed against sets of constraints, and found violations help refining the RDF graphs. However, existing validation approaches cannot always explain the root causes of violations (inhibiting refinement), and cannot fully match the entailments supported during validation with those supported by the application. These approaches cannot accurately validate RDF graphs, or combine multiple systems, deteriorating the validator's performance. In this paper, we present an alternative validation approach using rule-based reasoning, capable of fully customizing the used inferencing steps. We compare to existing approaches, and present a formal ground and practical implementation "Validatrr", based on N3Logic and the EYE reasoner. Our approach - supporting an equivalent number of constraint types compared to the state of the art - better explains the root cause of the violations due to the reasoner's generated logical proof, and returns an accurate number of violations due to the customizable inferencing rule set. Performance evaluation shows that Validatrr is performant for smaller datasets, and scales linearly w.r.t. the RDF graph size. The detailed root cause explanations can guide future validation report description specifications, and the fine-grained level of configuration can be employed to support different constraint languages. This foundation allows further research into handling recursion, validating RDF graphs based on their generation description, and providing automatic refinement suggestions

    Hierarchical Multi-Label Classification Using Web Reasoning for Large Datasets

    Get PDF
    Extracting valuable data among large volumes of data is one of the main challenges in Big Data. In this paper, a Hierarchical Multi-Label Classification process called Semantic HMC is presented. This process aims to extract valuable data from very large data sources, by automatically learning a label hierarchy and classifying data items.The Semantic HMC process is composed of five scalable steps, namely Indexation, Vectorization, Hierarchization, Resolution and Realization. The first three steps construct automatically a label hierarchy from statistical analysis of data. This paper focuses on the last two steps which perform item classification according to the label hierarchy. The process is implemented as a scalable and distributed application, and deployed on a Big Data platform. A quality evaluation is described, which compares the approach with multi-label classification algorithms from the state of the art dedicated to the same goal. The Semantic HMC approach outperforms state of the art approaches in some areas

    Implementation of a knowledge discovery and enhancement module from structured information gained from unstructured sources of information

    Get PDF
    Tese de mestrado integrado. Engenharia Informática e Computação. Faculdade de Engenharia. Universidade do Porto. 201

    Reasoning strategies for semantic Web rule languages

    Get PDF
    Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.Includes bibliographical references (p. 101-104).Dealing with data in open, distributed environments is an increasingly important problem today. The processing of heterogeneous data in formats such as RDF is still being researched. Using rules and rule engines is one technique that is being used. In doing so, the problem of handling heterogeneous rules from multiple sources becomes important. Over the course of this thesis, I wrote several kinds of reasoners including backward, forward, and hybrid reasoners for RDF rule languages. These were used for a variety of problems and data in a wide range of settings for solving real world problems. During my investigations, I learned several interesting problems of RDF. First, simply making the term space big and well names paced and the language low enough expressivity did not make computation necessarily easier. Next, checking proofs in an RDF environment proved to be hard because the basic features of RDF that make it possible for it to represent heterogeneous data effectively make proofs difficult. Further work is needed to see if some of these problems can be mitigated. Though rules are useful, using rules correctly and efficiently for processing RDF data proved to be difficult.by Joseph Scharf.M.Eng

    Inference as a data management problem

    Get PDF
    Inference over OWL ontologies with large A-Boxes has been researched as a data management problem in recent years. This work adopts the strategy of applying a tableaux-based reasoner for complete T-Box classification, and using a rule-based mechanism for scalable A-Box reasoning. Specifically, we establish for the classified T-Box an inference framework, which can be used to compute and materialise inference results. The inference we focus on is type inference in A-Box reasoning, which we define as the process of deriving for each A-Box instance its memberships of OWL classes and properties. As our approach materialises the inference results, it in general provides faster query processing than non-materialising techniques, at the expense of larger space requirement and slower update speed. When the A-Box size is suitable for an RDBMS, we compile the inference framework to triggers, which incrementally update the inference materialisation from both data inserts and data deletes, without needing to re-compute the whole inference. More importantly, triggers make inference available as atomic consequences of inserts or deletes, which preserves the ACID properties of transactions, and such inference is known as transactional reasoning. When the A-Box size is beyond the capability of an RDBMS, we then compile the inference framework to Spark programmes, which provide scalable inference materialisation in a Big Data system, and our evaluation considers up to reasoning 270 million A-Box facts. Evaluating our work, and comparing with two state-of-the-art reasoners, we empirically verify that our approach is able to perform scalable inference materialisation, and to provide faster query processing with comparable completeness of reasoning.Open Acces

    Ontology-Based Data Access Using Rewriting, OWL 2 RL Systems and Repairing

    Full text link
    Abstract. In previous work it has been shown how an OWL 2 DL on-tology O can be `repaired ' for an OWL 2 RL system ans|that is, how we can compute a set of axioms R that is independent from the data and such that ans that is generally incomplete for O becomes complete for all SPARQL queries when used with O [ R. However, the initial implementation and experiments were very preliminary and hence it is currently unclear whether the approach can be applied to large and com-plex ontologies. Moreover, the approach so far can only support instance queries. In the current paper we thoroughly investigate repairing as an approach to scalable (and complete) ontology-based data access. First, we present several non-trivial optimisations to the rst prototype. Sec-ond, we show how (arbitrary) conjunctive queries can be supported by integrating well-known query rewriting techniques with OWL 2 RL sys-tems via repairing. Third, we perform an extensive experimental evalua-tion obtaining encouraging results. In more detail, our results show that we can compute repairs even for very large real-world ontologies in a rea-sonable amount of time, that the performance overhead introduced by repairing is negligible in small to medium sized ontologies and noticeable but manageable in large and complex one, and that the hybrid reasoning approach can very eciently compute the correct answers for real-world challenging scenarios.
    • …
    corecore