6 research outputs found

    Explaining Query Answers under Inconsistency-Tolerant Semantics over Description Logic Knowledge Bases (Extended Abstract)

    Get PDF
    The problem of querying description logic (DL) knowledge bases (KBs) using database-style queries (in particular, conjunctive queries) has been a major focus of recent DL research. Since scalability is a key concern, much of the work has focused on lightweight DLs for which query answering can be performed in polynomial time w.r.t. the size of the ABox. The DL-Lite family of lightweight DLs [10] is especially popular due to the fact that query answering can be reduced, via query rewriting, to the problem of standard database query evaluation. Since the TBox is usually developed by experts and subject to extensive debugging, it is often reasonable to assume that its contents are correct. By contrast, the ABox is typically substantially larger and subject to frequent modifications, making errors almost inevitable. As such errors may render the KB inconsistent, several inconsistency-tolerant semantics have been introduced in order to provide meaningful answers to queries posed over inconsistent KBs. Arguably the most well-known is the AR semantics [17], inspired by work on consistent query answering in databases (cf. [4] for a survey). Query answering under AR semantics amounts to considering those answers (w.r.t. standard semantics) that can be obtained from every repair, the latter being defined as an inclusion-maximal subset of the ABox that is consistent with the TBox. A more cautious semantics, called IAR semantics The need to equip reasoning systems with explanation services is widely acknowledged by the DL community. Indeed, there have been numerous works on axiom pinpointing, in which the objective is to identify (minimal) subsets of a KB that entail a given TBox axiom (or ABox assertion

    Completing and Debugging Ontologies: state of the art and challenges

    Full text link
    As semantically-enabled applications require high-quality ontologies, developing and maintaining ontologies that are as correct and complete as possible is an important although difficult task in ontology engineering. A key step is ontology debugging and completion. In general, there are two steps: detecting defects and repairing defects. In this paper we discuss the state of the art regarding the repairing step. We do this by formalizing the repairing step as an abduction problem and situating the state of the art with respect to this framework. We show that there are still many open research problems and show opportunities for further work and advancing the field.Comment: 56 page

    Reasoning about Explanations for Negative Query Answers in DL-Lite

    Full text link
    In order to meet usability requirements, most logic-based applications provide explanation facilities for reasoning services. This holds also for Description Logics, where research has focused on the explanation of both TBox reasoning and, more recently, query answering. Besides explaining the presence of a tuple in a query answer, it is important to explain also why a given tuple is missing. We address the latter problem for instance and conjunctive query answering over DL-Lite ontologies by adopting abductive reasoning; that is, we look for additions to the ABox that force a given tuple to be in the result. As reasoning tasks we consider existence and recognition of an explanation, and relevance and necessity of a given assertion for an explanation. We characterize the computational complexity of these problems for arbitrary, subset minimal, and cardinality minimal explanations

    Semantically defined Analytics for Industrial Equipment Diagnostics

    Get PDF
    In this age of digitalization, industries everywhere accumulate massive amount of data such that it has become the lifeblood of the global economy. This data may come from various heterogeneous systems, equipment, components, sensors, systems and applications in many varieties (diversity of sources), velocities (high rate of changes) and volumes (sheer data size). Despite significant advances in the ability to collect, store, manage and filter data, the real value lies in the analytics. Raw data is meaningless, unless it is properly processed to actionable (business) insights. Those that know how to harness data effectively, have a decisive competitive advantage, through raising performance by making faster and smart decisions, improving short and long-term strategic planning, offering more user-centric products and services and fostering innovation. Two distinct paradigms in practice can be discerned within the field of analytics: semantic-driven (deductive) and data-driven (inductive). The first emphasizes logic as a way of representing the domain knowledge encoded in rules or ontologies and are often carefully curated and maintained. However, these models are often highly complex, and require intensive knowledge processing capabilities. Data-driven analytics employ machine learning (ML) to directly learn a model from the data with minimal human intervention. However, these models are tuned to trained data and context, making it difficult to adapt. Industries today that want to create value from data must master these paradigms in combination. However, there is great need in data analytics to seamlessly combine semantic-driven and data-driven processing techniques in an efficient and scalable architecture that allows extracting actionable insights from an extreme variety of data. In this thesis, we address these needs by providing: • A unified representation of domain-specific and analytical semantics, in form of ontology models called TechOnto Ontology Stack. It is highly expressive, platform-independent formalism to capture conceptual semantics of industrial systems such as technical system hierarchies, component partonomies etc and its analytical functional semantics. • A new ontology language Semantically defined Analytical Language (SAL) on top of the ontology model that extends existing DatalogMTL (a Horn fragment of Metric Temporal Logic) with analytical functions as first class citizens. • A method to generate semantic workflows using our SAL language. It helps in authoring, reusing and maintaining complex analytical tasks and workflows in an abstract fashion. • A multi-layer architecture that fuses knowledge- and data-driven analytics into a federated and distributed solution. To our knowledge, the work in this thesis is one of the first works to introduce and investigate the use of the semantically defined analytics in an ontology-based data access setting for industrial analytical applications. The reason behind focusing our work and evaluation on industrial data is due to (i) the adoption of semantic technology by the industries in general, and (ii) the common need in literature and in practice to allow domain expertise to drive the data analytics on semantically interoperable sources, while still harnessing the power of analytics to enable real-time data insights. Given the evaluation results of three use-case studies, our approach surpass state-of-the-art approaches for most application scenarios.Im Zeitalter der Digitalisierung sammeln die Industrien überall massive Daten-mengen, die zum Lebenselixier der Weltwirtschaft geworden sind. Diese Daten können aus verschiedenen heterogenen Systemen, Geräten, Komponenten, Sensoren, Systemen und Anwendungen in vielen Varianten (Vielfalt der Quellen), Geschwindigkeiten (hohe Änderungsrate) und Volumina (reine Datengröße) stammen. Trotz erheblicher Fortschritte in der Fähigkeit, Daten zu sammeln, zu speichern, zu verwalten und zu filtern, liegt der eigentliche Wert in der Analytik. Rohdaten sind bedeutungslos, es sei denn, sie werden ordnungsgemäß zu verwertbaren (Geschäfts-)Erkenntnissen verarbeitet. Wer weiß, wie man Daten effektiv nutzt, hat einen entscheidenden Wettbewerbsvorteil, indem er die Leistung steigert, indem er schnellere und intelligentere Entscheidungen trifft, die kurz- und langfristige strategische Planung verbessert, mehr benutzerorientierte Produkte und Dienstleistungen anbietet und Innovationen fördert. In der Praxis lassen sich im Bereich der Analytik zwei unterschiedliche Paradigmen unterscheiden: semantisch (deduktiv) und Daten getrieben (induktiv). Die erste betont die Logik als eine Möglichkeit, das in Regeln oder Ontologien kodierte Domänen-wissen darzustellen, und wird oft sorgfältig kuratiert und gepflegt. Diese Modelle sind jedoch oft sehr komplex und erfordern eine intensive Wissensverarbeitung. Datengesteuerte Analysen verwenden maschinelles Lernen (ML), um mit minimalem menschlichen Eingriff direkt ein Modell aus den Daten zu lernen. Diese Modelle sind jedoch auf trainierte Daten und Kontext abgestimmt, was die Anpassung erschwert. Branchen, die heute Wert aus Daten schaffen wollen, müssen diese Paradigmen in Kombination meistern. Es besteht jedoch ein großer Bedarf in der Daten-analytik, semantisch und datengesteuerte Verarbeitungstechniken nahtlos in einer effizienten und skalierbaren Architektur zu kombinieren, die es ermöglicht, aus einer extremen Datenvielfalt verwertbare Erkenntnisse zu gewinnen. In dieser Arbeit, die wir auf diese Bedürfnisse durch die Bereitstellung: • Eine einheitliche Darstellung der Domänen-spezifischen und analytischen Semantik in Form von Ontologie Modellen, genannt TechOnto Ontology Stack. Es ist ein hoch-expressiver, plattformunabhängiger Formalismus, die konzeptionelle Semantik industrieller Systeme wie technischer Systemhierarchien, Komponenten-partonomien usw. und deren analytische funktionale Semantik zu erfassen. • Eine neue Ontologie-Sprache Semantically defined Analytical Language (SAL) auf Basis des Ontologie-Modells das bestehende DatalogMTL (ein Horn fragment der metrischen temporären Logik) um analytische Funktionen als erstklassige Bürger erweitert. • Eine Methode zur Erzeugung semantischer workflows mit unserer SAL-Sprache. Es hilft bei der Erstellung, Wiederverwendung und Wartung komplexer analytischer Aufgaben und workflows auf abstrakte Weise. • Eine mehrschichtige Architektur, die Wissens- und datengesteuerte Analysen zu einer föderierten und verteilten Lösung verschmilzt. Nach unserem Wissen, die Arbeit in dieser Arbeit ist eines der ersten Werke zur Einführung und Untersuchung der Verwendung der semantisch definierten Analytik in einer Ontologie-basierten Datenzugriff Einstellung für industrielle analytische Anwendungen. Der Grund für die Fokussierung unserer Arbeit und Evaluierung auf industrielle Daten ist auf (i) die Übernahme semantischer Technologien durch die Industrie im Allgemeinen und (ii) den gemeinsamen Bedarf in der Literatur und in der Praxis zurückzuführen, der es der Fachkompetenz ermöglicht, die Datenanalyse auf semantisch inter-operablen Quellen voranzutreiben, und nutzen gleichzeitig die Leistungsfähigkeit der Analytik, um Echtzeit-Daten-einblicke zu ermöglichen. Aufgrund der Evaluierungsergebnisse von drei Anwendungsfällen Übertritt unser Ansatz für die meisten Anwendungsszenarien Modernste Ansätze

    Integration of Logic and Probability in Terminological and Inductive Reasoning

    Get PDF
    This thesis deals with Statistical Relational Learning (SRL), a research area combining principles and ideas from three important subfields of Artificial Intelligence: machine learn- ing, knowledge representation and reasoning on uncertainty. Machine learning is the study of systems that improve their behavior over time with experience; the learning process typi- cally involves a search through various generalizations of the examples, in order to discover regularities or classification rules. A wide variety of machine learning techniques have been developed in the past fifty years, most of which used propositional logic as a (limited) represen- tation language. Recently, more expressive knowledge representations have been considered, to cope with a variable number of entities as well as the relationships that hold amongst them. These representations are mostly based on logic that, however, has limitations when reason- ing on uncertain domains. These limitations have been lifted allowing a multitude of different formalisms combining probabilistic reasoning with logics, databases or logic programming, where probability theory provides a formal basis for reasoning on uncertainty. In this thesis we consider in particular the proposals for integrating probability in Logic Programming, since the resulting probabilistic logic programming languages present very in- teresting computational properties. In Probabilistic Logic Programming, the so-called "dis- tribution semantics" has gained a wide popularity. This semantics was introduced for the PRISM language (1995) but is shared by many other languages: Independent Choice Logic, Stochastic Logic Programs, CP-logic, ProbLog and Logic Programs with Annotated Disjunc- tions (LPADs). A program in one of these languages defines a probability distribution over normal logic programs called worlds. This distribution is then extended to queries and the probability of a query is obtained by marginalizing the joint distribution of the query and the programs. The languages following the distribution semantics differ in the way they define the distribution over logic programs. The first part of this dissertation presents techniques for learning probabilistic logic pro- grams under the distribution semantics. Two problems are considered: parameter learning and structure learning, that is, the problems of inferring values for the parameters or both the structure and the parameters of the program from data. This work contributes an algorithm for parameter learning, EMBLEM, and two algorithms for structure learning (SLIPCASE and SLIPCOVER) of probabilistic logic programs (in particular LPADs). EMBLEM is based on the Expectation Maximization approach and computes the expectations directly on the Binary De- cision Diagrams that are built for inference. SLIPCASE performs a beam search in the space of LPADs while SLIPCOVER performs a beam search in the space of probabilistic clauses and a greedy search in the space of LPADs, improving SLIPCASE performance. All learning approaches have been evaluated in several relational real-world domains. The second part of the thesis concerns the field of Probabilistic Description Logics, where we consider a logical framework suitable for the Semantic Web. Description Logics (DL) are a family of formalisms for representing knowledge. Research in the field of knowledge repre- sentation and reasoning is usually focused on methods for providing high-level descriptions of the world that can be effectively used to build intelligent applications. Description Logics have been especially effective as the representation language for for- mal ontologies. Ontologies model a domain with the definition of concepts and their properties and relations. Ontologies are the structural frameworks for organizing information and are used in artificial intelligence, the Semantic Web, systems engineering, software engineering, biomedical informatics, etc. They should also allow to ask questions about the concepts and in- stances described, through inference procedures. Recently, the issue of representing uncertain information in these domains has led to probabilistic extensions of DLs. The contribution of this dissertation is twofold: (1) a new semantics for the Description Logic SHOIN(D) , based on the distribution semantics for probabilistic logic programs, which embeds probability; (2) a probabilistic reasoner for computing the probability of queries from uncertain knowledge bases following this semantics. The explanations of queries are encoded in Binary Decision Diagrams, with the same technique employed in the learning systems de- veloped for LPADs. This approach has been evaluated on a real-world probabilistic ontology
    corecore