33 research outputs found

    Managing data through the lens of an ontology

    Get PDF
    Ontology-based data management aims at managing data through the lens of an ontology, that is, a conceptual representation of the domain of interest in the underlying information system. This new paradigm provides several interesting features, many of which have already been proved effective in managing complex information systems. This article introduces the notion of ontology-based data management, illustrating the main ideas underlying the paradigm, and pointing out the importance of knowledge representation and automated reasoning for addressing the technical challenges it introduces

    Semantically defined Analytics for Industrial Equipment Diagnostics

    Get PDF
    In this age of digitalization, industries everywhere accumulate massive amount of data such that it has become the lifeblood of the global economy. This data may come from various heterogeneous systems, equipment, components, sensors, systems and applications in many varieties (diversity of sources), velocities (high rate of changes) and volumes (sheer data size). Despite significant advances in the ability to collect, store, manage and filter data, the real value lies in the analytics. Raw data is meaningless, unless it is properly processed to actionable (business) insights. Those that know how to harness data effectively, have a decisive competitive advantage, through raising performance by making faster and smart decisions, improving short and long-term strategic planning, offering more user-centric products and services and fostering innovation. Two distinct paradigms in practice can be discerned within the field of analytics: semantic-driven (deductive) and data-driven (inductive). The first emphasizes logic as a way of representing the domain knowledge encoded in rules or ontologies and are often carefully curated and maintained. However, these models are often highly complex, and require intensive knowledge processing capabilities. Data-driven analytics employ machine learning (ML) to directly learn a model from the data with minimal human intervention. However, these models are tuned to trained data and context, making it difficult to adapt. Industries today that want to create value from data must master these paradigms in combination. However, there is great need in data analytics to seamlessly combine semantic-driven and data-driven processing techniques in an efficient and scalable architecture that allows extracting actionable insights from an extreme variety of data. In this thesis, we address these needs by providing: • A unified representation of domain-specific and analytical semantics, in form of ontology models called TechOnto Ontology Stack. It is highly expressive, platform-independent formalism to capture conceptual semantics of industrial systems such as technical system hierarchies, component partonomies etc and its analytical functional semantics. • A new ontology language Semantically defined Analytical Language (SAL) on top of the ontology model that extends existing DatalogMTL (a Horn fragment of Metric Temporal Logic) with analytical functions as first class citizens. • A method to generate semantic workflows using our SAL language. It helps in authoring, reusing and maintaining complex analytical tasks and workflows in an abstract fashion. • A multi-layer architecture that fuses knowledge- and data-driven analytics into a federated and distributed solution. To our knowledge, the work in this thesis is one of the first works to introduce and investigate the use of the semantically defined analytics in an ontology-based data access setting for industrial analytical applications. The reason behind focusing our work and evaluation on industrial data is due to (i) the adoption of semantic technology by the industries in general, and (ii) the common need in literature and in practice to allow domain expertise to drive the data analytics on semantically interoperable sources, while still harnessing the power of analytics to enable real-time data insights. Given the evaluation results of three use-case studies, our approach surpass state-of-the-art approaches for most application scenarios.Im Zeitalter der Digitalisierung sammeln die Industrien überall massive Daten-mengen, die zum Lebenselixier der Weltwirtschaft geworden sind. Diese Daten können aus verschiedenen heterogenen Systemen, Geräten, Komponenten, Sensoren, Systemen und Anwendungen in vielen Varianten (Vielfalt der Quellen), Geschwindigkeiten (hohe Änderungsrate) und Volumina (reine Datengröße) stammen. Trotz erheblicher Fortschritte in der Fähigkeit, Daten zu sammeln, zu speichern, zu verwalten und zu filtern, liegt der eigentliche Wert in der Analytik. Rohdaten sind bedeutungslos, es sei denn, sie werden ordnungsgemäß zu verwertbaren (Geschäfts-)Erkenntnissen verarbeitet. Wer weiß, wie man Daten effektiv nutzt, hat einen entscheidenden Wettbewerbsvorteil, indem er die Leistung steigert, indem er schnellere und intelligentere Entscheidungen trifft, die kurz- und langfristige strategische Planung verbessert, mehr benutzerorientierte Produkte und Dienstleistungen anbietet und Innovationen fördert. In der Praxis lassen sich im Bereich der Analytik zwei unterschiedliche Paradigmen unterscheiden: semantisch (deduktiv) und Daten getrieben (induktiv). Die erste betont die Logik als eine Möglichkeit, das in Regeln oder Ontologien kodierte Domänen-wissen darzustellen, und wird oft sorgfältig kuratiert und gepflegt. Diese Modelle sind jedoch oft sehr komplex und erfordern eine intensive Wissensverarbeitung. Datengesteuerte Analysen verwenden maschinelles Lernen (ML), um mit minimalem menschlichen Eingriff direkt ein Modell aus den Daten zu lernen. Diese Modelle sind jedoch auf trainierte Daten und Kontext abgestimmt, was die Anpassung erschwert. Branchen, die heute Wert aus Daten schaffen wollen, müssen diese Paradigmen in Kombination meistern. Es besteht jedoch ein großer Bedarf in der Daten-analytik, semantisch und datengesteuerte Verarbeitungstechniken nahtlos in einer effizienten und skalierbaren Architektur zu kombinieren, die es ermöglicht, aus einer extremen Datenvielfalt verwertbare Erkenntnisse zu gewinnen. In dieser Arbeit, die wir auf diese Bedürfnisse durch die Bereitstellung: • Eine einheitliche Darstellung der Domänen-spezifischen und analytischen Semantik in Form von Ontologie Modellen, genannt TechOnto Ontology Stack. Es ist ein hoch-expressiver, plattformunabhängiger Formalismus, die konzeptionelle Semantik industrieller Systeme wie technischer Systemhierarchien, Komponenten-partonomien usw. und deren analytische funktionale Semantik zu erfassen. • Eine neue Ontologie-Sprache Semantically defined Analytical Language (SAL) auf Basis des Ontologie-Modells das bestehende DatalogMTL (ein Horn fragment der metrischen temporären Logik) um analytische Funktionen als erstklassige Bürger erweitert. • Eine Methode zur Erzeugung semantischer workflows mit unserer SAL-Sprache. Es hilft bei der Erstellung, Wiederverwendung und Wartung komplexer analytischer Aufgaben und workflows auf abstrakte Weise. • Eine mehrschichtige Architektur, die Wissens- und datengesteuerte Analysen zu einer föderierten und verteilten Lösung verschmilzt. Nach unserem Wissen, die Arbeit in dieser Arbeit ist eines der ersten Werke zur Einführung und Untersuchung der Verwendung der semantisch definierten Analytik in einer Ontologie-basierten Datenzugriff Einstellung für industrielle analytische Anwendungen. Der Grund für die Fokussierung unserer Arbeit und Evaluierung auf industrielle Daten ist auf (i) die Übernahme semantischer Technologien durch die Industrie im Allgemeinen und (ii) den gemeinsamen Bedarf in der Literatur und in der Praxis zurückzuführen, der es der Fachkompetenz ermöglicht, die Datenanalyse auf semantisch inter-operablen Quellen voranzutreiben, und nutzen gleichzeitig die Leistungsfähigkeit der Analytik, um Echtzeit-Daten-einblicke zu ermöglichen. Aufgrund der Evaluierungsergebnisse von drei Anwendungsfällen Übertritt unser Ansatz für die meisten Anwendungsszenarien Modernste Ansätze

    How Can You Mend a Broken Inconsistent KBs in Existential Rules Using Argumentation

    Get PDF
    International audienceArgumentation is a reasoning method in presence of inconsistencies that is based on con- structing and evaluating arguments. In his seminal paper [6], Dung introduced the most abstract argumentation framework which consists of a set of arguments, a binary relation between arguments (called attack) and an extension-based semantics to extract subsets of arguments, representing consistent viewpoints, called extensions. Recently, another way of evaluating some arguments was proposed: ranking-based semantics, which ranks arguments based on their controversy with respect to attacks [3], i.e. arguments that are attacked “more severely” are ranked lower than others. Extension-based semantics and ranking-based semantics are the two main approaches that I plan to focus on in my future works.Logic-based argumentation [1] consists in instantiating argumentation framework with an inconsistent knowledge base expressed using a given logic that can be used in order to handle the underlying inconsistencies. It has been extensively studied and many frameworks have been proposed (assumption-based argumentation frameworks, DeLP, deductive argumentation or ASPIC/ASPIC+, etc.). In my current work, I chose to work with a logic that contains existential rules and to instantiate a deductive argumentation framework already available in the literature [5] with it. I made the choice of existential rules logic because of its expressivity and practical interest for the Semantic Web. Work- ing with existential-rules instantiated argumentation frameworks is challenging because of the presence of special features (n-ary conflicts or existential variables in rules) and undecidability problems for query answering in certain cases.Reasoning with an inconsistent knowledge base needs special techniques as every- thing can be entailed from falsum. Some techniques such as repair semantics [4] are based on the set of all maximal consistent subsets (repairs) of the knowledge base but usually do not give a lot of answers to queries. We propose to use argumentation in a general workflow for selecting the best repairs (mendings) of the knowledge base.The research question of my thesis is: “How can a non expert mend an inconsistent knowledge base expressed in existential-rules using argumentation?”In a first work, I addressed the lack of consideration of the existing tools for han- dling existential rules with inconsistencies by introducing the first application workflow for reasoning with inconsistencies in the framework of existential rules using argumen- tation (i.e. instantiating ASPIC+ with existential rules [9]). The significance of the study was demonstrated by the equivalence of extension-based semantics outputs between the ASPIC+ instantiation and the one in [5].Then, I focused on the practical generation of arguments from existential knowledge bases but soon realised that such a generating tool was nonexistent and that the current argumentation community did only possess randomly generated or very small argumen- tation graphs for benchmarking purposes [7]. I thus created a tool, called DAGGER, that generates argumentation graphs from existential knowledge bases [12]. The DAGGER tool was a significant contribution because it enabled me to conduct a study of theoret- ical structural properties [11] of the graphs induced by existential-rules-instantiated ar- gumentation frameworks as defined in [5], but also to analyse the behaviour of several solvers from an argumentation competition [16] regarding the generated graphs, and I studied whether their ranking (with respect to performance) was modified in the context of existential knowledge bases.It is worth noticing that the number of arguments in [5] is exponential with respect to the size of the knowledge base. Thus, I extended the structure of arguments in [5] with minimality, studied notions of core [2] and other efficient optimisations for reduc- ing the size of the produced argumentation frameworks [13]. What was surprising was that applying ranking-based semantics on a core of an argumentation framework gives different rankings than the rankings obtained from the original argumentation framework [10]. The salient point of this paper was the formal characterisation of these changes with respect to the proposed properties defined in [3].In my first two years of PhD, I made an analysis of the argumentation framework instantiated with existential rules and made several optimisations for managing the size of the argumentation graph. I also introduced a workflow for mending knowledge bases using argumentation [15]. In this workflow, subsets of arguments are extracted (view- points) and the ranking on arguments is “lifted” to these viewpoints to select the best mending. It is worth noticing that we also provided different desirable principles that the workflow should satisfy.In the last year, I plan to first study the following question: “In which ways do argu- mentation methods perform better than classical methods for knowledge bases mending ?” Indeed, I expect argumentation to work well for mending knowledge bases because of the following reasons: (1) ranking-based semantics are generally easy to compute and follow several desirable principles [3], (2) argumentation represents pieces of consistent knowledge as nodes and the inconsistencies as attacks. The ability of using argumenta- tion paths (sequence of attacks) is often neglected or ignored in traditional logic.Lastly, I plan on comparing argumentation methods with more logical methods [14] based on inconsistency measures and export all of my results by applying them on previously studied real world use-cases obtained in the framework of the agronomy Pack4Fresh project [8]

    The Shapley Value of Inconsistency Measures for Functional Dependencies

    Get PDF
    Quantifying the inconsistency of a database is motivated by various goals including reliability estimation for new datasets and progress indication in data cleaning. Another goal is to attribute to individual tuples a level of responsibility to the overall inconsistency, and thereby prioritize tuples in the explanation or inspection of dirt. Therefore, inconsistency quantification and attribution have been a subject of much research in Knowledge Representation and, more recently, in Databases. As in many other fields, a conventional responsibility sharing mechanism is the Shapley value from cooperative game theory. In this paper, we carry out a systematic investigation of the complexity of the Shapley value in common inconsistency measures for functional-dependency (FD) violations. For several measures we establish a full classification of the FD sets into tractable and intractable classes with respect to Shapley-value computation. We also study the complexity of approximation in intractable cases
    corecore