9 research outputs found

    Acta Cybernetica : Volume 23. Number 1.

    Get PDF

    Quantitative Variants of Language Equations and their Applications to Description Logics

    Get PDF
    Unification in description logics (DLs) has been introduced as a novel inference service that can be used to detect redundancies in ontologies, by finding different concepts that may potentially stand for the same intuitive notion. Together with the special case of matching, they were first investigated in detail for the DL FL0, where these problems can be reduced to solving certain language equations. In this thesis, we extend this service in two directions. In order to increase the recall of this method for finding redundancies, we introduce and investigate the notion of approximate unification, which basically finds pairs of concepts that “almost” unify, in order to account for potential small modelling errors. The meaning of “almost” is formalized using distance measures between concepts. We show that approximate unification in FL0 can be reduced to approximately solving language equations, and devise algorithms for solving the latter problem for particular distance measures. Furthermore, we make a first step towards integrating background knowledge, formulated in so-called TBoxes, by investigating the special case of matching in the presence of TBoxes of different forms. We acquire a tight complexity bound for the general case, while we prove that the problem becomes easier in a restricted setting. To achieve these bounds, we take advantage of an equivalence characterization of FL0 concepts that is based on formal languages. In addition, we incorporate TBoxes in computing concept distances. Even though our results on the approximate setting cannot deal with TBoxes yet, we prepare the framework that future research can build on. Before we journey to the technical details of the above investigations, we showcase our program in the simpler setting of the equational theory ACUI, where we are able to also combine the two extensions. In the course of studying the above problems, we make heavy use of automata theory, where we also derive novel results that could be of independent interest

    Quantitative Methods for Similarity in Description Logics

    Get PDF
    Description Logics (DLs) are a family of logic-based knowledge representation languages used to describe the knowledge of an application domain and reason about it in formally well-defined way. They allow users to describe the important notions and classes of the knowledge domain as concepts, which formalize the necessary and sufficient conditions for individual objects to belong to that concept. A variety of different DLs exist, differing in the set of properties one can use to express concepts, the so-called concept constructors, as well as the set of axioms available to describe the relations between concepts or individuals. However, all classical DLs have in common that they can only express exact knowledge, and correspondingly only allow exact inferences. Either we can infer that some individual belongs to a concept, or we can't, there is no in-between. In practice though, knowledge is rarely exact. Many definitions have their exceptions or are vaguely formulated in the first place, and people might not only be interested in exact answers, but also in alternatives that are "close enough". This thesis is aimed at tackling how to express that something "close enough", and how to integrate this notion into the formalism of Description Logics. To this end, we will use the notion of similarity and dissimilarity measures as a way to quantify how close exactly two concepts are. We will look at how useful measures can be defined in the context of DLs, and how they can be incorporated into the formal framework in order to generalize it. In particular, we will look closer at two applications of thus measures to DLs: Relaxed instance queries will incorporate a similarity measure in order to not just give the exact answer to some query, but all answers that are reasonably similar. Prototypical definitions on the other hand use a measure of dissimilarity or distance between concepts in order to allow the definitions of and reasoning with concepts that capture not just those individuals that satisfy exactly the stated properties, but also those that are "close enough"

    Statistical and Computational Models for Whole Word Morphology

    Get PDF
    Das Ziel dieser Arbeit ist die Formulierung eines Ansatzes zum maschinellen Lernen von Sprachmorphologie, in dem letztere als Zeichenkettentransformationen auf ganzen Wörtern, und nicht als Zerlegung von Wörtern in kleinere stukturelle Einheiten, modelliert wird. Der Beitrag besteht aus zwei wesentlichen Teilen: zum einen wird ein Rechenmodell formuliert, in dem morphologische Regeln als Funktionen auf Zeichenketten definiert sind. Solche Funktionen lassen sich leicht zu endlichen Transduktoren übersetzen, was eine solide algorithmische Grundlage für den Ansatz liefert. Zum anderen wird ein statistisches Modell für Graphen von Wortab\-leitungen eingeführt. Die Inferenz in diesem Modell erfolgt mithilfe des Monte Carlo Expectation Maximization-Algorithmus und die Erwartungswerte über Graphen werden durch einen Metropolis-Hastings-Sampler approximiert. Das Modell wird auf einer Reihe von praktischen Aufgaben evaluiert: Clustering flektierter Formen, Lernen von Lemmatisierung, Vorhersage von Wortart für unbekannte Wörter, sowie Generierung neuer Wörter

    Safe Programming Over Distributed Streams

    Get PDF
    The sheer scale of today\u27s data processing needs has led to a new paradigm of software systems centered around requirements for high-throughput, distributed, low-latency computation.Despite their widespread adoption, existing solutions have yet to provide a programming model with safe semantics -- and they disagree on basic design choices, in particular with their approach to parallelism. As a result, naive programmers are easily led to introduce correctness and performance bugs. This work proposes a reliable programming model for modern distributed stream processing, founded in a type system for partially ordered data streams. On top of the core type system, we propose language abstractions for working with streams -- mechanisms to build stream operators with (1) type-safe compositionality, (2) deterministic distribution, (3) run-time testing, and (4) static performance bounds. Our thesis is that viewing streams as partially ordered conveniently exposes parallelism without compromising safety or determinism. The ideas contained in this work are implemented in a series of open source software projects, including the Flumina, DiffStream, and Data Transducers libraries

    35th Symposium on Theoretical Aspects of Computer Science: STACS 2018, February 28-March 3, 2018, Caen, France

    Get PDF

    36th International Symposium on Theoretical Aspects of Computer Science: STACS 2019, March 13-16, 2019, Berlin, Germany

    Get PDF
    corecore