9 research outputs found

    Exploiting general-purpose background knowledge for automated schema matching

    Full text link
    The schema matching task is an integral part of the data integration process. It is usually the first step in integrating data. Schema matching is typically very complex and time-consuming. It is, therefore, to the largest part, carried out by humans. One reason for the low amount of automation is the fact that schemas are often defined with deep background knowledge that is not itself present within the schemas. Overcoming the problem of missing background knowledge is a core challenge in automating the data integration process. In this dissertation, the task of matching semantic models, so-called ontologies, with the help of external background knowledge is investigated in-depth in Part I. Throughout this thesis, the focus lies on large, general-purpose resources since domain-specific resources are rarely available for most domains. Besides new knowledge resources, this thesis also explores new strategies to exploit such resources. A technical base for the development and comparison of matching systems is presented in Part II. The framework introduced here allows for simple and modularized matcher development (with background knowledge sources) and for extensive evaluations of matching systems. One of the largest structured sources for general-purpose background knowledge are knowledge graphs which have grown significantly in size in recent years. However, exploiting such graphs is not trivial. In Part III, knowledge graph em- beddings are explored, analyzed, and compared. Multiple improvements to existing approaches are presented. In Part IV, numerous concrete matching systems which exploit general-purpose background knowledge are presented. Furthermore, exploitation strategies and resources are analyzed and compared. This dissertation closes with a perspective on real-world applications

    Efficient Maximum A-Posteriori Inference in Markov Logic and Application in Description Logics

    Full text link
    Maximum a-posteriori (MAP) query in statistical relational models computes the most probable world given evidence and further knowledge about the domain. It is arguably one of the most important types of computational problems, since it is also used as a subroutine in weight learning algorithms. In this thesis, we discuss an improved inference algorithm and an application for MAP queries. We focus on Markov logic (ML) as statistical relational formalism. Markov logic combines Markov networks with first-order logic by attaching weights to first-order formulas. For inference, we improve existing work which translates MAP queries to integer linear programs (ILP). The motivation is that existing ILP solvers are very stable and fast and are able to precisely estimate the quality of an intermediate solution. In our work, we focus on improving the translation process such that we result in ILPs having fewer variables and fewer constraints. Our main contribution is the Cutting Plane Aggregation (CPA) approach which leverages symmetries in ML networks and parallelizes MAP inference. Additionally, we integrate the cutting plane inference (Riedel 2008) algorithm which significantly reduces the number of groundings by solving multiple smaller ILPs instead of one large ILP. We present the new Markov logic engine RockIt which outperforms state-of-the-art engines in standard Markov logic benchmarks. Afterwards, we apply the MAP query to description logics. Description logics (DL) are knowledge representation formalisms whose expressivity is higher than propositional logic but lower than first-order logic. The most popular DLs have been standardized in the ontology language OWL and are an elementary component in the Semantic Web. We combine Markov logic, which essentially follows the semantic of a log-linear model, with description logics to log-linear description logics. In log-linear description logic weights can be attached to any description logic axiom. Furthermore, we introduce a new query type which computes the most-probable 'coherent' world. Possible applications of log-linear description logics are mainly located in the area of ontology learning and data integration. With our novel log-linear description logic reasoner ELog, we experimentally show that more expressivity increases quality and that the solutions of optimal solving strategies have higher quality than the solutions of approximate solving strategies

    Semantic Decision Support for Information Fusion Applications

    Get PDF
    La thĂšse s'inscrit dans le domaine de la reprĂ©sentation des connaissances et la modĂ©lisation de l'incertitude dans un contexte de fusion d'informations. L'idĂ©e majeure est d'utiliser les outils sĂ©mantiques que sont les ontologies, non seulement pour reprĂ©senter les connaissances gĂ©nĂ©rales du domaine et les observations, mais aussi pour reprĂ©senter les incertitudes que les sources introduisent dans leurs observations. Nous proposons de reprĂ©senter ces incertitudes au travers d'une mĂ©ta-ontologie (DS-ontology) fondĂ©e sur la thĂ©orie des fonctions de croyance. La contribution de ce travail porte sur la dĂ©finition d'opĂ©rateurs d'inclusion et d'intersection sĂ©mantique et sur lesquels s'appuie la mise en Ɠuvre de la thĂ©orie des fonctions de croyance, et sur le dĂ©veloppement d'un outil appelĂ© FusionLab permettant la fusion d'informations sĂ©mantiques Ă  partir du dĂ©veloppement thĂ©orique prĂ©cĂ©dent. Une application de ces travaux a Ă©tĂ© rĂ©alisĂ©e dans le cadre d'un projet de surveillance maritime.This thesis is part of the knowledge representation domain and modeling of uncertainty in a context of information fusion. The main idea is to use semantic tools and more specifically ontologies, not only to represent the general domain knowledge and observations, but also to represent the uncertainty that sources may introduce in their own observations. We propose to represent these uncertainties and semantic imprecision trough a metaontology (called DS-Ontology) based on the theory of belief functions. The contribution of this work focuses first on the definition of semantic inclusion and intersection operators for ontologies and on which relies the implementation of the theory of belief functions, and secondly on the development of a tool called FusionLab for merging semantic information within ontologies from the previous theorical development. These works have been applied within a European maritime surveillance project.ROUEN-INSA Madrillet (765752301) / SudocSudocFranceF

    The Semantic Shadow : Combining User Interaction with Context Information for Semantic Web-Site Annotation

    Get PDF
    This thesis develops the concept of the Semantic Shadow (SemS), a model for managing contentual and structural annotations on web page elements and their values. The model supports a contextual weighting of the annotated information, allowing to specify the annotation values in relation to the evaluation context. A procedure is presented, which allows to manage and process this context-dependent meta information on web page elements using a dedicated programming interface. Two distinct implementations for the model have been developed: One based on Java objects, the other using the Resource Description Framework (RDF) as modeling backend. This RDF-based storage allows to integrate the annotations of the Semantic Shadow with other information of the Semantic Web. To demonstrate the application of the Semantic Shadow concept, a procedure to optimize web based user interfaces based on the structural semantics has been developed: Assuming a mobile client, a requested web page is dynamically adapted by a proxy prototype, where the context-awareness of the adaptation can be directly modeled alongside with the structural annotations. To overcome the drawback of missing annotations for existing web pages, this thesis introduces a concept to derive context-dependent meta-information on the web pages from their usage: From the observation of the users' interaction with a web page, certain context-dependent structural information about the concerned web page elements can be derived and stored in the annotation model of the Semantic Shadow concept.In dieser Arbeit wird das Konzept des Semantic Shadow (dt. Semantischer Schatten) entwickelt, ein Programmier-Modell um Webseiten-Elemente mit inhaltsbezogenen und strukturellen Anmerkungen zu versehen. Das Modell unterstĂŒtzt dabei eine kontextabhĂ€ngige Gewichtung der Anmerkungen, so dass eine Anmerkung in Bezug zum Auswertungs-Kontext gesetzt werden kann. Zur Verwaltung und Verarbeitung dieser kontextbezogenen Meta-Informationen fĂŒr Webseiten-Elemente wurde im Rahmen der Arbeit eine Programmierschnittstelle definiert. Dazu wurden zwei Implementierungen der Schnittstelle entwickelt: Eine basiert ausschließlich auf Java-Objekten, die andere baut auf einem RDF-Modell auf. Die RDF-basierte Persistierung erlaubt eine Integration der Semantic-Shadow-Anmerkungen mit anderen Anwendungen des Semantic Webs. Um die Anwendungsmöglichkeiten des Semantic-Shadow-Konzepts darzustellen, wurde eine Vorgehensweise zur Optimierung von webbasierten Benutzerschnittstellen auf Grundlage von semantischen Strukturinformationen entwickelt: Wenn ein mobiler Benutzer eine Webseite anfordert, wird diese dynamisch durch einen Proxy angepasst. Die KontextabhĂ€ngigkeit dieser Anpassung wird dabei bereits direkt mit den Struktur-Anmerkungen modelliert. FĂŒr bestehende Webseiten liegen zumeist keine Annotationen vor. Daher wird in dieser Arbeit ein Konzept vorgestellt, kontextabhĂ€ngige Meta-Informationen aus der Benutzung der Webseiten zu bestimmen: Durch Beobachtung der Benutzer-Interaktionen mit den Webseiten-Elementen ist es möglich bestimmte kontextabhĂ€ngige Strukturinformationen abzuleiten und als Anmerkungen im Modell des Semantic-Shadow-Konzepts zu persistieren
    corecore