Search CORE

16 research outputs found

Mapping languages analysis of comparative characteristics

Author: De Meester Ben
Dimou Anastasia
Heyvaert Pieter
Verborgh Ruben
Publication venue
Publication date: 01/01/2019
Field of study

RDF generation processes are becoming more interoperable, reusable, and maintainable due to the increased usage of mapping languages: languages used to describe how to generate an RDF graph from (semi-)structured data. This gives rise to new mapping languages, each with different characteristics. However, it is not clear which mapping language is fit for a given task. Thus, a comparative framework is needed. In this paper, we investigate a set of mapping languages that inhibit complementary characteristics, and present an initial set of comparative characteristics based on requirements as put forward by the reference works of those mapping languages. Initial investigation found 9 broad characteristics, classified in 3 categories. To further formalize and complete the set of characteristics, further investigation is needed, requiring a joint effort of the community

Ghent University Academic Bibliography

The Semantic Web - ISWC 2017 - 16th International Semantic Web Conference, Vienna, Austria, October 21-25, 2017, Proceedings, Part II

Author: Claudia d'Amato Miriam Fernández, Valentina A. M. Tamma, Freddy Lécué, Philippe Cudré-Mauroux, Juan F. Sequeda, Christoph Lange, Jeff Heflin
Cudré-Mauroux Philippe
D'Amato Claudia
Fernández Miriam
Heflin Jeff
Lange Christoph
Lécué Freddy
Sequeda Juan F.
Tamma Valentina A. M.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Archivio istituzionale della ricerca - Università di Bari

The semantic web - ISWC 2017: 16th international semantic web conference, Vienna, Austria, October 21-25, 2017, proceedings, part II

Author: Cudré-Mauroux Philippe
d'Amato Claudia
Fernandez Miriam
Heflin Jeff
Lange Christoph
Lecue Freddy
Sequeda Juan
Tamma Valentina
Publication venue: Springer International Publishing AG
Publication date: 01/01/2017
Field of study

CERN Document Server

Why reinvent the wheel: Let's build question answering systems together

Author: Auer Sören
Both A.
Khikmatullaev A.
Lange C.
Lehmann J.
Lytra I.
Punjani D.
Radhakrishna A.S.
Shekarpour S.
Singh K.
Usbeck R.
Vidal Maria-Esther
Vyas A.
Publication venue: New York City : Association for Computing Machinery
Publication date: 01/01/2018
Field of study

Modern question answering (QA) systems need to flexibly integrate a number of components specialised to fulfil specific tasks in a QA pipeline. Key QA tasks include Named Entity Recognition and Disambiguation, Relation Extraction, and Query Building. Since a number of different software components exist that implement different strategies for each of these tasks, it is a major challenge to select and combine the most suitable components into a QA system, given the characteristics of a question. We study this optimisation problem and train classifiers, which take features of a question as input and have the goal of optimising the selection of QA components based on those features. We then devise a greedy algorithm to identify the pipelines that include the suitable components and can effectively answer the given question. We implement this model within Frankenstein, a QA framework able to select QA components and compose QA pipelines. We evaluate the effectiveness of the pipelines generated by Frankenstein using the QALD and LC-QuAD benchmarks. These results not only suggest that Frankenstein precisely solves the QA optimisation problem but also enables the automatic composition of optimised QA pipelines, which outperform the static Baseline QA pipeline. Thanks to this flexible and fully automated pipeline generation process, new QA components can be easily included in Frankenstein, thus improving the performance of the generated pipelines

Fraunhofer-ePrints

Repositorium für Naturwissenschaften und Technik

Named Entity Resolution in Personal Knowledge Graphs

Author: Kejriwal Mayank
Publication venue
Publication date: 22/07/2023
Field of study

Entity Resolution (ER) is the problem of determining when two entities refer to the same underlying entity. The problem has been studied for over 50 years, and most recently, has taken on new importance in an era of large, heterogeneous 'knowledge graphs' published on the Web and used widely in domains as wide ranging as social media, e-commerce and search. This chapter will discuss the specific problem of named ER in the context of personal knowledge graphs (PKGs). We begin with a formal definition of the problem, and the components necessary for doing high-quality and efficient ER. We also discuss some challenges that are expected to arise for Web-scale data. Next, we provide a brief literature review, with a special focus on how existing techniques can potentially apply to PKGs. We conclude the chapter by covering some applications, as well as promising directions for future research.Comment: To appear as a book chapter by the same name in an upcoming (Oct. 2023) book `Personal Knowledge Graphs (PKGs): Methodology, tools and applications' edited by Tiwari et a

arXiv.org e-Print Archive

On the centrality of tenure in spatial data systems for coastal/marine management: International exemplars versus emerging practice in Ireland

Author: Cooper Andrew
Murray-O'Connor Helen
Publication venue
Publication date: 01/11/2024
Field of study

Although it is often overlooked, access to data regarding ‘tenure’ is of primary importance in underpinning coastal management and marine spatial planning (MSP). National, regional and international coastal/marine spatial data management exemplars demonstrate the need for clarity and certainty with respect to legal coastal/marine geographies (i.e. the basis for achieving security of tenure). Good practice in MSP is underpinned by four key pillars (use’, ‘value’, ‘development’ and ‘tenure’ (U,V,D,T)). The exemplars demonstrate the importance of currency in the statutory delineation of the coastline (HWM) and the spatial extent of the ‘coastal zone’ and tenure therein.The National Marine Planning Framework (NMPF), established in Ireland in 2021, provides the foundations for three of the interrelated management (U,V,D) pillars but those relating to ‘tenure’ are largely absent. Coastal/marine management platforms and data gateways have yet to be fully developed to meet emerging marine/offshore obligations while national data portals remain primarily terrestrial in focus. Early steps to create a MSP ‘one stop’ web portal (MarinePlan.ie) are rather limited when benchmarked against international exemplars that do include information related to tenure. This is particularly important as legislation enacting the adoption of the NMPF extends planning control and the marine consent authorisations process of Irish Coastal Local Authorities (CLAs) to also include the nearshore (three nautical miles seaward from High Water Mark (HWM)). To achieve MSP targets, information on coastal/marine legal and regulatory interests across the land/sea interface needs to match that currently available in terrestrial settings

Ulster University's Research Portal

A data complexity and rewritability tetrachotomy of ontology-mediated queries with a covering axiom

Author: Gerasimova Olga
Kikot Stanislav
Kurucz Agi
Podolskii Vladimir
Zakharyaschev Michael
Publication venue: 'International Joint Conferences on Artificial Intelligence'
Publication date: 01/01/2020
Field of study

Aiming to understand the data complexity of answering conjunctive queries mediated by an axiom stating that a class is covered by the union of two other classes, we show that deciding their first-order rewritability is PSPACE-hard and obtain a number of sufficient conditions for membership in AC0, L, NL, and P. Our main result is a complete syntactic AC0/NL/P/CONP tetrachotomy of path queries under the assumption that the covering classes are disjoint

London Met Repository

Crossref

King's Research Portal

Explainable methods for knowledge graph refinement and exploration via symbolic reasoning

Author: Gad-Elrab Mohamed Hassan Mohamed
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/01/2021
Field of study

Knowledge Graphs (KGs) have applications in many domains such as Finance, Manufacturing, and Healthcare. While recent efforts have created large KGs, their content is far from complete and sometimes includes invalid statements. Therefore, it is crucial to refine the constructed KGs to enhance their coverage and accuracy via KG completion and KG validation. It is also vital to provide human-comprehensible explanations for such refinements, so that humans have trust in the KG quality. Enabling KG exploration, by search and browsing, is also essential for users to understand the KG value and limitations towards down-stream applications. However, the large size of KGs makes KG exploration very challenging. While the type taxonomy of KGs is a useful asset along these lines, it remains insufficient for deep exploration. In this dissertation we tackle the aforementioned challenges of KG refinement and KG exploration by combining logical reasoning over the KG with other techniques such as KG embedding models and text mining. Through such combination, we introduce methods that provide human-understandable output. Concretely, we introduce methods to tackle KG incompleteness by learning exception-aware rules over the existing KG. Learned rules are then used in inferring missing links in the KG accurately. Furthermore, we propose a framework for constructing human-comprehensible explanations for candidate facts from both KG and text. Extracted explanations are used to insure the validity of KG facts. Finally, to facilitate KG exploration, we introduce a method that combines KG embeddings with rule mining to compute informative entity clusters with explanations.Wissensgraphen haben viele Anwendungen in verschiedenen Bereichen, beispielsweise im Finanz- und Gesundheitswesen. Wissensgraphen sind jedoch unvollständig und enthalten auch ungültige Daten. Hohe Abdeckung und Korrektheit erfordern neue Methoden zur Wissensgraph-Erweiterung und Wissensgraph-Validierung. Beide Aufgaben zusammen werden als Wissensgraph-Verfeinerung bezeichnet. Ein wichtiger Aspekt dabei ist die Erklärbarkeit und Verständlichkeit von Wissensgraphinhalten für Nutzer. In Anwendungen ist darüber hinaus die nutzerseitige Exploration von Wissensgraphen von besonderer Bedeutung. Suchen und Navigieren im Graph hilft dem Anwender, die Wissensinhalte und ihre Limitationen besser zu verstehen. Aufgrund der riesigen Menge an vorhandenen Entitäten und Fakten ist die Wissensgraphen-Exploration eine Herausforderung. Taxonomische Typsystem helfen dabei, sind jedoch für tiefergehende Exploration nicht ausreichend. Diese Dissertation adressiert die Herausforderungen der Wissensgraph-Verfeinerung und der Wissensgraph-Exploration durch algorithmische Inferenz über dem Wissensgraph. Sie erweitert logisches Schlussfolgern und kombiniert es mit anderen Methoden, insbesondere mit neuronalen Wissensgraph-Einbettungen und mit Text-Mining. Diese neuen Methoden liefern Ausgaben mit Erklärungen für Nutzer. Die Dissertation umfasst folgende Beiträge: Insbesondere leistet die Dissertation folgende Beiträge: • Zur Wissensgraph-Erweiterung präsentieren wir ExRuL, eine Methode zur Revision von Horn-Regeln durch Hinzufügen von Ausnahmebedingungen zum Rumpf der Regeln. Die erweiterten Regeln können neue Fakten inferieren und somit Lücken im Wissensgraphen schließen. Experimente mit großen Wissensgraphen zeigen, dass diese Methode Fehler in abgeleiteten Fakten erheblich reduziert und nutzerfreundliche Erklärungen liefert. • Mit RuLES stellen wir eine Methode zum Lernen von Regeln vor, die auf probabilistischen Repräsentationen für fehlende Fakten basiert. Das Verfahren erweitert iterativ die aus einem Wissensgraphen induzierten Regeln, indem es neuronale Wissensgraph-Einbettungen mit Informationen aus Textkorpora kombiniert. Bei der Regelgenerierung werden neue Metriken für die Regelqualität verwendet. Experimente zeigen, dass RuLES die Qualität der gelernten Regeln und ihrer Vorhersagen erheblich verbessert. • Zur Unterstützung der Wissensgraph-Validierung wird ExFaKT vorgestellt, ein Framework zur Konstruktion von Erklärungen für Faktkandidaten. Die Methode transformiert Kandidaten mit Hilfe von Regeln in eine Menge von Aussagen, die leichter zu finden und zu validieren oder widerlegen sind. Die Ausgabe von ExFaKT ist eine Menge semantischer Evidenzen für Faktkandidaten, die aus Textkorpora und dem Wissensgraph extrahiert werden. Experimente zeigen, dass die Transformationen die Ausbeute und Qualität der entdeckten Erklärungen deutlich verbessert. Die generierten unterstützen Erklärungen unterstütze sowohl die manuelle Wissensgraph- Validierung durch Kuratoren als auch die automatische Validierung. • Zur Unterstützung der Wissensgraph-Exploration wird ExCut vorgestellt, eine Methode zur Erzeugung von informativen Entitäts-Clustern mit Erklärungen unter Verwendung von Wissensgraph-Einbettungen und automatisch induzierten Regeln. Eine Cluster-Erklärung besteht aus einer Kombination von Relationen zwischen den Entitäten, die den Cluster identifizieren. ExCut verbessert gleichzeitig die Cluster- Qualität und die Cluster-Erklärbarkeit durch iteratives Verschränken des Lernens von Einbettungen und Regeln. Experimente zeigen, dass ExCut Cluster von hoher Qualität berechnet und dass die Cluster-Erklärungen für Nutzer informativ sind

Universaar

Acronym

Scalable Data Integration for Linked Data

Author: Nentwig Markus
Publication venue
Publication date: 06/08/2020
Field of study

Linked Data describes an extensive set of structured but heterogeneous datasources where entities are connected by formal semantic descriptions. In thevision of the Semantic Web, these semantic links are extended towards theWorld Wide Web to provide as much machine-readable data as possible forsearch queries. The resulting connections allow an automatic evaluation to findnew insights into the data. Identifying these semantic connections betweentwo data sources with automatic approaches is called link discovery. We derivecommon requirements and a generic link discovery workflow based on similaritiesbetween entity properties and associated properties of ontology concepts. Mostof the existing link discovery approaches disregard the fact that in times ofBig Data, an increasing volume of data sources poses new demands on linkdiscovery. In particular, the problem of complex and time-consuming linkdetermination escalates with an increasing number of intersecting data sources.To overcome the restriction of pairwise linking of entities, holistic clusteringapproaches are needed to link equivalent entities of multiple data sources toconstruct integrated knowledge bases. In this context, the focus on efficiencyand scalability is essential. For example, reusing existing links or backgroundinformation can help to avoid redundant calculations. However, when dealingwith multiple data sources, additional data quality problems must also be dealtwith. This dissertation addresses these comprehensive challenges by designingholistic linking and clustering approaches that enable reuse of existing links.Unlike previous systems, we execute the complete data integration workflowvia a distributed processing system. At first, the LinkLion portal will beintroduced to provide existing links for new applications. These links act asa basis for a physical data integration process to create a unified representationfor equivalent entities from many data sources. We then propose a holisticclustering approach to form consolidated clusters for same real-world entitiesfrom many different sources. At the same time, we exploit the semantic typeof entities to improve the quality of the result. The process identifies errorsin existing links and can find numerous additional links. Additionally, theentity clustering has to react to the high dynamics of the data. In particular,this requires scalable approaches for continuously growing data sources withmany entities as well as additional new sources. Previous entity clusteringapproaches are mostly static, focusing on the one-time linking and clustering ofentities from few sources. Therefore, we propose and evaluate new approaches for incremental entity clustering that supports the continuous addition of newentities and data sources. To cope with the ever-increasing number of LinkedData sources, efficient and scalable methods based on distributed processingsystems are required. Thus we propose distributed holistic approaches to linkmany data sources based on a clustering of entities that represent the samereal-world object. The implementation is realized on Apache Flink. In contrastto previous approaches, we utilize efficiency-enhancing optimizations for bothdistributed static and dynamic clustering. An extensive comparative evaluationof the proposed approaches with various distributed clustering strategies showshigh effectiveness for datasets from multiple domains as well as scalability on amulti-machine Apache Flink cluster

Qucosa

HSSS - Hochschulschriftenserver der SLUB

Qucosa - Publikationsserver der Universität Leipzig