Search CORE

6 research outputs found

Statix - statistical type inference on linked data

Author: Cudré-Mauroux Philippe
Khayati Mourad
Lutov Artem
Roshankish Soheil
Publication venue
Publication date: 16/02/2019
Field of study

Large knowledge bases typically contain data adhering to various schemas with incomplete and/or noisy type information. This seriously complicates further integration and post-processing efforts, as type information is crucial in correctly handling the data. In this paper, we introduce a novel statistical type inference method, called StaTIX, to effectively infer instance types in Linked Data sets in a fully unsupervised manner. Our inference technique leverages a new hierarchical clustering algorithm that is robust, highly effective, and scalable. We introduce a novel approach to reduce the processing complexity of the similarity matrix specifying the relations between various instances in the knowledge base. This approach speeds up the inference process while also improving the correctness of the inferred types due to the noise attenuation in the input data. We further optimize the clustering process by introducing a dedicated hash function that speeds up the inference process by orders of magnitude without negatively affecting its accuracy. Finally, we describe a new technique to identify representative clusters from the multi-scale output of our clustering algorithm to further improve the accuracy of the inferred types. We empirically evaluate our approach on several real-world datasets and compare it to the state of the art. Our results show that StaTIX is more efficient than existing methods (both in terms of speed and memory consumption) as well as more effective. StaTIX reduces the F1-score error of the predicted types by about 40% on average compared to the state of the art and improves the execution time by orders of magnitude

arXiv.org e-Print Archive

RERO DOC Digital Library

Knowledge Graphs Evolution and Preservation -- A Technical Report from ISWS 2019

Author: Ahmad Sakor
Alba Catalina Morales Tirado
Alessandro Umbrico
Allard Oelen
Amine Dadoun
Aneta Koleva
Anna Nguyen
Ariam Rivas Mendez
Axel Polleres
Bilal Koteich
Chang Sun
Chuangtao Ma
Claudia d'Amato
Eleonora Marzi
Fabio Mariani
Federico Igne
Felix Bensmann
Frances Gillis-Webber
Francesca Alloatti
Francesca Giovannetti
Genet Asefa Gesese
Gianmarco Spinaci
Glenda Amaral
Harald Sack
Harm Delva
Heiko Paulheim
Irene Celino
Ismail Harrando
Ivan Heibi
Jaime Salas
Jan Portisch
John Domingue
Kabul Kurniawan
Kader Pustu-Iren
Kholoud Alghamdi
Laurine Huber
Lientje Maas
Ling Cai
Luigi Asprino
Maheshkumar Mistry
Marc Gallofré Ocaña
Margherita Porena
Marieke van Erp
Martin Beno
Martin Mansfield
Marìa Granados Buey
Meilin Shi
Mengya Liu
Michalis Georgiou
Michel Dumontier
Mohamad Yaser Jaradeh
Molka Tounsi Dhouib
Mortaza Alinam
Nacira Abbas
Neha Keshan
Omaima Fallatah
Paola Espinoza Arias
Riley Capshaw
Russa Biswas
Sebastian Rudolph
Sebastián Ferrada
Sepideh Mesbah
Soheil Roshankish
Stefano De Giorgis
Tabea Tietz
Thomas Schleider
Valentina Anita Carriero
Valentina Pasqual
Valentina Presutti
Viet Bach Nguyen
Vincent Emonet
Vitor Horta
Weiqin Xu
Wouter van den Berg
Publication venue
Publication date: 01/01/2020
Field of study

One of the grand challenges discussed during the Dagstuhl Seminar "Knowledge Graphs: New Directions for Knowledge Representation on the Semantic Web" and described in its report is that of a: "Public FAIR Knowledge Graph of Everything: We increasingly see the creation of knowledge graphs that capture information about the entirety of a class of entities. [...] This grand challenge extends this further by asking if we can create a knowledge graph of "everything" ranging from common sense concepts to location based entities. This knowledge graph should be "open to the public" in a FAIR manner democratizing this mass amount of knowledge." Although linked open data (LOD) is one knowledge graph, it is the closest realisation (and probably the only one) to a public FAIR Knowledge Graph (KG) of everything. Surely, LOD provides a unique testbed for experimenting and evaluating research hypotheses on open and FAIR KG. One of the most neglected FAIR issues about KGs is their ongoing evolution and long term preservation. We want to investigate this problem, that is to understand what preserving and supporting the evolution of KGs means and how these problems can be addressed. Clearly, the problem can be approached from different perspectives and may require the development of different approaches, including new theories, ontologies, metrics, strategies, procedures, etc. This document reports a collaborative effort performed by 9 teams of students, each guided by a senior researcher as their mentor, attending the International Semantic Web Research School (ISWS 2019). Each team provides a different perspective to the problem of knowledge graph evolution substantiated by a set of research questions as the main subject of their investigation. In addition, they provide their working definition for KG preservation and evolution

Archivio istituzionale della ricerca - Università di Bari