25 research outputs found
Reconnaissance d'entités nommées à l'aide d'une base de connaissances: État de l’art, aspects statistiques et apports de Cumuleo.be pour leur désambiguïsation dans des dépêches d’agence belges francophones -- Sciences et Technologies de l'Information
info:eu-repo/semantics/nonPublishe
Linked Open Data Validity: A Technical Report from ISWS 2018
Linked Open Data (LOD) is the publicly available RDF data in the Web. Each LOD entity is identfied by a URI and accessible via HTTP. LOD encodes globalscale knowledge potentially available to any human as well as artificial intelligence that may want to benefit from it as background knowledge for supporting their tasks. LOD has emerged as the backbone of applications in diverse fields such as Natural Language Processing, Information Retrieval, Computer Vision, Speech Recognition, and many more. Nevertheless, regardless of the specific tasks that LOD-based tools aim to address, the reuse of such knowledge may be challenging for diverse reasons, e.g. semantic heterogeneity, provenance, and data quality. As aptly stated by Heath et al. Linked Data might be outdated, imprecise, or simply wrong": there arouses a necessity to investigate the problem of linked data validity. This work reports a collaborative effort performed by nine teams of students, guided by an equal number of senior researchers, attending the International Semantic Web Research School (ISWS 2018) towards addressing such investigation from different perspectives coupled with different approaches to tackle the issue.info:eu-repo/semantics/nonPublishe
On the use of external identifiers to improve data quality in knowledge bases :the case of Wikidata
info:eu-repo/semantics/nonPublishe
Using external identifiers to improve Wikidata and its related datasets: state of play and future work
info:eu-repo/semantics/nonPublishe
Close-reading of Linked Data: a case study in regards to the quality of online authority files
More and more cultural institutions use Linked Data principles2to share and connect their collection metadata. In the archival field, initiatives emerge to exploit data contained in archival descriptions and adapt encoding standards to the semantic web. In this context, online authority files can be used to enrich metadata. However, relying on a decentralized network of knowledge bases suchas Wikidata, DBpedia or even Viaf has its own difficulties. This paper aims to offer a critical view of these linked authority files by adopting a close-readingapproach. Through a practical case study, we intend to identify and illustrate the possibilities and limits of RDF triples7compared to institutions’ less structured metadata.info:eu-repo/semantics/inPres
Close-reading of Linked Data: A Case Study in Regards to the Quality of Online Authority Files
International audienc
Close-reading of Linked Data: A Case Study in Regards to the Quality of Online Authority Files
International audienc
Scrambling for Metadata: Using Topic Modeling and Word2Vec to Explore the Archives of the European Commission
info:eu-repo/semantics/nonPublishe
Mining User Queries with Information Extraction Methods and Linked Data
Purpose: Advanced usage of web analytics tools allows to capture the content of user queries. Despite their relevant nature, the manual analysis of large volumes of user queries is problematic. The purpose of this paper is to address the problem of named entity recognition in digital library user queries. Design/methodology/approach: The paper presents a large-scale case study conducted at the Royal Library of Belgium in its online historical newspapers platform BelgicaPress. The object of the study is a data set of 83,854 queries resulting from 29,812 visits over a 12-month period. By making use of information extraction methods, knowledge bases (KBs) and various authority files, this paper presents the possibilities and limits to identify what percentage of end users are looking for person and place names. Findings: Based on a quantitative assessment, the method can successfully identify the majority of person and place names from user queries. Due to the specific character of user queries and the nature of the KBs used, a limited amount of queries remained too ambiguous to be treated in an automated manner. Originality/value: This paper demonstrates in an empirical manner how user queries can be extracted from a web analytics tool and how named entities can then be mapped with KBs and authority files, in order to facilitate automated analysis of their content. Methods and tools used are generalisable and can be reused by other collection holders.SCOPUS: ar.jinfo:eu-repo/semantics/publishe