Search CORE

4 research outputs found

Template-based multilingual football reports generation using Wikidata as a knowledge base

Author: Chris van der Lee
Lorenzo Gatti
Mariët Theune
Publication venue
Publication date: 01/01/2018
Field of study

This paper presents a new version of a football reports generation system called PASS. The original version generated Dutch text and relied on a limited handcrafted knowledge base. We describe how, in a short amount of time, we extended PASS to produce English texts, exploiting machine translation and Wikidata as a large-scale source of multilingual knowledge

Crossref

University of Twente Research Information

Open Access Repository

Tilburg University Repository

N-ary Relation Extraction for Simultaneous T-Box and A-Box Knowledge Base Augmentation

Author: Dorigatti Emilio
Fossati Marco
Giuliano Claudio
Publication venue
Publication date: 29/06/2018
Field of study

The Web has evolved into a huge mine of knowledge carved in different forms, the predominant one still being the free-text document. This motivates the need for intelligent Web-reading agents: hypothetically, they would skim through disparate Web sources corpora and generate meaningful structured assertions to fuel knowledge bases (KBs). Ultimately, comprehensive KBs, like WIKIDATA and DBPEDIA, play a fundamental role to cope with the issue of information overload. On account of such vision, this paper depicts the FACT EXTRACTOR, a complete natural language processing (NLP) pipeline which reads an input textual corpus and produces machine-readable statements. Each statement is supplied with a confidence score and undergoes a disambiguation step via entity linking, thus allowing the assignment of KB-compliant URIs. The system implements four research contributions: it (1) executes n-ary relation extraction by applying the frame semantics linguistic theory, as opposed to binary techniques; it (2) simultaneously populates both the T-Box and the A-Box of the target KB; it (3) relies on a single NLP layer, namely part-of-speech tagging; it (4) enables a completely supervised yet reasonably priced machine learning environment through a crowdsourcing strategy. We assess our approach by setting the target KB to DBpedia and by considering a use case of 52,000 Italian Wikipedia soccer player articles. Out of those, we yield a dataset of more than 213,000 triples with an estimated 81.27% F1. We corroborate the evaluation via (i) a performance comparison with a baseline system, as well as (ii) an analysis of the T-Box and A-Box augmentation capabilities. The outcomes are incorporated into the Italian DBpedia chapter, can be queried through its SPARQL endpoint, and/or downloaded as standalone data dumps. The codebase is released as free software and is publicly available in the DBpedia association repository

Crossref

Archivio della ricerca - Fondazione Bruno Kessler

Improving Relation Extraction with Knowledge-attention

Author: Li Pengfei
Li Qi
Mao Kezhi
Yang Xuefeng
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2019
Field of study

While attention mechanisms have been proven to be effective in many NLP tasks, majority of them are data-driven. We propose a novel knowledge-attention encoder which incorporates prior knowledge from external lexical resources into deep neural networks for relation extraction task. Furthermore, we present three effective ways of integrating knowledge-attention with self-attention to maximize the utilization of both knowledge and data. The proposed relation extraction system is end-to-end and fully attention-based. Experiment results show that the proposed knowledge-attention mechanism has complementary strengths with self-attention, and our integrated models outperform existing CNN, RNN, and self-attention based models. State-of-the-art performance is achieved on TACRED, a complex and large-scale relation extraction dataset.Comment: Paper presented at 2019 Conference on Empirical Methods in Natural Language Processing (EMNLP 2019

arXiv.org e-Print Archive

Crossref

N-ary relation extraction for simultaneous T-Box and A-Box knowledge base augmentation

Author: Baker
Das Desai Chen
Fillmore
Fillmore
Gangemi
Giuliano
Hoffart
Lehmann
Màrquez
Presutti
Punyakanok
Vrandecic
Publication venue: 'IOS Press'
Publication date
Field of study

Crossref