Search CORE

15,250 research outputs found

Towards Building a Knowledge Base of Monetary Transactions from a News Collection

Author: Balog Krisztian
Benetka Jan R.
Nørvåg Kjetil
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 17/09/2017
Field of study

We address the problem of extracting structured representations of economic events from a large corpus of news articles, using a combination of natural language processing and machine learning techniques. The developed techniques allow for semi-automatic population of a financial knowledge base, which, in turn, may be used to support a range of data mining and exploration tasks. The key challenge we face in this domain is that the same event is often reported multiple times, with varying correctness of details. We address this challenge by first collecting all information pertinent to a given event from the entire corpus, then considering all possible representations of the event, and finally, using a supervised learning method, to rank these representations by the associated confidence scores. A main innovative element of our approach is that it jointly extracts and stores all attributes of the event as a single representation (quintuple). Using a purpose-built test set we demonstrate that our supervised learning approach can achieve 25% improvement in F1-score over baseline methods that consider the earliest, the latest or the most frequent reporting of the event.Comment: Proceedings of the 17th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL '17), 201

arXiv.org e-Print Archive

Crossref

Event-based Access to Historical Italian War Memoirs

Author: Nanni Federico
Ponzetto Simone Paolo
Rovera Marco
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2021
Field of study

The progressive digitization of historical archives provides new, often domain specific, textual resources that report on facts and events which have happened in the past; among these, memoirs are a very common type of primary source. In this paper, we present an approach for extracting information from Italian historical war memoirs and turning it into structured knowledge. This is based on the semantic notions of events, participants and roles. We evaluate quantitatively each of the key-steps of our approach and provide a graph-based representation of the extracted knowledge, which allows to move between a Close and a Distant Reading of the collection.Comment: 23 pages, 6 figure

arXiv.org e-Print Archive

MAnnheim DOCument Server

Deep Semantic Role Labeling with Self-Attention

Author: Chen Yidong
Shi Xiaodong
Tan Zhixing
Wang Mingxuan
Xie Jun
Publication venue
Publication date: 05/12/2017
Field of study

Semantic Role Labeling (SRL) is believed to be a crucial step towards natural language understanding and has been widely studied. Recent years, end-to-end SRL with recurrent neural networks (RNN) has gained increasing attention. However, it remains a major challenge for RNNs to handle structural information and long range dependencies. In this paper, we present a simple and effective architecture for SRL which aims to address these problems. Our model is based on self-attention which can directly capture the relationships between two tokens regardless of their distance. Our single model achieves F

_1=83.4

on the CoNLL-2005 shared task dataset and F

_1=82.7

on the CoNLL-2012 shared task dataset, which outperforms the previous state-of-the-art results by

1.8

and

1.0

_1

score respectively. Besides, our model is computationally efficient, and the parsing speed is 50K tokens per second on a single Titan X GPU.Comment: Accepted by AAAI-201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications