15,250 research outputs found
Towards Building a Knowledge Base of Monetary Transactions from a News Collection
We address the problem of extracting structured representations of economic
events from a large corpus of news articles, using a combination of natural
language processing and machine learning techniques. The developed techniques
allow for semi-automatic population of a financial knowledge base, which, in
turn, may be used to support a range of data mining and exploration tasks. The
key challenge we face in this domain is that the same event is often reported
multiple times, with varying correctness of details. We address this challenge
by first collecting all information pertinent to a given event from the entire
corpus, then considering all possible representations of the event, and
finally, using a supervised learning method, to rank these representations by
the associated confidence scores. A main innovative element of our approach is
that it jointly extracts and stores all attributes of the event as a single
representation (quintuple). Using a purpose-built test set we demonstrate that
our supervised learning approach can achieve 25% improvement in F1-score over
baseline methods that consider the earliest, the latest or the most frequent
reporting of the event.Comment: Proceedings of the 17th ACM/IEEE-CS Joint Conference on Digital
Libraries (JCDL '17), 201
Event-based Access to Historical Italian War Memoirs
The progressive digitization of historical archives provides new, often
domain specific, textual resources that report on facts and events which have
happened in the past; among these, memoirs are a very common type of primary
source. In this paper, we present an approach for extracting information from
Italian historical war memoirs and turning it into structured knowledge. This
is based on the semantic notions of events, participants and roles. We evaluate
quantitatively each of the key-steps of our approach and provide a graph-based
representation of the extracted knowledge, which allows to move between a Close
and a Distant Reading of the collection.Comment: 23 pages, 6 figure
Deep Semantic Role Labeling with Self-Attention
Semantic Role Labeling (SRL) is believed to be a crucial step towards natural
language understanding and has been widely studied. Recent years, end-to-end
SRL with recurrent neural networks (RNN) has gained increasing attention.
However, it remains a major challenge for RNNs to handle structural information
and long range dependencies. In this paper, we present a simple and effective
architecture for SRL which aims to address these problems. Our model is based
on self-attention which can directly capture the relationships between two
tokens regardless of their distance. Our single model achieves F on
the CoNLL-2005 shared task dataset and F on the CoNLL-2012 shared task
dataset, which outperforms the previous state-of-the-art results by and
F score respectively. Besides, our model is computationally
efficient, and the parsing speed is 50K tokens per second on a single Titan X
GPU.Comment: Accepted by AAAI-201
- …