22 research outputs found
One for All: Neural Joint Modeling of Entities and Events
The previous work for event extraction has mainly focused on the predictions
for event triggers and argument roles, treating entity mentions as being
provided by human annotators. This is unrealistic as entity mentions are
usually predicted by some existing toolkits whose errors might be propagated to
the event trigger and argument role recognition. Few of the recent work has
addressed this problem by jointly predicting entity mentions, event triggers
and arguments. However, such work is limited to using discrete engineering
features to represent contextual information for the individual tasks and their
interactions. In this work, we propose a novel model to jointly perform
predictions for entity mentions, event triggers and arguments based on the
shared hidden representations from deep learning. The experiments demonstrate
the benefits of the proposed method, leading to the state-of-the-art
performance for event extraction.Comment: Accepted at The Thirty-Third AAAI Conference on Artificial
Intelligence (AAAI-19) (Honolulu, Hawaii, USA
Leveraging syntactic parsing to improve event annotation matching
Detecting event mentions is the first step in event extraction from text and annotating them is a notoriously difficult task. Evaluating annotator consistency is crucial when building datasets for mention detection. When event mentions are allowed to cover many tokens, annotators may disagree on their span, which means that overlapping annotations may then refer to the same event or to different events.
This paper explores different fuzzy matching functions which aim to resolve this ambiguity. The functions extract the sets of syntactic heads present in the annotations, use the Dice coefficient to measure the similarity between sets and return a judgment based on a given threshold. The functions are tested against the judgments of a human evaluator and a comparison is made between sets of tokens and sets of syntactic heads. The best-performing function is a head-based function that is found to agree with the human evaluator in 89% of cases
Abstractive news summarization based on event semantic link network
This paper studies the abstractive multi-document summarization for event-oriented news texts through event information extraction and abstract representation. Fine-grained event mentions and semantic relations between them are extracted to build a unified and connected event semantic link network, an abstract representation of source texts. A network reduction algorithm is proposed to summarize the most salient and coherent event information. New sentences with good linguistic quality are automatically generated and selected through sentences over-generation and greedy-selection processes. Experimental results on DUC2006 and DUC2007 datasets show that our system significantly outperforms the state-of-the-art extractive and abstractive baselines under both pyramid and ROUGE evaluation metrics
Optimal Hyperparameters for Deep LSTM-Networks for Sequence Labeling Tasks
Selecting optimal parameters for a neural network architecture can often make
the difference between mediocre and state-of-the-art performance. However,
little is published which parameters and design choices should be evaluated or
selected making the correct hyperparameter optimization often a "black art that
requires expert experiences" (Snoek et al., 2012). In this paper, we evaluate
the importance of different network design choices and hyperparameters for five
common linguistic sequence tagging tasks (POS, Chunking, NER, Entity
Recognition, and Event Detection). We evaluated over 50.000 different setups
and found, that some parameters, like the pre-trained word embeddings or the
last layer of the network, have a large impact on the performance, while other
parameters, for example the number of LSTM layers or the number of recurrent
units, are of minor importance. We give a recommendation on a configuration
that performs well among different tasks.Comment: 34 pages. 9 page version of this paper published at EMNLP 201