49 research outputs found
Improving Lexical Choice in Neural Machine Translation
We explore two solutions to the problem of mistranslating rare words in
neural machine translation. First, we argue that the standard output layer,
which computes the inner product of a vector representing the context with all
possible output word embeddings, rewards frequent words disproportionately, and
we propose to fix the norms of both vectors to a constant value. Second, we
integrate a simple lexical module which is jointly trained with the rest of the
model. We evaluate our approaches on eight language pairs with data sizes
ranging from 100k to 8M words, and achieve improvements of up to +4.3 BLEU,
surpassing phrase-based translation in nearly all settings.Comment: Accepted at NAACL HLT 201
Dynamic Global Memory for Document-level Argument Extraction
Extracting informative arguments of events from news articles is a
challenging problem in information extraction, which requires a global
contextual understanding of each document. While recent work on document-level
extraction has gone beyond single-sentence and increased the cross-sentence
inference capability of end-to-end models, they are still restricted by certain
input sequence length constraints and usually ignore the global context between
events. To tackle this issue, we introduce a new global neural generation-based
framework for document-level event argument extraction by constructing a
document memory store to record the contextual event information and leveraging
it to implicitly and explicitly help with decoding of arguments for later
events. Empirical results show that our framework outperforms prior methods
substantially and it is more robust to adversarially annotated examples with
our constrained decoding design. (Our code and resources are available at
https://github.com/xinyadu/memory_docie for research purpose.)Comment: ACL 2022 main conference (12 pages