17,553 research outputs found
Selective Attention for Context-aware Neural Machine Translation
Despite the progress made in sentence-level NMT, current systems still fall
short at achieving fluent, good quality translation for a full document. Recent
works in context-aware NMT consider only a few previous sentences as context
and may not scale to entire documents. To this end, we propose a novel and
scalable top-down approach to hierarchical attention for context-aware NMT
which uses sparse attention to selectively focus on relevant sentences in the
document context and then attends to key words in those sentences. We also
propose single-level attention approaches based on sentence or word-level
information in the context. The document-level context representation, produced
from these attention modules, is integrated into the encoder or decoder of the
Transformer model depending on whether we use monolingual or bilingual context.
Our experiments and evaluation on English-German datasets in different document
MT settings show that our selective attention approach not only significantly
outperforms context-agnostic baselines but also surpasses context-aware
baselines in most cases.Comment: Accepted at NAACL-HLT 201
Skeleton-Based Human Action Recognition with Global Context-Aware Attention LSTM Networks
Human action recognition in 3D skeleton sequences has attracted a lot of
research attention. Recently, Long Short-Term Memory (LSTM) networks have shown
promising performance in this task due to their strengths in modeling the
dependencies and dynamics in sequential data. As not all skeletal joints are
informative for action recognition, and the irrelevant joints often bring noise
which can degrade the performance, we need to pay more attention to the
informative ones. However, the original LSTM network does not have explicit
attention ability. In this paper, we propose a new class of LSTM network,
Global Context-Aware Attention LSTM (GCA-LSTM), for skeleton based action
recognition. This network is capable of selectively focusing on the informative
joints in each frame of each skeleton sequence by using a global context memory
cell. To further improve the attention capability of our network, we also
introduce a recurrent attention mechanism, with which the attention performance
of the network can be enhanced progressively. Moreover, we propose a stepwise
training scheme in order to train our network effectively. Our approach
achieves state-of-the-art performance on five challenging benchmark datasets
for skeleton based action recognition
HanoiT: Enhancing Context-aware Translation via Selective Context
Context-aware neural machine translation aims to use the document-level
context to improve translation quality. However, not all words in the context
are helpful. The irrelevant or trivial words may bring some noise and distract
the model from learning the relationship between the current sentence and the
auxiliary context. To mitigate this problem, we propose a novel end-to-end
encoder-decoder model with a layer-wise selection mechanism to sift and refine
the long document context. To verify the effectiveness of our method, extensive
experiments and extra quantitative analysis are conducted on four
document-level machine translation benchmarks. The experimental results
demonstrate that our model significantly outperforms previous models on all
datasets via the soft selection mechanism
Controlling Risk of Web Question Answering
Web question answering (QA) has become an indispensable component in modern
search systems, which can significantly improve users' search experience by
providing a direct answer to users' information need. This could be achieved by
applying machine reading comprehension (MRC) models over the retrieved passages
to extract answers with respect to the search query. With the development of
deep learning techniques, state-of-the-art MRC performances have been achieved
by recent deep methods. However, existing studies on MRC seldom address the
predictive uncertainty issue, i.e., how likely the prediction of an MRC model
is wrong, leading to uncontrollable risks in real-world Web QA applications. In
this work, we first conduct an in-depth investigation over the risk of Web QA.
We then introduce a novel risk control framework, which consists of a qualify
model for uncertainty estimation using the probe idea, and a decision model for
selectively output. For evaluation, we introduce risk-related metrics, rather
than the traditional EM and F1 in MRC, for the evaluation of risk-aware Web QA.
The empirical results over both the real-world Web QA dataset and the academic
MRC benchmark collection demonstrate the effectiveness of our approach.Comment: 42nd International ACM SIGIR Conference on Research and Development
in Information Retrieva
- …