Search CORE

17,553 research outputs found

Selective Attention for Context-aware Neural Machine Translation

Author: Haffari Gholamreza
Martins André F. T.
Maruf Sameen
Publication venue
Publication date: 01/01/2019
Field of study

Despite the progress made in sentence-level NMT, current systems still fall short at achieving fluent, good quality translation for a full document. Recent works in context-aware NMT consider only a few previous sentences as context and may not scale to entire documents. To this end, we propose a novel and scalable top-down approach to hierarchical attention for context-aware NMT which uses sparse attention to selectively focus on relevant sentences in the document context and then attends to key words in those sentences. We also propose single-level attention approaches based on sentence or word-level information in the context. The document-level context representation, produced from these attention modules, is integrated into the encoder or decoder of the Transformer model depending on whether we use monolingual or bilingual context. Our experiments and evaluation on English-German datasets in different document MT settings show that our selective attention approach not only significantly outperforms context-agnostic baselines but also surpasses context-aware baselines in most cases.Comment: Accepted at NAACL-HLT 201

arXiv.org e-Print Archive

Crossref

Monash University Research Portal

Skeleton-Based Human Action Recognition with Global Context-Aware Attention LSTM Networks

Author: Abdiyeva Kamila
Duan Ling-Yu
Kot Alex C.
Liu Jun
Wang Gang
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

Human action recognition in 3D skeleton sequences has attracted a lot of research attention. Recently, Long Short-Term Memory (LSTM) networks have shown promising performance in this task due to their strengths in modeling the dependencies and dynamics in sequential data. As not all skeletal joints are informative for action recognition, and the irrelevant joints often bring noise which can degrade the performance, we need to pay more attention to the informative ones. However, the original LSTM network does not have explicit attention ability. In this paper, we propose a new class of LSTM network, Global Context-Aware Attention LSTM (GCA-LSTM), for skeleton based action recognition. This network is capable of selectively focusing on the informative joints in each frame of each skeleton sequence by using a global context memory cell. To further improve the attention capability of our network, we also introduce a recurrent attention mechanism, with which the attention performance of the network can be enhanced progressively. Moreover, we propose a stepwise training scheme in order to train our network effectively. Our approach achieves state-of-the-art performance on five challenging benchmark datasets for skeleton based action recognition

arXiv.org e-Print Archive

Crossref

DR-NTU (Digital Repository of NTU)

HanoiT: Enhancing Context-aware Translation via Selective Context

Author: Guo Hongcheng
Huang Haoyang
Li Zhoujun
Ma Shuming
Wei Furu
Yang Jian
Yang Liqun
Yin Yuwei
Zeng Yutao
Zhang Dongdong
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 17/01/2023
Field of study

Context-aware neural machine translation aims to use the document-level context to improve translation quality. However, not all words in the context are helpful. The irrelevant or trivial words may bring some noise and distract the model from learning the relationship between the current sentence and the auxiliary context. To mitigate this problem, we propose a novel end-to-end encoder-decoder model with a layer-wise selection mechanism to sift and refine the long document context. To verify the effectiveness of our method, extensive experiments and extra quantitative analysis are conducted on four document-level machine translation benchmarks. The experimental results demonstrate that our model significantly outperforms previous models on all datasets via the soft selection mechanism

arXiv.org e-Print Archive

Controlling Risk of Web Question Answering

Author: Devlin Jacob
Dunn Matthew
Ferrucci David
Gal Yarin
Geifman Yonatan
Guo Chuan
Lai Guokun
Levy Omer
Malinin Andrey
Nguyen Tri
Richardson Matthew
Vinyals Oriol
Voorhees Ellen M.
Wang Shuohang
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 11/07/2019
Field of study

Web question answering (QA) has become an indispensable component in modern search systems, which can significantly improve users' search experience by providing a direct answer to users' information need. This could be achieved by applying machine reading comprehension (MRC) models over the retrieved passages to extract answers with respect to the search query. With the development of deep learning techniques, state-of-the-art MRC performances have been achieved by recent deep methods. However, existing studies on MRC seldom address the predictive uncertainty issue, i.e., how likely the prediction of an MRC model is wrong, leading to uncontrollable risks in real-world Web QA applications. In this work, we first conduct an in-depth investigation over the risk of Web QA. We then introduce a novel risk control framework, which consists of a qualify model for uncertainty estimation using the probe idea, and a decision model for selectively output. For evaluation, we introduce risk-related metrics, rather than the traditional EM and F1 in MRC, for the evaluation of risk-aware Web QA. The empirical results over both the real-world Web QA dataset and the academic MRC benchmark collection demonstrate the effectiveness of our approach.Comment: 42nd International ACM SIGIR Conference on Research and Development in Information Retrieva

arXiv.org e-Print Archive

Crossref