Search CORE

724 research outputs found

A convolutional attentional neural network for sentiment classification

Author: Du Jiachen
Gui Lin
He Yulan
Xu Ruifeng
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/03/2018
Field of study

Neural network models with attention mechanism have shown their efficiencies on various tasks. However, there is little research work on attention mechanism for text classification and existing attention model for text classification lacks of cognitive intuition and mathematical explanation. In this paper, we propose a new architecture of neural network based on the attention model for text classification. In particular, we show that the convolutional neural network (CNN) is a reasonable model for extracting attentions from text sequences in mathematics. We then propose a novel attention model base on CNN and introduce a new network architecture which combines recurrent neural network with our CNN-based attention model. Experimental results on five datasets show that our proposed models can accurately capture the salient parts of sentences to improve the performance of text classification

Aston Publications Explorer

Attentional Encoder Network for Targeted Sentiment Classification

Author: Jiang Tao
Liu Zhiyue
Rao Yanghui
Song Youwei
Wang Jiahai
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/04/2019
Field of study

Targeted sentiment classification aims at determining the sentimental tendency towards specific targets. Most of the previous approaches model context and target words with RNN and attention. However, RNNs are difficult to parallelize and truncated backpropagation through time brings difficulty in remembering long-term patterns. To address this issue, this paper proposes an Attentional Encoder Network (AEN) which eschews recurrence and employs attention based encoders for the modeling between context and target. We raise the label unreliability issue and introduce label smoothing regularization. We also apply pre-trained BERT to this task and obtain new state-of-the-art results. Experiments and analysis demonstrate the effectiveness and lightweight of our model.Comment: 7 page

arXiv.org e-Print Archive

Crossref

Self-Attention Networks for Connectionist Temporal Classification in Speech Recognition

Author: Huang Zhiheng
Kirchhoff Katrin
Salazar Julian
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 19/02/2019
Field of study

The success of self-attention in NLP has led to recent applications in end-to-end encoder-decoder architectures for speech recognition. Separately, connectionist temporal classification (CTC) has matured as an alignment-free, non-autoregressive approach to sequence transduction, either by itself or in various multitask and decoding frameworks. We propose SAN-CTC, a deep, fully self-attentional network for CTC, and show it is tractable and competitive for end-to-end speech recognition. SAN-CTC trains quickly and outperforms existing CTC models and most encoder-decoder models, with character error rates (CERs) of 4.7% in 1 day on WSJ eval92 and 2.8% in 1 week on LibriSpeech test-clean, with a fixed architecture and one GPU. Similar improvements hold for WERs after LM decoding. We motivate the architecture for speech, evaluate position and downsampling approaches, and explore how label alphabets (character, phoneme, subword) affect attention heads and performance.Comment: Accepted to ICASSP 201

arXiv.org e-Print Archive

Crossref

Convolution-based neural attention with applications to sentiment classification

Author: Du Jiachen
Gui Lin
He Yulan
Wang Xuan
Xu Ruifeng
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 18/03/2019
Field of study

Neural attention mechanism has achieved many successes in various tasks in natural language processing. However, existing neural attention models based on a densely connected network are loosely related to the attention mechanism found in psychology and neuroscience. Motivated by the finding in neuroscience that human possesses the template-searching attention mechanism, we propose to use convolution operation to simulate attentions and give a mathematical explanation of our neural attention model. We then introduce a new network architecture, which combines a recurrent neural network with our convolution-based attention model and further stacks an attention-based neural model to build a hierarchical sentiment classification model. The experimental results show that our proposed models can capture salient parts of the text to improve the performance of sentiment classification at both the sentence level and the document level

Warwick Research Archives Portal Repository