1,817 research outputs found
Aspect Based Sentiment Analysis with Gated Convolutional Networks
Aspect based sentiment analysis (ABSA) can provide more detailed information
than general sentiment analysis, because it aims to predict the sentiment
polarities of the given aspects or entities in text. We summarize previous
approaches into two subtasks: aspect-category sentiment analysis (ACSA) and
aspect-term sentiment analysis (ATSA). Most previous approaches employ long
short-term memory and attention mechanisms to predict the sentiment polarity of
the concerned targets, which are often complicated and need more training time.
We propose a model based on convolutional neural networks and gating
mechanisms, which is more accurate and efficient. First, the novel Gated
Tanh-ReLU Units can selectively output the sentiment features according to the
given aspect or entity. The architecture is much simpler than attention layer
used in the existing models. Second, the computations of our model could be
easily parallelized during training, because convolutional layers do not have
time dependency as in LSTM layers, and gating units also work independently.
The experiments on SemEval datasets demonstrate the efficiency and
effectiveness of our models.Comment: Accepted in ACL 201
Gated Convolutional Neural Networks for Domain Adaptation
Domain Adaptation explores the idea of how to maximize performance on a
target domain, distinct from source domain, upon which the classifier was
trained. This idea has been explored for the task of sentiment analysis
extensively. The training of reviews pertaining to one domain and evaluation on
another domain is widely studied for modeling a domain independent algorithm.
This further helps in understanding correlation between domains. In this paper,
we show that Gated Convolutional Neural Networks (GCN) perform effectively at
learning sentiment analysis in a manner where domain dependant knowledge is
filtered out using its gates. We perform our experiments on multiple gate
architectures: Gated Tanh ReLU Unit (GTRU), Gated Tanh Unit (GTU) and Gated
Linear Unit (GLU). Extensive experimentation on two standard datasets relevant
to the task, reveal that training with Gated Convolutional Neural Networks give
significantly better performance on target domains than regular convolution and
recurrent based architectures. While complex architectures like attention,
filter domain specific knowledge as well, their complexity order is remarkably
high as compared to gated architectures. GCNs rely on convolution hence gaining
an upper hand through parallelization.Comment: Accepted Long Paper at 24th International Conference on Applications
of Natural Language to Information Systems, June 2019, MediaCityUK Campus,
United Kingdo
Deep Learning for Sentiment Analysis : A Survey
Deep learning has emerged as a powerful machine learning technique that
learns multiple layers of representations or features of the data and produces
state-of-the-art prediction results. Along with the success of deep learning in
many other application domains, deep learning is also popularly used in
sentiment analysis in recent years. This paper first gives an overview of deep
learning and then provides a comprehensive survey of its current applications
in sentiment analysis.Comment: 34 pages, 9 figures, 2 table
Multimodal Sentiment Analysis with Word-Level Fusion and Reinforcement Learning
With the increasing popularity of video sharing websites such as YouTube and
Facebook, multimodal sentiment analysis has received increasing attention from
the scientific community. Contrary to previous works in multimodal sentiment
analysis which focus on holistic information in speech segments such as bag of
words representations and average facial expression intensity, we develop a
novel deep architecture for multimodal sentiment analysis that performs
modality fusion at the word level. In this paper, we propose the Gated
Multimodal Embedding LSTM with Temporal Attention (GME-LSTM(A)) model that is
composed of 2 modules. The Gated Multimodal Embedding alleviates the
difficulties of fusion when there are noisy modalities. The LSTM with Temporal
Attention performs word level fusion at a finer fusion resolution between input
modalities and attends to the most important time steps. As a result, the
GME-LSTM(A) is able to better model the multimodal structure of speech through
time and perform better sentiment comprehension. We demonstrate the
effectiveness of this approach on the publicly-available Multimodal Corpus of
Sentiment Intensity and Subjectivity Analysis (CMU-MOSI) dataset by achieving
state-of-the-art sentiment classification and regression results. Qualitative
analysis on our model emphasizes the importance of the Temporal Attention Layer
in sentiment prediction because the additional acoustic and visual modalities
are noisy. We also demonstrate the effectiveness of the Gated Multimodal
Embedding in selectively filtering these noisy modalities out. Our results and
analysis open new areas in the study of sentiment analysis in human
communication and provide new models for multimodal fusion.Comment: ICMI 2017 Oral Presentation, Honorable Mention Awar
Left-Center-Right Separated Neural Network for Aspect-based Sentiment Analysis with Rotatory Attention
Deep learning techniques have achieved success in aspect-based sentiment
analysis in recent years. However, there are two important issues that still
remain to be further studied, i.e., 1) how to efficiently represent the target
especially when the target contains multiple words; 2) how to utilize the
interaction between target and left/right contexts to capture the most
important words in them. In this paper, we propose an approach, called
left-center-right separated neural network with rotatory attention (LCR-Rot),
to better address the two problems. Our approach has two characteristics: 1) it
has three separated LSTMs, i.e., left, center and right LSTMs, corresponding to
three parts of a review (left context, target phrase and right context); 2) it
has a rotatory attention mechanism which models the relation between target and
left/right contexts. The target2context attention is used to capture the most
indicative sentiment words in left/right contexts. Subsequently, the
context2target attention is used to capture the most important word in the
target. This leads to a two-side representation of the target: left-aware
target and right-aware target. We compare our approach on three benchmark
datasets with ten related methods proposed recently. The results show that our
approach significantly outperforms the state-of-the-art techniques
Aspect-Based Relational Sentiment Analysis Using a Stacked Neural Network Architecture
Sentiment analysis can be regarded as a relation extraction problem in which
the sentiment of some opinion holder towards a certain aspect of a product,
theme or event needs to be extracted. We present a novel neural architecture
for sentiment analysis as a relation extraction problem that addresses this
problem by dividing it into three subtasks: i) identification of aspect and
opinion terms, ii) labeling of opinion terms with a sentiment, and iii)
extraction of relations between opinion terms and aspect terms. For each
subtask, we propose a neural network based component and combine all of them
into a complete system for relational sentiment analysis. The component for
aspect and opinion term extraction is a hybrid architecture consisting of a
recurrent neural network stacked on top of a convolutional neural network. This
approach outperforms a standard convolutional deep neural architecture as well
as a recurrent network architecture and performs competitively compared to
other methods on two datasets of annotated customer reviews. To extract
sentiments for individual opinion terms, we propose a recurrent architecture in
combination with word distance features and achieve promising results,
outperforming a majority baseline by 18% accuracy and providing the first
results for the USAGE dataset. Our relation extraction component outperforms
the current state-of-the-art in aspect-opinion relation extraction by 15%
F-Measure
Long Short-Term Attention
Attention is an important cognition process of humans, which helps humans
concentrate on critical information during their perception and learning.
However, although many machine learning models can remember information of
data, they have no the attention mechanism. For example, the long short-term
memory (LSTM) network is able to remember sequential information, but it cannot
pay special attention to part of the sequences. In this paper, we present a
novel model called long short-term attention (LSTA), which seamlessly
integrates the attention mechanism into the inner cell of LSTM. More than
processing long short term dependencies, LSTA can focus on important
information of the sequences with the attention mechanism. Extensive
experiments demonstrate that LSTA outperforms LSTM and related models on the
sequence learning tasks
Reasoning with Sarcasm by Reading In-between
Sarcasm is a sophisticated speech act which commonly manifests on social
communities such as Twitter and Reddit. The prevalence of sarcasm on the social
web is highly disruptive to opinion mining systems due to not only its tendency
of polarity flipping but also usage of figurative language. Sarcasm commonly
manifests with a contrastive theme either between positive-negative sentiments
or between literal-figurative scenarios. In this paper, we revisit the notion
of modeling contrast in order to reason with sarcasm. More specifically, we
propose an attention-based neural model that looks in-between instead of
across, enabling it to explicitly model contrast and incongruity. We conduct
extensive experiments on six benchmark datasets from Twitter, Reddit and the
Internet Argument Corpus. Our proposed model not only achieves state-of-the-art
performance on all datasets but also enjoys improved interpretability.Comment: Accepted to ACL201
Efficient Character-level Document Classification by Combining Convolution and Recurrent Layers
Document classification tasks were primarily tackled at word level. Recent
research that works with character-level inputs shows several benefits over
word-level approaches such as natural incorporation of morphemes and better
handling of rare words. We propose a neural network architecture that utilizes
both convolution and recurrent layers to efficiently encode character inputs.
We validate the proposed model on eight large scale document classification
tasks and compare with character-level convolution-only models. It achieves
comparable performances with much less parameters
A Structured Self-attentive Sentence Embedding
This paper proposes a new model for extracting an interpretable sentence
embedding by introducing self-attention. Instead of using a vector, we use a
2-D matrix to represent the embedding, with each row of the matrix attending on
a different part of the sentence. We also propose a self-attention mechanism
and a special regularization term for the model. As a side effect, the
embedding comes with an easy way of visualizing what specific parts of the
sentence are encoded into the embedding. We evaluate our model on 3 different
tasks: author profiling, sentiment classification, and textual entailment.
Results show that our model yields a significant performance gain compared to
other sentence embedding methods in all of the 3 tasks.Comment: 15 pages with appendix, 7 figures, 4 tables. Conference paper in 5th
International Conference on Learning Representations (ICLR 2017
- …