Search CORE

3,724 research outputs found

A Variational Approach to Weakly Supervised Document-Level Multi-Aspect Sentiment Classification

Author: Liu Xin
Song Yangqiu
Zeng Ziqian
Zhou Wenxuan
Publication venue
Publication date: 10/04/2019
Field of study

In this paper, we propose a variational approach to weakly supervised document-level multi-aspect sentiment classification. Instead of using user-generated ratings or annotations provided by domain experts, we use target-opinion word pairs as "supervision." These word pairs can be extracted by using dependency parsers and simple rules. Our objective is to predict an opinion word given a target word while our ultimate goal is to learn a sentiment polarity classifier to predict the sentiment polarity of each aspect given a document. By introducing a latent variable, i.e., the sentiment polarity, to the objective function, we can inject the sentiment polarity classifier to the objective via the variational lower bound. We can learn a sentiment polarity classifier by optimizing the lower bound. We show that our method can outperform weakly supervised baselines on TripAdvisor and BeerAdvocate datasets and can be comparable to the state-of-the-art supervised method with hundreds of labels per aspect.Comment: Accepted by NAACL-HLT 201

arXiv.org e-Print Archive

Dataset Construction via Attention for Aspect Term Extraction with Distant Supervision

Author: Antognini Diego
Baeriswyl Michael
Giannakopoulos Athanasios
Hossmann Andreea
Musat Claudiu
Publication venue
Publication date: 26/09/2017
Field of study

Aspect Term Extraction (ATE) detects opinionated aspect terms in sentences or text spans, with the end goal of performing aspect-based sentiment analysis. The small amount of available datasets for supervised ATE and the fact that they cover only a few domains raise the need for exploiting other data sources in new and creative ways. Publicly available review corpora contain a plethora of opinionated aspect terms and cover a larger domain spectrum. In this paper, we first propose a method for using such review corpora for creating a new dataset for ATE. Our method relies on an attention mechanism to select sentences that have a high likelihood of containing actual opinionated aspects. We thus improve the quality of the extracted aspects. We then use the constructed dataset to train a model and perform ATE with distant supervision. By evaluating on human annotated datasets, we prove that our method achieves a significantly improved performance over various unsupervised and supervised baselines. Finally, we prove that sentence selection matters when it comes to creating new datasets for ATE. Specifically, we show that, using a set of selected sentences leads to higher ATE performance compared to using the whole sentence set

arXiv.org e-Print Archive

Japanese Sentiment Classification using a Tree-Structured Long Short-Term Memory with Attention

Author: Komachi Mamoru
Miyazaki Ryosuke
Publication venue
Publication date: 29/09/2018
Field of study

Previous approaches to training syntax-based sentiment classification models required phrase-level annotated corpora, which are not readily available in many languages other than English. Thus, we propose the use of tree-structured Long Short-Term Memory with an attention mechanism that pays attention to each subtree of the parse tree. Experimental results indicate that our model achieves the state-of-the-art performance in a Japanese sentiment classification task.Comment: 10 pages; PACLIC 201

arXiv.org e-Print Archive

Semi-Supervised Affective Meaning Lexicon Expansion Using Semantic and Distributed Word Representations

Author: Alhothali Areej
Hoey Jesse
Publication venue
Publication date: 28/03/2017
Field of study

In this paper, we propose an extension to graph-based sentiment lexicon induction methods by incorporating distributed and semantic word representations in building the similarity graph to expand a three-dimensional sentiment lexicon. We also implemented and evaluated the label propagation using four different word representations and similarity metrics. Our comprehensive evaluation of the four approaches was performed on a single data set, demonstrating that all four methods can generate a significant number of new sentiment assignments with high accuracy. The highest correlations (tau=0.51) and the lowest error (mean absolute error < 1.1%), obtained by combining both the semantic and the distributional features, outperformed the distributional-based and semantic-based label-propagation models and approached a supervised algorithm

arXiv.org e-Print Archive

Deep Learning for Sentiment Analysis : A Survey

Author: Liu Bing
Wang Shuai
Zhang Lei
Publication venue
Publication date: 30/01/2018
Field of study

Deep learning has emerged as a powerful machine learning technique that learns multiple layers of representations or features of the data and produces state-of-the-art prediction results. Along with the success of deep learning in many other application domains, deep learning is also popularly used in sentiment analysis in recent years. This paper first gives an overview of deep learning and then provides a comprehensive survey of its current applications in sentiment analysis.Comment: 34 pages, 9 figures, 2 table

arXiv.org e-Print Archive

Effective LSTMs for Target-Dependent Sentiment Classification

Author: Feng Xiaocheng
Liu Ting
Qin Bing
Tang Duyu
Publication venue
Publication date: 29/09/2016
Field of study

Target-dependent sentiment classification remains a challenge: modeling the semantic relatedness of a target with its context words in a sentence. Different context words have different influences on determining the sentiment polarity of a sentence towards the target. Therefore, it is desirable to integrate the connections between target word and context words when building a learning system. In this paper, we develop two target dependent long short-term memory (LSTM) models, where target information is automatically taken into account. We evaluate our methods on a benchmark dataset from Twitter. Empirical results show that modeling sentence representation with standard LSTM does not perform well. Incorporating target information into LSTM can significantly boost the classification accuracy. The target-dependent LSTM models achieve state-of-the-art performances without using syntactic parser or external sentiment lexicons.Comment: 7 pages, 3 figures published in COLING 201

arXiv.org e-Print Archive

Emotion Detection in Text: a Review

Author: Seyeditabari Armin
Tabari Narges
Zadrozny Wlodek
Publication venue
Publication date: 02/06/2018
Field of study

In recent years, emotion detection in text has become more popular due to its vast potential applications in marketing, political science, psychology, human-computer interaction, artificial intelligence, etc. Access to a huge amount of textual data, especially opinionated and self-expression text also played a special role to bring attention to this field. In this paper, we review the work that has been done in identifying emotion expressions in text and argue that although many techniques, methodologies, and models have been created to detect emotion in text, there are various reasons that make these methods insufficient. Although, there is an essential need to improve the design and architecture of current systems, factors such as the complexity of human emotions, and the use of implicit and metaphorical language in expressing it, lead us to think that just re-purposing standard methodologies will not be enough to capture these complexities, and it is important to pay attention to the linguistic intricacies of emotion expression

arXiv.org e-Print Archive

Investigating the Working of Text Classifiers

Author: Sachan Devendra Singh
Salakhutdinov Ruslan
Zaheer Manzil
Publication venue
Publication date: 05/08/2018
Field of study

Text classification is one of the most widely studied tasks in natural language processing. Motivated by the principle of compositionality, large multilayer neural network models have been employed for this task in an attempt to effectively utilize the constituent expressions. Almost all of the reported work train large networks using discriminative approaches, which come with a caveat of no proper capacity control, as they tend to latch on to any signal that may not generalize. Using various recent state-of-the-art approaches for text classification, we explore whether these models actually learn to compose the meaning of the sentences or still just focus on some keywords or lexicons for classifying the document. To test our hypothesis, we carefully construct datasets where the training and test splits have no direct overlap of such lexicons, but overall language structure would be similar. We study various text classifiers and observe that there is a big performance drop on these datasets. Finally, we show that even simple models with our proposed regularization techniques, which disincentivize focusing on key lexicons, can substantially improve classification accuracy.Comment: Proceedings of COLING 2018, the 27th International Conference on Computational Linguistics: Technical Papers (COLING 2018), NIPS 2017 Workshop on Deep Learning: Bridging Theory and Practic

arXiv.org e-Print Archive

EvoMSA: A Multilingual Evolutionary Approach for Sentiment Analysis

Author: Graff Mario
Miranda-Jiménez Sabino
Moctezuma Daniela
Tellez Eric S.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 30/09/2019
Field of study

Sentiment analysis (SA) is a task related to understanding people's feelings in written text; the starting point would be to identify the polarity level (positive, neutral or negative) of a given text, moving on to identify emotions or whether a text is humorous or not. This task has been the subject of several research competitions in a number of languages, e.g., English, Spanish, and Arabic, among others. In this contribution, we propose an SA system, namely EvoMSA, that unifies our participating systems in various SA competitions, making it domain independent and multilingual by processing text using only language-independent techniques. EvoMSA is a classifier, based on Genetic Programming, that works by combining the output of different text classifiers and text models to produce the final prediction. We analyze EvoMSA on different SA competitions to provide a global overview of its performance, and as the results show, EvoMSA is competitive obtaining top rankings in several SA competitions. Furthermore, we performed an analysis of EvoMSA's components to measure their contribution to the performance; the idea is to facilitate a practitioner or newcomer to implement a competitive SA classifier. Finally, it is worth to mention that EvoMSA is available as open-source software

arXiv.org e-Print Archive

Developing a concept-level knowledge base for sentiment analysis in Singlish

Author: Bajpai Rajiv
Cambria Erik
Ho Danyun
Poria Soujanya
Publication venue
Publication date: 14/07/2017
Field of study

In this paper, we present Singlish sentiment lexicon, a concept-level knowledge base for sentiment analysis that associates multiword expressions to a set of emotion labels and a polarity value. Unlike many other sentiment analysis resources, this lexicon is not built by manually labeling pieces of knowledge coming from general NLP resources such as WordNet or DBPedia. Instead, it is automatically constructed by applying graph-mining and multi-dimensional scaling techniques on the affective common-sense knowledge collected from three different sources. This knowledge is represented redundantly at three levels: semantic network, matrix, and vector space. Subsequently, the concepts are labeled by emotions and polarity through the ensemble application of spreading activation, neural networks and an emotion categorization model.Comment: CICLing 201

arXiv.org e-Print Archive