192 research outputs found
Dynamic Classification of Sentiments from Restaurant Reviews Using Novel Fuzzy-Encoded LSTM
User reviews on social media have sparked a surge in interest in the application of sentiment analysis to provide feedback to the government, public and commercial sectors. Sentiment analysis, spam identification, sarcasm detection and news classification are just few of the uses of text mining. For many firms, classifying reviews based on user feelings is a significant and collaborative effort. In recent years, machine learning models and handcrafted features have been used to study text classification, however they have failed to produce encouraging results for short text categorization. Deep neural network based Long Short-Term Memory (LSTM) and Fuzzy logic model with incremental learning is suggested in this paper. On the basis of F1-score, accuracy, precision and recall, suggested model was tested on a large dataset of hotel reviews. This study is a categorization analysis of hotel review feelings provided by hotel customers. When word embedding is paired with LSTM, findings show that the suggested model outperforms current best-practice methods, with an accuracy 81.04%, precision 77.81%, recall 80.63% and F1-score 75.44%. The efficiency of the proposed model on any sort of review categorization job is demonstrated by these encouraging findings
Sentiment Analysis Using Averaged Weighted Word Vector Features
People use the world wide web heavily to share their experience with entities
such as products, services, or travel destinations. Texts that provide online
feedback in the form of reviews and comments are essential to make consumer
decisions. These comments create a valuable source that may be used to measure
satisfaction related to products or services. Sentiment analysis is the task of
identifying opinions expressed in such text fragments. In this work, we develop
two methods that combine different types of word vectors to learn and estimate
polarity of reviews. We develop average review vectors from word vectors and
add weights to this review vectors using word frequencies in positive and
negative sensitivity-tagged reviews. We applied the methods to several datasets
from different domains that are used as standard benchmarks for sentiment
analysis. We ensemble the techniques with each other and existing methods, and
we make a comparison with the approaches in the literature. The results show
that the performances of our approaches outperform the state-of-the-art success
rates
Attribute Sentiment Scoring With Online Text Reviews : Accounting for Language Structure and Attribute Self-Selection
The authors address two novel and significant challenges in using online text reviews to obtain attribute level ratings. First, they introduce the problem of inferring attribute level sentiment from text data to the marketing literature and develop a deep learning model to address it. While extant bag of words based topic models are fairly good at attribute discovery based on frequency of word or phrase occurrences, associating sentiments to attributes requires exploiting the spatial and sequential structure of language. Second, they illustrate how to correct for attribute self-selection—reviewers choose the subset of attributes to write about—in metrics of attribute level restaurant performance. Using Yelp.com reviews for empirical illustration, they find that a hybrid deep learning (CNN-LSTM) model, where CNN and LSTM exploit the spatial and sequential structure of language respectively provide the best performance in accuracy, training speed and training data size requirements. The model does particularly well on the “hard” sentiment classification problems. Further, accounting for attribute self-selection significantly impacts sentiment scores, especially on attributes that are frequently missing
Three Essays on the Role of Unstructured Data in Marketing Research
This thesis studies the use of firm and user-generated unstructured data (e.g., text and videos) for improving market research combining advances in text, audio and video processing with traditional economic modeling. The first chapter is joint work with K. Sudhir and Minkyung Kim. It addresses two significant challenges in using online text reviews to obtain fine-grained attribute level sentiment ratings. First, we develop a deep learning convolutional-LSTM hybrid model to account for language structure, in contrast to methods that rely on word frequency. The convolutional layer accounts for the spatial structure (adjacent word groups or phrases) and LSTM accounts for the sequential structure of language (sentiment distributed and modified across non-adjacent phrases). Second, we address the problem of missing attributes in text in constructing attribute sentiment scores---as reviewers write only about a subset of attributes and remain silent on others. We develop a model-based imputation strategy using a structural model of heterogeneous rating behavior. Using Yelp restaurant review data, we show superior accuracy in converting text to numerical attribute sentiment scores with our model. The structural model finds three reviewer segments with different motivations: status seeking, altruism/want voice, and need to vent/praise. Interestingly, our results show that reviewers write to inform and vent/praise, but not based on attribute importance. Our heterogeneous model-based imputation performs better than other common imputations; and importantly leads to managerially significant corrections in restaurant attribute ratings. The second essay, which is joint work with Aniko Oery and Joyee Deb is an information-theoretic model to study what causes selection in valence in user-generated reviews. The propensity of consumers to engage in word-of-mouth (WOM) differs after good versus bad experiences, which can result in positive or negative selection of user-generated reviews. We show how the strength of brand image (dispersion of consumer beliefs about quality) and the informativeness of good and bad experiences impacts selection of WOM in equilibrium. WOM is costly: Early adopters talk only if they can affect the receiver’s purchase. If the brand image is strong (consumer beliefs are homogeneous), only negative WOM can arise. With a weak brand image or heterogeneous beliefs, positive WOM can occur if positive experiences are sufficiently informative. Using data from Yelp.com, we show how strong brands (chain restaurants) systematically receive lower evaluations controlling for several restaurant and reviewer characteristics. The third essay which is joint work with K.Sudhir and Khai Chiong studies success factors of persuasive sales pitches from a multi-modal video dataset of buyer-seller interactions. A successful sales pitch is an outcome of both the content of the message as well as style of delivery. Moreover, unlike one-way interactions like speeches, sales pitches are a two-way process and hence interactivity as well as matching the wavelength of the buyer are also critical to the success of the pitch. We extract four groups of features: content-related, style-related, interactivity and similarity in order to build a predictive model of sales pitch effectiveness
Re-Engineered Word Embeddings for Improved Document-Level Sentiment Analysis
In this paper, a novel re-engineering mechanism for the generation of word embeddings is proposed for document-level sentiment analysis. Current approaches to sentiment analysis often integrate feature engineering with classification, without optimizing the feature vectors explicitly. Engineering feature vectors to match the data between the training set and query sample as proposed in this paper could be a promising way for boosting the classification performance in machine learning applications. The proposed mechanism is designed to re-engineer the feature components from a set of embedding vectors for greatly increased between-class separation, hence better leveraging the informative content of the documents. The proposed mechanism was evaluated using four public benchmarking datasets for both two-way and five-way semantic classifications. The resulting embeddings have demonstrated substantially improved performance for a range of sentiment analysis tasks. Tests using all the four datasets achieved by far the best classification results compared with the state-of-the-art
Aspect-Based Sentiment Analysis using Machine Learning and Deep Learning Approaches
Sentiment analysis (SA) is also known as opinion mining, it is the process of gathering and analyzing people's opinions about a particular service, good, or company on websites like Twitter, Facebook, Instagram, LinkedIn, and blogs, among other places. This article covers a thorough analysis of SA and its levels. This manuscript's main focus is on aspect-based SA, which helps manufacturing organizations make better decisions by examining consumers' viewpoints and opinions of their products. The many approaches and methods used in aspect-based sentiment analysis are covered in this review study (ABSA). The features associated with the aspects were manually drawn out in traditional methods, which made it a time-consuming and error-prone operation. Nevertheless, these restrictions may be overcome as artificial intelligence develops. Therefore, to increase the effectiveness of ABSA, researchers are increasingly using AI-based machine learning (ML) and deep learning (DL) techniques. Additionally, certain recently released ABSA approaches based on ML and DL are examined, contrasted, and based on this research, gaps in both methodologies are discovered. At the conclusion of this study, the difficulties that current ABSA models encounter are also emphasized, along with suggestions that can be made to improve the efficacy and precision of ABSA systems
Interpretable and Steerable Sequence Learning via Prototypes
One of the major challenges in machine learning nowadays is to provide
predictions with not only high accuracy but also user-friendly explanations.
Although in recent years we have witnessed increasingly popular use of deep
neural networks for sequence modeling, it is still challenging to explain the
rationales behind the model outputs, which is essential for building trust and
supporting the domain experts to validate, critique and refine the model. We
propose ProSeNet, an interpretable and steerable deep sequence model with
natural explanations derived from case-based reasoning. The prediction is
obtained by comparing the inputs to a few prototypes, which are exemplar cases
in the problem domain. For better interpretability, we define several criteria
for constructing the prototypes, including simplicity, diversity, and sparsity
and propose the learning objective and the optimization procedure. ProSeNet
also provides a user-friendly approach to model steering: domain experts
without any knowledge on the underlying model or parameters can easily
incorporate their intuition and experience by manually refining the prototypes.
We conduct experiments on a wide range of real-world applications, including
predictive diagnostics for automobiles, ECG, and protein sequence
classification and sentiment analysis on texts. The result shows that ProSeNet
can achieve accuracy on par with state-of-the-art deep learning models. We also
evaluate the interpretability of the results with concrete case studies.
Finally, through user study on Amazon Mechanical Turk (MTurk), we demonstrate
that the model selects high-quality prototypes which align well with human
knowledge and can be interactively refined for better interpretability without
loss of performance.Comment: Accepted as a full paper at KDD 2019 on May 8, 201
- …