574 research outputs found

    Dynamic Classification of Sentiments from Restaurant Reviews Using Novel Fuzzy-Encoded LSTM

    Get PDF
    User reviews on social media have sparked a surge in interest in the application of sentiment analysis to provide feedback to the government, public and commercial sectors. Sentiment analysis, spam identification, sarcasm detection and news classification are just few of the uses of text mining. For many firms, classifying reviews based on user feelings is a significant and collaborative effort. In recent years, machine learning models and handcrafted features have been used to study text classification, however they have failed to produce encouraging results for short text categorization. Deep neural network based Long Short-Term Memory (LSTM) and Fuzzy logic model with incremental learning is suggested in this paper. On the basis of F1-score, accuracy, precision and recall, suggested model was tested on a large dataset of hotel reviews. This study is a categorization analysis of hotel review feelings provided by hotel customers. When word embedding is paired with LSTM, findings show that the suggested model outperforms current best-practice methods, with an accuracy 81.04%, precision 77.81%, recall 80.63% and F1-score 75.44%. The efficiency of the proposed model on any sort of review categorization job is demonstrated by these encouraging findings

    Sentiment Classification of Online Customer Reviews and Blogs Using Sentence-level Lexical Based Semantic Orientation Method

    Get PDF
    ABSTRACT Sentiment analysis is the process of extracting knowledge from the peoples‟ opinions, appraisals and emotions toward entities, events and their attributes. These opinions greatly impact on customers to ease their choices regarding online shopping, choosing events, products and entities. With the rapid growth of online resources, a vast amount of new data in the form of customer reviews and opinions are being generated progressively. Hence, sentiment analysis methods are desirable for developing efficient and effective analyses and classification of customer reviews, blogs and comments. The main inspiration for this thesis is to develop high performance domain independent sentiment classification method. This study focuses on sentiment analysis at the sentence level using lexical based method for different type data such as reviews and blogs. The proposed method is based on general lexicons i.e. WordNet, SentiWordNet and user defined lexical dictionaries for sentiment orientation. The relations and glosses of these dictionaries provide solution to the domain portability problem. The experiments are performed on various data sets such as customer reviews and blogs comments. The results show that the proposed method with sentence contextual information is effective for sentiment classification. The proposed method performs better than word and text level corpus based machine learning methods for semantic orientation. The results highlight that the proposed method achieves an average accuracy of 86% at sentence-level and 97% at feedback level for customer reviews. Similarly, it achieves an average accuracy of 83% at sentence level and 86% at feedback level for blog comment

    Assessment, Implication, and Analysis of Online Consumer Reviews: A Literature Review

    Get PDF
    The onset of e-marketplace, virtual communities and social networking has appreciated the influential capability of online consumer reviews (OCR) and therefore necessitate conglomeration of the body of knowledge. This article attempts to conceptually cluster academic literature in both management and technical domain. The study follows a framework which broadly clusters management research under two heads: OCR Assessment and OCR Implication (business implication). Parallel technical literature has been reviewed to reconcile methodologies adopted in the analysis of text content on the web, majorly reviews. Text mining through automated tools, algorithmic contribution (dominant majorly in technical stream literature) and manual assessment (derived from the stream of content analysis) has been studied in this review article. Literature survey of both the domains is analyzed to propose possible area for further research. Usage of text analysis methods along with statistical and data mining techniques to analyze review text and utilize the knowledge creation for solving managerial issues can possibly constitute further work. Available at: https://aisel.aisnet.org/pajais/vol9/iss2/4

    A comparative analysis of recommender systems based on item aspect opinions extracted from user reviews

    Full text link
    In popular applications such as e-commerce sites and social media, users provide online reviews giving personal opinions about a wide array of items, such as products, services and people. These reviews are usually in the form of free text, and represent a rich source of information about the users’ preferences. Among the information elements that can be extracted from reviews, opinions about particular item aspects (i.e., characteristics, attributes or components) have been shown to be effective for user modeling and personalized recommendation. In this paper, we investigate the aspect-based recommendation problem by separately addressing three tasks, namely identifying references to item aspects in user reviews, classifying the sentiment orientation of the opinions about such aspects in the reviews, and exploiting the extracted aspect opinion information to provide enhanced recommendations. Differently to previous work, we integrate and empirically evaluate several state-of-the-art and novel methods for each of the above tasks. We conduct extensive experiments on standard datasets and several domains, analyzing distinct recommendation quality metrics and characteristics of the datasets, domains and extracted aspects. As a result of our investigation, we not only derive conclusions about which combination of methods is most appropriate according to the above issues, but also provide a number of valuable resources for opinion mining and recommendation purposes, such as domain aspect vocabularies and domain-dependent, aspect-level lexiconsThis work was supported by the Spanish Ministry of Economy, Industry and Competitiveness (TIN2016-80630-P)

    Sentiment classification with deep neural networks

    Get PDF
    Sentiment classification is an important task in Natural Language Processing (NLP) area. Deep neural networks become the mainstream method to perform the text sentiment classification nowadays. In this thesis two datasets are used. The first dataset is a hotel review dataset(TripAdvisor dataset) that collects the hotel reviews from the TripAdvisor website using Python Scrapy framework. The preprocessing steps are then applied to clean the dataset. A record in the TripAdvisor dataset consists of the text review and corresponding sentiment score. There are 5 sentimental labels: very negative, negative, neutral, positive, and very positive. The second dataset is the Stanford Sentiment Treebank (SST) dataset. It is a public and common dataset for sentiment classification. Text Convolutional Neural Network (Text-CNN), Very Deep Convolutional Neural Network (VDCNN), and Bidirectional Long Short Term Memory neural network (BiLSTM) were chosen as different methods for the evaluation in the experiments. The Text-CNN was the first work to apply convolutional neural network architecture for the text classification. The VD-CNN applied deep convolutional layers, with up to 29 layers, to perform the text classification. The BiLSTM exploited the bidirectional recurrent neural network with long short term memory cell mechanism. On the other hand, word embedding techniques are also considered as an important factor in sentiment classification. Thus, in this thesis, GloVe and FastText techniques were used to investigate the effect of word embedding initialization on the dataset. GloVe is a unsupervised word embedding learning algorithm. FastText uses shallow neural network to generate word vectors and it has fast convergence speed for training and high speed for inference. The experiment was implemented using PyTorch framework. It shows that the BiLSTM with GloVe as the word vector initialization achieved the highest accuracy 73.73% while the VD-CNN with FastText had the lowest accuracy 71.95% on the TripAdvisor dataset. The BiLSTM model achieved 0.68 F1-score while the VD-CNN model obtained 0.67 F1-score on the TripAdvisor dataset. On the SST dataset, BiLSTM with GloVe again achieved the highest accuracy 36.35% and 0.35 F1-score. The VD-CNN model with GloVe had the worst evaluation result in terms of accuracy and F1-score. The Text-CNN model performed better than the VD-CNN model even thought the VD-CNN model has more layers in most cases. By analyzing the misclassified reviews in the TripAdvisor dataset from the three deep neural networks, it is shown that the hotel reviews with more contradictory sentimental words were more prone to misclassification than other hotel reviews

    What Airbnb Reviews can Tell us? An Advanced Latent Aspect Rating Analysis Approach

    Get PDF
    There is no doubt that the rapid growth of Airbnb has changed the lodging industry and tourists’ behaviors dramatically since the advent of the sharing economy. Airbnb welcomes customers and engages them by creating and providing unique travel experiences to “live like a local” through the delivery of lodging services. With the special experiences that Airbnb customers pursue, more investigation is needed to systematically examine the Airbnb customer lodging experience. Online reviews offer a representative look at individual customers’ personal and unique lodging experiences. Moreover, the overall ratings given by customers are reflections of their experiences with a product or service. Since customers take overall ratings into account in their purchase decisions, a study that bridges the customer lodging experience and the overall rating is needed. In contrast to traditional research methods, mining customer reviews has become a useful method to study customers’ opinions about products and services. User-generated reviews are a form of evaluation generated by peers that users post on business or other (e.g., third-party) websites (Mudambi & Schuff, 2010). The main purpose of this study is to identify the weights of latent lodging experience aspects that customers consider in order to form their overall ratings based on the eight basic emotions. This study applied both aspect-based sentiment analysis and the latent aspect rating analysis (LARA) model to predict the aspect ratings and determine the latent aspect weights. Specifically, this study extracted the innovative lodging experience aspects that Airbnb customers care about most by mining a total of 248,693 customer reviews from 6,946 Airbnb accommodations. Then, the NRC Emotion Lexicon with eight emotions was employed to assess the sentiments associated with each lodging aspect. By applying latent rating regression, the predicted aspect ratings were generated. With the aspect ratings, , the aspect weights, and the predicted overall ratings were calculated. It was suggested that the overall rating be assessed based on the sentiment words of five lodging aspects: communication, experience, location, product/service, and value. It was found that, compared with the aspects of location, product/service, and value, customers expressed less joy and more surprise than they did over the aspects of communication and experience. The LRR results demonstrate that Airbnb customers care most about a listing location, followed by experience, value, communication, and product/service. The results also revealed that even listings with the same overall rating may have different predicted aspect ratings based on the different aspect weights. Finally, the LARA model demonstrated the different preferences between customers seeking expensive versus cheap accommodations. Understanding customer experience and its role in forming customer rating behavior is important. This study empirically confirms and expands the usefulness of LARA as the prediction model in deconstructing overall ratings into aspect ratings, and then further predicting aspect level weights. This study makes meaningful academic contributions to the evolving customer behavior and customer experience research. It also benefits the shared-lodging industry through its development of pragmatic methods to establish effective marketing strategies for improving customer perceptions and create personalized review filter systems

    Management Responses to Online Reviews: Big Data From Social Media Platforms

    Get PDF
    User-generated content from virtual communities helps businesses develop and sustain competitive advantages, which leads to asking how firms can strategically manage that content. This research, which consists of two studies, discusses management response strategies for hotel firms to gain a competitive advantage and improve customer relationship management by leveraging big data, social media analytics, and deep learning techniques. Since negative reviews' harmful effects are greater than positive comments' contribution, firms must strategise their responses to intervene in and minimise those damages. Although current literature includes a sheer amount of research that presents effective response strategies to negative reviews, they mostly overlook an extensive classification of response strategies. The first study consists of two phases and focuses on comprehensive response strategies to only negative reviews. The first phase is explorative and presents a correlation analysis between response strategies and overall ratings of hotels. It also reveals the differences in those strategies based on hotel class, average customer rating, and region. The second phase investigates effective response strategies for increasing the subsequent ratings of returning customers using logistic regression analysis. It presents that responses involving statements of admittance of mistake(s), specific action, and direct contact requests help increase following ratings of previously dissatisfied returning customers. In addition, personalising the response for better customer relationship management is particularly difficult due to the significant variability of textual reviews with various topics. The second study examines the impact of personalised management responses to positive and negative reviews on rating growth, integrating a novel method of multi-topic matching approach with a panel data analysis. It demonstrates that (a) personalised responses improve future ratings of hotels; (b) the effect of personalised responses is stronger for luxury hotels in increasing future ratings. Lastly, practical insights are provided

    Three essays on applications of machine learning in problems with high dimensional data

    Get PDF
    The amount of data businesses collecting from the internet is massive. Researchers and analysts can now track various data features generated from log files, such as customers’ behavior history, product descriptions and aggregate level data. etc. In an ideal scenario, such data could be represented in a spreadsheet, with columns representing each dimension. In practice, the number of data dimensions can be staggering, making data processing difficult. With high dimensional data, the number of features can be more than the number of observations, and it can be very challenging for traditional econometric method to handle this scenario. My dissertation addresses this data issue by applying machine learning techniques, including LASSO (least absolute shrinkage and selection operator), decision trees, and neural networks, to help decision makers perform descriptive-predictive, and prescriptive analytics based on high dimensional data. My dissertation comprises three essays. The first essay applies tree based machine learning models (random forest and gradient boosting decision tree) and free text information to predict house prices and understand how certain factors could affect the prices. In the second essay, I propose a LASSO method in high dimensional data and use daily prices of hotels to understand hotel’s competition pattern in a certain area. In the third essay, a word embedding and neural network model is applied to real estate data to more efficiently extract free text information, which leads to more accurate of house prices. In these essays, I apply and extend a variety of analytic tools including supervised learning, unsupervised learning, statistics, and econometric methods. These essays contribute to the applied econometric and business analytics literature and can help researchers and analysts appreciate both traditional econometrics and predictive analytics tools, and make data-driven business decisions
    • …
    corecore