2,154 research outputs found

    A study on text-score disagreement in online reviews

    Get PDF
    In this paper, we focus on online reviews and employ artificial intelligence tools, taken from the cognitive computing field, to help understanding the relationships between the textual part of the review and the assigned numerical score. We move from the intuitions that 1) a set of textual reviews expressing different sentiments may feature the same score (and vice-versa); and 2) detecting and analyzing the mismatches between the review content and the actual score may benefit both service providers and consumers, by highlighting specific factors of satisfaction (and dissatisfaction) in texts. To prove the intuitions, we adopt sentiment analysis techniques and we concentrate on hotel reviews, to find polarity mismatches therein. In particular, we first train a text classifier with a set of annotated hotel reviews, taken from the Booking website. Then, we analyze a large dataset, with around 160k hotel reviews collected from Tripadvisor, with the aim of detecting a polarity mismatch, indicating if the textual content of the review is in line, or not, with the associated score. Using well established artificial intelligence techniques and analyzing in depth the reviews featuring a mismatch between the text polarity and the score, we find that -on a scale of five stars- those reviews ranked with middle scores include a mixture of positive and negative aspects. The approach proposed here, beside acting as a polarity detector, provides an effective selection of reviews -on an initial very large dataset- that may allow both consumers and providers to focus directly on the review subset featuring a text/score disagreement, which conveniently convey to the user a summary of positive and negative features of the review target.Comment: This is the accepted version of the paper. The final version will be published in the Journal of Cognitive Computation, available at Springer via http://dx.doi.org/10.1007/s12559-017-9496-

    Sentiment classification from reviews for tourism analytics

    Get PDF
    User-generated content is critical for tourism destination management as it could help them identify their customers' opinions and come up with solutions to upgrade their tourism organizations as it could help them identify customer opinions. There are many reviews on social media and it is difficult for these organizations to analyse the reviews manually. By applying sentiment classification, reviews can be classified into several classes and help ease decision-making. The reviews contain noisy contents, such as typos and emoticons, which could affect the accuracy of the classifiers. This study evaluates the reviews using Support Vector Machine and Random Forest models to identify a suitable classifier. The main phases in this study are data collection, data preparation, data labelling and modelling phases. The reviews are labelled into three sentiments; positive, neutral, and negative. During pre-processing, steps such as removing the missing value, tokenization, case folding, stop words removal, stemming, and applying n-grams are performed. The result of this research is evaluated by looking at the performance of the models based on accuracy where the result with the highest accuracy is chosen as the solution. In this study, data is data from TripAdvisor and Google reviews using web scraping tools. The findings show that the Support Vector Machine model with 5-fold cross-validation the most suitable classifier with an accuracy of 67.97% compared to Naive Bayes with 61.33% accuracy and Random Forest classifier with 63.55% accuracy. In conclusion, the result of this paper could provide important information in tourism besides determining the suitable algorithm to be used for Sentiment Analysis related to the tourism domain

    Using sentiment analysis in tourism research: A systematic, bibliometric, and integrative review

    Full text link
    Purpose: Sentiment analysis is built from the information provided through text (reviews) to help understand the social sentiment toward their brand, product, or service. The main purpose of this paper is to draw an overview of the topics and the use of the sentiment analysis approach in tourism research. Methods: The study is a bibliometric analysis (VOSviewer), with a systematic and integrative review. The search occurred in March 2021 (Scopus) applying the search terms "sentiment analysis" and "tourism" in the title, abstract, or keywords, resulting in a final sample of 111 papers. Results: This analysis pointed out that China (35) and the United States (24) are the leading countries studying sentiment analysis with tourism. The first paper using sentiment analysis was published in 2012; there is a growing interest in this topic, presenting qualitative and quantitative approaches. The main results present four clusters to understand this subject. Cluster 1 discusses sentiment analysis and its application in tourism research, searching how online reviews can impact decision-making. Cluster 2 examines the resources used to make sentiment analysis, such as social media. Cluster 3 argues about methodological approaches in sentiment analysis and tourism, such as deep learning and sentiment classification, to understand the user-generated content. Cluster 4 highlights questions relating to the internet and tourism. Implications: The use of sentiment analysis in tourism research shows that government and entrepreneurship can draw and enhance communication strategies, reduce cost, and time, and mainly contribute to the decision-making process and understand consumer behavior

    What drives the helpfulness of online reviews? A deep learning study of sentiment analysis, pictorial content and reviewer expertise for mature destinations.

    Get PDF
    User-generated content (UGC) is a growing driver of destination choice. Drawing on dual-process theories on how individuals process information, this study focuses on the role of central and peripheral information processing routes in the formation of consumers’ perceptions of the helpfulness of online reviews. We carried out a two-step process to address the perceived helpfulness of user-generated content, a sentiment analysis using advanced machine-learning techniques (deep learning), and a regression analysis. We used a database of 2,023 comments posted on TripAdvisor about two iconic Venetian cultural attractions, St. Mark’s Square (an open, free attraction) and the Doge’s Palace (a museum which charges an entry fee). Following the application of deep-learning techniques, we first identified which factors influenced whether a review received a “helpful” vote by means of logistic regression. Second, we selected those reviews which received at least one helpful vote to identify, through linear regression, the significant determinants of TripAdvisor users’ voting behaviour. The results showed that reviewer expertise is an influential factor in both free and paid-for attractions, although the impact of central cues (sentiment polarity, subjectivity and pictorial content) is different in both attractions. Our study suggests that managers should look beyond individual ratings and focus on the sentiment analysis of online reviews, which are shown to be based on the nature of the attraction (free vs. paid-for)

    Data analytics 2016: proceedings of the fifth international conference on data analytics

    Get PDF

    Destination image online analyzed through user generated content: a systematic literature review

    Get PDF
    Destination Image is a concept that has been studied for a long time in tourism research. The question of how a destination is perceived by tourists and potential new guests is an important insight, especially for local tourism managers, in order to evaluate the implemented strategies and to plan further tactics. Since the last two decades, due to a drastic digitalization, tourism research is now increasingly examining the Destination Image online. This creates new challenges in the selection of sources, methods, and in data collection. The aim of the present study was to systematically capture the approach to analyze the online Destination Image through User Generated Content using studies from the last ten years. Therefore, a Systematic Literature Review on primary research from academic databases was conducted. As a summary of the findings, a conceptual model was developed, based on the insights of the studies in the dataset, to contribute a guidance for the preparation phase of future online Destination Image research. In short, the main findings are: TripAdvisor.com is the main source for online Destination Image analysis. Researchers recommend using the help of software and programming languages to collect and analyzed the data. Equally to earlier Destination Image studies, the main methods applied in online Destination Image analysis are quantitative content analysis, qualitative content analysis and sentiment analysis. In combination with the examination of cognitive and affective factors, co-occurrence analysis, and correlation analysis. The present study has several limitations, which are: the loss of detail information due to reducing the studies to comparable key parameters, the absence of Anglo-American studies, due to the database selection as well as the lack of quality testing of the studies included.A Destination Image é um conceito que tem sido estudado há muito tempo na investigação turística. A questão de como o destino é visto pelos turistas e pelos potenciais novos hóspedes é uma perspectiva importante, especialmente para os gestores de turismo da região, a fim de avaliar as estratégias implementadas e de planear novas tácticas. Desde as últimas duas décadas, ocorreu uma digitalização drástica, a investigação turística adaptou-se a este fenómeno e está agora a estudar cada vez mais a imagem do destino online. Esta alteração criou novos desafios na selecção de fontes, métodos, e na recolha de dados. O objetivo do presente trabalho foi o de captar, de forma sistemática, as abordagens consideradas para analisar a imagem do destino online utilizando estudos dos últimos dez anos. Para este efeito, os estudos primários dos anos 2010-2020 das bases de dados académicos Web of Science, ProQuest e b-on, foram recolhidos utilizando palavras-chave de pesquisa pré-definidas. O grupo de artigos obtidos como resultado foram subsequentemente sujeitos a avaliação de eligibilidade, como recomendado por Moher et al. (2009). Isto significa que os estudos que não cumpriam os critérios pré-definidos foram excluídos. Os critérios de inclusão foram: O trabalho académico tinha de ser uma referência primária de uma revista científica, escrita em inglês e a amostra analisada tinha de ter uma origem associada à comunicação nas social media online. Posteriormente, os restantes 35 artigos foram transferidos para uma base de dados utilizando uma matriz de codificação. A matriz de codificação foi concebida para capturar os parâmetros-chave de cada estudo primário de uma forma padronizada e, portanto, comparável. Foi considerada informação geral, como o ano, localização e revista publicada, bem como informação temática específica, como o campo do turismo pesquisado e os meios analisados, juntamente com as categorias referentes à metodologia considerada, as ferramentas utilizadas e os resultados obtidos. A base de dados resultante foi então utilizada para obter declarações sobre a abordagem metodológica utilizada na análise da imagem de destinos online. Como resumo dos resultados, foi desenvolvido um modelo conceptual, baseado nos conhecimentos obtidos a partir do grupo de artigos, que constituiu o conjunto de dados para análise, para contribuir com um guião para a fase de preparação de uma futura investigação sobre imagem dos destinos online. Em resumo, as principais conclusões são: TripAdvisor.com é a principal fonte para a análise da imagem de destinos online. Os investigadores recomendam a utilização da ajuda de software e linguagens de programação para a recolha e análise dos dados. À semelhança de estudos anteriores de Destination Image, os principais métodos aplicados na análise imagem dos destinos online são a análise quantitativa do conteúdo, a análise qualitativa do conteúdo e a análise dos sentimentos. Em combinação com a análise dos fatores cognitivos e afectivos, análise de co-ocorrência, e análise de correlação. O presente estudo tem várias limitações. Que são: a perda de informação detalhada devido à redução dos estudos a parâmetros-chave comparáveis, a ausência de estudos anglo-americanos, devido à selecção do banco de dados, bem como a falta de testes de qualidade dos estudos incluídos.(TurExperience - Tourist experiences' impacts on the destination image: searching for new opportunities to the Algarve”)
    corecore