research

The expression of sentiment in user reviews of hotels

Abstract

The linguistic expression of sentiment, understood as the polarity of an opinion, is known to be domain-specific to a certain extent (Aue & Gamon, 2005; Choi et al., 2009). Even though many words and expressions convey the same evaluation across domains (e.g., “excellent”, “terrible”), many others acquire a more precise semantic orientation within a specific domain. For example, features such as size or location (and the lexical expressions that are used to express them) may or may not convey semantic orientation depending on the topic. In Sentiment Analysis (SA), it is critical that domain-specific expressions of sentiment be accounted for (Tan et al., 2007) if the system is to be useful to those who wish to explore the polarity of texts belonging in that domain. The software tool Lingmotif (Moreno-Ortiz, 2016) will be used to explore a corpus of hotel reviews in the English language. Lingmotif is a lexicon-based, linguistically-motivated, user-friendly, GUI-enabled, multi-platform, Sentiment Analysis desktop application. Lingmotif can perform SA on any type of input texts, regardless of size and topic. The analysis is based on the identification of sentiment-laden words and phrases contained in the application's rich core lexicons, and employs context rules to account for sentiment shifters. It offers easy-to-interpret visual representations of quantitative data (text polarity, sentiment intensity, sentiment profile), as well as a detailed, qualitative analysis of the text in terms of its sentiment. Lingmotif can also take user-provided plugin lexicons in order to account for domain-specific sentiment expression. In this paper, we describe our procedure to identify domain-specific lexical cues for the domain of user reviews of Spanish hotels. We made use of a recently compiled corpus of reviews from the online travel agency booking site booking.com. This corpus was analyzed entirely with Lingmotif using only its core (i.e., general-language lexicon), and then manually analyzed the results to find errors and omissions produced by the lack of specialized language cues. We then encoded the identified lexical cues as a Lingmotif plugin lexicon and reran the analysis with it. This methodology allowed us, first, to obtain a very concrete description of the expression of sentiment in this domain, and, from a practical perspective, to precisely measure to what extent this expression is domain-dependent.Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tech

    Similar works