867 research outputs found

    What Airbnb Reviews can Tell us? An Advanced Latent Aspect Rating Analysis Approach

    Get PDF
    There is no doubt that the rapid growth of Airbnb has changed the lodging industry and tourists’ behaviors dramatically since the advent of the sharing economy. Airbnb welcomes customers and engages them by creating and providing unique travel experiences to “live like a local” through the delivery of lodging services. With the special experiences that Airbnb customers pursue, more investigation is needed to systematically examine the Airbnb customer lodging experience. Online reviews offer a representative look at individual customers’ personal and unique lodging experiences. Moreover, the overall ratings given by customers are reflections of their experiences with a product or service. Since customers take overall ratings into account in their purchase decisions, a study that bridges the customer lodging experience and the overall rating is needed. In contrast to traditional research methods, mining customer reviews has become a useful method to study customers’ opinions about products and services. User-generated reviews are a form of evaluation generated by peers that users post on business or other (e.g., third-party) websites (Mudambi & Schuff, 2010). The main purpose of this study is to identify the weights of latent lodging experience aspects that customers consider in order to form their overall ratings based on the eight basic emotions. This study applied both aspect-based sentiment analysis and the latent aspect rating analysis (LARA) model to predict the aspect ratings and determine the latent aspect weights. Specifically, this study extracted the innovative lodging experience aspects that Airbnb customers care about most by mining a total of 248,693 customer reviews from 6,946 Airbnb accommodations. Then, the NRC Emotion Lexicon with eight emotions was employed to assess the sentiments associated with each lodging aspect. By applying latent rating regression, the predicted aspect ratings were generated. With the aspect ratings, , the aspect weights, and the predicted overall ratings were calculated. It was suggested that the overall rating be assessed based on the sentiment words of five lodging aspects: communication, experience, location, product/service, and value. It was found that, compared with the aspects of location, product/service, and value, customers expressed less joy and more surprise than they did over the aspects of communication and experience. The LRR results demonstrate that Airbnb customers care most about a listing location, followed by experience, value, communication, and product/service. The results also revealed that even listings with the same overall rating may have different predicted aspect ratings based on the different aspect weights. Finally, the LARA model demonstrated the different preferences between customers seeking expensive versus cheap accommodations. Understanding customer experience and its role in forming customer rating behavior is important. This study empirically confirms and expands the usefulness of LARA as the prediction model in deconstructing overall ratings into aspect ratings, and then further predicting aspect level weights. This study makes meaningful academic contributions to the evolving customer behavior and customer experience research. It also benefits the shared-lodging industry through its development of pragmatic methods to establish effective marketing strategies for improving customer perceptions and create personalized review filter systems

    Automated curation of brand-related social media images with deep learning

    Get PDF
    This paper presents a work consisting in using deep convolutional neural networks (CNNs) to facilitate the curation of brand-related social media images. The final goal is to facilitate searching and discovering user-generated content (UGC) with potential value for digital marketing tasks. The images are captured in real time and automatically annotated with multiple CNNs. Some of the CNNs perform generic object recognition tasks while others perform what we call visual brand identity recognition. When appropriate, we also apply object detection, usually to discover images containing logos. We report experiments with 5 real brands in which more than 1 million real images were analyzed. In order to speed-up the training of custom CNNs we applied a transfer learning strategy. We examine the impact of different configurations and derive conclusions aiming to pave the way towards systematic and optimized methodologies for automatic UGC curation.Peer ReviewedPostprint (author's final draft

    Analysis of Twitter Data Using a Multiple-level Clustering Strategy

    Get PDF
    Twitter, currently the leading microblogging social network, has attracted a great body of research works. This paper proposes a data analysis framework to discover groups of similar twitter messages posted on a given event. By analyzing these groups, user emotions or thoughts that seem to be associated with specific events can be extracted, as well as aspects characterizing events according to user perception. To deal with the inherent sparseness of micro-messages, the proposed approach relies on a multiple-level strategy that allows clustering text data with a variable distribution. Clusters are then characterized through the most representative words appearing in their messages, and association rules are used to highlight correlations among these words. To measure the relevance of specific words for a given event, text data has been represented in the Vector Space Model using the TF-IDF weighting score. As a case study, two real Twitter datasets have been analyse

    How Posting Purchases on Social Media Influences Happiness: The Role of Self-Esteem

    Get PDF
    The purpose of this article is to investigate the influences of posting one’s purchases on the content creator’s happiness attained from the purchases. A survey (n=207) was conducted on Amazon Mechanical Turk. Multiple regression and floodlight analysis were utilized to examine the data, which show that posting purchases on social media as a new way of self-presentation interplays with self-esteem in influencing consumers’ happiness obtained from the posted purchases. Specifically, posting behavior increases the happiness among consumers with higher self-esteem, but has no effects on consumers with lower self-esteem. This article fills the gap among literature about the influences of the different self-presentation styles caused by self- esteem, and advances our understanding of how social media usage differently influences consumers with higher and lower self-esteem. This research also provides novel insights into the role of self-presentation in consumers’ happiness from purchases and the affective benefits of creating user-generated content. This article is pioneering in investigating the behavior of posting purchases on social media. It is the first research revealing the complicated interaction between the behavior and the content creators’ self-esteem in influencing happiness obtained from the purchases

    HeBERT & HebEMO: a Hebrew BERT Model and a Tool for Polarity Analysis and Emotion Recognition

    Full text link
    This paper introduces HeBERT and HebEMO. HeBERT is a Transformer-based model for modern Hebrew text, which relies on a BERT (Bidirectional Encoder Representations for Transformers) architecture. BERT has been shown to outperform alternative architectures in sentiment analysis, and is suggested to be particularly appropriate for MRLs. Analyzing multiple BERT specifications, we find that while model complexity correlates with high performance on language tasks that aim to understand terms in a sentence, a more-parsimonious model better captures the sentiment of entire sentence. Either way, out BERT-based language model outperforms all existing Hebrew alternatives on all common language tasks. HebEMO is a tool that uses HeBERT to detect polarity and extract emotions from Hebrew UGC. HebEMO is trained on a unique Covid-19-related UGC dataset that we collected and annotated for this study. Data collection and annotation followed an active learning procedure that aimed to maximize predictability. We show that HebEMO yields a high F1-score of 0.96 for polarity classification. Emotion detection reaches F1-scores of 0.78-0.97 for various target emotions, with the exception of surprise, which the model failed to capture (F1 = 0.41). These results are better than the best-reported performance, even among English-language models of emotion detection

    Noise or music? Investigating the usefulness of normalisation for robust sentiment analysis on social media data

    Get PDF
    In the past decade, sentiment analysis research has thrived, especially on social media. While this data genre is suitable to extract opinions and sentiment, it is known to be noisy. Complex normalisation methods have been developed to transform noisy text into its standard form, but their effect on tasks like sentiment analysis remains underinvestigated. Sentiment analysis approaches mostly include spell checking or rule-based normalisation as preprocess- ing and rarely investigate its impact on the task performance. We present an optimised sentiment classifier and investigate to what extent its performance can be enhanced by integrating SMT-based normalisation as preprocessing. Experiments on a test set comprising a variety of user-generated content genres revealed that normalisation improves sentiment classification performance on tweets and blog posts, showing the model’s ability to generalise to other data genres

    Detection, Modelling and Visualisation of Georeferenced Emotions from User-Generated Content

    Get PDF
    In recent years emotion-related applications like smartphone apps that document and analyse the emotions of the user, have become very popular. But research also can deal with human emotions in a very technology-driven approach. Thus space-related emotions are of interest as well which can be visualised cartographically and can be captured in different ways. The research project of this dissertation deals with the extraction of georeferenced emotions from the written language in the metadata of Flickr and Panoramio photos, thus from user-generated content, as well as with their modelling and visualisation. Motivation is the integration of an emotional component into location-based services for tourism since only factual information is considered thus far although places have an emotional impact. The metadata of those user-generated photos contain descriptions of the place that is depicted within the respective picture. The words used have affective connotations which are determined with the help of emotional word lists. The emotion that is associated with the particular word in the word list is described on the basis of the two dimensions ‘valence’ and ‘arousal’. Together with the coordinates of the respective photo, the extracted emotion forms a georeferenced emotion. The algorithm that was developed for the extraction of these emotions applies different approaches from the field of computer linguistics and considers grammatical special cases like the amplification or negation of words. The algorithm was applied to a dataset of Flickr and Panoramio photos of Dresden (Germany). The results are an emotional characterisation of space which makes it possible to assess and investigate specific features of georeferenced emotions. These features are especially related to the temporal dependence and the temporal reference of emotions on one hand; on the other hand collectively and individually perceived emotions have to be distinguished. As a consequence, a place does not necessarily have to be connected with merely one emotion but possibly also with several. The analysis was carried out with the help of different cartographic visualisations. The temporal occurrence of georeferenced emotions was examined detailed. Hence the dissertation focuses on fundamental research into the extraction of space-related emotions from georeferenced user-generated content as well as their visualisation. However as an outlook, further research questions and core themes are identified which arose during the investigations. This shows that this subject is far from being exhausted.:Statement of Authorship I Acknowledgements II Abstract III Zusammenfassung V Table of Contents VII List of Figures XI List of Tables XIV List of Abbreviations XV 1 Introduction 1 1.1 Motivation 1 1.2 Research Questions 3 1.3 Thesis Structure 4 1.4 Underlying Publications 4 2 State of the Art 6 2.1 Emotions 6 2.1.1 Definitions and Terms 6 2.1.2 Emotion Theories 7 2.1.2.1 James-Lange Theory 9 2.1.2.2 Two-Factor Theory 9 2.1.3 Structuring Emotions 9 2.1.3.1 Dimensional Approaches 10 2.1.3.2 Basic Emotions 11 2.1.3.3 Empirical Similarity Categories 12 2.1.4 Acquisition of Emotions 14 2.1.4.1 Verbal Procedures 14 2.1.4.2 Non-Verbal Procedures 14 2.1.5 Relation between Emotions and Places 15 2.1.6 Emotions in Language 17 2.1.7 Affect Analysis and Sentiment Analysis 20 2.2 User-Generated Content 22 2.2.1 Definition and Characterisation 22 2.2.2 Advantages and Disadvantages 23 2.2.3 Tagging 24 2.2.4 Inaccuracies 28 2.2.5 Flickr and Panoramio 29 2.2.5.1 Flickr 30 2.2.5.2 Panoramio 31 2.3 Related Work on Georeferenced Emotions 32 2.3.1 Emotional Data Resulting from Biometric Measurements 33 2.3.1.1 Bio Mapping 33 2.3.1.2 EmBaGIS 34 2.3.1.3 Ein emotionales Kiezportrait 35 2.3.2 Emotional Data Resulting from Empirical Surveys 35 2.3.2.1 EmoMap 35 2.3.2.2 WiMo 36 2.3.2.3 ECDESUP 37 2.3.2.4 Map of World Happiness 38 2.3.2.5 Emotional Study of Yeongsan River Basin 39 2.3.3 Emotional Data Resulting from User-Generated Content 40 2.3.3.1 Emography 40 2.3.3.2 Twittermood 40 2.3.3.3 Tweetbeat 42 2.3.3.4 Beautiful picture of an ugly place 42 2.3.4 Visualisation in the Related Work 43 3 Methods 45 3.1 Approach for Extracting Georeferenced Emotions from the Metadata of Flickr and Panoramio Photos 45 3.2 Implemented Algorithm 45 3.3 Grammatical Special Cases 47 3.3.1 Degree Words 48 3.3.2 Negation 52 3.3.2.1 Syntactic Negation in English Language 55 3.3.2.2 Syntactic Negation in German Language 57 3.3.3 Modification of Words Affected by Grammatical Special Cases 60 4 Visualisation and Analysis of Extracted Georeferenced Emotions 62 4.1 Data Basis 62 4.2 Density Maps 67 4.3 Inverse Distance Weight 71 4.4 3D Visualisation 73 4.5 Choropleth Mapping 74 4.6 Point Symbols 78 4.7 Impact of Considering Grammatical Special Cases 80 5 Investigation in Temporal Aspects 85 5.1 Annually Occurrence of Emotions 85 5.2 Periodic Events 87 5.3 Single Events 91 5.4 Dependence of Georeferenced Emotions on Different Periods of Time 93 5.4.1 Seasons 95 5.4.2 Months 96 5.4.3 Weekdays 98 5.4.4 Times of Day 99 5.5 Potentials and Limits of Temporal Analyses 99 6 Discussion 100 6.1 Evaluation 100 6.2 Weaknesses and Problems 102 7 Conclusions and Outlook 105 7.1 Answers to the Research Questions 105 7.2 Outlook and Future Work 107 8 Bibliography 112 Appendices XVIIn den letzten Jahren sind emotionsbezogene Anwendungen, wie Apps, die die Emotionen des Nutzers dokumentieren und analysieren, sehr populĂ€r geworden. Ebenfalls in der Forschung sind Emotionen in einem sehr technologiegetriebenen Ansatz ein Thema. So auch ortsbezogene Emotionen, die sich somit kartographisch darstellen lassen und auf verschiedene Art und Weisen gewonnen werden können. Das Forschungsvorhaben der Dissertation befasst sich mit der Extraktion von georeferenzierten Emotionen aus geschriebener Sprache unter Verwendung von Metadaten verorteter Flickr- und Panoramio-Fotos, d.h. aus nutzergenerierten Inhalten, sowie deren Modellierung und Visualisierung. Motivation hierfĂŒr ist die Einbindung einer emotionalen Komponente in ortsbasierte touristische Dienste, da diese bisher nur faktische Informationen berĂŒcksichtigen, obwohl Orte durchaus eine emotionale Wirkung haben. Die Metadaten dieser nutzergenerierten Inhalte stellen Beschreibungen des auf dem Foto festgehaltenen Ortes dar. Die dafĂŒr verwendeten Wörter besitzen affektive Konnotationen, welche mit Hilfe emotionaler Wortlisten ermittelt werden. Die Emotion, die mit dem jeweiligen Wort in der Wortliste assoziiert wird, wird anhand der zwei Dimensionen Valenz und Erregung beschrieben. Die extrahierten Emotionen bilden zusammen mit der geographischen Koordinate des jeweiligen Fotos eine georeferenzierte Emotion. Der zur Extraktion dieser Emotionen entwickelte Algorithmus bringt verschiedene AnsĂ€tze aus dem Bereich der Computerlinguistik zum Einsatz und berĂŒcksichtigt ebenso grammatikalische SonderfĂ€lle, wie Intensivierung oder Negation von Wörtern. Der Algorithmus wurde auf einen Datensatz von Flickr- und Panoramio-Fotos von Dresden angewendet. Die Ergebnisse stellen eine emotionale Raumcharakterisierung dar und ermöglichen es, spezifische Eigenschaften verorteter Emotionen festzustellen und zu untersuchen. Diese Eigenschaften beziehen sich sowohl auf die zeitliche AbhĂ€ngigkeit und den zeitlichen Bezug von Emotionen, als auch darauf, dass zwischen kollektiv und individuell wahrgenommenen Emotionen unterschieden werden muss. Das bedeutet, dass ein Ort nicht nur mit einer Emotion verbunden sein muss, sondern möglicherweise auch mit mehreren. Die Auswertung erfolgte mithilfe verschiedener kartographischer Visualisierungen. Eingehender wurde das zeitliche Auftreten der ortsbezogenen Emotionen untersucht. Der Fokus der Dissertation liegt somit auf der Grundlagenforschung zur Extraktion verorteter Emotionen aus georeferenzierten nutzergenerierten Inhalten sowie deren Visualisierung. Im Ausblick werden jedoch weitere Fragestellungen und Schwerpunkte genannt, die sich im Laufe der Untersuchungen ergeben haben, womit gezeigt wird, dass dieses Forschungsgebiet bei Weitem noch nicht ausgeschöpft ist.:Statement of Authorship I Acknowledgements II Abstract III Zusammenfassung V Table of Contents VII List of Figures XI List of Tables XIV List of Abbreviations XV 1 Introduction 1 1.1 Motivation 1 1.2 Research Questions 3 1.3 Thesis Structure 4 1.4 Underlying Publications 4 2 State of the Art 6 2.1 Emotions 6 2.1.1 Definitions and Terms 6 2.1.2 Emotion Theories 7 2.1.2.1 James-Lange Theory 9 2.1.2.2 Two-Factor Theory 9 2.1.3 Structuring Emotions 9 2.1.3.1 Dimensional Approaches 10 2.1.3.2 Basic Emotions 11 2.1.3.3 Empirical Similarity Categories 12 2.1.4 Acquisition of Emotions 14 2.1.4.1 Verbal Procedures 14 2.1.4.2 Non-Verbal Procedures 14 2.1.5 Relation between Emotions and Places 15 2.1.6 Emotions in Language 17 2.1.7 Affect Analysis and Sentiment Analysis 20 2.2 User-Generated Content 22 2.2.1 Definition and Characterisation 22 2.2.2 Advantages and Disadvantages 23 2.2.3 Tagging 24 2.2.4 Inaccuracies 28 2.2.5 Flickr and Panoramio 29 2.2.5.1 Flickr 30 2.2.5.2 Panoramio 31 2.3 Related Work on Georeferenced Emotions 32 2.3.1 Emotional Data Resulting from Biometric Measurements 33 2.3.1.1 Bio Mapping 33 2.3.1.2 EmBaGIS 34 2.3.1.3 Ein emotionales Kiezportrait 35 2.3.2 Emotional Data Resulting from Empirical Surveys 35 2.3.2.1 EmoMap 35 2.3.2.2 WiMo 36 2.3.2.3 ECDESUP 37 2.3.2.4 Map of World Happiness 38 2.3.2.5 Emotional Study of Yeongsan River Basin 39 2.3.3 Emotional Data Resulting from User-Generated Content 40 2.3.3.1 Emography 40 2.3.3.2 Twittermood 40 2.3.3.3 Tweetbeat 42 2.3.3.4 Beautiful picture of an ugly place 42 2.3.4 Visualisation in the Related Work 43 3 Methods 45 3.1 Approach for Extracting Georeferenced Emotions from the Metadata of Flickr and Panoramio Photos 45 3.2 Implemented Algorithm 45 3.3 Grammatical Special Cases 47 3.3.1 Degree Words 48 3.3.2 Negation 52 3.3.2.1 Syntactic Negation in English Language 55 3.3.2.2 Syntactic Negation in German Language 57 3.3.3 Modification of Words Affected by Grammatical Special Cases 60 4 Visualisation and Analysis of Extracted Georeferenced Emotions 62 4.1 Data Basis 62 4.2 Density Maps 67 4.3 Inverse Distance Weight 71 4.4 3D Visualisation 73 4.5 Choropleth Mapping 74 4.6 Point Symbols 78 4.7 Impact of Considering Grammatical Special Cases 80 5 Investigation in Temporal Aspects 85 5.1 Annually Occurrence of Emotions 85 5.2 Periodic Events 87 5.3 Single Events 91 5.4 Dependence of Georeferenced Emotions on Different Periods of Time 93 5.4.1 Seasons 95 5.4.2 Months 96 5.4.3 Weekdays 98 5.4.4 Times of Day 99 5.5 Potentials and Limits of Temporal Analyses 99 6 Discussion 100 6.1 Evaluation 100 6.2 Weaknesses and Problems 102 7 Conclusions and Outlook 105 7.1 Answers to the Research Questions 105 7.2 Outlook and Future Work 107 8 Bibliography 112 Appendices XV

    Applying Deep Learning Techniques for Sentiment Analysis to Assess Sustainable Transport

    Get PDF
    Users voluntarily generate large amounts of textual content by expressing their opinions, in social media and specialized portals, on every possible issue, including transport and sustainability. In this work we have leveraged such User Generated Content to obtain a high accuracy sentiment analysis model which automatically analyses the negative and positive opinions expressed in the transport domain. In order to develop such model, we have semiautomatically generated an annotated corpus of opinions about transport, which has then been used to fine-tune a large pretrained language model based on recent deep learning techniques. Our empirical results demonstrate the robustness of our approach, which can be applied to automatically process massive amounts of opinions about transport. We believe that our method can help to complement data from official statistics and traditional surveys about transport sustainability. Finally, apart from the model and annotated dataset, we also provide a transport classification score with respect to the sustainability of the transport types found in the use case dataset.This work has been partially funded by the Spanish Ministry of Science, Innovation and Universities (DeepReading RTI2018-096846-B-C21, MCIU/AEI/FEDER, UE), Ayudas FundaciĂłn BBVA a Equipos de InvestigaciĂłn CientĂ­fica 2018 (BigKnowledge), DeepText (KK-2020/00088), funded by the Basque Government and the COLAB19/19 project funded by the UPV/EHU. Rodrigo Agerri is also funded by the RYC-2017-23647 fellowship and acknowledges the donation of a Titan V GPU by the NVIDIA Corporation

    Analyzing user reviews of messaging Apps for competitive analysis

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Data ScienceThe rise of various messaging apps has resulted in intensively fierce competition, and the era of Web 2.0 enables business managers to gain competitive intelligence from user-generated content (UGC). Text-mining UGC for competitive intelligence has been drawing great interest of researchers. However, relevant studies mostly focus on industries such as hospitality and products, and few studies applied such techniques to effectively perform competitive analysis for messaging apps. Here, we conducted a competitive analysis based on topic modeling and sentiment analysis by text-mining 27,479 user reviews of four iOS messaging apps, namely Messenger, WhatsApp, Signal and Telegram. The results show that the performance of topic modeling and sentiment analysis is encouraging, and that a combination of the extracted app aspect-based topics and the adjusted sentiment scores can effectively reveal meaningful competitive insights into user concerns, competitive strengths and weaknesses as well as changes of user sentiments over time. We anticipate that this study will not only advance the existing literature on competitive analysis using text mining techniques for messaging apps but also help existing players and new entrants in the market to sharpen their competitive edge by better understanding their user needs and the industry trends

    Predictive Analytics on Emotional Data Mined from Digital Social Networks with a Focus on Financial Markets

    Get PDF
    This dissertation is a cumulative dissertation and is comprised of five articles. User-Generated Content (UGC) comprises a substantial part of communication via social media. In this dissertation, UGC that carries and facilitates the exchange of emotions is referred to as “emotional data.” People “produce” emotional data, that is, they express their emotions via tweets, forum posts, blogs, and so on, or they “consume” it by being influenced by expressed sentiments, feelings, opinions, and the like. Decisions often depend on shared emotions and data – which again lead to new data because decisions may change behaviors or results. “Emotional Data Intelligence” ultimately seeks an answer to the question of how all the different emotions expressed in public online sources influence decision-making processes. The overarching research topic of this dissertation follows the question whether network structures and emotional sentiment data extracted from digital social networks contain predictive information or they are just noise. Underlying data was collected from different social media sources, such as Twitter, blogs, message boards, or online news and social networking sites, such as Xing. By means of methodologies of social network analysis (SNA), sentiment analysis, and predictive analysis the individual contributions of this dissertation study whether sentiment data from social media or online social networking structures can predict real-world behaviors. The focus lies on the analysis of emotional data and network structures and its predictive power for financial markets. With the formal construction of the data analyses methodologies introduced in the individual contributions this dissertation contributes to the theories of social network analysis, sentiment analysis, and predictive analytics
    • 

    corecore