840 research outputs found

    The Trending Customer Needs (TCN) Dataset: A Benchmarking and Automated Evaluation Approach for New Product Development

    Get PDF
    In recent years, there have been many studies which summarize User Generated Content as lists of ranked keyphrases representing customer needs for the purposes of New Product Development. However, methods for the evaluation of keyphrase lists do not robustly assess solutions for these purposes. Therefore, in this paper we present the ā€œTrending Customer Needsā€ (TCN) dataset of over 9000 top trending customer need keyphrases organized by month from 2007-2021 which spans 37 product categories in the area of Consumer Packaged Goods (e.g. toothpaste, eyeliner, beer etc.). TCN is a curated dataset for the benchmarking of supervised machine learning approaches in the prediction of customer needs using User Generated Content. We describe the process of curating TCN while ensuring its quality. Finally, we demonstrate its utility via a case study of Reddit discourse as a potential predictor for future customer needs in Consumer Packaged Goods

    TOWARDS MINING BRAND ASSOCIATIONS FROM USER-GENERATED CONTENT (UGC): EVIDENCE FROM LINGUISTIC CHARACTERISTICS

    Get PDF
    Consumersā€™ brand associations offer qualitative explanations on a brandā€™s success or failure and are typically elicited using survey-based instruments. Marketers are interested in time- and cost-efficient, automated brand association elicitation approaches. To enable an automated brand association elicitation, we show that brand associations can be formalized and described by patterns of linguistic part-of-speech sequences that differ from ordinary speech which is required for an automated extraction via text mining. Furthermore, we provide evidence that UGC is an adequate data-source for an automated brand association elicitation. We do that by comparing survey-based and UGC data-sources using linguistic part-of-speech sequence- and n-gram analysis as well as sequential pattern mining. We contribute to exiting research by establishing prerequisites for the construction of novel information systems that use text mining to extract brand associations automatically from UGC

    Abstraction and cartographic generalization of geographic user-generated content: use-case motivated investigations for mobile users

    Full text link
    On a daily basis, a conventional internet user queries different internet services (available on different platforms) to gather information and make decisions. In most cases, knowingly or not, this user consumes data that has been generated by other internet users about his/her topic of interest (e.g. an ideal holiday destination with a family traveling by a van for 10 days). Commercial service providers, such as search engines, travel booking websites, video-on-demand providers, food takeaway mobile apps and the like, have found it useful to rely on the data provided by other users who have commonalities with the querying user. Examples of commonalities are demography, location, interests, internet address, etc. This process has been in practice for more than a decade and helps the service providers to tailor their results based on the collective experience of the contributors. There has been also interest in the different research communities (including GIScience) to analyze and understand the data generated by internet users. The research focus of this thesis is on finding answers for real-world problems in which a user interacts with geographic information. The interactions can be in the form of exploration, querying, zooming and panning, to name but a few. We have aimed our research at investigating the potential of using geographic user-generated content to provide new ways of preparing and visualizing these data. Based on different scenarios that fulfill user needs, we have investigated the potential of finding new visual methods relevant to each scenario. The methods proposed are mainly based on pre-processing and analyzing data that has been offered by data providers (both commercial and non-profit organizations). But in all cases, the contribution of the data was done by ordinary internet users in an active way (compared to passive data collections done by sensors). The main contributions of this thesis are the proposals for new ways of abstracting geographic information based on user-generated content contributions. Addressing different use-case scenarios and based on different input parameters, data granularities and evidently geographic scales, we have provided proposals for contemporary users (with a focus on the users of location-based services, or LBS). The findings are based on different methods such as semantic analysis, density analysis and data enrichment. In the case of realization of the findings of this dissertation, LBS users will benefit from the findings by being able to explore large amounts of geographic information in more abstract and aggregated ways and get their results based on the contributions of other users. The research outcomes can be classified in the intersection between cartography, LBS and GIScience. Based on our first use case we have proposed the inclusion of an extended semantic measure directly in the classic map generalization process. In our second use case we have focused on simplifying geographic data depiction by reducing the amount of information using a density-triggered method. And finally, the third use case was focused on summarizing and visually representing relatively large amounts of information by depicting geographic objects matched to the salient topics emerged from the data

    Predicting online product sales via online reviews, sentiments, and promotion strategies

    Get PDF
    Purpose ā€“ The purpose of this paper is to investigate if online reviews (e.g. valence and volume), online promotional strategies (e.g. free delivery and discounts) and sentiments from user reviews can help predict product sales. Design/methodology/approach ā€“ The authors designed a big data architecture and deployed Node.js agents for scraping the Amazon.com pages using asynchronous input/output calls. The completed web crawling and scraping data sets were then preprocessed for sentimental and neural network analysis. The neural network was employed to examine which variables in the study are important predictors of product sales. Findings ā€“ This study found that although online reviews, online promotional strategies and online sentiments can all predict product sales, some variables are more important predictors than others. The authors found that the interplay effects of these variables become more important variables than the individual variables themselves. For example, online volume interactions with sentiments and discounts are more important than the individual predictors of discounts, sentiments or online volume. Originality/value ā€“ This study designed big data architecture, in combination with sentimental and neural network analysis that can facilitate future business research for predicting product sales in an online environment. This study also employed a predictive analytic approach (e.g. neural network) to examine the variables, and this approach is useful for future data analysis in a big data environment where prediction can have more practical implications than significance testing. This study also examined the interplay between online reviews, sentiments and promotional strategies, which up to now have mostly been examined individually in previous studies

    What Airbnb Reviews can Tell us? An Advanced Latent Aspect Rating Analysis Approach

    Get PDF
    There is no doubt that the rapid growth of Airbnb has changed the lodging industry and touristsā€™ behaviors dramatically since the advent of the sharing economy. Airbnb welcomes customers and engages them by creating and providing unique travel experiences to ā€œlive like a localā€ through the delivery of lodging services. With the special experiences that Airbnb customers pursue, more investigation is needed to systematically examine the Airbnb customer lodging experience. Online reviews offer a representative look at individual customersā€™ personal and unique lodging experiences. Moreover, the overall ratings given by customers are reflections of their experiences with a product or service. Since customers take overall ratings into account in their purchase decisions, a study that bridges the customer lodging experience and the overall rating is needed. In contrast to traditional research methods, mining customer reviews has become a useful method to study customersā€™ opinions about products and services. User-generated reviews are a form of evaluation generated by peers that users post on business or other (e.g., third-party) websites (Mudambi & Schuff, 2010). The main purpose of this study is to identify the weights of latent lodging experience aspects that customers consider in order to form their overall ratings based on the eight basic emotions. This study applied both aspect-based sentiment analysis and the latent aspect rating analysis (LARA) model to predict the aspect ratings and determine the latent aspect weights. Specifically, this study extracted the innovative lodging experience aspects that Airbnb customers care about most by mining a total of 248,693 customer reviews from 6,946 Airbnb accommodations. Then, the NRC Emotion Lexicon with eight emotions was employed to assess the sentiments associated with each lodging aspect. By applying latent rating regression, the predicted aspect ratings were generated. With the aspect ratings, , the aspect weights, and the predicted overall ratings were calculated. It was suggested that the overall rating be assessed based on the sentiment words of five lodging aspects: communication, experience, location, product/service, and value. It was found that, compared with the aspects of location, product/service, and value, customers expressed less joy and more surprise than they did over the aspects of communication and experience. The LRR results demonstrate that Airbnb customers care most about a listing location, followed by experience, value, communication, and product/service. The results also revealed that even listings with the same overall rating may have different predicted aspect ratings based on the different aspect weights. Finally, the LARA model demonstrated the different preferences between customers seeking expensive versus cheap accommodations. Understanding customer experience and its role in forming customer rating behavior is important. This study empirically confirms and expands the usefulness of LARA as the prediction model in deconstructing overall ratings into aspect ratings, and then further predicting aspect level weights. This study makes meaningful academic contributions to the evolving customer behavior and customer experience research. It also benefits the shared-lodging industry through its development of pragmatic methods to establish effective marketing strategies for improving customer perceptions and create personalized review filter systems

    Developing Knowledge Models of Social Media: A Case Study on LinkedIn

    Get PDF
    User Generated Content (UGC) exchanged via large Social Network is considered a very important knowledge source about all aspects of the social engagements (e.g. interests, events, personal information, personal preferences, social experience, skills etc.). However this data is inherently unstructured or semi-structured. In this paper, we describe the results of a case study on LinkedIn Ireland public profiles. The study investigated how the available knowledge could be harvested from LinkedIn in a novel way by developing and applying a reusable knowledge model using linked open data vocabularies and semantic web. In addition, the paper discusses the crawling and data normalisation strategies that we developed, so that high quality metadata could be extracted from the LinkedIn public profiles. Apart from the search engine in LinkedIn.com itself, there are no well known publicly available endpoints that allow users to query knowledge concerning the interests of individuals on LinkedIn. In particular, we present a system that extracts and converts information from raw web pages of LinkedIn public profiles into a machine-readable, interoperable format using data mining and Semantic Web technologies. The outcomes of our research can be summarized as follows: (1) A reusable knowledge model which can represent LinkedIn public users and company profiles using linked data vocabularies and structured data, (2) a public SPARQL endpoint to access structured data about Irish industry and public profiles, (3) a scalable data crawling strategy and mashup based data normalisation approach. The proposed data mining and knowledge representation proposed in this paper are evaluated in four ways: (1) We evaluate metadata quality using automated techniques, such as data completeness and data linkage. (2) Data accuracy is evaluated via user studies. In particular, accuracy is evaluated by comparison of manually entered metadata fields and the metadata which was automatically extracted. (3) User perceived metadata quality is measured by asking users to rate the automatically extracted metadata in user studies. (4) Finally, the paper discusses how the extracted metadata suits for a user interface design. Overall, the evaluations show that the extracted metadata is of high quality and meets the requirements of a data visualisation user interface

    Detecting Pain Points from User-Generated Social Media Posts Using Machine Learning

    Get PDF
    Artificial intelligence, particularly machine learning, carries high potential to automatically detect customersā€™ pain points, which is a particular concern the customer expresses that the company can address. However, unstructured data scattered across social media make detection a nontrivial task. Thus, to help firms gain deeper insights into customersā€™ pain points, the authors experiment with and evaluate the performance of various machine learning models to automatically detect pain points and pain point types for enhanced customer insights. The data consist of 4.2ā€…million user-generated tweets targeting 20 global brands from five separate industries. Among the models they train, neural networks show the best performance at overall pain point detection, with an accuracy of 85% (F1 scoreā€‰=ā€‰.80). The best model for detecting five specific pain points was RoBERTa 100 samples using SYNONYM augmentation. This study adds another foundational building block of machine learning research in marketing academia through the application and comparative evaluation of machine learning models for natural languageā€“based content identification and classification. In addition, the authors suggest that firms use pain point profiling, a technique for applying subclasses to the identified pain point messages to gain a deeper understanding of their customersā€™ concerns.Ā©2022 SAGE Publications. The article is protected by copyright and reuse is restricted to non-commercial and no derivative uses. Users may also download and save a local copy of an article accessed in an institutional repository for the user's personal reference.fi=vertaisarvioitu|en=peerReviewed
    • ā€¦
    corecore