Search CORE

11,981 research outputs found

Data properties and the performance of sentiment classification for electronic commerce applications

Author: Choi Y
Lee H
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Sentiment classification has played an important role in various research area including e-commerce applications and a number of advanced Computational Intelligence techniques including machine learning and computational linguistics have been proposed in the literature for improved sentiment classification results. While such studies focus on improving performance with new techniques or extending existing algorithms based on previously used dataset, few studies provide practitioners with insight on what techniques are better for their datasets that have different properties. This paper applies four different sentiment classification techniques from machine learning (Naïve Bayes, SVM and Decision Tree) and sentiment orientation approaches to datasets obtained from various sources (IMDB, Twitter, Hotel review, and Amazon review datasets) to learn how different data properties including dataset size, length of target documents, and subjectivity of data affect the performance of those techniques. The results of computational experiments confirm the sensitivity of the techniques on data properties including training data size, the document length and subjectivity of training /test data in the improvement of performances of techniques. The theoretical and practical implications of the findings are discussed.This study was partially funded by Korea National Research Foundation through Global Research Network Program (Project no. 2016S1A2A2912265) and EU funded project Policy Compass (Project no. 283700)

Southampton (e-Prints Soton)

Loughborough University Institutional Repository

Crossref

Springer - Publisher Connector

Coventry University Pure Portal

Brunel University Research Archive

A study on text-score disagreement in online reviews

Author: A Flanagin
A Ghose
A Hotho
A Muhammad
Angelo Spognardi
B Agarwal
BA Sparks
C Cortes
E Cambria
E Cambria
F Bravo-Marquez
HA Schwartz
IE Vermeulen
J Hipp
JR Quinlan
M-T Martín-Valdivia
Marinella Petrocchi
Michela Fazzolari
O Netzer
P Green
Q Zhou
R Pandarachalil
S Poria
SL Lo
T Wilson
TM Mitchell
Vittoria Cozza
W Medhat
X Fang
Y Xia
Z Bu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

In this paper, we focus on online reviews and employ artificial intelligence tools, taken from the cognitive computing field, to help understanding the relationships between the textual part of the review and the assigned numerical score. We move from the intuitions that 1) a set of textual reviews expressing different sentiments may feature the same score (and vice-versa); and 2) detecting and analyzing the mismatches between the review content and the actual score may benefit both service providers and consumers, by highlighting specific factors of satisfaction (and dissatisfaction) in texts. To prove the intuitions, we adopt sentiment analysis techniques and we concentrate on hotel reviews, to find polarity mismatches therein. In particular, we first train a text classifier with a set of annotated hotel reviews, taken from the Booking website. Then, we analyze a large dataset, with around 160k hotel reviews collected from Tripadvisor, with the aim of detecting a polarity mismatch, indicating if the textual content of the review is in line, or not, with the associated score. Using well established artificial intelligence techniques and analyzing in depth the reviews featuring a mismatch between the text polarity and the score, we find that -on a scale of five stars- those reviews ranked with middle scores include a mixture of positive and negative aspects. The approach proposed here, beside acting as a polarity detector, provides an effective selection of reviews -on an initial very large dataset- that may allow both consumers and providers to focus directly on the review subset featuring a text/score disagreement, which conveniently convey to the user a summary of positive and negative features of the review target.Comment: This is the accepted version of the paper. The final version will be published in the Journal of Cognitive Computation, available at Springer via http://dx.doi.org/10.1007/s12559-017-9496-

arXiv.org e-Print Archive

Crossref

Catalogo dei prodotti della ricerca

Archivio della ricerca- Università di Roma La Sapienza

Online Research Database In Technology

Archivio istituzionale della ricerca - Università di Padova

Assessment, Implication, and Analysis of Online Consumer Reviews: A Literature Review

Author: anand oshin
Rakshit Atanu
Srivastava Praveen Ranjan
Publication venue: AIS Electronic Library (AISeL)
Publication date: 30/06/2017
Field of study

The onset of e-marketplace, virtual communities and social networking has appreciated the influential capability of online consumer reviews (OCR) and therefore necessitate conglomeration of the body of knowledge. This article attempts to conceptually cluster academic literature in both management and technical domain. The study follows a framework which broadly clusters management research under two heads: OCR Assessment and OCR Implication (business implication). Parallel technical literature has been reviewed to reconcile methodologies adopted in the analysis of text content on the web, majorly reviews. Text mining through automated tools, algorithmic contribution (dominant majorly in technical stream literature) and manual assessment (derived from the stream of content analysis) has been studied in this review article. Literature survey of both the domains is analyzed to propose possible area for further research. Usage of text analysis methods along with statistical and data mining techniques to analyze review text and utilize the knowledge creation for solving managerial issues can possibly constitute further work. Available at: https://aisel.aisnet.org/pajais/vol9/iss2/4

AIS Electronic Library (AISeL)

Big data and Sentiment Analysis considering reviews from e-commerce platforms to predict consumer behavior

Author: Pons-Muñoz de Morales Sergi
Publication venue
Publication date: 01/01/2020
Field of study

Treballs Finals del Màster de Recerca en Empresa, Facultat d'Economia i Empresa, Universitat de Barcelona, Curs: 2019-2020, Tutor: Javier Manuel Romaní Fernández ; Jaime Gil LafuenteNowadays and since the last two decades, digital data is generated on a massive scale, this phenomenon is known as Big Data (BD). This phenomenon supposes a change in the way of managing and drawing conclusions from data. Moreover, techniques and methods used in artificial intelligence shape new ways of analysis considering BD. Sentiment Analysis (SA) or Opinion Mining (OM) is a topic widely studied for the last few years due to its potential in extracting value from data. However, it is a topic that has been more explored in the fields of engineering or linguistics and not so much in business and marketing fields. For this reason, the aim of this study is to provide a reachable guide that includes the main BD concepts and technologies to those who do not come from a technical field such as Marketing directors. This essay is articulated in two parts. Firstly, it is described the BD ecosystem and the technologies involved. Secondly, it is conducted a systematic literature review in which articles related with the field of SA are analysed. The contribution of this study is a summarization and a brief description of the main technologies behind BD, as well as the techniques and procedures currently involved in SA

Diposit Digital de la Universitat de Barcelona

BDGS: A Scalable Big Data Generator Suite in Big Data Benchmarking

Author: C Luo
DM Blei
J Leskovec
J Leskovec
J-M Fourneau
LA Barroso
T Rabl
Z Jia
Publication venue
Publication date: 26/02/2014
Field of study

Data generation is a key issue in big data benchmarking that aims to generate application-specific data sets to meet the 4V requirements of big data. Specifically, big data generators need to generate scalable data (Volume) of different types (Variety) under controllable generation rates (Velocity) while keeping the important characteristics of raw data (Veracity). This gives rise to various new challenges about how we design generators efficiently and successfully. To date, most existing techniques can only generate limited types of data and support specific big data systems such as Hadoop. Hence we develop a tool, called Big Data Generator Suite (BDGS), to efficiently generate scalable big data while employing data models derived from real data to preserve data veracity. The effectiveness of BDGS is demonstrated by developing six data generators covering three representative data types (structured, semi-structured and unstructured) and three data sources (text, graph, and table data)

arXiv.org e-Print Archive

Crossref