111,492 research outputs found
Automatic domain ontology extraction for context-sensitive opinion mining
Automated analysis of the sentiments presented in online consumer feedbacks can facilitate both organizationsâ business strategy development and individual consumersâ comparison shopping. Nevertheless, existing opinion mining methods either adopt a context-free sentiment classification approach or rely on a large number of manually annotated training examples to perform context sensitive sentiment classification. Guided by the design science research methodology, we illustrate the design, development, and evaluation of a novel fuzzy domain ontology based contextsensitive opinion mining system. Our novel ontology extraction mechanism underpinned by a variant of Kullback-Leibler divergence can automatically acquire contextual sentiment knowledge across various product domains to improve the sentiment analysis processes. Evaluated based on a benchmark dataset and real consumer reviews collected from Amazon.com, our system shows remarkable performance improvement over the context-free baseline
Inferring Networks of Substitutable and Complementary Products
In a modern recommender system, it is important to understand how products
relate to each other. For example, while a user is looking for mobile phones,
it might make sense to recommend other phones, but once they buy a phone, we
might instead want to recommend batteries, cases, or chargers. These two types
of recommendations are referred to as substitutes and complements: substitutes
are products that can be purchased instead of each other, while complements are
products that can be purchased in addition to each other.
Here we develop a method to infer networks of substitutable and complementary
products. We formulate this as a supervised link prediction task, where we
learn the semantics of substitutes and complements from data associated with
products. The primary source of data we use is the text of product reviews,
though our method also makes use of features such as ratings, specifications,
prices, and brands. Methodologically, we build topic models that are trained to
automatically discover topics from text that are successful at predicting and
explaining such relationships. Experimentally, we evaluate our system on the
Amazon product catalog, a large dataset consisting of 9 million products, 237
million links, and 144 million reviews.Comment: 12 pages, 6 figure
A study on text-score disagreement in online reviews
In this paper, we focus on online reviews and employ artificial intelligence
tools, taken from the cognitive computing field, to help understanding the
relationships between the textual part of the review and the assigned numerical
score. We move from the intuitions that 1) a set of textual reviews expressing
different sentiments may feature the same score (and vice-versa); and 2)
detecting and analyzing the mismatches between the review content and the
actual score may benefit both service providers and consumers, by highlighting
specific factors of satisfaction (and dissatisfaction) in texts.
To prove the intuitions, we adopt sentiment analysis techniques and we
concentrate on hotel reviews, to find polarity mismatches therein. In
particular, we first train a text classifier with a set of annotated hotel
reviews, taken from the Booking website. Then, we analyze a large dataset, with
around 160k hotel reviews collected from Tripadvisor, with the aim of detecting
a polarity mismatch, indicating if the textual content of the review is in
line, or not, with the associated score.
Using well established artificial intelligence techniques and analyzing in
depth the reviews featuring a mismatch between the text polarity and the score,
we find that -on a scale of five stars- those reviews ranked with middle scores
include a mixture of positive and negative aspects.
The approach proposed here, beside acting as a polarity detector, provides an
effective selection of reviews -on an initial very large dataset- that may
allow both consumers and providers to focus directly on the review subset
featuring a text/score disagreement, which conveniently convey to the user a
summary of positive and negative features of the review target.Comment: This is the accepted version of the paper. The final version will be
published in the Journal of Cognitive Computation, available at Springer via
http://dx.doi.org/10.1007/s12559-017-9496-
A Domain Oriented LDA Model for Mining Product Defects from Online Customer Reviews
Online reviews provide important demand-side knowledge for product manufacturers to improve product quality. However, discovering and quantifying potential productsâ defects from large amounts of online reviews is a nontrivial task. In this paper, we propose a Latent Product Defect Mining model that identifies critical product defects. We define domain-oriented key attributes, such as components and keywords used to describe a defect, and build a novel LDA model to identify and acquire integral information about product defects. We conduct comprehensive evaluations including quantitative and qualitative evaluations to ensure the quality of discovered information. Experimental results show that the proposed model outperforms the standard LDA model, and could find more valuable information. Our research contributes to the extant product quality analytics literature and has significant managerial implications for researchers, policy makers, customers, and practitioners
Fashion Conversation Data on Instagram
The fashion industry is establishing its presence on a number of
visual-centric social media like Instagram. This creates an interesting clash
as fashion brands that have traditionally practiced highly creative and
editorialized image marketing now have to engage with people on the platform
that epitomizes impromptu, realtime conversation. What kinds of fashion images
do brands and individuals share and what are the types of visual features that
attract likes and comments? In this research, we take both quantitative and
qualitative approaches to answer these questions. We analyze visual features of
fashion posts first via manual tagging and then via training on convolutional
neural networks. The classified images were examined across four types of
fashion brands: mega couture, small couture, designers, and high street. We
find that while product-only images make up the majority of fashion
conversation in terms of volume, body snaps and face images that portray
fashion items more naturally tend to receive a larger number of likes and
comments by the audience. Our findings bring insights into building an
automated tool for classifying or generating influential fashion information.
We make our novel dataset of {24,752} labeled images on fashion conversations,
containing visual and textual cues, available for the research community.Comment: 10 pages, 6 figures, This paper will be presented at ICWSM'1
- âŠ