71 research outputs found
Aspect-based Sentiment Analysis on Car Reviews Using SpaCy Dependency Parsing and VADER
All businesses, including car manufacturers, need to understand what aspects of their products are perceived as positive and negative based on user reviews so that they can make improvements for the negative aspects and maintain the already positive aspects of their products. One of the available tools for this task is Sentiment Analysis. The traditional document-level and sentence-level sentiment analysis will only classify each document / sentence into a class. This approach is incapable of finding the more fine-grained sentiment for a specific aspect of interest, for example, comfort, price, engine, paint, etc. Therefore, in this case, Aspect-based Sentiment Analysis is used. A total of 22.702 rows of car review data are scraped from the Edmunds website (www.edmunds.com) for a specific car manufacturer. Dependency Parsing and noun phrase extraction were carried out using the SpaCy module in Python, and VADER sentiment analysis was used to determine the polarity of the sentiment for each noun phrase. Results showed that the vast majority of the sentiments are on the positive aspects: comfortable to drive, good fuel economy / mileage, reliability, spaciousness, value for money, helpful rear camera, quiet ride, good acceleration, well-designed, good sound system, and solid build. The results for the negative aspects have some similar aspects with those in the positive class but has a very low frequency. This finding means that the vast majority of the users are satisfied with multiple aspects of the produced cars. The limitation of this research and future research direction are discussed
Firsthand Opiates Abuse on Social Media: Monitoring Geospatial Patterns of Interest Through a Digital Cohort
In the last decade drug overdose deaths reached staggering proportions in the
US. Besides the raw yearly deaths count that is worrisome per se, an alarming
picture comes from the steep acceleration of such rate that increased by 21%
from 2015 to 2016. While traditional public health surveillance suffers from
its own biases and limitations, digital epidemiology offers a new lens to
extract signals from Web and Social Media that might be complementary to
official statistics. In this paper we present a computational approach to
identify a digital cohort that might provide an updated and complementary view
on the opioid crisis. We introduce an information retrieval algorithm suitable
to identify relevant subspaces of discussion on social media, for mining data
from users showing explicit interest in discussions about opioid consumption in
Reddit. Moreover, despite the pseudonymous nature of the user base, almost 1.5
million users were geolocated at the US state level, resembling the census
population distribution with a good agreement. A measure of prevalence of
interest in opiate consumption has been estimated at the state level, producing
a novel indicator with information that is not entirely encoded in the standard
surveillance. Finally, we further provide a domain specific vocabulary
containing informal lexicon and street nomenclature extracted by user-generated
content that can be used by researchers and practitioners to implement novel
digital public health surveillance methodologies for supporting policy makers
in fighting the opioid epidemic.Comment: Proceedings of the 2019 World Wide Web Conference (WWW '19
- …