71 research outputs found

    Aspect-based Sentiment Analysis on Car Reviews Using SpaCy Dependency Parsing and VADER

    Get PDF
    All businesses, including car manufacturers, need to understand what aspects of their products are perceived as positive and negative based on user reviews so that they can make improvements for the negative aspects and maintain the already positive aspects of their products. One of the available tools for this task is Sentiment Analysis. The traditional document-level and sentence-level sentiment analysis will only classify each document / sentence into a class. This approach is incapable of finding the more fine-grained sentiment for a specific aspect of interest, for example, comfort, price, engine, paint, etc. Therefore, in this case, Aspect-based Sentiment Analysis is used. A total of 22.702 rows of car review data are scraped from the Edmunds website (www.edmunds.com) for a specific car manufacturer. Dependency Parsing and noun phrase extraction were carried out using the SpaCy module in Python, and VADER sentiment analysis was used to determine the polarity of the sentiment for each noun phrase. Results showed that the vast majority of the sentiments are on the positive aspects: comfortable to drive, good fuel economy / mileage, reliability, spaciousness, value for money, helpful rear camera, quiet ride, good acceleration, well-designed, good sound system, and solid build. The results for the negative aspects have some similar aspects with those in the positive class but has a very low frequency. This finding means that the vast majority of the users are satisfied with multiple aspects of the produced cars. The limitation of this research and future research direction are discussed

    Firsthand Opiates Abuse on Social Media: Monitoring Geospatial Patterns of Interest Through a Digital Cohort

    Get PDF
    In the last decade drug overdose deaths reached staggering proportions in the US. Besides the raw yearly deaths count that is worrisome per se, an alarming picture comes from the steep acceleration of such rate that increased by 21% from 2015 to 2016. While traditional public health surveillance suffers from its own biases and limitations, digital epidemiology offers a new lens to extract signals from Web and Social Media that might be complementary to official statistics. In this paper we present a computational approach to identify a digital cohort that might provide an updated and complementary view on the opioid crisis. We introduce an information retrieval algorithm suitable to identify relevant subspaces of discussion on social media, for mining data from users showing explicit interest in discussions about opioid consumption in Reddit. Moreover, despite the pseudonymous nature of the user base, almost 1.5 million users were geolocated at the US state level, resembling the census population distribution with a good agreement. A measure of prevalence of interest in opiate consumption has been estimated at the state level, producing a novel indicator with information that is not entirely encoded in the standard surveillance. Finally, we further provide a domain specific vocabulary containing informal lexicon and street nomenclature extracted by user-generated content that can be used by researchers and practitioners to implement novel digital public health surveillance methodologies for supporting policy makers in fighting the opioid epidemic.Comment: Proceedings of the 2019 World Wide Web Conference (WWW '19
    • …
    corecore