1,805 research outputs found

    Multi-class twitter data categorization and geocoding with a novel computing framework

    Get PDF
    This study details the progress in transportation data analysis with a novel computing framework in keeping with the continuous evolution of the computing technology. The computing framework combines the Labeled Latent Dirichlet Allocation (L-LDA)-incorporated Support Vector Machine (SVM) classifier with the supporting computing strategy on publicly available Twitter data in determining transportation-related events to provide reliable information to travelers. The analytical approach includes analyzing tweets using text classification and geocoding locations based on string similarity. A case study conducted for the New York City and its surrounding areas demonstrates the feasibility of the analytical approach. Approximately 700,010 tweets are analyzed to extract relevant transportation-related information for one week. The SVM classifier achieves \u3e 85% accuracy in identifying transportation-related tweets from structured data. To further categorize the transportation-related tweets into sub-classes: incident, congestion, construction, special events, and other events, three supervised classifiers are used: L-LDA, SVM, and L-LDA incorporated SVM. Findings from this study demonstrate that the analytical framework, which uses the L-LDA incorporated SVM, can classify roadway transportation-related data from Twitter with over 98.3% accuracy, which is significantly higher than the accuracies achieved by standalone L-LDA and SVM

    Incident Streams 2019: Actionable Insights and How to Find Them

    Get PDF
    The ubiquity of mobile internet-enabled devices combined with wide-spread social media use during emergencies is posing new challenges for response personnel. In particular, service operators are now expected to monitor these online channels to extract actionable insights and answer questions from the public. A lack of adequate tools makes this monitoring impractical at the scale of many emergencies. The TREC Incident Streams (TREC-IS) track drives research into solving this technology gap by bringing together academia and industry to develop techniques for extracting actionable insights from social media streams during emergencies. This paper covers the second year of TREC-IS, hosted in 2019 with two editions, 2019-A and 2019-B, contributing 12 new events and approximately 20,000 new tweets across 25 information categories, with 15 research groups participating across the world. This paper provides an overview of these new editions, actionable insights from data labelling, and the automated techniques employed by participant systems that appear most effective

    Classifying #MeToo Hash-tagged Tweets by Semantics to Understand the Extent of Sexual Harassment

    Get PDF
    This thesis contains a program that will process tweets from Twitter that use the hashtag "#MeToo" and categorize them by their relevance to the movement, their stance on the movement, and the type of sexual harassment expressed (if applicable). Being able to work with a narrowed set of tweets belonging to a specific category creates the capacity to do more in-depth research and analysis, exploring Twitter as a special platform for discussing these sensitive topics and showing that this online space for expressing personal experiences has delivered unprecedented potential avenues of study. This thesis also contains research into additional solutions towards addressing sexual harassment online, exploring the needs of society through the results to a questionnaire that was administered to university students asking for opinions on how sexual harassment is addressed on social media as well as through a literature review of current obstacles for victims
    corecore