Analysis of Detection Models for Disaster-Related Tweets

Abstract

Social media is perceived as a rich resource for disaster management and relief efforts, but the high class imbalance between disaster-related and non-disaster-related messages challenges a reliable detection. We analyze and compare the effectiveness of three state-of-the-art machine learning models for detecting disaster-related tweets. In this regard we introduce the Disaster Tweet Corpus 2020, an extended compilation of existing resources, which comprises a total of 123,166 tweets from 46 disasters covering 9 disaster types. Our findings from a large experiments series include: detection models work equally well over a broad range of disaster types when being trained for the respective type, a domain transfer across disaster types leads to unacceptable performance drops, or, similarly, type-agnostic classification models behave more robust at a lower effectiveness level. Altogether, the average misclassification rate of 3,8\% on performance-optimized detection models indicates effective classification knowledge but comes at the price of insufficient generalizability

    Similar works