    Finding Eyewitness Tweets During Crises

    Disaster response agencies have started to incorporate social media as a source of fast-breaking information to understand the needs of people affected by the many crises that occur around the world. These agencies look for tweets from within the region affected by the crisis to get the latest updates of the status of the affected region. However only 1% of all tweets are geotagged with explicit location information. First responders lose valuable information because they cannot assess the origin of many of the tweets they collect. In this work we seek to identify non-geotagged tweets that originate from within the crisis region. Towards this, we address three questions: (1) is there a difference between the language of tweets originating within a crisis region and tweets originating outside the region, (2) what are the linguistic patterns that can be used to differentiate within-region and outside-region tweets, and (3) for non-geotagged tweets, can we automatically identify those originating within the crisis region in real-time

    Expanding Awareness: Comparing Location, Keyword, and Network Filtering Methods to Collect Hyperlocal Social Media Data

    Opportunities to collect real-time social media data during a crisis remain limited to location and keyword filtering despite the sparsity of geographic metadata and the tendency of keyword-based methods to capture information posted by remote rather than local users. Here we introduce a third, network filtering method that uses social network ties to infer the location of social media users in a geographic community and collect data from networks of these users during a crisis. In this paper we compare all three methods by analyzing the distribution of situational reports of infrastructure damage and service disruption across location, keyword, and network-filtered social media data during a weather emergency. We find that network filtering doubles the number of situational reports collected in real-time compared to location and keyword filtering alone, but that all three methods collect unique reports that can support situational awareness of incidents occurring across a community

    Early Detection of Mass Disaster Events Using Social Media Data

    During a mass disaster, social media are a major source of information providing first-hand accounts of the unfolding situation. Automated ways to discover and collate this information in real-time can be of critical value for humanitarian operations. Prior work on this task largely focused on developing message classifiers restricted to particular types of disasters, such as storms or wildfires. In this paper we investigate machine-learning methods to detect crisis-related messages where the type of the crisis is not known in advance. The methods are potentially of a much greater practical value, as they can provide the means to deal with a wide range of crisis situations, including those that involve combinations of disaster types and types that were unknown at the training stage. The key challenge with this task is the fact that events of potential relevance are extremely diverse and correspondingly both training and test data are highly heterogeneous. The data heterogeneity causes significant difficulties for machine learning algorithms to generalize and accurately label incoming data. Our main contributions are an investigation of the scope of this problem in the context of disaster management, and novel message classification methods to overcome data heterogeneity based on ensemble methods, semi-supervised learning and feature selection. We evaluate the proposed methods on an academic benchmark dataset comprising twenty-six different disaster events, as well as in a case study where we assess the performance of the methods on real-world data. The experimental evaluation shows that the methods achieve quality of classification superior to methods previously used for this task

    Between a Rock and a Cell Phone: Social Media Use during Mass Protests in Iran, Tunisia and Egypt

    Get PDF
    In this paper we examine the use of social media, and especially Twitter, in Iran, Tunisia and Egypt during the mass political demonstrations and protests in June 2009, December 2010 - January 2011, and February 2011, respectively. We compare this usage with methods and findings from other studies on the use of Twitter in emergency situations, such as natural and man-made disasters. We draw on our own experiences and participant-observations as an eyewitness in Iran (first author), and on Twitter data from Iran, Tunisia and Egypt. In these three cases, Twitter filled a unique technology and communication gap at least partially. We summarize suggested directions for future research with a view of placing this work in the larger context of social media use in conditions of crisis and social convergence

    On Identifying Disaster-Related Tweets: Matching-based or Learning-based?

    Social media such as tweets are emerging as platforms contributing to situational awareness during disasters. Information shared on Twitter by both affected population (e.g., requesting assistance, warning) and those outside the impact zone (e.g., providing assistance) would help first responders, decision makers, and the public to understand the situation first-hand. Effective use of such information requires timely selection and analysis of tweets that are relevant to a particular disaster. Even though abundant tweets are promising as a data source, it is challenging to automatically identify relevant messages since tweet are short and unstructured, resulting to unsatisfactory classification performance of conventional learning-based approaches. Thus, we propose a simple yet effective algorithm to identify relevant messages based on matching keywords and hashtags, and provide a comparison between matching-based and learning-based approaches. To evaluate the two approaches, we put them into a framework specifically proposed for analyzing disaster-related tweets. Analysis results on eleven datasets with various disaster types show that our technique provides relevant tweets of higher quality and more interpretable results of sentiment analysis tasks when compared to learning approach

    Cross-language tweet classification using Bing translator

    Master of ScienceDepartment of Computing and Information SciencesDoina CarageaSocial media affects our daily lives. It is one of the first sources for finding breaking news. In particular, Twitter is one of the popular social media platforms, with around 330 million monthly users. From local events such as Fake Patty's Day to across the world happenings - Twitter gets there first. During a disaster, tweets can be used to post warnings, status of available medical and food supply, emergency personnel, and updates. Users were practically tweeting about the Hurricane Sandy, despite lack of network during the storm. Analysis of these tweets can help monitor the disaster, plan and manage the crisis, and aid in research. In this research, we use the publicly available tweets posted during several disasters and identify the relevant tweets. As the languages in the datasets are different, Bing translation API has been used to detect and translate the tweets. The translations are then, used as training datasets for supervised machine learning algorithms. Supervised learning is the process of learning from a labeled training dataset. This learned classifier can then be used to predict the correct output for any valid input. When trained to more observations, the algorithm improves its predictive performance