2,597 research outputs found

    Multilingual Text Classification from Twitter during Emergencies

    Get PDF
    Social media such as Twitter are a valuable source of information due to their diffusion among citizens and to their speed in sharing data worldwide. However, it is challenging to automatically extract information from such data, given the huge amount of useless content. We propose a multilingual tool that automatically categorizes tweets according to their information content. To achieve real-time classification while supporting any language, we apply a deep learning classifier, using multilingual word embeddings. This allows our solution to be trained on one language and to apply it to any other language via zero-shot inference achieving acceptable performance loss

    Community Segmentation and Inclusive Social Media Listening

    Get PDF
    Social media analytics provide a generalized picture of situational awareness from the conversations happening among communities present in social media channels that are that are, or risk being affected by crises. The generalized nature of results from these analytics leaves underrepresented communities in the background. When considering social media analytics, concerns, sentiment, and needs are perceived as homogenous. However, offline, the community is diverse, often segmented by age group, occupation, or language, to name a few. Through our analysis of interviews from professionals using social media as a source of information in public service organizations, we argue that practitioners might not be perceiving this segmentation from the social media conversation. In addition, practitioners who are aware of this limitation, agree that there is room for improvement and resort to alternative mechanisms to understand, reach, and provide services to these communities in need. Thus, we analyze current perceptions and activities around segmentation and provide suggestions that could inform the design of social media analytics tools that support inclusive public services for all, including persons with disabilities and from other disadvantaged groups.publishedVersionPaid open acces

    Classifying Crises-Information Relevancy with Semantics

    Get PDF
    Social media platforms have become key portals for sharing and consuming information during crisis situations. However, humanitarian organisations and affected communities often struggle to sieve through the large volumes of data that are typically shared on such platforms during crises to determine which posts are truly relevant to the crisis, and which are not. Previous work on automatically classifying crisis information was mostly focused on using statistical features. However, such approaches tend to be inappropriate when processing data on a type of crisis that the model was not trained on, such as processing information about a train crash, whereas the classifier was trained on floods, earthquakes, and typhoons. In such cases, the model will need to be retrained, which is costly and time-consuming. In this paper, we explore the impact of semantics in classifying Twitter posts across same, and different, types of crises. We experiment with 26 crisis events, using a hybrid system that combines statistical features with various semantic features extracted from external knowledge bases. We show that adding semantic features has no noticeable benefit over statistical features when classifying same-type crises, whereas it enhances the classifier performance by up to 7.2% when classifying information about a new type of crisis

    IMEXT: a method and system to extract geolocated images from Tweets - Analysis of a case study

    Get PDF
    open5noopenFrancalanci, Chiara; Guglielmino, Paolo; Montalcini, Matteo; Scalia, Gabriele; Pernici, BarbaraFrancalanci, Chiara; Guglielmino, Paolo; Montalcini, Matteo; Scalia, Gabriele; Pernici, Barbar

    TriggerCit: Early Flood Alerting using Twitter and Geolocation - A Comparison with Alternative Sources

    Get PDF
    Rapid impact assessment in the immediate aftermath of a natural disaster is essential to provide adequate information to international organisations, local authorities, and first responders. Social media can support emergency response with evidence-based content posted by citizens and organisations during ongoing events. In the paper, we propose TriggerCit: an early flood alerting tool with a multilanguage approach focused on timeliness and geolocation. The paper focuses on assessing the reliability of the approach as a triggering system, comparing it with alternative sources for alerts, and evaluating the quality and amount of complementary information gathered. Geolocated visual evidence extracted from Twitter by TriggerCit was analysed in two case studies on floods in Thailand and Nepal in 2021.Comment: 12 pages Keywords Social Media, Disaster management, Early Alertin

    Terminology Extraction for and from Communications in Multi-disciplinary Domains

    Get PDF
    Terminology extraction generally refers to methods and systems for identifying term candidates in a uni-disciplinary and uni-lingual environment such as engineering, medical, physical and geological sciences, or administration, business and leisure. However, as human enterprises get more and more complex, it has become increasingly important for teams in one discipline to collaborate with others from not only a non-cognate discipline but also speaking a different language. Disaster mitigation and recovery, and conflict resolution are amongst the areas where there is a requirement to use standardised multilingual terminology for communication. This paper presents a feasibility study conducted to build terminology (and ontology) in the domain of disaster management and is part of the broader work conducted for the EU project Sland \ub4 ail (FP7 607691). We have evaluated CiCui (for Chinese name \ub4 \u8bcd\u8403, which translates to words gathered), a corpus-based text analytic system that combine frequency, collocation and linguistic analyses to extract candidates terminologies from corpora comprised of domain texts from diverse sources. CiCui was assessed against four terminology extraction systems and the initial results show that it has an above average precision in extracting terms
    • …
    corecore