Search CORE

2,597 research outputs found

Multilingual Text Classification from Twitter during Emergencies

Author: Arnaudo Edoardo
Piscitelli Sara
Rossi Claudio
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2021
Field of study

Social media such as Twitter are a valuable source of information due to their diffusion among citizens and to their speed in sharing data worldwide. However, it is challenging to automatically extract information from such data, given the huge amount of useless content. We propose a multilingual tool that automatically categorizes tweets according to their information content. To achieve real-time classification while supporting any language, we apply a deep learning classifier, using multilingual word embeddings. This allows our solution to be trained on one language and to apply it to any other language via zero-shot inference achieving acceptable performance loss

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Community Segmentation and Inclusive Social Media Listening

Author: Gjøsæter Terje
Herrera Lucia Castro
Publication venue: International Association for Information Systems for Crisis Response and Management
Publication date: 01/01/2022
Field of study

Social media analytics provide a generalized picture of situational awareness from the conversations happening among communities present in social media channels that are that are, or risk being affected by crises. The generalized nature of results from these analytics leaves underrepresented communities in the background. When considering social media analytics, concerns, sentiment, and needs are perceived as homogenous. However, offline, the community is diverse, often segmented by age group, occupation, or language, to name a few. Through our analysis of interviews from professionals using social media as a source of information in public service organizations, we argue that practitioners might not be perceiving this segmentation from the social media conversation. In addition, practitioners who are aware of this limitation, agree that there is room for improvement and resort to alternative mechanisms to understand, reach, and provide services to these communities in need. Thus, we analyze current perceptions and activities around segmentation and provide suggestions that could inform the design of social media analytics tools that support inclusive public services for all, including persons with disabilities and from other disadvantaged groups.publishedVersionPaid open acces

Agder University Research Archive

Classifying Crises-Information Relevancy with Semantics

Author: A Tonon
F Abel
G Burel
H Gao
J Rogstadius
J Yin
N Cristianini
R Navigli
R Power
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

Social media platforms have become key portals for sharing and consuming information during crisis situations. However, humanitarian organisations and affected communities often struggle to sieve through the large volumes of data that are typically shared on such platforms during crises to determine which posts are truly relevant to the crisis, and which are not. Previous work on automatically classifying crisis information was mostly focused on using statistical features. However, such approaches tend to be inappropriate when processing data on a type of crisis that the model was not trained on, such as processing information about a train crash, whereas the classifier was trained on floods, earthquakes, and typhoons. In such cases, the model will need to be retrained, which is costly and time-consuming. In this paper, we explore the impact of semantics in classifying Twitter posts across same, and different, types of crises. We experiment with 26 crisis events, using a hybrid system that combines statistical features with various semantic features extracted from external knowledge bases. We show that adding semantic features has no noticeable benefit over statistical features when classifying same-type crises, whereas it enhances the classifier performance by up to 7.2% when classifying information about a new type of crisis

Crossref

Open Research Online (The Open University)

IMEXT: a method and system to extract geolocated images from Tweets - Analysis of a case study

Author
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 13/05/2017
Field of study

open5noopenFrancalanci, Chiara; Guglielmino, Paolo; Montalcini, Matteo; Scalia, Gabriele; Pernici, BarbaraFrancalanci, Chiara; Guglielmino, Paolo; Montalcini, Matteo; Scalia, Gabriele; Pernici, Barbar

Archivio istituzionale della ricerca - Politecnico di Milano

Crossref

TriggerCit: Early Flood Alerting using Twitter and Geolocation - A Comparison with Alternative Sources

Author: Amudha Ravi Shankar
Barbara Pernici
Carlo Bono
Edoardo Nemni
Jose Luis Fernandez-Marquez
Mehmet Oguz Mülâyim
Publication venue: ISCRAM Digital Library
Publication date: 01/01/2022
Field of study

Rapid impact assessment in the immediate aftermath of a natural disaster is essential to provide adequate information to international organisations, local authorities, and first responders. Social media can support emergency response with evidence-based content posted by citizens and organisations during ongoing events. In the paper, we propose TriggerCit: an early flood alerting tool with a multilanguage approach focused on timeliness and geolocation. The paper focuses on assessing the reliability of the approach as a triggering system, comparing it with alternative sources for alerts, and evaluating the quality and amount of complementary information gathered. Geolocated visual evidence extracted from Twitter by TriggerCit was analysed in two case studies on floods in Thailand and Nepal in 2021.Comment: 12 pages Keywords Social Media, Disaster management, Early Alertin

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Politecnico di Milano

Terminology Extraction for and from Communications in Multi-disciplinary Domains

Author: Ahmad Khurshid
Musacchio MARIA TERESA
Panizzon Raffaella
Zhang Xiubo
Publication venue
Publication date: 01/01/2016
Field of study

Terminology extraction generally refers to methods and systems for identifying term candidates in a uni-disciplinary and uni-lingual environment such as engineering, medical, physical and geological sciences, or administration, business and leisure. However, as human enterprises get more and more complex, it has become increasingly important for teams in one discipline to collaborate with others from not only a non-cognate discipline but also speaking a different language. Disaster mitigation and recovery, and conflict resolution are amongst the areas where there is a requirement to use standardised multilingual terminology for communication. This paper presents a feasibility study conducted to build terminology (and ontology) in the domain of disaster management and is part of the broader work conducted for the EU project Sland \ub4 ail (FP7 607691). We have evaluated CiCui (for Chinese name \ub4 \u8bcd\u8403, which translates to words gathered), a corpus-based text analytic system that combine frequency, collocation and linguistic analyses to extract candidates terminologies from corpora comprised of domain texts from diverse sources. CiCui was assessed against four terminology extraction systems and the initial results show that it has an above average precision in extracting terms

Archivio istituzionale della ricerca - Università di Padova

Recommended from our members

Identifying and Processing Crisis Information from Social Media

Author: Khare Prashant
Publication venue
Publication date: 26/02/2020
Field of study

Social media platforms play a crucial role in how people communicate, particularly during crisis situations such as natural disasters. People share and disseminate information on social media platforms that relates to updates, alerts, rescue and relief requests among other crisis relevant information. Hurricane Harvey and Hurricane Sandy saw over tens of millions of posts getting generated, on Twitter, in a short span of time. The ambit of such posts spreads across a wide range such as personal and official communications, and citizen sensing, to mention a few. This makes social media platforms a source of vital information to different stakeholders in crisis situations such as impacted communities, relief agencies, and civic authorities. However, the overwhelming volume of data generated during such times, makes it impossible to manually identify information relevant to crisis. Additionally, a large portion of posts in voluminous streams is not relevant or bears minimal relevance to crisis situations. This has steered much research towards exploring methods that can automatically identify crisis relevant information from voluminous streams of data during such scenarios. However, the problem of identifying crisis relevant information from social media platforms, such as Twitter, is not trivial given the nature of unstructured text such as short text length and syntactic variations among other challenges. A key objective, while creating automatic crisis relevancy classification systems, is to make them adaptable to a wide range of crisis types and languages. Many related approaches rely on statistical features which are quantifiable properties and linguistic properties of the text. A general approach is to train the classification model on labelled data acquired from crisis events and evaluate on other crisis events. A key aspect missing from explored literature is the validity of crisis relevancy classification models when applied to data from unseen types of crisis events and languages. For instance, how would the accuracy of a crisis relevancy classification model, trained on earthquake type of events, change when applied to flood type of events. Or, how would a model perform when trained on crisis data in English but applied to data in Italian. This thesis investigates these problems from a semantics perspective, where the challenges posed by diverse types of crisis and language variations are seen as the problems that can be tackled by enriching the data semantically. The use of knowledge bases such as DBpedia, BabelNet, and Wikipedia, for semantic enrichment of data in text classification problems has often been studied. Semantic enrichment of data through entity linking and expansion of context via knowledge bases can take advantage of connections between different concepts and thus enhance contextual coherency across crisis types and languages. Several previous works have focused on similar problems and proposed approaches using statistical features and/or non-semantic features. The use of semantics extracted through knowledge graphs has remained unexplored in building crisis relevancy classifiers that are adaptive to varying crisis types and multilingual data. Experiments conducted in this thesis consider data from Twitter, a micro-blogging social media platform, and analyse multiple aspects of crisis data classification. The results obtained through various analyses in this thesis demonstrate the value of semantic enrichment of text through knowledge graphs in improving the adaptability of crisis relevancy classifiers across crisis types and languages, in comparison to statistical features as often used in much of the related work

Open Research Online (The Open University)