Search CORE

366 research outputs found

Large-scale, Language-agnostic Discourse Classification of Tweets During COVID-19

Author: Gencoglu Oguzhan
Publication venue: 'MDPI AG'
Publication date: 01/01/2020
Field of study

Quantifying the characteristics of public attention is an essential prerequisite for appropriate crisis management during severe events such as pandemics. For this purpose, we propose language-agnostic tweet representations to perform large-scale Twitter discourse classification with machine learning. Our analysis on more than 26 million COVID-19 tweets shows that large-scale surveillance of public discourse is feasible with computationally lightweight classifiers by out-of-the-box utilization of these representations.Comment: 14 pages, 4 figure

arXiv.org e-Print Archive

Multidisciplinary Digital Publishing Institute

Trepo - Institutional Repository of Tampere University

Impact Estimation of Emergency Events Using Social Media Streams

Author: Arnaudo Edoardo
Blanco Giacomo
Rossi Claudio
Salza Dario
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 25/08/2022
Field of study

In recent years, Social Media platforms have attracted millions of users, becoming a primary communication channel. They offer the possibility to massively ingest and instantly share big volumes of user-generated content before, during, and after emergency events. Being able to accurately quantify the impact of such hazardous events could greatly help all organizations involved in the emergency management cycle to adequately plan the required recovery operations. In this work, we propose a novel Natural Language Processing approach built on rule-based algorithms able to estimate, from tweets posted during natural hazards, the impact of emergency events in terms of affected population and infrastructures. We implement our approach in an operational environment and present its validation on a publicly released dataset of more than 1.4K manually annotated tweets, showing an overall weighted F1 score of 0.77

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Geo-Information Harvesting from Social Media Data

Author: Abdulahhad Karam
Hoffmann Eike Jens
Häberle Matthias
Jacobs Nathan
Kochupillai Mrinalini
Kruspe Anna
Levering Alex
Taubenböck Hannes
Tuia Devis
Wang Yuanyuan
Werner Martin
Zhu Xiao Xiang
Publication venue
Publication date: 01/01/2022
Field of study

As unconventional sources of geo-information, massive imagery and text messages from open platforms and social media form a temporally quasi-seamless, spatially multi-perspective stream, but with unknown and diverse quality. Due to its complementarity to remote sensing data, geo-information from these sources offers promising perspectives, but harvesting is not trivial due to its data characteristics. In this article, we address key aspects in the field, including data availability, analysis-ready data preparation and data management, geo-information extraction from social media text messages and images, and the fusion of social media and remote sensing data. We then showcase some exemplary geographic applications. In addition, we present the first extensive discussion of ethical considerations of social media data in the context of geo-information harvesting and geographic applications. With this effort, we wish to stimulate curiosity and lay the groundwork for researchers who intend to explore social media data for geo-applications. We encourage the community to join forces by sharing their code and data.Comment: Accepted for publication IEEE Geoscience and Remote Sensing Magazin

arXiv.org e-Print Archive

Institute of Transport Research:Publications

Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation

Author
Publication venue: The Association for Computational Linguistics
Publication date: 19/04/2021
Field of study

Peer reviewe

Helsingin yliopiston digitaalinen arkisto

Geo-Information Harvesting from Social Media Data

Author: Abdulahhad Karam
Hoffmann Eike Jens
Häberle Matthias
Jacobs Nathan
Kochupillai Mrinalini
Kruspe Anna
Levering Alex
Taubenböck Hannes
Tuia Devis
Wang Yuanyuan
Werner Martin
Zhu Xiao Xiang
Publication venue: IEEE - Institute of Electrical and Electronics Engineers
Publication date: 01/01/2023
Field of study

As unconventional sources of geo-information, massive imagery and text messages from open platforms and social media form a temporally quasi-seamless, spatially multiperspective stream, but with unknown and diverse quality. Due to its complementarity to remote sensing data, geo-information from these sources offers promising perspectives, but harvesting is not trivial due to its data characteristics. In this article, we address key aspects in the field, including data availability, analysisready data preparation and data management, geo-information extraction from social media text messages and images, and the fusion of social media and remote sensing data. We then showcase some exemplary geographic applications. In addition, we present the first extensive discussion of ethical considerations of social media data in the context of geo-information harvesting and geographic applications. With this effort, we wish to stimulate curiosity and lay the groundwork for researchers who intend to explore social media data for geo-applications. We encourage the community to join forces by sharing their code and data

Institute of Transport Research:Publications

Unifying context with labeled property graph: A pipeline-based system for comprehensive text representation in NLP

Author: Ahmed Mohiuddin
Hur Ali
Janjua Naeem
Publication venue: Edith Cowan University, Research Online, Perth, Western Australia
Publication date: 01/01/2024
Field of study

Extracting valuable insights from vast amounts of unstructured digital text presents significant challenges across diverse domains. This research addresses this challenge by proposing a novel pipeline-based system that generates domain-agnostic and task-agnostic text representations. The proposed approach leverages labeled property graphs (LPG) to encode contextual information, facilitating the integration of diverse linguistic elements into a unified representation. The proposed system enables efficient graph-based querying and manipulation by addressing the crucial aspect of comprehensive context modeling and fine-grained semantics. The effectiveness of the proposed system is demonstrated through the implementation of NLP components that operate on LPG-based representations. Additionally, the proposed approach introduces specialized patterns and algorithms to enhance specific NLP tasks, including nominal mention detection, named entity disambiguation, event enrichments, event participant detection, and temporal link detection. The evaluation of the proposed approach, using the MEANTIME corpus comprising manually annotated documents, provides encouraging results and valuable insights into the system\u27s strengths. The proposed pipeline-based framework serves as a solid foundation for future research, aiming to refine and optimize LPG-based graph structures to generate comprehensive and semantically rich text representations, addressing the challenges associated with efficient information extraction and analysis in NLP

Research Online @ ECU

Exploring Sentiment Analysis on Twitter: Investigating Public Opinion on Migration in Brazil from 2015 to 2020

Author: BASELICE PELICIONI ANA BEATRIZ
Publication venue
Publication date: 17/10/2023
Field of study

openTechnology has reshaped societal interaction and the expression of opinions. Migration is a prominent trend, and analysing social media discussions provides insights into societal perspectives. This thesis explores how events between 2015 and 2020 impacted Brazilian sentiment on Twitter about migrants and refugees. Its aim was to uncover the influence of key sociopolitical events on public sentiment, clarifying how these echoed in the digital realm. Four key objectives guided this research: (a) understanding public opinions on migrants and refugees, (b) investigating how events influenced Twitter sentiment, (c) identifying terms used in migration-related tweets, and (d) tracking sentiment shifts, especially concerning changes in government. Sentiment analysis using VADER (Valence Aware Dictionary and sEntiment Reasoner) was employed to analyse tweet data. The use of computational methods in social sciences is gaining traction, yet no analysis has been conducted before to understand the sentiments of the Brazilian population regarding migration. The analysis underscored Twitter's role in reflecting and shaping public discourse, offering insights into how major events influenced discussions on migration. In conclusion, this study illuminated the landscape of Brazilian sentiment on migration, emphasizing the significance of innovative social media analysis methodologies for policymaking and societal inclusivity in the digital age

Padua Thesis and Dissertation Archive

Knowledge extraction from unstructured data

Author: Sakor Ahmad
Publication venue: Hannover : Institutionelles Repositorium der Leibniz Universität Hannover
Publication date: 01/01/2023
Field of study

Data availability is becoming more essential, considering the current growth of web-based data. The data available on the web are represented as unstructured, semi-structured, or structured data. In order to make the web-based data available for several Natural Language Processing or Data Mining tasks, the data needs to be presented as machine-readable data in a structured format. Thus, techniques for addressing the problem of capturing knowledge from unstructured data sources are needed. Knowledge extraction methods are used by the research communities to address this problem; methods that are able to capture knowledge in a natural language text and map the extracted knowledge to existing knowledge presented in knowledge graphs (KGs). These knowledge extraction methods include Named-entity recognition, Named-entity Disambiguation, Relation Recognition, and Relation Linking. This thesis addresses the problem of extracting knowledge over unstructured data and discovering patterns in the extracted knowledge. We devise a rule-based approach for entity and relation recognition and linking. The defined approach effectively maps entities and relations within a text to their resources in a target KG. Additionally, it overcomes the challenges of recognizing and linking entities and relations to a specific KG by employing devised catalogs of linguistic and domain-specific rules that state the criteria to recognize entities in a sentence of a particular language, and a deductive database that encodes knowledge in community-maintained KGs. Moreover, we define a Neuro-symbolic approach for the tasks of knowledge extraction in encyclopedic and domain-specific domains; it combines symbolic and sub-symbolic components to overcome the challenges of entity recognition and linking and the limitation of the availability of training data while maintaining the accuracy of recognizing and linking entities. Additionally, we present a context-aware framework for unveiling semantically related posts in a corpus; it is a knowledge-driven framework that retrieves associated posts effectively. We cast the problem of unveiling semantically related posts in a corpus into the Vertex Coloring Problem. We evaluate the performance of our techniques on several benchmarks related to various domains for knowledge extraction tasks. Furthermore, we apply these methods in real-world scenarios from national and international projects. The outcomes show that our techniques are able to effectively extract knowledge encoded in unstructured data and discover patterns over the extracted knowledge presented as machine-readable data. More importantly, the evaluation results provide evidence to the effectiveness of combining the reasoning capacity of the symbolic frameworks with the power of pattern recognition and classification of sub-symbolic models

Institutionelles Repositorium der Leibniz Universität Hannover