5 research outputs found
Sentiment Analysis on Naija-Tweets
Examining sentiments in social media
poses a challenge to natural language
processing because of the intricacy and
variability in the dialect articulation, noisy
terms in form of slang, abbreviation,
acronym, emoticon, and spelling error
coupled with the availability of real-time
content. Moreover, most of the knowledgebased approaches for resolving slang,
abbreviation, and acronym do not consider
the issue of ambiguity that evolves in the
usage of these noisy terms. This research
work proposes an improved framework for
social media feed pre-processing that
leverages on the combination of integrated
local knowledge bases and adapted Lesk
algorithm to facilitate pre-processing of
social media feeds. The results from the
experimental evaluation revealed an
improvement over existing methods when
applied to supervised learning algorithms
in the task of extracting sentiments from
Nigeria-origin tweets with an accuracy of
99.17%
An enhanced binary bat and Markov clustering algorithms to improve event detection for heterogeneous news text documents
Event Detection (ED) works on identifying events from various types of data. Building an ED model for news text documents greatly helps decision-makers in various disciplines in improving their strategies. However, identifying and summarizing events from such data is a non-trivial task due to the large volume of published heterogeneous news text documents. Such documents create a high-dimensional feature space that influences the overall performance of the baseline methods in ED model. To address such a problem, this research presents an enhanced ED model that includes improved methods for the crucial phases of the ED model such as Feature Selection (FS), ED, and summarization. This work focuses on the FS problem by automatically detecting events through a novel wrapper FS method based on Adapted Binary Bat Algorithm (ABBA) and Adapted Markov Clustering Algorithm (AMCL), termed ABBA-AMCL. These adaptive techniques were developed to overcome the premature convergence in BBA and fast convergence rate in MCL. Furthermore, this study proposes four summarizing methods to generate informative summaries. The enhanced ED model was tested on 10 benchmark datasets and 2 Facebook news datasets. The effectiveness of ABBA-AMCL was compared to 8 FS methods based on meta-heuristic algorithms and 6 graph-based ED methods. The empirical and statistical results proved that ABBAAMCL surpassed other methods on most datasets. The key representative features demonstrated that ABBA-AMCL method successfully detects real-world events from Facebook news datasets with 0.96 Precision and 1 Recall for dataset 11, while for dataset 12, the Precision is 1 and Recall is 0.76. To conclude, the novel ABBA-AMCL presented in this research has successfully bridged the research gap and resolved the curse of high dimensionality feature space for heterogeneous news text documents. Hence, the enhanced ED model can organize news documents into distinct events and provide policymakers with valuable information for decision making
Understanding the Socio-infrastructure Systems During Disaster from Social Media Data
Our socio-infrastructure systems are becoming more and more vulnerable due to the increased severity and frequency of extreme events every year. Effective disaster management can minimize the damaging impacts of a disaster to a large extent. The ubiquitous use of social media platforms in GPS enabled smartphones offers a unique opportunity to observe, model, and predict human behavior during a disaster. This dissertation explores the opportunity of using social media data and different modeling techniques towards understanding and managing disaster more dynamically. In this dissertation, we focus on four objectives. First, we develop a method to infer individual evacuation behaviors (e.g., evacuation decision, timing, destination) from social media data. We develop an input output hidden Markov model to infer evacuation decisions from user tweets. Our findings show that using geo-tagged posts and text data, a hidden Markov model can be developed to capture the dynamics of hurricane evacuation decision. Second, we develop evacuation demand prediction model using social media and traffic data. We find that trained from social media and traffic data, a deep learning model can predict well evacuation traffic demand up to 24 hours ahead. Third, we present a multi-label classification approach to identify the co-occurrence of multiple types of infrastructure disruptions considering the sentiment towards a disruption—whether a post is reporting an actual disruption (negative), or a disruption in general (neutral), or not affected by a disruption (positive). We validate our approach for data collected during multiple hurricanes. Fourth, finally we develop an agent-based model to understand the influence of multiple information sources on risk perception dynamics and evacuation decisions. In this study, we explore the effects of socio-demographic factors and information sources such as social connectivity, neighborhood observation, and weather information and its credibility in forming risk perception dynamics and evacuation decisions