109 research outputs found
Incorporating Emotions into Health Mention Classification Task on Social Media
The health mention classification (HMC) task is the process of identifying
and classifying mentions of health-related concepts in text. This can be useful
for identifying and tracking the spread of diseases through social media posts.
However, this is a non-trivial task. Here we build on recent studies suggesting
that using emotional information may improve upon this task. Our study results
in a framework for health mention classification that incorporates affective
features. We present two methods, an intermediate task fine-tuning approach
(implicit) and a multi-feature fusion approach (explicit) to incorporate
emotions into our target task of HMC. We evaluated our approach on 5
HMC-related datasets from different social media platforms including three from
Twitter, one from Reddit and another from a combination of social media
sources. Extensive experiments demonstrate that our approach results in
statistically significant performance gains on HMC tasks. By using the
multi-feature fusion approach, we achieve at least a 3% improvement in F1 score
over BERT baselines across all datasets. We also show that considering only
negative emotions does not significantly affect performance on the HMC task.
Additionally, our results indicate that HMC models infused with emotional
knowledge are an effective alternative, especially when other HMC datasets are
unavailable for domain-specific fine-tuning. The source code for our models is
freely available at https://github.com/tahirlanre/Emotion_PHM
Social Media Analysis for Social Good
Data on social media is abundant and offers valuable information that can be utilised for a range of purposes. Users share their experiences and opinions on various topics, ranging from their personal life to the community and the world, in real-time. In comparison to conventional data sources, social media is cost-effective to obtain, is up-to-date and reaches a larger audience. By analysing this rich data source, it can contribute to solving societal issues and promote social impact in an equitable manner. In this thesis, I present my research in exploring innovative applications using \ac{NLP} and machine learning to identify patterns and extract actionable insights from social media data to ultimately make a positive impact on society.
First, I evaluate the impact of an intervention program aimed at promoting inclusive and equitable learning opportunities for underrepresented communities using social media data. Second, I develop EmoBERT, an emotion-based variant of the BERT model, for detecting fine-grained emotions to gauge the well-being of a population during significant disease outbreaks. Third, to improve public health surveillance on social media, I demonstrate how emotions expressed in social media posts can be incorporated into health mention classification using an intermediate task fine-tuning and multi-feature fusion approach. I also propose a multi-task learning framework to model the literal meanings of disease and symptom words to enhance the classification of health mentions. Fourth, I create a new health mention dataset to address the imbalance in health data availability between developing and developed countries, providing a benchmark alternative to the traditional standards used in digital health research. Finally, I leverage the power of pretrained language models to analyse religious activities, recognised as social determinants of health, during disease outbreaks
Improving Health Mention Classification Through Emphasising Literal Meanings: A Study Towards Diversity and Generalisation for Public Health Surveillance
People often use disease or symptom terms on social media and online forums in ways other than to describe their health. Thus the NLP health mention classification (HMC) task aims to identify posts where users are discussing health conditions literally, not figuratively. Existing computational research typically only studies health mentions within well-represented groups in developed nations. Developing countries with limited health surveillance abilities fail to benefit from such data to manage public health crises. To advance the HMC research and benefit more diverse populations, we present the Nairaland health mention dataset (NHMD), a new dataset collected from a dedicated web forum for Nigerians. NHMD consists of 7,763 manually labelled posts extracted based on four prevalent diseases (HIV/AIDS, Malaria, Stroke and Tuberculosis) in Nigeria. With NHMD, we conduct extensive experiments using current state-of-the-art models for HMC and identify that, compared to existing public datasets, NHMD contains out-of-distribution examples. Hence, it is well suited for domain adaptation studies. The introduction of the NHMD dataset imposes better diversity coverage of vulnerable populations and generalisation for HMC tasks in a global public health surveillance setting. Additionally, we present a novel multi-task learning approach for HMC tasks by combining literal word meaning prediction as an auxiliary task. Experimental results demonstrate that the proposed approach outperforms state-of-the-art methods statistically significantly (p < 0.01, Wilcoxon test) in terms of F1 score over the state-of-the-art and shows that our new dataset poses a strong challenge to the existing HMC methods
Curriculum CycleGAN for Textual Sentiment Domain Adaptation with Multiple Sources
Sentiment analysis of user-generated reviews or comments on products and
services in social networks can help enterprises to analyze the feedback from
customers and take corresponding actions for improvement. To mitigate
large-scale annotations on the target domain, domain adaptation (DA) provides
an alternate solution by learning a transferable model from other labeled
source domains. Existing multi-source domain adaptation (MDA) methods either
fail to extract some discriminative features in the target domain that are
related to sentiment, neglect the correlations of different sources and the
distribution difference among different sub-domains even in the same source, or
cannot reflect the varying optimal weighting during different training stages.
In this paper, we propose a novel instance-level MDA framework, named
curriculum cycle-consistent generative adversarial network (C-CycleGAN), to
address the above issues. Specifically, C-CycleGAN consists of three
components: (1) pre-trained text encoder which encodes textual input from
different domains into a continuous representation space, (2) intermediate
domain generator with curriculum instance-level adaptation which bridges the
gap across source and target domains, and (3) task classifier trained on the
intermediate domain for final sentiment classification. C-CycleGAN transfers
source samples at instance-level to an intermediate domain that is closer to
the target domain with sentiment semantics preserved and without losing
discriminative features. Further, our dynamic instance-level weighting
mechanisms can assign the optimal weights to different source samples in each
training stage. We conduct extensive experiments on three benchmark datasets
and achieve substantial gains over state-of-the-art DA approaches. Our source
code is released at: https://github.com/WArushrush/Curriculum-CycleGAN.Comment: Accepted by WWW 202
Recommended from our members
Where are you talking about? Advances and Challenges of Geographic Analysis of Text with Application to Disease Monitoring
The Natural Language Processing task we focus on in this thesis is Geoparsing. Geoparsing is the process of extraction and grounding of toponyms (place names). Consider this sentence: "The victims of the Spanish earthquake off the coast of Malaga were of American and Mexican origin." Four toponyms will be extracted (called Geotagging) and grounded to their geographic coordinates (called Toponym Resolution). However, our research goes further than any previous work by showing how to distinguish the literal place(s) of the event (Spain, Malaga) from other linguistic types/uses such as nationalities (Mexican, American), improving downstream task accuracy. We consolidate and extend the Standard Evaluation Framework, discuss key research problems, then present concrete solutions in order to advance each stage of geoparsing. For geotagging, as well as training a SOTA neural Location-NER tagger, we simplify Metonymy Resolution with a novel minimalist feature extraction combined with an LSTM-based classifier, matching SOTA results. For toponym resolution, we deploy the latest deep learning methods to achieve SOTA performance by augmenting neural models with hitherto unused geographic features called Map Vectors. With each research project, we provide high-quality datasets and system prototypes, further building resources in this field. We then show how these geoparsing advances coupled with our proposed Intra-Document Analysis can be used to associate news articles with locations in order to monitor the spread of public health threats. To this end, we evaluate our research contributions with production data from a real-time downstream application to improve geolocation of news events for disease monitoring. The data was made available to us by the Joint Research Centre (JRC), which operates one such system called MediSys that processes incoming news articles in order to monitor threats to public health and make these available to a variety of governmental, business and non-profit organisations. We also discuss steps towards an end-to-end, automated news monitoring system and make actionable recommendations for future work. In summary, the thesis aims are twofold: (1) Generate original geoparsing research aimed at advancing each stage of the pipeline by addressing pertinent challenges with concrete solutions and actionable proposals. (2) Demonstrate how this research can be applied to news event monitoring to increase the efficacy of existing biosurveillance systems, e.g. European Commission’s MediSys.I was generously funded by DREAM CDT, which was funded by NERC of UKRI
Unlocking the Pragmatics of Emoji: Evaluation of the Integration of Pragmatic Markers for Sarcasm Detection
Emojis have become an integral element of online communications, serving as a powerful, under-utilised resource for enhancing pragmatic understanding in NLP. Previous works have highlighted their potential for improvement of more complex tasks such as the identification of figurative literary devices including sarcasm due to their role in conveying tone within text. However present state-of-the-art does not include the consideration of emoji or adequately address sarcastic markers such as sentiment incongruence. This work aims to integrate these concepts to generate more robust solutions for sarcasm detection leveraging enhanced pragmatic features from both emoji and text tokens. This was achieved by establishing methodologies for sentiment feature extraction from emojis and a depth statistical evaluation of the features which characterise sarcastic text on Twitter. Current convention for generation of training data which implements weak-labelling using hashtags or keywords was evaluated against a human-annotated baseline; postulated validity concerns were verified where statistical evaluation found the content features deviated significantly from the baseline, highlighting potential validity concerns for many prominent works on the topic to date. Organic labelled sarcastic tweets containing emojis were crowd sourced by means of a survey to ensure valid outcomes for the sarcasm detection model. Given an established importance of both semantic and sentiment information, a novel sentiment-aware attention mechanism was constructed to enhance pattern recognition, balancing core features of sarcastic text: sentiment incongruence and context. This work establishes a framework for emoji feature extraction; a key roadblock cited in literature for their use in NLP tasks. The proposed sarcasm detection pipeline successfully facilitates the task using a GRU neural network with sentiment-aware attention, at an accuracy of 73% and promising indications regarding model robustness as part of a framework which is easily scalable for the inclusion of any future emojis released. Both enhanced sentiment information to supplement context in addition to consideration of the emoji were found to improve outcomes for the task
Enriching Affect Analysis Through Emotion and Sarcasm Detection
Affect detection from text is the task of detecting affective states such as sentiment, mood and emotions from natural language text including news comments, product reviews, discussion posts, tweets and so on. Broadly speaking, affect detection includes the related tasks of sentiment analysis, emotion detection and sarcasm detection, amongst others.
In this dissertation, we seek to enrich textual affect analysis from two perspectives: emotion and sarcasm. Emotion detection entails classifying the text into fine-grained categories of emotions such as happiness, sadness, surprise, and so on, whereas sarcasm detection seeks to identify the presence or absence of sarcasm in text. The task of emotion detection is particularly challenging due to limited number of resources and as it involves a greater number of categories of emotions in which to undertake classification, with no fixed number or types of emotions. Similarly, the recently proposed task of sarcasm detection is complicated due to the inherent sophisticated nature of sarcasm, where one typically says or writes the opposite of what they mean.
This dissertation consists of five contributions. First, we address word-emotion association, a fundamental building block of most, if not all, emotion detection systems. Current approaches to emotion detection rely on a handful of manually annotated resources such as lexicons and datasets for deriving word-emotion association. Instead, we propose novel models for augmenting word-emotion association to support unsupervised learning which does not require labeled training data and can be extended to flexible taxonomies of emotions.
Second, we study the problem of affective word representations, where affectively similar words are projected into neighboring regions of an n-dimensional embedding space. While existing techniques usually consider the lexical semantics and syntax of co-occurring words, thus rating emotionally dissimilar words occurring in similar contexts as highly similar, we integrate a rich spectrum of emotions into representation learning in order to cluster emotionally similar words closer, and emotionally dissimilar words farther from each other. The generated emotion-enriched word representations are found to be better at capturing relevant features useful for sentence-level emotion classification and emotion similarity tasks.
Third, we investigate the problem of computational sarcasm detection. Generally, sarcasm detection is treated as a linguistic and lexical phenomena with limited emphasis on the emotional aspects of sarcasm. In order to address this gap, we propose novel models of enriching sarcasm detection by incorporating affective knowledge. In particular, document-level features obtained from affective word representations are utilized in designing classification systems. Through extensive evaluation on six datasets from three diverse domains of text, we demonstrate the potential of exploiting automatically induced features without the need for considerable manual feature engineering.
Motivated by the importance of affective knowledge in detecting sarcasm, the fourth contribution of this thesis seeks to dig deeper and study the role of transitions and relationships between different emotions in order to discover which emotions serve as more informative and discriminative features for distinguishing sarcastic utterances in text.
Lastly, we show the usefulness of our proposed affective models by applying them in a non-affective framework of predicting the helpfulness of online reviews
EVALITA Evaluation of NLP and Speech Tools for Italian Proceedings of the Final Workshop
Editor of the proceedings of EVALITA 2016
Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation
This paper surveys the current state of the art in Natural Language
Generation (NLG), defined as the task of generating text or speech from
non-linguistic input. A survey of NLG is timely in view of the changes that the
field has undergone over the past decade or so, especially in relation to new
(usually data-driven) methods, as well as new applications of NLG technology.
This survey therefore aims to (a) give an up-to-date synthesis of research on
the core tasks in NLG and the architectures adopted in which such tasks are
organised; (b) highlight a number of relatively recent research topics that
have arisen partly as a result of growing synergies between NLG and other areas
of artificial intelligence; (c) draw attention to the challenges in NLG
evaluation, relating them to similar challenges faced in other areas of Natural
Language Processing, with an emphasis on different evaluation methods and the
relationships between them.Comment: Published in Journal of AI Research (JAIR), volume 61, pp 75-170. 118
pages, 8 figures, 1 tabl
- …