684 research outputs found
Latent-Optimized Adversarial Neural Transfer for Sarcasm Detection
The existence of multiple datasets for sarcasm detection prompts us to apply
transfer learning to exploit their commonality. The adversarial neural transfer
(ANT) framework utilizes multiple loss terms that encourage the source-domain
and the target-domain feature distributions to be similar while optimizing for
domain-specific performance. However, these objectives may be in conflict,
which can lead to optimization difficulties and sometimes diminished transfer.
We propose a generalized latent optimization strategy that allows different
losses to accommodate each other and improves training dynamics. The proposed
method outperforms transfer learning and meta-learning baselines. In
particular, we achieve 10.02% absolute performance gain over the previous state
of the art on the iSarcasm dataset.Comment: 14 pages, 5 figures, published at NAACL-HLT 2021 conference, see
https://www.aclweb.org/anthology/2021.naacl-main.425
Deep Learning for User Comment Moderation
Experimenting with a new dataset of 1.6M user comments from a Greek news
portal and existing datasets of English Wikipedia comments, we show that an RNN
outperforms the previous state of the art in moderation. A deep,
classification-specific attention mechanism improves further the overall
performance of the RNN. We also compare against a CNN and a word-list baseline,
considering both fully automatic and semi-automatic moderation
Extractive Adversarial Networks: High-Recall Explanations for Identifying Personal Attacks in Social Media Posts
We introduce an adversarial method for producing high-recall explanations of
neural text classifier decisions. Building on an existing architecture for
extractive explanations via hard attention, we add an adversarial layer which
scans the residual of the attention for remaining predictive signal. Motivated
by the important domain of detecting personal attacks in social media comments,
we additionally demonstrate the importance of manually setting a semantically
appropriate `default' behavior for the model by explicitly manipulating its
bias term. We develop a validation set of human-annotated personal attacks to
evaluate the impact of these changes.Comment: Accepted to EMNLP 2018 Code and data available at
https://github.com/shcarton/rcn
Assessing the Efficacy of the ELECTRA Pre-Trained Language Model for Multi-Class Sarcasm Subcategory Classification
Sarcasm detection remains a challenging task in the discipline of natural language processing, primarily due to the large levels of nuance, subjectivity, and context-sensitivity in expression of the sentiment. Pre-trained large language models have been employed in a variety of sarcasm detection tasks, including binary sarcasm detection and the classification of sarcastic speech subcategories. However, such models remain compute-hungry solutions and thus there has been a recent trend towards attempting to mitigate this through the creation of more lightweight models- including ELECTRA. This dissertation seeks to assess the efficacy of the ELECTRA pre-trained large language model, known for its computational efficiency and performant results in various natural language processing tasks, for multi-class sarcasm subcategory classification. This research proposes a partial fine-tuning approach to generalise on sarcastic data before the model is applied in several manners to the task while employing feature engineering techniques to remove overlap between hierarchical data categories. Preliminary results yield a macro F1 Score of 0.0787 for 6-class classification and 0.2363 for3-class classification, indicating potential for further improvement and application within the field
- …