684 research outputs found

    Latent-Optimized Adversarial Neural Transfer for Sarcasm Detection

    Full text link
    The existence of multiple datasets for sarcasm detection prompts us to apply transfer learning to exploit their commonality. The adversarial neural transfer (ANT) framework utilizes multiple loss terms that encourage the source-domain and the target-domain feature distributions to be similar while optimizing for domain-specific performance. However, these objectives may be in conflict, which can lead to optimization difficulties and sometimes diminished transfer. We propose a generalized latent optimization strategy that allows different losses to accommodate each other and improves training dynamics. The proposed method outperforms transfer learning and meta-learning baselines. In particular, we achieve 10.02% absolute performance gain over the previous state of the art on the iSarcasm dataset.Comment: 14 pages, 5 figures, published at NAACL-HLT 2021 conference, see https://www.aclweb.org/anthology/2021.naacl-main.425

    Deep Learning for User Comment Moderation

    Full text link
    Experimenting with a new dataset of 1.6M user comments from a Greek news portal and existing datasets of English Wikipedia comments, we show that an RNN outperforms the previous state of the art in moderation. A deep, classification-specific attention mechanism improves further the overall performance of the RNN. We also compare against a CNN and a word-list baseline, considering both fully automatic and semi-automatic moderation

    Extractive Adversarial Networks: High-Recall Explanations for Identifying Personal Attacks in Social Media Posts

    Full text link
    We introduce an adversarial method for producing high-recall explanations of neural text classifier decisions. Building on an existing architecture for extractive explanations via hard attention, we add an adversarial layer which scans the residual of the attention for remaining predictive signal. Motivated by the important domain of detecting personal attacks in social media comments, we additionally demonstrate the importance of manually setting a semantically appropriate `default' behavior for the model by explicitly manipulating its bias term. We develop a validation set of human-annotated personal attacks to evaluate the impact of these changes.Comment: Accepted to EMNLP 2018 Code and data available at https://github.com/shcarton/rcn

    Assessing the Efficacy of the ELECTRA Pre-Trained Language Model for Multi-Class Sarcasm Subcategory Classification

    Get PDF
    Sarcasm detection remains a challenging task in the discipline of natural language processing, primarily due to the large levels of nuance, subjectivity, and context-sensitivity in expression of the sentiment. Pre-trained large language models have been employed in a variety of sarcasm detection tasks, including binary sarcasm detection and the classification of sarcastic speech subcategories. However, such models remain compute-hungry solutions and thus there has been a recent trend towards attempting to mitigate this through the creation of more lightweight models- including ELECTRA. This dissertation seeks to assess the efficacy of the ELECTRA pre-trained large language model, known for its computational efficiency and performant results in various natural language processing tasks, for multi-class sarcasm subcategory classification. This research proposes a partial fine-tuning approach to generalise on sarcastic data before the model is applied in several manners to the task while employing feature engineering techniques to remove overlap between hierarchical data categories. Preliminary results yield a macro F1 Score of 0.0787 for 6-class classification and 0.2363 for3-class classification, indicating potential for further improvement and application within the field
    • …
    corecore