132 research outputs found

    Transformer based contextualization of pre-trained word embeddings for irony detection in Twitter

    Full text link
    [EN] Human communication using natural language, specially in social media, is influenced by the use of figurative language like irony. Recently, several workshops are intended to explore the task of irony detection in Twitter by using computational approaches. This paper describes a model for irony detection based on the contextualization of pre-trained Twitter word embeddings by means of the Transformer architecture. This approach is based on the same powerful architecture as BERT but, differently to it, our approach allows us to use in-domain embeddings. We performed an extensive evaluation on two corpora, one for the English language and another for the Spanish language. Our system was the first ranked system in the Spanish corpus and, to our knowledge, it has achieved the second-best result on the English corpus. These results support the correctness and adequacy of our proposal. We also studied and interpreted how the multi-head self-attention mechanisms are specialized on detecting irony by means of considering the polarity and relevance of individual words and even the relationships among words. This analysis is a first step towards understanding how the multi-head self-attention mechanisms of the Transformer architecture address the irony detection problem.This work has been partially supported by the Spanish Ministerio de Ciencia, Innovacion y Universidades and FEDER founds under project AMIC (TIN2017-85854-C4-2-R) and the GiSPRO project (PROMETEU/2018/176). Work of Jose-Angel Gonzalez is financed by Universitat Politecnica de Valencia under grant PAID-01-17.González-Barba, JÁ.; Hurtado Oliver, LF.; Pla Santamaría, F. (2020). Transformer based contextualization of pre-trained word embeddings for irony detection in Twitter. Information Processing & Management. 57(4):1-15. https://doi.org/10.1016/j.ipm.2020.102262S115574Farías, D. I. H., Patti, V., & Rosso, P. (2016). Irony Detection in Twitter. ACM Transactions on Internet Technology, 16(3), 1-24. doi:10.1145/2930663Greene, R., Cushman, S., Cavanagh, C., Ramazani, J., & Rouzer, P. (Eds.). (2012). The Princeton Encyclopedia of Poetry and Poetics. doi:10.1515/9781400841424Van Hee, C., Lefever, E., & Hoste, V. (2018). We Usually Don’t Like Going to the Dentist: Using Common Sense to Detect Irony on Twitter. Computational Linguistics, 44(4), 793-832. doi:10.1162/coli_a_00337Hochreiter, S., & Schmidhuber, J. (1997). Long Short-Term Memory. Neural Computation, 9(8), 1735-1780. doi:10.1162/neco.1997.9.8.1735Joshi, A., Bhattacharyya, P., & Carman, M. J. (2017). Automatic Sarcasm Detection. ACM Computing Surveys, 50(5), 1-22. doi:10.1145/3124420Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., & Soricut, R. (2019). Albert: A lite bert for self-supervised learning of language representations.Mohammad, S. M., & Turney, P. D. (2012). CROWDSOURCING A WORD-EMOTION ASSOCIATION LEXICON. Computational Intelligence, 29(3), 436-465. doi:10.1111/j.1467-8640.2012.00460.xMuecke, D. C. (1978). Irony markers. Poetics, 7(4), 363-375. doi:10.1016/0304-422x(78)90011-6Potamias, R. A., Siolas, G., & Stafylopatis, A. (2019). A transformer-based approach to irony and sarcasm detection. arXiv:1911.10401.Rosso, P., Rangel, F., Farías, I. H., Cagnina, L., Zaghouani, W., & Charfi, A. (2018). A survey on author profiling, deception, and irony detection for the Arabic language. Language and Linguistics Compass, 12(4), e12275. doi:10.1111/lnc3.12275Sulis, E., Irazú Hernández Farías, D., Rosso, P., Patti, V., & Ruffo, G. (2016). Figurative messages and affect in Twitter: Differences between #irony, #sarcasm and #not. Knowledge-Based Systems, 108, 132-143. doi:10.1016/j.knosys.2016.05.035Wilson, D., & Sperber, D. (1992). On verbal irony. Lingua, 87(1-2), 53-76. doi:10.1016/0024-3841(92)90025-eYus, F. (2016). Propositional attitude, affective attitude and irony comprehension. Pragmatics & Cognition, 23(1), 92-116. doi:10.1075/pc.23.1.05yusZhang, S., Zhang, X., Chan, J., & Rosso, P. (2019). Irony detection via sentiment-based transfer learning. Information Processing & Management, 56(5), 1633-1644. doi:10.1016/j.ipm.2019.04.00

    Incorporating Emoji Descriptions Improves Tweet Classification

    Get PDF
    Article presenting a simple strategy to process emojis in Tweets: replace them with their natural language description and use pretrained word embeddings as normally done with standard words. Results show that this strategy is more effective than using pretrained emoji embeddings for tweet classification

    Emotion Embeddings \unicode{x2014} Learning Stable and Homogeneous Abstractions from Heterogeneous Affective Datasets

    Full text link
    Human emotion is expressed in many communication modalities and media formats and so their computational study is equally diversified into natural language processing, audio signal analysis, computer vision, etc. Similarly, the large variety of representation formats used in previous research to describe emotions (polarity scales, basic emotion categories, dimensional approaches, appraisal theory, etc.) have led to an ever proliferating diversity of datasets, predictive models, and software tools for emotion analysis. Because of these two distinct types of heterogeneity, at the expressional and representational level, there is a dire need to unify previous work on increasingly diverging data and label types. This article presents such a unifying computational model. We propose a training procedure that learns a shared latent representation for emotions, so-called emotion embeddings, independent of different natural languages, communication modalities, media or representation label formats, and even disparate model architectures. Experiments on a wide range of heterogeneous affective datasets indicate that this approach yields the desired interoperability for the sake of reusability, interpretability and flexibility, without penalizing prediction quality. Code and data are archived under https://doi.org/10.5281/zenodo.7405327 .Comment: 18 pages, 6 figure
    corecore