Search CORE

121 research outputs found

Towards Robust Word Embeddings for Noisy Texts

Author: Doval Yerai
Gómez-Rodríguez Carlos
Vilares Jesús
Publication venue: 'MDPI AG'
Publication date: 01/01/2020
Field of study

[Abstract] Research on word embeddings has mainly focused on improving their performance on standard corpora, disregarding the difficulties posed by noisy texts in the form of tweets and other types of non-standard writing from social media. In this work, we propose a simple extension to the skipgram model in which we introduce the concept of bridge-words, which are artificial words added to the model to strengthen the similarity between standard words and their noisy variants. Our new embeddings outperform baseline models on noisy texts on a wide range of evaluation tasks, both intrinsic and extrinsic, while retaining a good performance on standard texts. To the best of our knowledge, this is the first explicit approach at dealing with these types of noisy texts at the word embedding level that goes beyond the support for out-of-vocabulary words.Ministerio de Economía, Industria y Competitividad. MINECO; TIN2017-85160-C2-2-RMinisterio de Economía, Industria y Competitividad. MINECO; TIN2017-85160-C2-1-REuropean Social Fund. ESF; BES-2015-073768Xunta de Galicia; ED431D 2017/12Xunta de Galicia; ED431B 2017/01Xunta de Galicia; ED431C 2020/11Xunta de Galicia; ED431G/0

arXiv.org e-Print Archive

Modeling Empathy and Distress in Reaction to News Stories

Author: Buechel Sven
Buffone Anneke
Sedoc João
Slaff Barry
Ungar Lyle
Publication venue
Publication date: 01/01/2018
Field of study

Computational detection and understanding of empathy is an important factor in advancing human-computer interaction. Yet to date, text-based empathy prediction has the following major limitations: It underestimates the psychological complexity of the phenomenon, adheres to a weak notion of ground truth where empathic states are ascribed by third parties, and lacks a shared corpus. In contrast, this contribution presents the first publicly available gold standard for empathy prediction. It is constructed using a novel annotation methodology which reliably captures empathy assessments by the writer of a statement using multi-item scales. This is also the first computational work distinguishing between multiple forms of empathy, empathic concern, and personal distress, as recognized throughout psychology. Finally, we present experimental results for three different predictive models, of which a CNN performs the best.Comment: To appear at EMNLP 201

arXiv.org e-Print Archive