7 research outputs found

    Cross Lingual Sentiment Analysis: A Clustering-Based Bee Colony Instance Selection and Target-Based Feature Weighting Approach

    Get PDF
    The lack of sentiment resources in poor resource languages poses challenges for the sentiment analysis in which machine learning is involved. Cross-lingual and semi-supervised learning approaches have been deployed to represent the most common ways that can overcome this issue. However, performance of the existing methods degrades due to the poor quality of translated resources, data sparseness and more specifically, language divergence. An integrated learning model that uses a semi-supervised and an ensembled model while utilizing the available sentiment resources to tackle language divergence related issues is proposed. Additionally, to reduce the impact of translation errors and handle instance selection problem, we propose a clustering-based bee-colony-sample selection method for the optimal selection of most distinguishing features representing the target data. To evaluate the proposed model, various experiments are conducted employing an English-Arabic cross-lingual data set. Simulations results demonstrate that the proposed model outperforms the baseline approaches in terms of classification performances. Furthermore, the statistical outcomes indicate the advantages of the proposed training data sampling and target-based feature selection to reduce the negative effect of translation errors. These results highlight the fact that the proposed approach achieves a performance that is close to in-language supervised models

    Multi-view informed attention-based model for Irony and Satire detection in Spanish variants

    Full text link
    [EN] Making machines understand language and reasoning on it has been one of the most challenging problems addressed by Artificial Intelligent researchers. This challenge increases when figurative language is used for communicating complex meanings, intentions, emotions and attitudes in creative and funny ways. In fact, sentiment analysis approaches struggle when facing irony, satire and other figurative languages, particularly those where the explanation of a prediction might arguably be as necessary as the prediction itself. This paper describes a new model MvAttLSTM based on deep learning for irony and satire detection in tweets written in distinct Spanish variants. The proposed model is based on an attentive-LSTM informed with three additional views learned from distinct perspectives. We investigate two strategies to pass these views into MvAttLSTM. We perform an extensive evaluation on three corpora, one for irony detection and two for satire detection. Moreover, in order to study the robustness of our proposed model, we investigate its performance on humor recognition. Experiments confirm that the proposed views help our model to improve its performance. Moreover, they show that affective information benefits our model to detect irony and satire. In particular, a first analysis of the results highlights the discriminating power of emotional features obtained from SenticNet and SEL lexicon. Overall, our system achieves the state-of-the-art performance in irony and satire detection in Spanish variants and competitive results in humor recognition.The work of the first two authors was in the framework of the research project MISMIS-FAKEnHATE on MISinformation and MIScommunication in social media: FAKE news and HATE speech (PGC2018-096212-B-C31) , funded by Spanish Ministry of Science and Innovation, and DeepPattern (PROMETEO/2019/121) , funded by the Generalitat Valenciana, Spain.Ortega-Bueno, R.; Rosso, P.; Medina-Pagola, JE. (2022). Multi-view informed attention-based model for Irony and Satire detection in Spanish variants. Knowledge-Based Systems. 235:1-24. https://doi.org/10.1016/j.knosys.2021.10759712423

    A study of the translation of sentiment in user-generated text

    Get PDF
    A thesis submitted in partial ful filment of the requirements of the University of Wolverhampton for the degree of Doctor of Philosophy.Emotions are biological states of feeling that humans may verbally express to communicate their negative or positive mood, influence others, or even afflict harm. Although emotions such as anger, happiness, affection, or fear are supposedly universal experiences, the lingual realisation of the emotional experience may vary in subtle ways across different languages. For this reason, preserving the original sentiment of the source text has always been a challenging task that draws in a translator's competence and fi nesse. In the professional translation industry, an incorrect translation of the sentiment-carrying lexicon is considered a critical error as it can be either misleading or in some cases harmful since it misses the fundamental aspect of the source text, i.e. the author's sentiment. Since the advent of Neural Machine Translation (NMT), there has been a tremendous improvement in the quality of automatic translation. This has lead to an extensive use of NMT online tools to translate User-Generated Text (UGT) such as reviews, tweets, and social media posts, where the main message is often the author's positive or negative attitude towards an entity. In such scenarios, the process of translating the user's sentiment is entirely automatic with no human intervention, neither for post-editing nor for accuracy checking. However, NMT output still lacks accuracy in some low-resource languages and sometimes makes critical translation errors that may not only distort the sentiment but at times flips the polarity of the source text to its exact opposite. In this thesis, we tackle the translation of sentiment in UGT by NMT systems from two perspectives: analytical and experimental. First, the analytical approach introduces a list of linguistic features that can lead to a mistranslation of ne-grained emotions between different language pairs in the UGT domain. It also presents an error-typology specifi c to Arabic UGT illustrating the main linguistic phenomena that can cause mistranslation of sentiment polarity when translating Arabic UGT into English by NMT systems. Second, the experimental approach attempts to improve the translation of sentiment by addressing some of the linguistic challenges identifi ed in the analysis as causing mistranslation of sentiment both on the word-level and on the sentence-level. On the word-level, we propose a Transformer NMT model trained on a sentiment-oriented vector space model (VSM) of UGT data that is capable of translating the correct sentiment polarity of challenging contronyms. On the sentence-level, we propose a semi-supervised approach to overcome the problem of translating sentiment expressed by dialectical language in UGT data. We take the translation of dialectical Arabic UGT into English as a case study. Our semi-supervised AR-EN NMT model shows improved performance over the online MT Twitter tool in translating dialectical Arabic UGT not only in terms of translation quality but also in the preservation of the sentiment polarity of the source text. The experimental section also presents an empirical method to quantify the notion of sentiment transfer by an MT system and, more concretely, to modify automatic metrics such that its MT ranking comes closer to a human judgement of a poor or good translation of sentiment
    corecore