3 research outputs found

    Detection of Sarcasm and Nastiness: New Resources for Spanish Language

    Get PDF
    The main goal of this work is to provide the cognitive computing community with valuable resources to analyze and simulate the intentionality and/or emotions embedded in the language employed in social media. Specifically, it is focused on the Spanish language and online dialogues, leading to the creation of SOFOCO (Spanish Online Forums Corpus). It is the first Spanish corpus consisting of dialogic debates extracted from social media and it is annotated by means of crowdsourcing in order to carry out automatic analysis of subjective language forms, like sarcasm or nastiness. Furthermore, the annotators were also asked about the context need when taking a decision. In this way, the users’ intentions and their behavior inside social networks can be better understood and more accurate text analysis is possible. An analysis of the annotation results is carried out and the reliability of the annotations is also explored. Additionally, sarcasm and nastiness detection results (around 0.76 F-Measure in both cases) are also reported. The obtained results show the presented corpus as a valuable resource that might be used in very diverse future work.This study was partially funded by the Spanish Government (TIN2014-54288-C4-4-R and TIN2017-85854-C4-3-R) by the European Unions’s H2020 program under grant 769872 and by the National Science Foundation of USA (NSF CISE R1 #1202668

    Generación de recursos para Análisis de Opiniones en español

    Get PDF
    [ES] El Análisis de Sentimientos (AS) se refiere al tratamiento de la información subjetiva en los textos, sobretodo comentarios u opiniones personales. Una de las tareas básicas de AS es la clasificación de la polaridad de un texto determinado en un documento o frase, es decir, si la opinión expresada es positiva, negativa o neutra. Mucho se ha investigado en la clasificación de polaridad en documentos escritos en inglés. Sin embargo, actualmente cada vez más personas expresan comentarios u opiniones en su propio idioma. Para llevar a cabo esta labor es necesario el uso de los recursos lingüísticos (lexicones y corpora) que son escasos, cuando no inexistentes, en idiomas distintos al inglés. Por tales circunstancias, esta tesis tiene como objetivo la generación de nuevos recursos para el AS en español, tercer idioma con más relevancia en la web 2.0.[EN] Sentiment Analysis (SA) refers to the treatment of the subjective information in texts, product reviews, comments on blogs or personal opinions. One of the basic tasks in SA is classifying the polarity of a given text in a document, i.e., whether the opinion expressed is positive, negative, or neutral. Many studies have investigated the polarity classification in documents written in English. However, nowadays more and more people express their comments, opinions or points of view in their own language. For this reason, it is necessary to develop systems than can extract and analyze all this information in different languages. In this work we focus on polarity detection for Spanish reviews. We are mainly concerned with linguistic resources for Spanish sentiment analysis because, in addition to the lack of resources for this language in this area, it is currently the third most used language in the web 2.0.Tesis Univ. Jaén. Departamento de Informática- Leída el 28 de noviembre de 201
    corecore