52 research outputs found

    Building layered, multilingual sentiment lexicons at synset and lemma levels

    Get PDF
    Many tasks related to sentiment analysis rely on sentiment lexicons, lexical resources containing information about the emotional implications of words (e.g., sentiment orientation of words, positive or negative). In this work, we present an automatic method for building lemma-level sentiment lexicons, which has been applied to obtain lexicons for English, Spanish and other three official languages in Spain. Our lexicons are multi-layered, allowing applications to trade off between the amount of available words and the accuracy of the estimations. Our evaluations show high accuracy values in all cases. As a previous step to the lemma-level lexicons, we have built a synset-level lexicon for English similar to SENTIWORDNET 3.0, one of the most used sentiment lexicons nowadays. We have made several improvements in the original SENTIWORDNET 3.0 building method, reflecting significantly better estimations of positivity and negativity, according to our evaluations. The resource containing all the lexicons, ML-SENTICON, is publicly available.Ministerio de Economía y Competitividad TIN2012-38536-C03-0

    Studi Literatur Sistematis Terhadap Pengembangan Leksikon Sentiment

    Get PDF
    Leksikon sentiment adalah sebuah kamus berisi istilah-istilah yang telah diklasifikasikan menurut polaritas sentiment positif, negatif dan netral. Biasanya istilah-stilah di dalam leksikon sentiment juga dilengkapi dengan nilai bobot polaritasnya. Leksikon sentiment memiliki peranan penting dalam proses analisis sentiment. Analisis sentiment adalah proses mengklasifikasikan polaritas emosi yang terkandung dalam suatu data. Jadi, leksikon sentiment adalah pondasi yang memperkuat akurasi hasil klasifikasi oleh mesin analisis sentiment. Mengingat pentingnya peranan leksikon sentiment maka banyak penelitian yang telah mengembangkan leksikon sentiment. Penelitian ini adalah sebuah literature review mengenai pengembangan leksikon sentiment. Hasil dari penelitian ini adalah identifikasi mengenai metode-metode untuk pengembangan leksikon sentimen, masalah yang terjadi dalam pengembangan leksikon sentiment, solusi-solusi untuk mengatasi permasalahan yang terjadi, jumlah data dan sumber data yang dibutuhkan, serta pengaruh leksikon sentiment pada perfoma sistem analisis sentiment. Penelitian ini menggunakan pendekatan  systematic literature review dalam proses pelaksanaannya. Hasil dari penelitian ini dapat berkontribusi sebagai tinjauan bagi proyek pengembangan leksikon sentiment

    Improving Spanish Polarity Classification Combining Different Linguistic Resources

    Get PDF
    Sentiment analysis is a challenging task which is attracting the attention of researchers. However, most of work is only focused on English documents, perhaps due to the lack of linguistic resources for other languages. In this paper, we present several Spanish opinion mining resources in order to develop a polarity classification system. In addition, we propose the combination of different features extracted from each resource in order to train a classifier over two different opinion corpora. We prove that the integration of knowledge from several resources can improve the final Spanish polarity classification system. The good results encourage us to continue developing sentiment resources for Spanish, and studying the combination of features extracted from different resourcesMinisterio de Economía y Competitividad TIN2012-38536-C03-0Junta de Andalucía P11-TIC-7684Universidad de Jaén CEATIC-2013-0

    Multilingual sentiment analysis in social media.

    Get PDF
    252 p.This thesis addresses the task of analysing sentiment in messages coming from social media. The ultimate goal was to develop a Sentiment Analysis system for Basque. However, because of the socio-linguistic reality of the Basque language a tool providing only analysis for Basque would not be enough for a real world application. Thus, we set out to develop a multilingual system, including Basque, English, French and Spanish.The thesis addresses the following challenges to build such a system:- Analysing methods for creating Sentiment lexicons, suitable for less resourced languages.- Analysis of social media (specifically Twitter): Tweets pose several challenges in order to understand and extract opinions from such messages. Language identification and microtext normalization are addressed.- Research the state of the art in polarity classification, and develop a supervised classifier that is tested against well known social media benchmarks.- Develop a social media monitor capable of analysing sentiment with respect to specific events, products or organizations

    Cross-domain polarity classification using a knowledge-enhanced meta-classifier

    Get PDF
    Current approaches to single and cross-domain polarity classification usually use bag of words, n-grams or lexical resource-based classifiers. In this paper, we propose the use of meta-learning to combine and enrich those approaches by adding also other knowledge-based features. In addition to the aforementioned classical approaches, our system uses the BabelNet multilingual semantic network to generate features derived from word sense disambiguation and vocabulary expansion. Experimental results show state-of-the-art performance on single and cross-domain polarity classification. Contrary to other approaches, ours is generic. These results were obtained without any domain adaptation technique. Moreover, the use of meta-learning allows our approach to obtain the most stable results across domains. Finally, our empirical analysis provides interesting insights on the use of semantic network-based features.European Comission WIQ-EI IRSES (No. 269180)Ministerio de Economía y Competitividad TIN2012-38603-C02-01Ministerio de Economía y Competitividad TIN2012-38536-C03-02Junta de Andalucía P11-TIC-7684 M

    Sentiment Analysis in Social Streams

    Get PDF
    In this chapter we review and discuss the state of the art on sentiment analysis in social streams –such as web forums, micro-blogging systems, and so- cial networks–, aiming to clarify how user opinions, affective states, and intended emotional effects are extracted from user generated content, how they are modeled, and how they could be finally exploited. We explain why sentiment analysis tasks are more difficult for social streams than for other textual sources, and entail going beyond classic text-based opinion mining techniques. We show, for example, that social streams may use vocabularies and expressions that exist outside the main- stream of standard, formal languages, and may reflect complex dynamics in the opinions and sentiments expressed by individuals and communities

    Sentiment Analysis in Social Streams

    Get PDF
    In this chapter, we review and discuss the state of the art on sentiment analysis in social streams—such as web forums, microblogging systems, and social networks, aiming to clarify how user opinions, affective states, and intended emo tional effects are extracted from user generated content, how they are modeled, and howthey could be finally exploited.We explainwhy sentiment analysistasks aremore difficult for social streams than for other textual sources, and entail going beyond classic text-based opinion mining techniques. We show, for example, that social streams may use vocabularies and expressions that exist outside the mainstream of standard, formal languages, and may reflect complex dynamics in the opinions and sentiments expressed by individuals and communities

    Generación de recursos para Análisis de Opiniones en español

    Get PDF
    [ES] El Análisis de Sentimientos (AS) se refiere al tratamiento de la información subjetiva en los textos, sobretodo comentarios u opiniones personales. Una de las tareas básicas de AS es la clasificación de la polaridad de un texto determinado en un documento o frase, es decir, si la opinión expresada es positiva, negativa o neutra. Mucho se ha investigado en la clasificación de polaridad en documentos escritos en inglés. Sin embargo, actualmente cada vez más personas expresan comentarios u opiniones en su propio idioma. Para llevar a cabo esta labor es necesario el uso de los recursos lingüísticos (lexicones y corpora) que son escasos, cuando no inexistentes, en idiomas distintos al inglés. Por tales circunstancias, esta tesis tiene como objetivo la generación de nuevos recursos para el AS en español, tercer idioma con más relevancia en la web 2.0.[EN] Sentiment Analysis (SA) refers to the treatment of the subjective information in texts, product reviews, comments on blogs or personal opinions. One of the basic tasks in SA is classifying the polarity of a given text in a document, i.e., whether the opinion expressed is positive, negative, or neutral. Many studies have investigated the polarity classification in documents written in English. However, nowadays more and more people express their comments, opinions or points of view in their own language. For this reason, it is necessary to develop systems than can extract and analyze all this information in different languages. In this work we focus on polarity detection for Spanish reviews. We are mainly concerned with linguistic resources for Spanish sentiment analysis because, in addition to the lack of resources for this language in this area, it is currently the third most used language in the web 2.0.Tesis Univ. Jaén. Departamento de Informática- Leída el 28 de noviembre de 201

    AORESCU: Opinion Analysis in Social Networks and User-Generated Contents

    Get PDF
    El proyecto AORESCU tiene como objetivos la recopilación y el procesamiento de la información generada por los usuarios sobre una entidad con idea de obtener a partir de ella una serie de indicadores que permitan evaluar la imagen que los usuarios tienen de la misma. La información recuperada puede ser estructurada (p.e. valoraciones numéricas) y no estructurada (fundamentalmente en forma de textos en lenguaje natural). Las técnicas y herramientas utilizadas en el proyecto son adaptables a cualquier dominio. No obstante, se ha elegido el ámbito turístico como dominio de aplicación al tratarse de un sector con una importante actividad económica y para el que es fácil encontrar contenidos para analizar. El proyecto tiene cuatro partes fundamentales: la recuperación de información de distintas fuentes sobre las entidades que pertenecen al dominio de aplicación (hoteles, restaurantes, espacios naturales, monumentos,…), la definición de un modelo de datos para representar esta información, el desarrollo de herramientas de análisis de textos para procesar los comentarios de los usuarios y el desarrollo de una aplicación web que permita analizar los datos procesados.AORESCU project main goals are focused on the retrieval and processing of information generated by users about an entity. The idea is to get insights from this information that help us to understand the perception of users about an entity. We can retrieve two types of information from web 2.0 sources: structured information (e.g. numerical rating) and unstructured (mainly in the form of texts in natural language). The techniques and tools used in the project are adaptable to any domain. We chose the tourism sector as application domain since it is a sector with an important economic activity and because it is easy to find user generated content about touristic resources. The project has four main phases: the retrieval of information from different sources about the entities (for the tourism sector, these entities are hotels, restaurants, natural spaces, monuments,...), the definition of a data model to represent this information, the development of text analysis tools to process user comments and the development of a web application to query and analyze the processed data.El proyecto AORESCU (P11-TIC-7684 MO) está financiado por la Consejería de Innovación, Ciencia y Empresas de la Junta de Andalucía
    corecore