52 research outputs found
Building layered, multilingual sentiment lexicons at synset and lemma levels
Many tasks related to sentiment analysis rely on sentiment lexicons, lexical resources containing
information about the emotional implications of words (e.g., sentiment orientation of words, positive
or negative). In this work, we present an automatic method for building lemma-level sentiment lexicons,
which has been applied to obtain lexicons for English, Spanish and other three official languages in Spain.
Our lexicons are multi-layered, allowing applications to trade off between the amount of available words
and the accuracy of the estimations. Our evaluations show high accuracy values in all cases. As a previous
step to the lemma-level lexicons, we have built a synset-level lexicon for English similar to SENTIWORDNET
3.0, one of the most used sentiment lexicons nowadays. We have made several improvements in the
original SENTIWORDNET 3.0 building method, reflecting significantly better estimations of positivity and
negativity, according to our evaluations. The resource containing all the lexicons, ML-SENTICON, is publicly
available.Ministerio de Economía y Competitividad TIN2012-38536-C03-0
Studi Literatur Sistematis Terhadap Pengembangan Leksikon Sentiment
Leksikon sentiment adalah sebuah kamus berisi istilah-istilah yang telah diklasifikasikan menurut polaritas sentiment positif, negatif dan netral. Biasanya istilah-stilah di dalam leksikon sentiment juga dilengkapi dengan nilai bobot polaritasnya. Leksikon sentiment memiliki peranan penting dalam proses analisis sentiment. Analisis sentiment adalah proses mengklasifikasikan polaritas emosi yang terkandung dalam suatu data. Jadi, leksikon sentiment adalah pondasi yang memperkuat akurasi hasil klasifikasi oleh mesin analisis sentiment. Mengingat pentingnya peranan leksikon sentiment maka banyak penelitian yang telah mengembangkan leksikon sentiment. Penelitian ini adalah sebuah literature review mengenai pengembangan leksikon sentiment. Hasil dari penelitian ini adalah identifikasi mengenai metode-metode untuk pengembangan leksikon sentimen, masalah yang terjadi dalam pengembangan leksikon sentiment, solusi-solusi untuk mengatasi permasalahan yang terjadi, jumlah data dan sumber data yang dibutuhkan, serta pengaruh leksikon sentiment pada perfoma sistem analisis sentiment. Penelitian ini menggunakan pendekatan systematic literature review dalam proses pelaksanaannya. Hasil dari penelitian ini dapat berkontribusi sebagai tinjauan bagi proyek pengembangan leksikon sentiment
Improving Spanish Polarity Classification Combining Different Linguistic Resources
Sentiment analysis is a challenging task which is attracting
the attention of researchers. However, most of work is only focused on
English documents, perhaps due to the lack of linguistic resources for
other languages. In this paper, we present several Spanish opinion mining
resources in order to develop a polarity classification system. In addition,
we propose the combination of different features extracted from each
resource in order to train a classifier over two different opinion corpora.
We prove that the integration of knowledge from several resources can
improve the final Spanish polarity classification system. The good results
encourage us to continue developing sentiment resources for Spanish, and
studying the combination of features extracted from different resourcesMinisterio de Economía y Competitividad TIN2012-38536-C03-0Junta de Andalucía P11-TIC-7684Universidad de Jaén CEATIC-2013-0
Multilingual sentiment analysis in social media.
252 p.This thesis addresses the task of analysing sentiment in messages coming from social media. The ultimate goal was to develop a Sentiment Analysis system for Basque. However, because of the socio-linguistic reality of the Basque language a tool providing only analysis for Basque would not be enough for a real world application. Thus, we set out to develop a multilingual system, including Basque, English, French and Spanish.The thesis addresses the following challenges to build such a system:- Analysing methods for creating Sentiment lexicons, suitable for less resourced languages.- Analysis of social media (specifically Twitter): Tweets pose several challenges in order to understand and extract opinions from such messages. Language identification and microtext normalization are addressed.- Research the state of the art in polarity classification, and develop a supervised classifier that is tested against well known social media benchmarks.- Develop a social media monitor capable of analysing sentiment with respect to specific events, products or organizations
Cross-domain polarity classification using a knowledge-enhanced meta-classifier
Current approaches to single and cross-domain polarity classification usually use bag of words, n-grams
or lexical resource-based classifiers. In this paper, we propose the use of meta-learning to combine and
enrich those approaches by adding also other knowledge-based features. In addition to the aforementioned
classical approaches, our system uses the BabelNet multilingual semantic network to generate features
derived from word sense disambiguation and vocabulary expansion. Experimental results show
state-of-the-art performance on single and cross-domain polarity classification. Contrary to other
approaches, ours is generic. These results were obtained without any domain adaptation technique.
Moreover, the use of meta-learning allows our approach to obtain the most stable results across domains.
Finally, our empirical analysis provides interesting insights on the use of semantic network-based
features.European Comission WIQ-EI IRSES (No. 269180)Ministerio de Economía y Competitividad TIN2012-38603-C02-01Ministerio de Economía y Competitividad TIN2012-38536-C03-02Junta de Andalucía P11-TIC-7684 M
Sentiment Analysis in Social Streams
In this chapter we review and discuss the state of the art on sentiment analysis in social streams –such as web forums, micro-blogging systems, and so- cial networks–, aiming to clarify how user opinions, affective states, and intended emotional effects are extracted from user generated content, how they are modeled, and how they could be finally exploited. We explain why sentiment analysis tasks are more difficult for social streams than for other textual sources, and entail going beyond classic text-based opinion mining techniques. We show, for example, that social streams may use vocabularies and expressions that exist outside the main- stream of standard, formal languages, and may reflect complex dynamics in the opinions and sentiments expressed by individuals and communities
Sentiment Analysis in Social Streams
In this chapter, we review and discuss the state of the art on sentiment
analysis in social streams—such as web forums, microblogging systems, and social
networks, aiming to clarify how user opinions, affective states, and intended emo tional effects are extracted from user generated content, how they are modeled, and
howthey could be finally exploited.We explainwhy sentiment analysistasks aremore
difficult for social streams than for other textual sources, and entail going beyond
classic text-based opinion mining techniques. We show, for example, that social
streams may use vocabularies and expressions that exist outside the mainstream of
standard, formal languages, and may reflect complex dynamics in the opinions and
sentiments expressed by individuals and communities
Generación de recursos para Análisis de Opiniones en español
[ES] El Análisis de Sentimientos (AS) se refiere al tratamiento de la información subjetiva en los textos, sobretodo comentarios u opiniones personales. Una de las tareas básicas de AS es la clasificación de la polaridad de un texto determinado en un documento o frase, es decir, si la opinión expresada es positiva, negativa o neutra. Mucho se ha investigado en la clasificación de polaridad en documentos escritos en inglés. Sin embargo, actualmente cada vez más personas expresan comentarios u opiniones en su propio idioma. Para llevar a cabo esta labor es necesario el uso de los recursos lingüísticos (lexicones y corpora) que son escasos, cuando no inexistentes, en idiomas distintos al inglés. Por tales circunstancias, esta tesis tiene como objetivo la generación de nuevos recursos para el AS en español, tercer idioma con más relevancia en la web 2.0.[EN] Sentiment Analysis (SA) refers to the treatment of the subjective information in texts, product reviews, comments on blogs or personal opinions. One of the basic tasks in SA is classifying the polarity of a given text in a document, i.e., whether the opinion expressed is positive, negative, or neutral. Many studies have investigated the polarity classification in documents written in English. However, nowadays more and more people express their comments, opinions or points of view in their own language. For this reason, it is necessary to develop systems than can extract and analyze all this information in different languages. In this work we focus on polarity detection for Spanish reviews. We are mainly concerned with linguistic resources for Spanish sentiment analysis because, in addition to the lack of resources for this language in this area, it is currently the third most used language in the web 2.0.Tesis Univ. Jaén. Departamento de Informática- Leída el 28 de noviembre de 201
AORESCU: Opinion Analysis in Social Networks and User-Generated Contents
El proyecto AORESCU tiene como objetivos la recopilación y el procesamiento de la información generada por los usuarios sobre una entidad con idea de obtener a partir de ella una serie de indicadores que permitan evaluar la imagen que los usuarios tienen de la misma. La información recuperada puede ser estructurada (p.e. valoraciones numéricas) y no estructurada (fundamentalmente en forma de textos en lenguaje natural). Las técnicas y herramientas utilizadas en el proyecto son adaptables a cualquier dominio. No obstante, se ha elegido el ámbito turístico como dominio de aplicación al tratarse de un sector con una importante actividad económica y para el que es fácil encontrar contenidos para analizar. El proyecto tiene cuatro partes fundamentales: la recuperación de información de distintas fuentes sobre las entidades que pertenecen al dominio de aplicación (hoteles, restaurantes, espacios naturales, monumentos,…), la definición de un modelo de datos para representar esta información, el desarrollo de herramientas de análisis de textos para procesar los comentarios de los usuarios y el desarrollo de una aplicación web que permita analizar los datos procesados.AORESCU project main goals are focused on the retrieval and processing of information generated by users about an entity. The idea is to get insights from this information that help us to understand the perception of users about an entity. We can retrieve two types of information from web 2.0 sources: structured information (e.g. numerical rating) and unstructured (mainly in the form of texts in natural language). The techniques and tools used in the project are adaptable to any domain. We chose the tourism sector as application domain since it is a sector with an important economic activity and because it is easy to find user generated content about touristic resources. The project has four main phases: the retrieval of information from different sources about the entities (for the tourism sector, these entities are hotels, restaurants, natural spaces, monuments,...), the definition of a data model to represent this information, the development of text analysis tools to process user comments and the development of a web application to query and analyze the processed data.El proyecto AORESCU (P11-TIC-7684 MO) está financiado por la Consejería de Innovación, Ciencia y Empresas de la Junta de Andalucía
- …