1,094 research outputs found

    A lexicon based method to search for extreme opinions

    Get PDF
    Studies in sentiment analysis and opinion mining have been focused on many aspects related to opinions, namely polarity classification by making use of positive, negative or neutral values. However, most studies have overlooked the identification of extreme opinions (most negative and most positive opinions) in spite of their vast significance in many applications. We use an unsupervised approach to search for extreme opinions, which is based on the automatic construction of a new lexicon containing the most negative and most positive wordsS

    Learning domain-specific sentiment lexicons with applications to recommender systems

    Get PDF
    Search is now going beyond looking for factual information, and people wish to search for the opinions of others to help them in their own decision-making. Sentiment expressions or opinion expressions are used by users to express their opinion and embody important pieces of information, particularly in online commerce. The main problem that the present dissertation addresses is how to model text to find meaningful words that express a sentiment. In this context, I investigate the viability of automatically generating a sentiment lexicon for opinion retrieval and sentiment classification applications. For this research objective we propose to capture sentiment words that are derived from online users’ reviews. In this approach, we tackle a major challenge in sentiment analysis which is the detection of words that express subjective preference and domain-specific sentiment words such as jargon. To this aim we present a fully generative method that automatically learns a domain-specific lexicon and is fully independent of external sources. Sentiment lexicons can be applied in a broad set of applications, however popular recommendation algorithms have somehow been disconnected from sentiment analysis. Therefore, we present a study that explores the viability of applying sentiment analysis techniques to infer ratings in a recommendation algorithm. Furthermore, entities’ reputation is intrinsically associated with sentiment words that have a positive or negative relation with those entities. Hence, is provided a study that observes the viability of using a domain-specific lexicon to compute entities reputation. Finally, a recommendation system algorithm is improved with the use of sentiment-based ratings and entities reputation

    Multilingual opinion mining

    Get PDF
    170 p.Cada día se genera gran cantidad de texto en diferentes medios online. Gran parte de ese texto contiene opiniones acerca de multitud de entidades, productos, servicios, etc. Dada la creciente necesidad de disponer de medios automatizados para analizar, procesar y explotar esa información, las técnicas de análisis de sentimiento han recibido gran cantidad de atención por parte de la industria y la comunidad científica durante la última década y media. No obstante, muchas de las técnicas empleadas suelen requerir de entrenamiento supervisado utilizando para ello ejemplos anotados manualmente, u otros recursos lingüísticos relacionados con un idioma o dominio de aplicación específicos. Esto limita la aplicación de este tipo de técnicas, ya que dicho recursos y ejemplos anotados no son sencillos de obtener. En esta tesis se explora una serie de métodos para realizar diversos análisis automáticos de texto en el marco del análisis de sentimiento, incluyendo la obtención automática de términos de un dominio, palabras que expresan opinión, polaridad del sentimiento de dichas palabras (positivas o negativas), etc. Finalmente se propone y se evalúa un método que combina representación continua de palabras (continuous word embeddings) y topic-modelling inspirado en la técnica de Latent Dirichlet Allocation (LDA), para obtener un sistema de análisis de sentimiento basado en aspectos (ABSA), que sólo necesita unas pocas palabras semilla para procesar textos de un idioma o dominio determinados. De este modo, la adaptación a otro idioma o dominio se reduce a la traducción de las palabras semilla correspondientes

    A sentiment analysis model to evaluate people’s opinion about artificial intelligence

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced AnalyticsWith the use of internet, people are much more able to express and share what they think about a certain topic, their ideas and so on. Facebook and Twitter social networks, YouTube, online review sites like Zomato, online news sites or personal blogs are platforms that are usually used for this purpose. Every business wants to know what people think about their products; many people and politicians want to know the prediction for political elections; sometimes it can be useful to understand how opinions are distributed in some controversial themes. Thus, the analysis of textual data is also a need to stay competitive. In this work, through Sentiment Analysis techniques, different opinions from different online sources regarding to artificial intelligence are analyzed - a controversial field that have been a target of some debate in recent years. First, it is done a careful revision of the concept of Sentiment Analysis and all the involved techniques and processes such as data preprocessing, feature extraction and selection, sentiment classification approaches and machine learning algorithms – Naïve Bayes, Neural Networks, Random Forest, Support Vector Machine, Logistic Regression, Stochastic Gradient Descent. Based on previous works, the main conclusions, regarding to which techniques work better in which situations, are highlighted. Then, it is described the followed methodology in the application of Sentiment Analysis to artificial intelligence as a controversial field. The auxiliary tool used for this work is Python. In the end, results are presented and discussed

    NLP methods to inform Marketing Strategy

    Get PDF
    This thesis explores the intersection between the marketing literature and Natural Language Processing. From a methodological perspective, NLP has provided marketing scholars with new metrics and tools, like sentiment analysis and topic modelling, that enabled them to analyze textual data more in-depth and on a larger scale. One of the research areas most affected by these methodological advancements is research on Online Word Of Mouth and social media interactions in general. Within this field, we generally observe two types of research that use NLP methods: studies on the effect of social media interaction on external phenomena, like company performance, and studies about the interaction among the online actors. The three essays of this thesis follow this three folded classification (methodology, performance and online dynamics). In the first one, "The Telephone Game: the effect of online communication similarity on market performance", we study the effect of semantic similarity on market performance. Semantic similarity has been so far neglected in marketing to our knowledge. However, we argue that it is an important dimension of online communication dynamics since it can measure how much of the original brand message gets retained in consumers’ communication. This information helps consumers evaluate their fit with the brand, and hence it contributes to the effect of online communication on market performance. We find that semantic similarity positively affects market performance. The second essay, "Culture of Innovation: A Comprehensive Literature Review Using Natural Language Processing", is intended to provide a methodological contribution. We argue that, although there is no ready-to-use algorithm for literature reviews, different NLP methods can be used to assist researchers during the literature review process. Hence we try to apply them to review the literature about the culture of innovation. Finally, the third essay, "Disentangling the "echoverse" for brand communication", is a research proposal about social media dynamics between brands and consumers. The tendency of consumers to diverge from brand content creates a constant tension for brands between the need to keep up with its consumers to keep them engaged and the need to keep control of its own narrative. To assess how the conversational content between brands and consumers changes over time and who drives the change, we will observe the shifts in topics discussed by brands and consumers across ten years of Twitter data

    Stock market sentiment lexicon acquisition using microblogging data and statistical measures

    Get PDF
    Lexicon acquisition is a key issue for sentiment analysis. This paper presents a novel and fast approach for creating stock market lexicons. The approach is based on statistical measures applied over a vast set of labeled messages from StockTwits, which is a specialized stock market microblog. We compare three adaptations of statistical measures, such as pointwise mutual information (PMI), two new complementary statistics and the use of sentiment scores for affirmative and negated con- texts. Using StockTwits, we show that the new lexicons are competitive for measuring investor sentiment when compared with six popular lexicons. We also applied a lexicon to easily produce Twitter investor sentiment indicators and analyzed their correlation with survey sentiment indexes. The new microblogging indicators have a moderate correlation with popular Investors Intelligence (II) and American Association of Individual Investors (AAII) indicators. Thus, the new microblogging approach can be used alternatively to traditional survey indicators with advantages (e.g., cheaper creation, higher frequencies).This work was supported by FCT - Funda ção para a Ciência e Tecnologia within the Project Scope UID/CEC/00319/201

    Computational Sociolinguistics: A Survey

    Get PDF
    Language is a social phenomenon and variation is inherent to its social nature. Recently, there has been a surge of interest within the computational linguistics (CL) community in the social dimension of language. In this article we present a survey of the emerging field of "Computational Sociolinguistics" that reflects this increased interest. We aim to provide a comprehensive overview of CL research on sociolinguistic themes, featuring topics such as the relation between language and social identity, language use in social interaction and multilingual communication. Moreover, we demonstrate the potential for synergy between the research communities involved, by showing how the large-scale data-driven methods that are widely used in CL can complement existing sociolinguistic studies, and how sociolinguistics can inform and challenge the methods and assumptions employed in CL studies. We hope to convey the possible benefits of a closer collaboration between the two communities and conclude with a discussion of open challenges.Comment: To appear in Computational Linguistics. Accepted for publication: 18th February, 201
    • …
    corecore