1,094 research outputs found
Recommended from our members
Semantic Sentiment Analysis of Microblogs
Microblogs and social media platforms are now considered among the most popular forms of online communication. Through a platform like Twitter, much information reflecting people's opinions and attitudes is published and shared among users on a daily basis. This has recently brought great opportunities to companies interested in tracking and monitoring the reputation of their brands and businesses, and to policy makers and politicians to support their assessment of public opinions about their policies or political issues.
A wide range of approaches to sentiment analysis on Twitter, and other similar microblogging platforms, have been recently built. Most of these approaches rely mainly on the presence of affect words or syntactic structures that explicitly and unambiguously reflect sentiment (e.g., "great'', "terrible''). However, these approaches are semantically weak, that is, they do not account for the semantics of words when detecting their sentiment in text. This is problematic since the sentiment of words, in many cases, is associated with their semantics, either along the context they occur within (e.g., "great'' is negative in the context "pain'') or the conceptual meaning associated with the words (e.g., "Ebola" is negative when its associated semantic concept is "Virus").
This thesis investigates the role of words' semantics in sentiment analysis of microblogs, aiming mainly at addressing the above problem. In particular, Twitter is used as a case study of microblogging platforms to investigate whether capturing the sentiment of words with respect to their semantics leads to more accurate sentiment analysis models on Twitter. To this end, several approaches are proposed in this thesis for extracting and incorporating two types of word semantics for sentiment analysis: contextual semantics (i.e., semantics captured from words' co-occurrences) and conceptual semantics (i.e., semantics extracted from external knowledge sources).
Experiments are conducted with both types of semantics by assessing their impact in three popular sentiment analysis tasks on Twitter; entity-level sentiment analysis, tweet-level sentiment analysis and context-sensitive sentiment lexicon adaptation. Evaluation under each sentiment analysis task includes several sentiment lexicons, and up to 9 Twitter datasets of different characteristics, as well as comparing against several state-of-the-art sentiment analysis approaches widely used in the literature.
The findings from this body of work demonstrate the value of using semantics in sentiment analysis on Twitter. The proposed approaches, which consider words' semantics for sentiment analysis at both, entity and tweet levels, surpass non-semantic approaches in most datasets
A lexicon based method to search for extreme opinions
Studies in sentiment analysis and opinion mining have been focused on many aspects related to opinions, namely polarity classification by making use of positive, negative or neutral values. However, most studies have overlooked the identification of extreme opinions (most negative and most positive opinions) in spite of their vast significance in many applications. We use an unsupervised approach to search for extreme opinions, which is based on the automatic construction of a new lexicon containing the most negative and most positive wordsS
Learning domain-specific sentiment lexicons with applications to recommender systems
Search is now going beyond looking for factual information, and people wish to search for the opinions of others to help them in their own decision-making. Sentiment expressions or opinion expressions are used by users to express their opinion and embody important pieces of information, particularly in online commerce. The main problem that the present dissertation addresses is how to model text to find meaningful words that express a sentiment. In this context, I investigate the viability of automatically generating a sentiment lexicon for opinion retrieval and sentiment classification applications. For this research objective we propose to capture sentiment words that are derived from online users’ reviews. In this approach, we tackle a major challenge in sentiment analysis which is the detection of words that express subjective preference and domain-specific sentiment words such as jargon. To this aim we present a fully generative method that automatically learns a domain-specific lexicon and is fully independent of external sources.
Sentiment lexicons can be applied in a broad set of applications, however popular recommendation algorithms have somehow been disconnected from sentiment analysis. Therefore, we present a study that explores the viability of applying sentiment analysis techniques to infer ratings in a recommendation algorithm. Furthermore, entities’ reputation is intrinsically associated with sentiment words that have a positive or negative relation with those entities. Hence, is provided a study that observes the viability of using a domain-specific lexicon to compute entities reputation. Finally, a recommendation system algorithm is improved with the use of sentiment-based ratings and entities reputation
Multilingual opinion mining
170 p.Cada dÃa se genera gran cantidad de texto en diferentes medios online. Gran parte de ese texto contiene opiniones acerca de multitud de entidades, productos, servicios, etc. Dada la creciente necesidad de disponer de medios automatizados para analizar, procesar y explotar esa información, las técnicas de análisis de sentimiento han recibido gran cantidad de atención por parte de la industria y la comunidad cientÃfica durante la última década y media. No obstante, muchas de las técnicas empleadas suelen requerir de entrenamiento supervisado utilizando para ello ejemplos anotados manualmente, u otros recursos lingüÃsticos relacionados con un idioma o dominio de aplicación especÃficos. Esto limita la aplicación de este tipo de técnicas, ya que dicho recursos y ejemplos anotados no son sencillos de obtener. En esta tesis se explora una serie de métodos para realizar diversos análisis automáticos de texto en el marco del análisis de sentimiento, incluyendo la obtención automática de términos de un dominio, palabras que expresan opinión, polaridad del sentimiento de dichas palabras (positivas o negativas), etc. Finalmente se propone y se evalúa un método que combina representación continua de palabras (continuous word embeddings) y topic-modelling inspirado en la técnica de Latent Dirichlet Allocation (LDA), para obtener un sistema de análisis de sentimiento basado en aspectos (ABSA), que sólo necesita unas pocas palabras semilla para procesar textos de un idioma o dominio determinados. De este modo, la adaptación a otro idioma o dominio se reduce a la traducción de las palabras semilla correspondientes
A sentiment analysis model to evaluate people’s opinion about artificial intelligence
Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced AnalyticsWith the use of internet, people are much more able to express and share what they think about a certain topic, their ideas and so on. Facebook and Twitter social networks, YouTube, online review sites like Zomato, online news sites or personal blogs are platforms that are usually used for this purpose. Every business wants to know what people think about their products; many people and politicians want to know the prediction for political elections; sometimes it can be useful to understand how opinions are distributed in some controversial themes. Thus, the analysis of textual data is also a need to stay competitive.
In this work, through Sentiment Analysis techniques, different opinions from different online sources regarding to artificial intelligence are analyzed - a controversial field that have been a target of some debate in recent years.
First, it is done a careful revision of the concept of Sentiment Analysis and all the involved techniques and processes such as data preprocessing, feature extraction and selection, sentiment classification approaches and machine learning algorithms – Naïve Bayes, Neural Networks, Random Forest, Support Vector Machine, Logistic Regression, Stochastic Gradient Descent. Based on previous works, the main conclusions, regarding to which techniques work better in which situations, are highlighted. Then, it is described the followed methodology in the application of Sentiment Analysis to artificial intelligence as a controversial field. The auxiliary tool used for this work is Python. In the end, results are presented and discussed
NLP methods to inform Marketing Strategy
This thesis explores the intersection between the marketing literature and Natural Language Processing. From a methodological perspective, NLP has provided marketing scholars with new metrics and tools, like sentiment analysis and topic modelling, that enabled
them to analyze textual data more in-depth and on a larger scale. One of the research
areas most affected by these methodological advancements is research on Online Word
Of Mouth and social media interactions in general. Within this field, we generally observe two types of research that use NLP methods: studies on the effect of social media
interaction on external phenomena, like company performance, and studies about the interaction among the online actors. The three essays of this thesis follow this three folded
classification (methodology, performance and online dynamics).
In the first one, "The Telephone Game: the effect of online communication similarity on
market performance", we study the effect of semantic similarity on market performance.
Semantic similarity has been so far neglected in marketing to our knowledge. However,
we argue that it is an important dimension of online communication dynamics since it
can measure how much of the original brand message gets retained in consumers’ communication. This information helps consumers evaluate their fit with the brand, and hence
it contributes to the effect of online communication on market performance. We find that
semantic similarity positively affects market performance.
The second essay, "Culture of Innovation: A Comprehensive Literature Review Using
Natural Language Processing", is intended to provide a methodological contribution. We argue that, although there is no ready-to-use algorithm for literature reviews, different
NLP methods can be used to assist researchers during the literature review process. Hence
we try to apply them to review the literature about the culture of innovation.
Finally, the third essay, "Disentangling the "echoverse" for brand communication", is a
research proposal about social media dynamics between brands and consumers. The tendency of consumers to diverge from brand content creates a constant tension for brands
between the need to keep up with its consumers to keep them engaged and the need
to keep control of its own narrative. To assess how the conversational content between
brands and consumers changes over time and who drives the change, we will observe the
shifts in topics discussed by brands and consumers across ten years of Twitter data
Stock market sentiment lexicon acquisition using microblogging data and statistical measures
Lexicon acquisition is a key issue for sentiment analysis. This paper presents a novel and fast approach for creating stock market lexicons. The approach is based on statistical measures applied over a vast set of labeled messages from StockTwits, which is a specialized stock market microblog. We compare three adaptations of statistical measures, such as pointwise mutual information (PMI), two new complementary statistics and the use of sentiment scores for affirmative and negated con- texts. Using StockTwits, we show that the new lexicons are competitive for measuring investor sentiment when compared with six popular lexicons. We also applied a lexicon to easily produce Twitter investor sentiment indicators and analyzed their correlation with survey sentiment indexes. The new microblogging indicators have a moderate correlation with popular Investors Intelligence (II) and American Association of Individual Investors (AAII) indicators. Thus, the new microblogging approach can be used alternatively to traditional survey indicators with advantages (e.g., cheaper creation, higher frequencies).This work was supported by FCT - Funda ção para a Ciência e Tecnologia within the Project Scope UID/CEC/00319/201
Computational Sociolinguistics: A Survey
Language is a social phenomenon and variation is inherent to its social
nature. Recently, there has been a surge of interest within the computational
linguistics (CL) community in the social dimension of language. In this article
we present a survey of the emerging field of "Computational Sociolinguistics"
that reflects this increased interest. We aim to provide a comprehensive
overview of CL research on sociolinguistic themes, featuring topics such as the
relation between language and social identity, language use in social
interaction and multilingual communication. Moreover, we demonstrate the
potential for synergy between the research communities involved, by showing how
the large-scale data-driven methods that are widely used in CL can complement
existing sociolinguistic studies, and how sociolinguistics can inform and
challenge the methods and assumptions employed in CL studies. We hope to convey
the possible benefits of a closer collaboration between the two communities and
conclude with a discussion of open challenges.Comment: To appear in Computational Linguistics. Accepted for publication:
18th February, 201
- …