114 research outputs found
SPECIFIC WORD EMBEDDINGS FOR SENTIMENT ANALYSIS IN TWITTER DATA ANALYSIS
Social media sites are one of the platforms where a lot of people interact in the present, expanding world. Twitter is one popular social media platform. Tweets are shared with the general public through Twitter. Currently, significant amounts of continuous data word representation learning algorithms do not take into consider the sentimental relationships between words and instead focus only on the text's syntactic information. Specific Word Embeddings for Sentiment Analysis in Twitter Data Analysis is presented in this analysis. Weighted average word embeddings is a method that uses an adapted version of the delta Term Frequency-Inverse Document Frequency (TFIDF) a method to integrate sentiment data into continuous word representations. Sentiment Analysis experiment model makes use of a classifier described as Tailored Random Forest, which was trained the training sample. An analysis utilizes tokenization sentiment, stemming, and the removal of stop words. In this study, the development of multiplication polarity-based sentiment analysis is the main focus. In comparison to un-weighted embeddings, experiments have shown promising results. Experimental results demonstrate that described classifier gives very high predictive Accuracy, macro average Recall and Precision. Finally, they can enhance the sentiment analysis model's performance
Contextual lexicon-based sentiment analysis for social media.
Sentiment analysis concerns the computational study of opinions expressed in text. Social media domains provide a wealth of opinionated data, thus, creating a greater need for sentiment analysis. Typically, sentiment lexicons that capture term-sentiment association knowledge are commonly used to develop sentiment analysis systems. However, the nature of social media content calls for analysis methods and knowledge sources that are better able to adapt to changing vocabulary. Invariably existing sentiment lexicon knowledge cannot usefully handle social media vocabulary which is typically informal and changeable yet rich in sentiment. This, in turn, has implications on the analyser's ability to effectively capture the context therein and to interpret the sentiment polarity from the lexicons. In this thesis we use SentiWordNet, a popular sentiment-rich lexicon with a substantial vocabulary coverage and explore how to adapt it for social media sentiment analysis. Firstly, the thesis identifies a set of strategies to incorporate the effect of modifiers on sentiment-bearing terms (local context). These modifiers include: contextual valence shifters, non-lexical sentiment modifiers typical in social media and discourse structures. Secondly, the thesis introduces an approach in which a domain-specific lexicon is generated using a distant supervision method and integrated with a general-purpose lexicon, using a weighted strategy, to form a hybrid (domain-adapted) lexicon. This has the dual purpose of enriching term coverage of the general purpose lexicon with non-standard but sentiment-rich terms as well as adjusting sentiment semantics of terms. Here, we identified two term-sentiment association metrics based on Term Frequency and Inverse Document Frequency that are able to outperform the state-of-the-art Point-wise Mutual Information on social media data. As distant supervision may not be readily applicable on some social media domains, we explore the cross-domain transferability of a hybrid lexicon. Thirdly, we introduce an approach for improving distant-supervised sentiment classification with knowledge from local context analysis, domain-adapted (hybrid) and emotion lexicons. Finally, we conduct a comprehensive evaluation of all identified approaches using six sentiment-rich social media datasets
Recommended from our members
Semantic Sentiment Analysis of Microblogs
Microblogs and social media platforms are now considered among the most popular forms of online communication. Through a platform like Twitter, much information reflecting people's opinions and attitudes is published and shared among users on a daily basis. This has recently brought great opportunities to companies interested in tracking and monitoring the reputation of their brands and businesses, and to policy makers and politicians to support their assessment of public opinions about their policies or political issues.
A wide range of approaches to sentiment analysis on Twitter, and other similar microblogging platforms, have been recently built. Most of these approaches rely mainly on the presence of affect words or syntactic structures that explicitly and unambiguously reflect sentiment (e.g., "great'', "terrible''). However, these approaches are semantically weak, that is, they do not account for the semantics of words when detecting their sentiment in text. This is problematic since the sentiment of words, in many cases, is associated with their semantics, either along the context they occur within (e.g., "great'' is negative in the context "pain'') or the conceptual meaning associated with the words (e.g., "Ebola" is negative when its associated semantic concept is "Virus").
This thesis investigates the role of words' semantics in sentiment analysis of microblogs, aiming mainly at addressing the above problem. In particular, Twitter is used as a case study of microblogging platforms to investigate whether capturing the sentiment of words with respect to their semantics leads to more accurate sentiment analysis models on Twitter. To this end, several approaches are proposed in this thesis for extracting and incorporating two types of word semantics for sentiment analysis: contextual semantics (i.e., semantics captured from words' co-occurrences) and conceptual semantics (i.e., semantics extracted from external knowledge sources).
Experiments are conducted with both types of semantics by assessing their impact in three popular sentiment analysis tasks on Twitter; entity-level sentiment analysis, tweet-level sentiment analysis and context-sensitive sentiment lexicon adaptation. Evaluation under each sentiment analysis task includes several sentiment lexicons, and up to 9 Twitter datasets of different characteristics, as well as comparing against several state-of-the-art sentiment analysis approaches widely used in the literature.
The findings from this body of work demonstrate the value of using semantics in sentiment analysis on Twitter. The proposed approaches, which consider words' semantics for sentiment analysis at both, entity and tweet levels, surpass non-semantic approaches in most datasets
Learning domain-specific sentiment lexicons with applications to recommender systems
Search is now going beyond looking for factual information, and people wish to search for the opinions of others to help them in their own decision-making. Sentiment expressions or opinion expressions are used by users to express their opinion and embody important pieces of information, particularly in online commerce. The main problem that the present dissertation addresses is how to model text to find meaningful words that express a sentiment. In this context, I investigate the viability of automatically generating a sentiment lexicon for opinion retrieval and sentiment classification applications. For this research objective we propose to capture sentiment words that are derived from online users’ reviews. In this approach, we tackle a major challenge in sentiment analysis which is the detection of words that express subjective preference and domain-specific sentiment words such as jargon. To this aim we present a fully generative method that automatically learns a domain-specific lexicon and is fully independent of external sources.
Sentiment lexicons can be applied in a broad set of applications, however popular recommendation algorithms have somehow been disconnected from sentiment analysis. Therefore, we present a study that explores the viability of applying sentiment analysis techniques to infer ratings in a recommendation algorithm. Furthermore, entities’ reputation is intrinsically associated with sentiment words that have a positive or negative relation with those entities. Hence, is provided a study that observes the viability of using a domain-specific lexicon to compute entities reputation. Finally, a recommendation system algorithm is improved with the use of sentiment-based ratings and entities reputation
Twitter financial community sentiment and its predictive relationship to stock market movement
Twitter, one of the several major social media platforms, has been identified as an influential factor for financial markets by multiple academic and professional publications in recent years. The motivation of this study hinges on the growing popularity of the use of Twitter and the increasing prevalence of its influence among the financial investment community. This paper presents empirical evidence of the existence of a financial community on Twitter in which users’ interests align with financial market-related topics. We establish a methodology to identify relevant Twitter users who form the financial community, and we also present the empirical findings of network characteristics of the financial community. We observe that this financial community behaves similarly to a small-world network, and we further identify groups of critical nodes and analyse their influence within the financial community based on several network centrality measures. Using a novel sentiment analysis algorithm, we construct a weighted sentiment measure using tweet messages from these critical nodes, and we discover that it is significantly correlated with the returns of the major financial market indices. By forming a financial community within the Twitter universe, we argue that the influential Twitter users within the financial community provide a proxy for the relationship between social sentiment and financial market movement. Hence, we conclude that the weighted sentiment constructed from these critical nodes within the financial community provides a more robust predictor of financial markets than the general social sentiment
Real-Time Topic and Sentiment Analysis in Human-Robot Conversation
Socially interactive robots, especially those designed for entertainment and companionship, must be able to hold conversations with users that feel natural and engaging for humans. Two important components of such conversations include adherence to the topic of conversation and inclusion of affective expressions. Most previous approaches have concentrated on topic detection or sentiment analysis alone, and approaches that attempt to address both are limited by domain and by type of reply. This thesis presents a new approach, implemented on a humanoid robot interface, that detects the topic and sentiment of a user’s utterances from text-transcribed speech. It also generates domain-independent, topically relevant verbal replies and appropriate positive and negative emotional expressions in real time. The front end of the system is a smartphone app that functions as the robot’s face. It displays emotionally expressive eyes, transcribes verbal input as text, and synthesizes spoken replies. The back end of the system is implemented on the robot’s onboard computer. It connects with the app via Bluetooth, receives and processes the transcribed input, and returns verbal replies and sentiment scores. The back end consists of a topic-detection subsystem and a sentiment-analysis subsystem. The topic-detection subsystem uses a Latent Semantic Indexing model of a conversation corpus, followed by a search in the online database ConceptNet 5, in order to generate a topically relevant reply. The sentiment-analysis subsystem disambiguates the input words, obtains their sentiment scores from SentiWordNet, and returns the averaged sum of the scores as the overall sentiment score. The system was hypothesized to engage users more with both subsystems working together than either subsystem alone, and each subsystem alone was hypothesized to engage users more than a random control. In computational evaluations, each subsystem performed weakly but positively. In user evaluations, users reported a higher level of topical relevance and emotional appropriateness in conversations in which the subsystems were working together, and they reported higher engagement especially in conversations in which the topic-detection system was working. It is concluded that the system partially fulfills its goals, and suggestions for future work are presented
- …