114 research outputs found


    Get PDF
    Social media sites are one of the platforms where a lot of people interact in the present, expanding world. Twitter is one popular social media platform. Tweets are shared with the general public through Twitter. Currently, significant amounts of continuous data word representation learning algorithms do not take into consider the sentimental relationships between words and instead focus only on the text's syntactic information. Specific Word Embeddings for Sentiment Analysis in Twitter Data Analysis is presented in this analysis. Weighted average word embeddings is a method that uses an adapted version of the delta Term Frequency-Inverse Document Frequency (TFIDF) a method to integrate sentiment data into continuous word representations. Sentiment Analysis experiment model makes use of a classifier described as Tailored Random Forest, which was trained the training sample. An analysis utilizes tokenization sentiment, stemming, and the removal of stop words. In this study, the development of multiplication polarity-based sentiment analysis is the main focus. In comparison to un-weighted embeddings, experiments have shown promising results. Experimental results demonstrate that described classifier gives very high predictive Accuracy, macro average Recall and Precision. Finally, they can enhance the sentiment analysis model's performance

    Contextual lexicon-based sentiment analysis for social media.

    Get PDF
    Sentiment analysis concerns the computational study of opinions expressed in text. Social media domains provide a wealth of opinionated data, thus, creating a greater need for sentiment analysis. Typically, sentiment lexicons that capture term-sentiment association knowledge are commonly used to develop sentiment analysis systems. However, the nature of social media content calls for analysis methods and knowledge sources that are better able to adapt to changing vocabulary. Invariably existing sentiment lexicon knowledge cannot usefully handle social media vocabulary which is typically informal and changeable yet rich in sentiment. This, in turn, has implications on the analyser's ability to effectively capture the context therein and to interpret the sentiment polarity from the lexicons. In this thesis we use SentiWordNet, a popular sentiment-rich lexicon with a substantial vocabulary coverage and explore how to adapt it for social media sentiment analysis. Firstly, the thesis identifies a set of strategies to incorporate the effect of modifiers on sentiment-bearing terms (local context). These modifiers include: contextual valence shifters, non-lexical sentiment modifiers typical in social media and discourse structures. Secondly, the thesis introduces an approach in which a domain-specific lexicon is generated using a distant supervision method and integrated with a general-purpose lexicon, using a weighted strategy, to form a hybrid (domain-adapted) lexicon. This has the dual purpose of enriching term coverage of the general purpose lexicon with non-standard but sentiment-rich terms as well as adjusting sentiment semantics of terms. Here, we identified two term-sentiment association metrics based on Term Frequency and Inverse Document Frequency that are able to outperform the state-of-the-art Point-wise Mutual Information on social media data. As distant supervision may not be readily applicable on some social media domains, we explore the cross-domain transferability of a hybrid lexicon. Thirdly, we introduce an approach for improving distant-supervised sentiment classification with knowledge from local context analysis, domain-adapted (hybrid) and emotion lexicons. Finally, we conduct a comprehensive evaluation of all identified approaches using six sentiment-rich social media datasets

    Learning domain-specific sentiment lexicons with applications to recommender systems

    Get PDF
    Search is now going beyond looking for factual information, and people wish to search for the opinions of others to help them in their own decision-making. Sentiment expressions or opinion expressions are used by users to express their opinion and embody important pieces of information, particularly in online commerce. The main problem that the present dissertation addresses is how to model text to find meaningful words that express a sentiment. In this context, I investigate the viability of automatically generating a sentiment lexicon for opinion retrieval and sentiment classification applications. For this research objective we propose to capture sentiment words that are derived from online users’ reviews. In this approach, we tackle a major challenge in sentiment analysis which is the detection of words that express subjective preference and domain-specific sentiment words such as jargon. To this aim we present a fully generative method that automatically learns a domain-specific lexicon and is fully independent of external sources. Sentiment lexicons can be applied in a broad set of applications, however popular recommendation algorithms have somehow been disconnected from sentiment analysis. Therefore, we present a study that explores the viability of applying sentiment analysis techniques to infer ratings in a recommendation algorithm. Furthermore, entities’ reputation is intrinsically associated with sentiment words that have a positive or negative relation with those entities. Hence, is provided a study that observes the viability of using a domain-specific lexicon to compute entities reputation. Finally, a recommendation system algorithm is improved with the use of sentiment-based ratings and entities reputation

    Twitter financial community sentiment and its predictive relationship to stock market movement

    Get PDF
    Twitter, one of the several major social media platforms, has been identified as an influential factor for financial markets by multiple academic and professional publications in recent years. The motivation of this study hinges on the growing popularity of the use of Twitter and the increasing prevalence of its influence among the financial investment community. This paper presents empirical evidence of the existence of a financial community on Twitter in which users’ interests align with financial market-related topics. We establish a methodology to identify relevant Twitter users who form the financial community, and we also present the empirical findings of network characteristics of the financial community. We observe that this financial community behaves similarly to a small-world network, and we further identify groups of critical nodes and analyse their influence within the financial community based on several network centrality measures. Using a novel sentiment analysis algorithm, we construct a weighted sentiment measure using tweet messages from these critical nodes, and we discover that it is significantly correlated with the returns of the major financial market indices. By forming a financial community within the Twitter universe, we argue that the influential Twitter users within the financial community provide a proxy for the relationship between social sentiment and financial market movement. Hence, we conclude that the weighted sentiment constructed from these critical nodes within the financial community provides a more robust predictor of financial markets than the general social sentiment

    Real-Time Topic and Sentiment Analysis in Human-Robot Conversation

    Get PDF
    Socially interactive robots, especially those designed for entertainment and companionship, must be able to hold conversations with users that feel natural and engaging for humans. Two important components of such conversations include adherence to the topic of conversation and inclusion of affective expressions. Most previous approaches have concentrated on topic detection or sentiment analysis alone, and approaches that attempt to address both are limited by domain and by type of reply. This thesis presents a new approach, implemented on a humanoid robot interface, that detects the topic and sentiment of a user’s utterances from text-transcribed speech. It also generates domain-independent, topically relevant verbal replies and appropriate positive and negative emotional expressions in real time. The front end of the system is a smartphone app that functions as the robot’s face. It displays emotionally expressive eyes, transcribes verbal input as text, and synthesizes spoken replies. The back end of the system is implemented on the robot’s onboard computer. It connects with the app via Bluetooth, receives and processes the transcribed input, and returns verbal replies and sentiment scores. The back end consists of a topic-detection subsystem and a sentiment-analysis subsystem. The topic-detection subsystem uses a Latent Semantic Indexing model of a conversation corpus, followed by a search in the online database ConceptNet 5, in order to generate a topically relevant reply. The sentiment-analysis subsystem disambiguates the input words, obtains their sentiment scores from SentiWordNet, and returns the averaged sum of the scores as the overall sentiment score. The system was hypothesized to engage users more with both subsystems working together than either subsystem alone, and each subsystem alone was hypothesized to engage users more than a random control. In computational evaluations, each subsystem performed weakly but positively. In user evaluations, users reported a higher level of topical relevance and emotional appropriateness in conversations in which the subsystems were working together, and they reported higher engagement especially in conversations in which the topic-detection system was working. It is concluded that the system partially fulfills its goals, and suggestions for future work are presented
    • …