46,167 research outputs found

    A new ANEW: Evaluation of a word list for sentiment analysis in microblogs

    Get PDF
    Sentiment analysis of microblogs such as Twitter has recently gained a fair amount of attention. One of the simplest sentiment analysis approaches compares the words of a posting against a labeled word list, where each word has been scored for valence, -- a 'sentiment lexicon' or 'affective word lists'. There exist several affective word lists, e.g., ANEW (Affective Norms for English Words) developed before the advent of microblogging and sentiment analysis. I wanted to examine how well ANEW and other word lists performs for the detection of sentiment strength in microblog posts in comparison with a new word list specifically constructed for microblogs. I used manually labeled postings from Twitter scored for sentiment. Using a simple word matching I show that the new word list may perform better than ANEW, though not as good as the more elaborate approach found in SentiStrength.Comment: 6 pages, 4 figures, 1 table, Submitted to "Making Sense of Microposts (#MSM2011)

    Sarcasm Detection and User Behaviour Analysis

    Get PDF
    Sarcasm is a sort of sentiment where public expresses their negative emotions using positive word within the text. It is very tough for humans to acknowledge. In this way we show the interest in sarcasm detection of social media text, particularly in tweets. In this paper we propose new method pattern based approach for sarcasm detection, and also used behavioral modelling approach for effective sarcasm detection by analyzing the content of tweets however by conjoint exploiting the activity traits of users derived from their past activities. In this way we propose the different method for sarcasm detection such as, Sentiment-related Features, Punctuation-Related Features, Syntactic and Semantic Features, Pattern-Related Features approach for detection of sarcasm in the tweet. We also develop the behavioural modeling approach to check the user emotion and sentiment analysis. By using the various classifiers such as TREE, Support Vector Machine (SVM), BOOST and Maximum Entropy, we check the accuracy and performance. Our proposed approach reaches an accuracy of 94 %

    Latent dirichlet markov allocation for sentiment analysis

    Get PDF
    In recent years probabilistic topic models have gained tremendous attention in data mining and natural language processing research areas. In the field of information retrieval for text mining, a variety of probabilistic topic models have been used to analyse content of documents. A topic model is a generative model for documents, it specifies a probabilistic procedure by which documents can be generated. All topic models share the idea that documents are mixture of topics, where a topic is a probability distribution over words. In this paper we describe Latent Dirichlet Markov Allocation Model (LDMA), a new generative probabilistic topic model, based on Latent Dirichlet Allocation (LDA) and Hidden Markov Model (HMM), which emphasizes on extracting multi-word topics from text data. LDMA is a four-level hierarchical Bayesian model where topics are associated with documents, words are associated with topics and topics in the model can be presented with single- or multi-word terms. To evaluate performance of LDMA, we report results in the field of aspect detection in sentiment analysis, comparing to the basic LDA model

    The role of approximate negators in modeling the automatic detection of negation in tweets

    Get PDF
    Although improvements have been made in the performance of sentiment analysis tools, the automatic detection of negated text (which affects negative sentiment prediction) still presents challenges. More research is needed on new forms of negation beyond prototypical negation cues such as “not” or “never.” The present research reports findings on the role of a set of words called “approximate negators,” namely “barely,” “hardly,” “rarely,” “scarcely,” and “seldom,” which, in specific occasions (such as attached to a word from the non-affirmative adverb “any” family), can operationalize negation styles not yet explored. Using a corpus of 6,500 tweets, human annotation allowed for the identification of 17 recurrent usages of these words as negatives (such as “very seldom”) which, along with findings from the literature, helped engineer specific features that guided a machine learning classifier in predicting negated tweets. The machine learning experiments also modeled negation scope (i.e. in which specific words are negated in the text) by employing lexical and dependency graph information. Promising results included F1 values for negation detection ranging from 0.71 to 0.89 and scope detection from 0.79 to 0.88. Future work will be directed to the application of these findings in automatic sentiment classification, further exploration of patterns in data (such as part-of-speech recurrences for these new types of negation), and the investigation of sarcasm, formal language, and exaggeration as themes that emerged from observations during corpus annotation

    Joint Aspect and Polarity Classification for Aspect-based Sentiment Analysis with End-to-End Neural Networks

    Full text link
    In this work, we propose a new model for aspect-based sentiment analysis. In contrast to previous approaches, we jointly model the detection of aspects and the classification of their polarity in an end-to-end trainable neural network. We conduct experiments with different neural architectures and word representations on the recent GermEval 2017 dataset. We were able to show considerable performance gains by using the joint modeling approach in all settings compared to pipeline approaches. The combination of a convolutional neural network and fasttext embeddings outperformed the best submission of the shared task in 2017, establishing a new state of the art.Comment: EMNLP 201

    Opinion Mining on Non-English Short Text

    Full text link
    As the type and the number of such venues increase, automated analysis of sentiment on textual resources has become an essential data mining task. In this paper, we investigate the problem of mining opinions on the collection of informal short texts. Both positive and negative sentiment strength of texts are detected. We focus on a non-English language that has few resources for text mining. This approach would help enhance the sentiment analysis in languages where a list of opinionated words does not exist. We propose a new method projects the text into dense and low dimensional feature vectors according to the sentiment strength of the words. We detect the mixture of positive and negative sentiments on a multi-variant scale. Empirical evaluation of the proposed framework on Turkish tweets shows that our approach gets good results for opinion mining
    • …
    corecore