13,442 research outputs found

    The role of idioms in sentiment analysis

    Get PDF
    In this paper we investigate the role of idioms in automated approaches to sentiment analysis. To estimate the degree to which the inclusion of idioms as features may potentially improve the results of traditional sentiment analysis, we compared our results to two such methods. First, to support idioms as features we collected a set of 580 idioms that are relevant to sentiment analysis, i.e. the ones that can be mapped to an emotion. These mappings were then obtained using a web-based crowdsourcing approach. The quality of the crowdsourced information is demonstrated with high agreement among five independent annotators calculated using Krippendorff's alpha coefficient (α = 0.662). Second, to evaluate the results of sentiment analysis, we assembled a corpus of sentences in which idioms are used in context. Each sentence was annotated with an emotion, which formed the basis for the gold standard used for the comparison against two baseline methods. The performance was evaluated in terms of three measures - precision, recall and F-measure. Overall, our approach achieved 64% and 61% for these three measures in two experiments improving the baseline results by 20 and 15 percent points respectively. F-measure was significantly improved over all three sentiment polarity classes: Positive, Negative and Other. Most notable improvement was recorded in classification of positive sentiments, where recall was improved by 45 percent points in both experiments without compromising the precision. The statistical significance of these improvements was confirmed by McNemar's test

    A Bayesian-Based Approach for Public Sentiment Modeling

    Full text link
    Public sentiment is a direct public-centric indicator for the success of effective action planning. Despite its importance, systematic modeling of public sentiment remains untapped in previous studies. This research aims to develop a Bayesian-based approach for quantitative public sentiment modeling, which is capable of incorporating uncertainty and guiding the selection of public sentiment measures. This study comprises three steps: (1) quantifying prior sentiment information and new sentiment observations with Dirichlet distribution and multinomial distribution respectively; (2) deriving the posterior distribution of sentiment probabilities through incorporating the Dirichlet distribution and multinomial distribution via Bayesian inference; and (3) measuring public sentiment through aggregating sampled sets of sentiment probabilities with an application-based measure. A case study on Hurricane Harvey is provided to demonstrate the feasibility and applicability of the proposed approach. The developed approach also has the potential to be generalized to model various types of probability-based measures

    Idiom–based features in sentiment analysis: cutting the Gordian knot

    Get PDF
    In this paper we describe an automated approach to enriching sentiment analysis with idiom–based features. Specifically, we automated the development of the supporting lexico–semantic resources, which include (1) a set of rules used to identify idioms in text and (2) their sentiment polarity classifications. Our method demonstrates how idiom dictionaries, which are readily available general pedagogical resources, can be adapted into purpose–specific computational resources automatically. These resources were then used to replace the manually engineered counterparts in an existing system, which originally outperformed the baseline sentiment analysis approaches by 17 percentage points on average, taking the F–measure from 40s into 60s. The new fully automated approach outperformed the baselines by 8 percentage points on average taking the F–measure from 40s into 50s. Although the latter improvement is not as high as the one achieved with the manually engineered features, it has got the advantage of being more general in a sense that it can readily utilize an arbitrary list of idioms without the knowledge acquisition overhead previously associated with this task, thereby fully automating the original approach

    Construction and Expansion of Dictionary of Idiomatic Emotional Expressions and Idiomatic Emotional Expression Corpus

    Get PDF
    Objective: In the study of sentiment estimation from language, methods focusing on words, phrases, sentence patterns, and sentence-final expressions have been proposed. However, it is difficult to deal with a wide variety of emotional expressions by only assigning emotions to words and phrases. In particular, it is difficult to analyze metaphorical expressions and idiomatic expressions on a word-by-word basis, and it is impossible to register all expressions in a dictionary because new expressions can be created by flexibly replacing words. However, it is difficult to determine the constraints on the words to be replaced, and not all expressions can be registered in the dictionary as sentence patterns. Methods: In this paper, we construct a dictionary of idiomatic sentiment expressions, which contains idioms expressing emotions. In this paper, we construct a pseudo-emotional corpus by collecting utterances containing emotional idioms from social media and automatically assigning emotions expressed by the idioms. Results: This corpus includes expressions other than idioms, and can be an effective resource for estimating emotions in sentences that do not contain idioms. In this study, we create an emotion estimation model for utterances based on the constructed corpus, and conduct evaluation experiments to explore the problems of the idiomatic emotion corpus. In addition, using the constructed sentiment corpus, we investigate how to expand the dictionary of sentiment expressions in idiomatic phrases by using deep learning methods. Conclusion: Using the corpus of idiomatic sentiments constructed by the proposed method as training data, models with and without idioms were constructed by machine learning models. The results show that the F-values of all emotions with idioms exceed 0.8. On the other hand, when idioms were not included, the F-values tended to decrease overall. However, the F-values of emotions such as "shame" and "excitement" were around 0.7, indicating that the characteristics of emotional expressions other than idioms were expressed
    corecore