22 research outputs found

    Multi-theme sentiment analysis with sentiment shifting

    Get PDF
    Business reviews contain rich sentiment on multiple themes, disclosing more interesting information than the overall polarities of documents. When it comes to fine-grained sentiment analysis, given any segment of text, we are not only interested in overall polarity of such segment, but also the sentiment words play major effects. However, sentiment analysis at the word level poses significant challenges due to the complexity of reviews, the inconsistency of sentiment in different themes, and the sentiment shifting resulting from linguistic patterns---contextual valence shifters. To simultaneously resolve the multi-theme and sentiment shifting dilemma, a unified explainable sentiment analysis model, MTSA, is proposed in this paper, which enables both classification of sentiment polarity and discovery of quantified sentiment-shifting patterns. MTSA formulates multi-theme sentiment by learning embeddings (i.e., vector representations) for both themes and words, and derives the shifter effect learning algorithm by modeling the shifted sentiment in a logistic regression model. Extensive experiments have been conducted on Yelp business reviews and IMDB movie reviews. The improvement of sentiment polarity classification demonstrates the effectiveness of MTSA at rectifying word feature representations of reviews, and the human evaluation shows its successful discovery of multi-theme sentiment words and automatic effect quantification of contextual valence shifters

    Embedding Predications

    Get PDF
    Written communication is rarely a sequence of simple assertions. More often, in addition to simple assertions, authors express subjectivity, such as beliefs, speculations, opinions, intentions, and desires. Furthermore, they link statements of various kinds to form a coherent discourse that reflects their pragmatic intent. In computational semantics, extraction of simple assertions (propositional meaning) has attracted the greatest attention, while research that focuses on extra-propositional aspects of meaning has remained sparse overall and has been largely limited to narrowly defined categories, such as hedging or sentiment analysis, treated in isolation. In this thesis, we contribute to the understanding of extra-propositional meaning in natural language understanding, by providing a comprehensive account of the semantic phenomena that occur beyond simple assertions and examining how a coherent discourse is formed from lower level semantic elements. Our approach is linguistically based, and we propose a general, unified treatment of the semantic phenomena involved, within a computationally viable framework. We identify semantic embedding as the core notion involved in expressing extra-propositional meaning. The embedding framework is based on the structural distinction between embedding and atomic predications, the former corresponding to extra-propositional aspects of meaning. It incorporates the notions of predication source, modality scale, and scope. We develop an embedding categorization scheme and a dictionary based on it, which provide the necessary means to interpret extra-propositional meaning with a compositional semantic interpretation methodology. Our syntax-driven methodology exploits syntactic dependencies to construct a semantic embedding graph of a document. Traversing the graph in a bottom-up manner guided by compositional operations, we construct predications corresponding to extra-propositional semantic content, which form the basis for addressing practical tasks. We focus on text from two distinct domains: news articles from the Wall Street Journal, and scientific articles focusing on molecular biology. Adopting a task-based evaluation strategy, we consider the easy adaptability of the core framework to practical tasks that involve some extra-propositional aspect as a measure of its success. The computational tasks we consider include hedge/uncertainty detection, scope resolution, negation detection, biological event extraction, and attribution resolution. Our competitive results in these tasks demonstrate the viability of our proposal

    Sentiment analysis and resources for informal Arabic text on social media

    Get PDF
    Online content posted by Arab users on social networks does not generally abide by the grammatical and spelling rules. These posts, or comments, are valuable because they contain users’ opinions towards different objects such as products, policies, institutions, and people. These opinions constitute important material for commercial and governmental institutions. Commercial institutions can use these opinions to steer marketing campaigns, optimize their products and know the weaknesses and/ or strengths of their products. Governmental institutions can benefit from the social networks posts to detect public opinion before or after legislating a new policy or law and to learn about the main issues that concern citizens. However, the huge size of online data and its noisy nature can hinder manual extraction and classification of opinions present in online comments. Given the irregularity of dialectal Arabic (or informal Arabic), tools developed for formally correct Arabic are of limited use. This is specifically the case when employed in sentiment analysis (SA) where the target of the analysis is social media content. This research implemented a system that addresses this challenge. This work can be roughly divided into three blocks: building a corpus for SA and manually tagging it to check the performance of the constructed lexicon-based (LB) classifier; building a sentiment lexicon that consists of three different sets of patterns (negative, positive, and spam); and finally implementing a classifier that employs the lexicon to classify Facebook comments. In addition to providing resources for dialectal Arabic SA and classifying Facebook comments, this work categorises reasons behind incorrect classification, provides preliminary solutions for some of them with focus on negation, and uses regular expressions to detect the presence of lexemes. This work also illustrates how the constructed classifier works along with its different levels of reporting. Moreover, it compares the performance of the LB classifier against Naïve Bayes classifier and addresses how NLP tools such as POS tagging and Named Entity Recognition can be employed in SA. In addition, the work studies the performance of the implemented LB classifier and the developed sentiment lexicon when used to classify other corpora used in the literature, and the performance of lexicons used in the literature to classify the corpora constructed in this research. With minor changes, the classifier can be used in domain classification of documents (sports, science, news, etc.). The work ends with a discussion of research questions arising from the research reported

    Detecting subjectivity through lexicon-grammar. strategies databases, rules and apps for the italian language

    Get PDF
    2014 - 2015The present research handles the detection of linguistic phenomena connected to subjectivity, emotions and opinions from a computational point of view. The necessity to quickly monitor huge quantity of semi-structured and unstructured data from the web, poses several challenges to Natural Language Processing, that must provide strategies and tools to analyze their structures from a lexical, syntactical and semantic point of views. The general aim of the Sentiment Analysis, shared with the broader fields of NLP, Data Mining, Information Extraction, etc., is the automatic extraction of value from chaos; its specific focus instead is on opinions rather than on factual information. This is the aspect that differentiates it from other computational linguistics subfields. The majority of the sentiment lexicons has been manually or automatically created for the English language; therefore, existent Italian lexicons are mostly built through the translation and adaptation of the English lexical databases, e.g. SentiWordNet and WordNet-Affect. Unlike many other Italian and English sentiment lexicons, our database SentIta, made up on the interaction of electronic dictionaries and lexicon dependent local grammars, is able to manage simple and multiword structures, that can take the shape of distributionally free structures, distributionally restricted structures and frozen structures. Moreover, differently from other lexicon-based Sentiment Analysis methods, our approach has been grounded on the solidity of the Lexicon-Grammar resources and classifications, that provides fine-grained semantic but also syntactic descriptions of the lexical entries. According with the major contribution in the Sentiment Analysis literature, we did not consider polar words in isolation. We computed they elementary sentence contexts, with the allowed transformations and, then, their interaction with contextual valence shifters, the linguistic devices that are able to modify the prior polarity of the words from SentIta, when occurring with them in the same sentences. In order to do so, we took advantage of the computational power of the finite-state technology. We formalized a set of rules that work for the intensification, downtoning and negation modeling, the modality detection and the analysis of comparative forms. With regard to the applicative part of the research, we conducted, with satisfactory results, three experiments on the same number of Sentiment Analysis subtasks: the sentiment classification of documents and sentences, the feature-based Sentiment Analysis and the Semantic Role Labeling based on sentiments. [edited by author]XIV n.s

    Twitter Analysis to Predict the Satisfaction of Saudi Telecommunication Companies’ Customers

    Get PDF
    The flexibility in mobile communications allows customers to quickly switch from one service provider to another, making customer churn one of the most critical challenges for the data and voice telecommunication service industry. In 2019, the percentage of post-paid telecommunication customers in Saudi Arabia decreased; this represents a great deal of customer dissatisfaction and subsequent corporate fiscal losses. Many studies correlate customer satisfaction with customer churn. The Telecom companies have depended on historical customer data to measure customer churn. However, historical data does not reveal current customer satisfaction or future likeliness to switch between telecom companies. Current methods of analysing churn rates are inadequate and faced some issues, particularly in the Saudi market. This research was conducted to realize the relationship between customer satisfaction and customer churn and how to use social media mining to measure customer satisfaction and predict customer churn. This research conducted a systematic review to address the churn prediction models problems and their relation to Arabic Sentiment Analysis. The findings show that the current churn models lack integrating structural data frameworks with real-time analytics to target customers in real-time. In addition, the findings show that the specific issues in the existing churn prediction models in Saudi Arabia relate to the Arabic language itself, its complexity, and lack of resources. As a result, I have constructed the first gold standard corpus of Saudi tweets related to telecom companies, comprising 20,000 manually annotated tweets. It has been generated as a dialect sentiment lexicon extracted from a larger Twitter dataset collected by me to capture text characteristics in social media. I developed a new ASA prediction model for telecommunication that fills the detected gaps in the ASA literature and fits the telecommunication field. The proposed model proved its effectiveness for Arabic sentiment analysis and churn prediction. This is the first work using Twitter mining to predict potential customer loss (churn) in Saudi telecom companies, which has not been attempted before. Different fields, such as education, have different features, making applying the proposed model is interesting because it based on text-mining

    Mobilizing Empathy: From Einfuhlung to Homo Empathicus

    Get PDF
    This dissertation traces the movements of empathy across and within diverse contexts. Empathy is shown to be conceptually amorphous with significant degrees of variation in its applications. With an analytic lens focused on use (conceived of as the mobilization of empathy) heterogeneous conceptions of empathy are examined, illuminating the different psychological and social realities that are created when empathy functions in different ways. This systematic reconstruction is facilitated through an analysis of empathys moral, relational, epistemic, natural, and aesthetic conceptual foundations, and its quantitative, gendered, pathological, political, educational, commodified, and professional uses. It is argued that at the core of empathy is a moral valence; specifically, that empathy is irreducibly connected to ethical questions and, thus, there is always a moral dimension inherent in its applications. Based on the reconstruction an ontology of empathy is derived that includes the individual, the other, and its moral valence. The dissertation concludes with considerations of the consequences of this ontology. Challenging empathy exclusively construed as a matter of individual intentionality, it is argued that socio-political, economic, and societal structures create, shape, and maintain much of what individuals have access to and experience empathically. For this critical understanding, the notion of empathy avoidance, arm-chair empathy, and regulated empathy, are introduced
    corecore