1,660 research outputs found

    Opinion Mining on Non-English Short Text

    Full text link
    As the type and the number of such venues increase, automated analysis of sentiment on textual resources has become an essential data mining task. In this paper, we investigate the problem of mining opinions on the collection of informal short texts. Both positive and negative sentiment strength of texts are detected. We focus on a non-English language that has few resources for text mining. This approach would help enhance the sentiment analysis in languages where a list of opinionated words does not exist. We propose a new method projects the text into dense and low dimensional feature vectors according to the sentiment strength of the words. We detect the mixture of positive and negative sentiments on a multi-variant scale. Empirical evaluation of the proposed framework on Turkish tweets shows that our approach gets good results for opinion mining

    Doctor of Philosophy in Computer Science

    Get PDF
    dissertationOver the last decade, social media has emerged as a revolutionary platform for informal communication and social interactions among people. Publicly expressing thoughts, opinions, and feelings is one of the key characteristics of social media. In this dissertation, I present research on automatically acquiring knowledge from social media that can be used to recognize people's affective state (i.e., what someone feels at a given time) in text. This research addresses two types of affective knowledge: 1) hashtag indicators of emotion consisting of emotion hashtags and emotion hashtag patterns, and 2) affective understanding of similes (a form of figurative comparison). My research introduces a bootstrapped learning algorithm for learning hashtag in- dicators of emotions from tweets with respect to five emotion categories: Affection, Anger/Rage, Fear/Anxiety, Joy, and Sadness/Disappointment. With a few seed emotion hashtags per emotion category, the bootstrapping algorithm iteratively learns new hashtags and more generalized hashtag patterns by analyzing emotion in tweets that contain these indicators. Emotion phrases are also harvested from the learned indicators to train additional classifiers that use the surrounding word context of the phrases as features. This is the first work to learn hashtag indicators of emotions. My research also presents a supervised classification method for classifying affective polarity of similes in Twitter. Using lexical, semantic, and sentiment properties of different simile components as features, supervised classifiers are trained to classify a simile into a positive or negative affective polarity class. The property of comparison is also fundamental to the affective understanding of similes. My research introduces a novel framework for inferring implicit properties that 1) uses syntactic constructions, statistical association, dictionary definitions and word embedding vector similarity to generate and rank candidate properties, 2) re-ranks the top properties using influence from multiple simile components, and 3) aggregates the ranks of each property from different methods to create a final ranked list of properties. The inferred properties are used to derive additional features for the supervised classifiers to further improve affective polarity recognition. Experimental results show substantial improvements in affective understanding of similes over the use of existing sentiment resources

    Using Twitter to learn about the autism community

    Full text link
    Considering the raising socio-economic burden of autism spectrum disorder (ASD), timely and evidence-driven public policy decision making and communication of the latest guidelines pertaining to the treatment and management of the disorder is crucial. Yet evidence suggests that policy makers and medical practitioners do not always have a good understanding of the practices and relevant beliefs of ASD-afflicted individuals' carers who often follow questionable recommendations and adopt advice poorly supported by scientific data. The key goal of the present work is to explore the idea that Twitter, as a highly popular platform for information exchange, could be used as a data-mining source to learn about the population affected by ASD -- their behaviour, concerns, needs etc. To this end, using a large data set of over 11 million harvested tweets as the basis for our investigation, we describe a series of experiments which examine a range of linguistic and semantic aspects of messages posted by individuals interested in ASD. Our findings, the first of their nature in the published scientific literature, strongly motivate additional research on this topic and present a methodological basis for further work.Comment: Social Network Analysis and Mining, 201

    Persuasion in the digital age: a theoretical model of persuasion in terse text

    Get PDF
    This thesis explores how the increasingly prevalent terse text format of Social Media communication has affected the way we seek to persuade one another and whether it has impacted the applicability of existing models of persuasion, influence and attitude change. Over the past few decades, communication behaviour has evolved dramatically. As a society we increasingly consume information in the format of short messages, rather than lengthy text and verbose speech. Meanwhile our understanding of persuasion has hardly moved on from the 1980’s and continues to be spread across a variety of academic disciplines, such as Behavioural Science/Psychology, Philosophy/Rhetoric, and various sub-fields of linguistics. Existing models of persuasion are to date lacking interdisciplinarity and applicability to the terse text format found in Social Media. The data used in this research is in the format of Twitter microblogs gathered throughout a number of recent political campaigns, such as the 2016 UK Brexit referendum and the 2016 US General Election. The research purpose is fundamental, rather than applied, meaning that it seeks to expand knowledge by increasing the understanding of fundamental principles, rather than answering specific questions and offering a precise solution to a practical problem. The research philosophy that has been adopted for this project is interpretivism. The research approach is idiographic, and the methodology is predominantly qualitative, with occasional use of descriptive statistics. The research was conducted in several distinct phases, starting with the construction of the theoretical model, followed by two validation exercises and further experimental exploration by means of a recall test and computational linguistic analysis, culminating in a revised model of terse text persuasion. This research draws upon and collates existing knowledge from behavioural science, rhetoric, linguistics, and cognitive science and develops a comprehensive understanding of how we seek to persuade through terse text media, based on data collected around a number of recent political campaigns and topics of debate. The research demonstrates that existing models of persuasion, such as the Elaboration Likelihood Model (Petty and Cacioppo, 1986) and the Heuristic Systematic Model (Chaiken et al., 1989)cannot be applied to the terse text context without significant modification. A new theoretical model of persuasion in terse text is proposed and evaluated. The findings also show that there is a distinct preference for heuristic over systematic cues in terse text messages with persuasive intent, and – in terms of Aristotelian rhetorical appeals – a preference for appeals to credibility (ethos) and emotion (pathos) over appeals to reason (logos). Additionally, the research explores, by means of a recall test, the most memorable subcategories of terse text microblogs, as well as the examining message structure and features through computational linguistic tools. Although this research focusses on political persuasion in terse text Social Media, the findings have implications that reach far beyond the political sphere into activism, marketing, social engineering, strategic communication and (human centred) information warfare

    FINE-GRAINED EMOTION DETECTION IN MICROBLOG TEXT

    Get PDF
    Automatic emotion detection in text is concerned with using natural language processing techniques to recognize emotions expressed in written discourse. Endowing computers with the ability to recognize emotions in a particular kind of text, microblogs, has important applications in sentiment analysis and affective computing. In order to build computational models that can recognize the emotions represented in tweets we need to identify a set of suitable emotion categories. Prior work has mainly focused on building computational models for only a small set of six basic emotions (happiness, sadness, fear, anger, disgust, and surprise). This thesis describes a taxonomy of 28 emotion categories, an expansion of these six basic emotions, developed inductively from data. This set of 28 emotion categories represents a set of fine-grained emotion categories that are representative of the range of emotions expressed in tweets, microblog posts on Twitter. The ability of humans to recognize these fine-grained emotion categories is characterized using inter-annotator reliability measures based on annotations provided by expert and novice annotators. A set of 15,553 human-annotated tweets form a gold standard corpus, EmoTweet-28. For each emotion category, we have extracted a set of linguistic cues (i.e., punctuation marks, emoticons, emojis, abbreviated forms, interjections, lemmas, hashtags and collocations) that can serve as salient indicators for that emotion category. We evaluated the performance of automatic classification techniques on the set of 28 emotion categories through a series of experiments using several classifier and feature combinations. Our results shows that it is feasible to extend machine learning classification to fine-grained emotion detection in tweets (i.e., as many as 28 emotion categories) with results that are comparable to state-of-the-art classifiers that detect six to eight basic emotions in text. Classifiers using features extracted from the linguistic cues associated with each category equal or better the performance of conventional corpus-based and lexicon-based features for fine-grained emotion classification. This thesis makes an important theoretical contribution in the development of a taxonomy of emotion in text. In addition, this research also makes several practical contributions, particularly in the creation of language resources (i.e., corpus and lexicon) and machine learning models for fine-grained emotion detection in text
    • …
    corecore