1,035 research outputs found
Bots increase exposure to negative and inflammatory content in online social systems
Societies are complex systems which tend to polarize into sub-groups of
individuals with dramatically opposite perspectives. This phenomenon is
reflected -- and often amplified -- in online social networks where, however,
humans are no more the only players, and co-exist alongside with social bots,
i.e., software-controlled accounts. Analyzing large-scale social data collected
during the Catalan referendum for independence on October 1, 2017, consisting
of nearly 4 millions Twitter posts generated by almost 1 million users, we
identify the two polarized groups of Independentists and Constitutionalists and
quantify the structural and emotional roles played by social bots. We show that
bots act from peripheral areas of the social system to target influential
humans of both groups, bombarding Independentists with violent contents,
increasing their exposure to negative and inflammatory narratives and
exacerbating social conflict online. Our findings stress the importance of
developing countermeasures to unmask these forms of automated social
manipulation.Comment: 8 pages, 5 figure
Sentiment analysis of health care tweets: review of the methods used.
BACKGROUND: Twitter is a microblogging service where users can send and read short 140-character messages called "tweets." There are several unstructured, free-text tweets relating to health care being shared on Twitter, which is becoming a popular area for health care research. Sentiment is a metric commonly used to investigate the positive or negative opinion within these messages. Exploring the methods used for sentiment analysis in Twitter health care research may allow us to better understand the options available for future research in this growing field. OBJECTIVE: The first objective of this study was to understand which tools would be available for sentiment analysis of Twitter health care research, by reviewing existing studies in this area and the methods they used. The second objective was to determine which method would work best in the health care settings, by analyzing how the methods were used to answer specific health care questions, their production, and how their accuracy was analyzed. METHODS: A review of the literature was conducted pertaining to Twitter and health care research, which used a quantitative method of sentiment analysis for the free-text messages (tweets). The study compared the types of tools used in each case and examined methods for tool production, tool training, and analysis of accuracy. RESULTS: A total of 12 papers studying the quantitative measurement of sentiment in the health care setting were found. More than half of these studies produced tools specifically for their research, 4 used open source tools available freely, and 2 used commercially available software. Moreover, 4 out of the 12 tools were trained using a smaller sample of the study's final data. The sentiment method was trained against, on an average, 0.45% (2816/627,024) of the total sample data. One of the 12 papers commented on the analysis of accuracy of the tool used. CONCLUSIONS: Multiple methods are used for sentiment analysis of tweets in the health care setting. These range from self-produced basic categorizations to more complex and expensive commercial software. The open source and commercial methods are developed on product reviews and generic social media messages. None of these methods have been extensively tested against a corpus of health care messages to check their accuracy. This study suggests that there is a need for an accurate and tested tool for sentiment analysis of tweets trained using a health care setting-specific corpus of manually annotated tweets first
Data-driven Social Mood Analysis through the Conceptualization of Emotional Fingerprints
Abstract A body of knowledge shows the emerging of evidence according to a better account for the emotional spectrum is achievable by employing a complete selection of emotion keywords. Basic emotions, such as Ekman's ones, cannot be considered universal, but are related to with implicit thematic affairs within the corpus under analysis. The paper tracks some preliminary experiments obtained by employing a data-driven methodology that captures emotions, relying on domain data that you want to model. The experimentation consists of investigating the corresponding conceptual space based on a set of terms (i.e., keywords) that are representative of the domain and the determination. Furthermore, the conceptual space is exploited as a bridge between the textual content and its sub-symbolic mapping as an "emotional fingerprint" into a six-dimensional hyperspace
Recommended from our members
On stopwords, filtering and data sparsity for sentiment analysis of Twitter
Sentiment classification over Twitter is usually affected by the noisy nature (abbreviations, irregular forms) of tweets data. A popular procedure to reduce the noise of textual data is to remove stopwords by using pre-compiled stopword lists or more sophisticated methods for dynamic stopword identification. However, the effectiveness of removing stopwords in the context of Twitter sentiment classification has been debated in the last few years. In this paper we investigate whether removing stopwords helps or hampers the effectiveness of Twitter sentiment classification methods. To this end, we apply six different stopword identification methods to Twitter data from six different datasets and observe how removing stopwords affects two well-known supervised sentiment classification methods. We assess the impact of removing stopwords by observing fluctuations on the level of data sparsity, the size of the classifier’s feature space and its classification performance. Our results show that using pre-compiled lists of stopwords negatively impacts the performance of Twitter sentiment classification approaches. On the other hand, the dynamic generation of stopword lists, by removing those infrequent terms appearing only once in the corpus, appears to be the optimal method to maintaining a high classification performance while reducing the data sparsity and substantially shrinking the feature space
Emotion Expression Extraction Method for Chinese Microblog Sentences
With the rapid spread of Chinese microblog, a large number of microblog topics are being generated in real-time. More and more users pay attention to emotion expressions of these opinionated sentences in different topics. It is challenging to label the emotion expressions of opinionated sentences manually. For this endeavor, an emotion expression extraction method is proposed to process millions of user-generated opinionated sentences automatically in this paper. Specifically, the proposed method mainly contains two tasks: emotion classification and opinion target extraction. We first use a lexicon-based emotion classification method to compute different emotion values in emotion label vectors of opinionated sentences. Then emotion label vectors of opinionated sentences are revised by an unsupervised emotion label propagation algorithm. After extracting candidate opinion targets of opinionated sentences, the opinion target extraction task is performed on a random walk-based ranking algorithm, which considers the connection between candidate opinion targets and the textual similarity between opinionated sentences, ranks candidate opinion targets of opinionated sentences. Experimental results demonstrate the effectiveness of algorithms in the proposed method
SENTIMENT ANALYSIS ON TWITTER BY USING MAXIMUM ENTROPY AND SUPPORT VECTOR MACHINE METHOD
With the advancement of social media and its growth, there is a lot of data that can be presented for research in social mining. Twitter is a microblogging that can be used. In this event, a lot of companies used the data on Twitter to analyze the satisfaction of their customer about product quality. On the other hand, a lot of users use social media to express their daily emotions. The case can be developed into a research study that can be used both to improve product quality, as well as to analyze the opinion on certain events. The research is often called sentiment analysis or opinion mining. While The previous research does a particularly useful feature for sentiment analysis, but it is still a lack of performance. Furthermore, they used Support Vector Machine as a classification method. On the other hand, most researchers found another classification method, which is considered more efficient such as Maximum Entropy. So, this research used two types of a dataset, the general opinion data, and the airline's opinion data. For feature extraction, we employ four feature extraction, such as pragmatic, lexical-grams, pos-grams, and sentiment lexical. For the classification, we use both of Support Vector Machine and Maximum Entropy to find the best result. In the end, the best result is performed by Maximum Entropy with 85,8% accuracy on general opinion data, and 92,6% accuracy on airlines opinion data
Detecting and Tracking the Spread of Astroturf Memes in Microblog Streams
Online social media are complementing and in some cases replacing
person-to-person social interaction and redefining the diffusion of
information. In particular, microblogs have become crucial grounds on which
public relations, marketing, and political battles are fought. We introduce an
extensible framework that will enable the real-time analysis of meme diffusion
in social media by mining, visualizing, mapping, classifying, and modeling
massive streams of public microblogging events. We describe a Web service that
leverages this framework to track political memes in Twitter and help detect
astroturfing, smear campaigns, and other misinformation in the context of U.S.
political elections. We present some cases of abusive behaviors uncovered by
our service. Finally, we discuss promising preliminary results on the detection
of suspicious memes via supervised learning based on features extracted from
the topology of the diffusion networks, sentiment analysis, and crowdsourced
annotations
- …