5,466 research outputs found
Organized Behavior Classification of Tweet Sets using Supervised Learning Methods
During the 2016 US elections Twitter experienced unprecedented levels of
propaganda and fake news through the collaboration of bots and hired persons,
the ramifications of which are still being debated. This work proposes an
approach to identify the presence of organized behavior in tweets. The Random
Forest, Support Vector Machine, and Logistic Regression algorithms are each
used to train a model with a data set of 850 records consisting of 299 features
extracted from tweets gathered during the 2016 US presidential election. The
features represent user and temporal synchronization characteristics to capture
coordinated behavior. These models are trained to classify tweet sets among the
categories: organic vs organized, political vs non-political, and pro-Trump vs
pro-Hillary vs neither. The random forest algorithm performs better with
greater than 95% average accuracy and f-measure scores for each category. The
most valuable features for classification are identified as user based
features, with media use and marking tweets as favorite to be the most
dominant.Comment: 51 pages, 5 figure
Semi-Supervised Approach to Monitoring Clinical Depressive Symptoms in Social Media
With the rise of social media, millions of people are routinely expressing
their moods, feelings, and daily struggles with mental health issues on social
media platforms like Twitter. Unlike traditional observational cohort studies
conducted through questionnaires and self-reported surveys, we explore the
reliable detection of clinical depression from tweets obtained unobtrusively.
Based on the analysis of tweets crawled from users with self-reported
depressive symptoms in their Twitter profiles, we demonstrate the potential for
detecting clinical depression symptoms which emulate the PHQ-9 questionnaire
clinicians use today. Our study uses a semi-supervised statistical model to
evaluate how the duration of these symptoms and their expression on Twitter (in
terms of word usage patterns and topical preferences) align with the medical
findings reported via the PHQ-9. Our proactive and automatic screening tool is
able to identify clinical depressive symptoms with an accuracy of 68% and
precision of 72%.Comment: 8 pages, Advances in Social Networks Analysis and Mining (ASONAM),
2017 IEEE/ACM International Conferenc
Recommended from our members
Search engine For Twitter sentiment analysis
textThe purpose of sentiment analysis is to determine the attitude of a writer or a speaker with respect to some topic or his feeling in a document. Thanks to the rise of social media, nowadays there are numerous data generated by users. Mining and categorizing these data will not only bring profits for companies, but also benefit the nation. Sentiment analysis not only enables business decision makers to better understand customers' behaviors, but also allows customers to know how the public feel about a product before purchasing. On the other hand, the aggregation of emotions will effectively measure the public response toward an event or news. For example, the level of distress and sadness will increase significantly after terror attacks or natural disaster. In our project, we are going to build a search engine that allows users to check the sentiment of his query. Some of previous researches on classifying sentiment of messages on micro-blogging services like Twitter have tried to solve this problem but they have ignored neutral tweets, which will result in problematic results (12). Our sentiment analysis will also be based on tweets collected from twitter, since twitter can offer sufficient and real-time corpora for analysis. We will preprocess each tweet in the training set and label it as positive, negative or neutral. As we use words in the tweet as the feature for our model, different features will be used. We will show that accuracy achieved by different machine learning algorithms (Naïve Bayes, Maximum Entropy) can be improved with a feature vector obtained by using bigrams (5). In our practice, we find that Naive Bayes has better performance than Maximum Entropy.Statistic
- …