20,961 research outputs found
Recommended from our members
Search engine For Twitter sentiment analysis
textThe purpose of sentiment analysis is to determine the attitude of a writer or a speaker with respect to some topic or his feeling in a document. Thanks to the rise of social media, nowadays there are numerous data generated by users. Mining and categorizing these data will not only bring profits for companies, but also benefit the nation. Sentiment analysis not only enables business decision makers to better understand customers' behaviors, but also allows customers to know how the public feel about a product before purchasing. On the other hand, the aggregation of emotions will effectively measure the public response toward an event or news. For example, the level of distress and sadness will increase significantly after terror attacks or natural disaster. In our project, we are going to build a search engine that allows users to check the sentiment of his query. Some of previous researches on classifying sentiment of messages on micro-blogging services like Twitter have tried to solve this problem but they have ignored neutral tweets, which will result in problematic results (12). Our sentiment analysis will also be based on tweets collected from twitter, since twitter can offer sufficient and real-time corpora for analysis. We will preprocess each tweet in the training set and label it as positive, negative or neutral. As we use words in the tweet as the feature for our model, different features will be used. We will show that accuracy achieved by different machine learning algorithms (Naïve Bayes, Maximum Entropy) can be improved with a feature vector obtained by using bigrams (5). In our practice, we find that Naive Bayes has better performance than Maximum Entropy.Statistic
Investigating Rumor Propagation with TwitterTrails
Social media have become part of modern news reporting, used by journalists
to spread information and find sources, or as a news source by individuals. The
quest for prominence and recognition on social media sites like Twitter can
sometimes eclipse accuracy and lead to the spread of false information. As a
way to study and react to this trend, we introduce {\sc TwitterTrails}, an
interactive, web-based tool ({\tt twittertrails.com}) that allows users to
investigate the origin and propagation characteristics of a rumor and its
refutation, if any, on Twitter. Visualizations of burst activity, propagation
timeline, retweet and co-retweeted networks help its users trace the spread of
a story. Within minutes {\sc TwitterTrails} will collect relevant tweets and
automatically answer several important questions regarding a rumor: its
originator, burst characteristics, propagators and main actors according to the
audience. In addition, it will compute and report the rumor's level of
visibility and, as an example of the power of crowdsourcing, the audience's
skepticism towards it which correlates with the rumor's credibility. We
envision {\sc TwitterTrails} as valuable tool for individual use, but we
especially for amateur and professional journalists investigating recent and
breaking stories. Further, its expanding collection of investigated rumors can
be used to answer questions regarding the amount and success of misinformation
on Twitter.Comment: 10 pages, 8 figures, under revie
Cascades: A view from Audience
Cascades on online networks have been a popular subject of study in the past
decade, and there is a considerable literature on phenomena such as diffusion
mechanisms, virality, cascade prediction, and peer network effects. However, a
basic question has received comparatively little attention: how desirable are
cascades on a social media platform from the point of view of users? While
versions of this question have been considered from the perspective of the
producers of cascades, any answer to this question must also take into account
the effect of cascades on their audience. In this work, we seek to fill this
gap by providing a consumer perspective of cascade.
Users on online networks play the dual role of producers and consumers.
First, we perform an empirical study of the interaction of Twitter users with
retweet cascades. We measure how often users observe retweets in their home
timeline, and observe a phenomenon that we term the "Impressions Paradox": the
share of impressions for cascades of size k decays much slower than frequency
of cascades of size k. Thus, the audience for cascades can be quite large even
for rare large cascades. We also measure audience engagement with retweet
cascades in comparison to non-retweeted content. Our results show that cascades
often rival or exceed organic content in engagement received per impression.
This result is perhaps surprising in that consumers didn't opt in to see tweets
from these authors. Furthermore, although cascading content is widely popular,
one would expect it to eventually reach parts of the audience that may not be
interested in the content. Motivated by our findings, we posit a theoretical
model that focuses on the effect of cascades on the audience. Our results on
this model highlight the balance between retweeting as a high-quality content
selection mechanism and the role of network users in filtering irrelevant
content
Privacy-Aware Recommender Systems Challenge on Twitter's Home Timeline
Recommender systems constitute the core engine of most social network
platforms nowadays, aiming to maximize user satisfaction along with other key
business objectives. Twitter is no exception. Despite the fact that Twitter
data has been extensively used to understand socioeconomic and political
phenomena and user behaviour, the implicit feedback provided by users on Tweets
through their engagements on the Home Timeline has only been explored to a
limited extent. At the same time, there is a lack of large-scale public social
network datasets that would enable the scientific community to both benchmark
and build more powerful and comprehensive models that tailor content to user
interests. By releasing an original dataset of 160 million Tweets along with
engagement information, Twitter aims to address exactly that. During this
release, special attention is drawn on maintaining compliance with existing
privacy laws. Apart from user privacy, this paper touches on the key challenges
faced by researchers and professionals striving to predict user engagements. It
further describes the key aspects of the RecSys 2020 Challenge that was
organized by ACM RecSys in partnership with Twitter using this dataset.Comment: 16 pages, 2 table
Using Text Similarity to Detect Social Interactions not Captured by Formal Reply Mechanisms
In modeling social interaction online, it is important to understand when
people are reacting to each other. Many systems have explicit indicators of
replies, such as threading in discussion forums or replies and retweets in
Twitter. However, it is likely these explicit indicators capture only part of
people's reactions to each other, thus, computational social science approaches
that use them to infer relationships or influence are likely to miss the mark.
This paper explores the problem of detecting non-explicit responses, presenting
a new approach that uses tf-idf similarity between a user's own tweets and
recent tweets by people they follow. Based on a month's worth of posting data
from 449 ego networks in Twitter, this method demonstrates that it is likely
that at least 11% of reactions are not captured by the explicit reply and
retweet mechanisms. Further, these uncaptured reactions are not evenly
distributed between users: some users, who create replies and retweets without
using the official interface mechanisms, are much more responsive to followees
than they appear. This suggests that detecting non-explicit responses is an
important consideration in mitigating biases and building more accurate models
when using these markers to study social interaction and information diffusion.Comment: A final version of this work was published in the 2015 IEEE 11th
International Conference on e-Science (e-Science
- …