65 research outputs found
Twitter-based analysis of the dynamics of collective attention to political parties
Large-scale data from social media have a significant potential to describe
complex phenomena in real world and to anticipate collective behaviors such as
information spreading and social trends. One specific case of study is
represented by the collective attention to the action of political parties. Not
surprisingly, researchers and stakeholders tried to correlate parties' presence
on social media with their performances in elections. Despite the many efforts,
results are still inconclusive since this kind of data is often very noisy and
significant signals could be covered by (largely unknown) statistical
fluctuations. In this paper we consider the number of tweets (tweet volume) of
a party as a proxy of collective attention to the party, identify the dynamics
of the volume, and show that this quantity has some information on the
elections outcome. We find that the distribution of the tweet volume for each
party follows a log-normal distribution with a positive autocorrelation of the
volume over short terms, which indicates the volume has large fluctuations of
the log-normal distribution yet with a short-term tendency. Furthermore, by
measuring the ratio of two consecutive daily tweet volumes, we find that the
evolution of the daily volume of a party can be described by means of a
geometric Brownian motion (i.e., the logarithm of the volume moves randomly
with a trend). Finally, we determine the optimal period of averaging tweet
volume for reducing fluctuations and extracting short-term tendencies. We
conclude that the tweet volume is a good indicator of parties' success in the
elections when considered over an optimal time window. Our study identifies the
statistical nature of collective attention to political issues and sheds light
on how to model the dynamics of collective attention in social media.Comment: 16 pages, 7 figures, 3 tables. Published in PLoS ON
Comparison of six classification models in terms of <i>Alpha</i>.
<p>Results are the average of 10-fold cross-validations.</p
The self- and inter-annotator agreement measures.
<p>The 95% confidence intervals for <i>Alpha</i> are computed by bootstrapping. Albanian and Spanish (in bold) have very low agreement values.</p
The Slovenian (left) and Bulgarian (right) datasets.
<p>The Slovenian classifier peak is at 70,000 tweets (<i>Alpha</i> = 0.459). The Bulgarian classifier peak is at 40,000 tweets (<i>Alpha</i> = 0.378).</p
The number of annotators, and the number and fraction of posts annotated twice.
<p>The self-agreement column gives the number of posts annotated twice by the same annotator, and the inter-agreement column the posts annotated twice by two different annotators.</p
The Portuguese dataset.
<p>There are two peaks (at 50,000 tweets, <i>Alpha</i> = 0.394, and at 160,000 tweets, <i>Alpha</i> = 0.391), and a large drop in between, due to a topic shift.</p
Results of the Friedman-Nemenyi test of classifiers ranking.
<p>The six classifiers are compared in terms of their ranking using two evaluation measures, <i>Alpha</i> (left) and (right). The ranks of classifiers within the critical distance (2.09) are not statistically significantly different.</p
Sentiment distributions of the application datasets as predicted by the sentiment classifiers.
<p>The rightmost column shows the sentiment score (the mean) of the application and training datasets (the later from <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0155036#pone.0155036.t001" target="_blank">Table 1</a>), respectively.</p
- ā¦