65 research outputs found

    Twitter-based analysis of the dynamics of collective attention to political parties

    Get PDF
    Large-scale data from social media have a significant potential to describe complex phenomena in real world and to anticipate collective behaviors such as information spreading and social trends. One specific case of study is represented by the collective attention to the action of political parties. Not surprisingly, researchers and stakeholders tried to correlate parties' presence on social media with their performances in elections. Despite the many efforts, results are still inconclusive since this kind of data is often very noisy and significant signals could be covered by (largely unknown) statistical fluctuations. In this paper we consider the number of tweets (tweet volume) of a party as a proxy of collective attention to the party, identify the dynamics of the volume, and show that this quantity has some information on the elections outcome. We find that the distribution of the tweet volume for each party follows a log-normal distribution with a positive autocorrelation of the volume over short terms, which indicates the volume has large fluctuations of the log-normal distribution yet with a short-term tendency. Furthermore, by measuring the ratio of two consecutive daily tweet volumes, we find that the evolution of the daily volume of a party can be described by means of a geometric Brownian motion (i.e., the logarithm of the volume moves randomly with a trend). Finally, we determine the optimal period of averaging tweet volume for reducing fluctuations and extracting short-term tendencies. We conclude that the tweet volume is a good indicator of parties' success in the elections when considered over an optimal time window. Our study identifies the statistical nature of collective attention to political issues and sheds light on how to model the dynamics of collective attention in social media.Comment: 16 pages, 7 figures, 3 tables. Published in PLoS ON

    Comparison of six classification models in terms of <i>Alpha</i>.

    No full text
    <p>Results are the average of 10-fold cross-validations.</p

    The self- and inter-annotator agreement measures.

    No full text
    <p>The 95% confidence intervals for <i>Alpha</i> are computed by bootstrapping. Albanian and Spanish (in bold) have very low agreement values.</p

    The Slovenian (left) and Bulgarian (right) datasets.

    No full text
    <p>The Slovenian classifier peak is at 70,000 tweets (<i>Alpha</i> = 0.459). The Bulgarian classifier peak is at 40,000 tweets (<i>Alpha</i> = 0.378).</p

    The number of annotators, and the number and fraction of posts annotated twice.

    No full text
    <p>The self-agreement column gives the number of posts annotated twice by the same annotator, and the inter-agreement column the posts annotated twice by two different annotators.</p

    The Portuguese dataset.

    No full text
    <p>There are two peaks (at 50,000 tweets, <i>Alpha</i> = 0.394, and at 160,000 tweets, <i>Alpha</i> = 0.391), and a large drop in between, due to a topic shift.</p

    Results of the Friedman-Nemenyi test of classifiers ranking.

    No full text
    <p>The six classifiers are compared in terms of their ranking using two evaluation measures, <i>Alpha</i> (left) and (right). The ranks of classifiers within the critical distance (2.09) are not statistically significantly different.</p

    Sentiment distributions of the application datasets as predicted by the sentiment classifiers.

    No full text
    <p>The rightmost column shows the sentiment score (the mean) of the application and training datasets (the later from <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0155036#pone.0155036.t001" target="_blank">Table 1</a>), respectively.</p
    • ā€¦
    corecore