9 research outputs found

    Scalable Privacy-Compliant Virality Prediction on Twitter

    Get PDF
    The digital town hall of Twitter becomes a preferred medium of communication for individuals and organizations across the globe. Some of them reach audiences of millions, while others struggle to get noticed. Given the impact of social media, the question remains more relevant than ever: how to model the dynamics of attention in Twitter. Researchers around the world turn to machine learning to predict the most influential tweets and authors, navigating the volume, velocity, and variety of social big data, with many compromises. In this paper, we revisit content popularity prediction on Twitter. We argue that strict alignment of data acquisition, storage and analysis algorithms is necessary to avoid the common trade-offs between scalability, accuracy and privacy compliance. We propose a new framework for the rapid acquisition of large-scale datasets, high accuracy supervisory signal and multilanguage sentiment prediction while respecting every privacy request applicable. We then apply a novel gradient boosting framework to achieve state-of-the-art results in virality ranking, already before including tweet's visual or propagation features. Our Gradient Boosted Regression Tree is the first to offer explainable, strong ranking performance on benchmark datasets. Since the analysis focused on features available early, the model is immediately applicable to incoming tweets in 18 languages.Comment: AffCon@AAAI-19 Best Paper Award; Presented at AAAI-19 W1: Affective Content Analysi

    Assessing the reTweet proneness of tweets: predictive models for retweeting

    Get PDF

    Predicting Influencer Virality on Twitter

    Get PDF
    The ability to successfully predict virality on Twitter holds great potential as a resource for Twitter influencers, enabling the development of more sophisticated strategies for audience engagement, audience monetization, and information sharing. To our knowledge, focusing exclusively on tweets posted by influencers is a novel context for studying Twitter virality. We find, among feature categories traditionally considered in the literature, that combining categories covering a range of information performs better than models only incorporating individual feature categories. Moreover, our general predictive model, encompassing a range of feature categories, achieves a prediction accuracy of 68% for influencer virality. We also investigate the role of influencer audiences in predicting virality, a topic we believe to be understudied in the literature. We suspect that incorporating audience information will allow us to better discriminate between virality classes, thus leading to better predictions. We pursue two different approaches, resulting in 10 different predictive models that leverage influencer audience information in addition to traditional feature categories. Both of our attempts to incorporate audience information plateau at an accuracy of approximately 61%, roughly a 7% decrease in performance compared to our general predictive model. We conclude that we are unable to find experimental evidence to support our claim that incorporating influencer audience information will improve virality predictions. Nonetheless, the performance of our general model holds promise for the deployment of a tool that allows influencers to reap the benefits of virality prediction. As stronger performance from the underlying model would make this tool more useful in practice to influencers, improving the predictive performance of our general model is a cornerstone of future work

    Analysis of the Structure of Social Networks for Information Diffusion

    Get PDF
    The vast proliferation of Online Social Networks (OSN) is generating many new ways to interact and create social relationships with others. In OSN, information spreads among users following existing social relationships. This spread is influenced by the local properties and structures of the social relationships at individual level. Being able to understand these properties can be fundamental for the design of new communication systems able to predict the creation and sharing of content based on social properties of the users. While substantial results have been obtained in anthropology literature describing the properties of human social networks, a clear understanding of the properties of social networks built using OSN is still to be achieved. In this thesis, the structure of Ego networks formed online is compared with the properties of offline social relationships showing interesting similarities. These properties are exploited to provide a meaningful way to study the mechanisms controlling the formation of information diffusion chains in social networks (typically referred to as information cascades). Trough the analysis of synthetically generated diffusion cascades executed in a large Facebook communication datasets, is showed that the knowledge of tie strength of the social links is fundamental to infer which nodes will give rise to large information cascades and which links will be more used in the information diffusion process. We analysed the trade off between information spread and trustworthiness of information. Specifically, we have investigated the spread of information when only links of a certain trust value are used. Assuming, based on results from sociology, that trust can be quantised, we show that too strict limits on the minimum trust between users limit significantly information spread. In the thesis we investigate the effect of different strategies to significantly increase spread of information by minimally relaxing constraints on the minimum allowed trust level

    Working the News: Preserving Professional Identity Through Networked Journalism at Elite News Media

    Get PDF
    The concept of journalism as a profession has arguably been fraught and contested throughout its existence. Ideologically, it is founded on a claim to norms and a code of ethics, but in the past, news media also held material control over mass communication through broadcast and print which were largely inaccessible to most citizens. The Internet and social media has created a news environment where professional journalists and their work exist side-by-side with non-journalists. In this space, acts of journalism also can be and are carried out by non-journalists. Through the new news distribution channels offered by social media, non-journalists are potentially able to disseminate their texts to wide audiences. In practice this means that journalism is no longer exclusively the domain of the journalist, and has led to the adoption of collaboration as a journalistic convention that presents opportunities but also serious challenges and risks for the professional community. My research aims to contribute to the news discourse concerning emerging professional practices in networked journalism with a focus on how journalistic authority is reasserted within a collaborative news environment. Rather than looking at networked journalism as primarily participatory, this research explores collaborative newswork as a means to carry out professional boundary work and to articulate this to audiences. I argue that the act of collaboration in newswork at times becomes a quasi-ideological project to protect journalism as a profession that lays claim to ethics, norms and routines. The research comprises three case studies of news stories covered by the BBC World Service and the English-language services of France 24 and Al Jazeera. Using quantitative and qualitative analysis methods, they explore how social media was mobilised in the newswork. The aim was to explore how sourcing practices affected the power relationships between primary and secondary definers, and how journalists create and articulate professional boundaries in collaborative newswork. These research findings were triangulated with interviews with social media editors at the three news organisations
    corecore