9 research outputs found
Scalable Privacy-Compliant Virality Prediction on Twitter
The digital town hall of Twitter becomes a preferred medium of communication
for individuals and organizations across the globe. Some of them reach
audiences of millions, while others struggle to get noticed. Given the impact
of social media, the question remains more relevant than ever: how to model the
dynamics of attention in Twitter. Researchers around the world turn to machine
learning to predict the most influential tweets and authors, navigating the
volume, velocity, and variety of social big data, with many compromises. In
this paper, we revisit content popularity prediction on Twitter. We argue that
strict alignment of data acquisition, storage and analysis algorithms is
necessary to avoid the common trade-offs between scalability, accuracy and
privacy compliance. We propose a new framework for the rapid acquisition of
large-scale datasets, high accuracy supervisory signal and multilanguage
sentiment prediction while respecting every privacy request applicable. We then
apply a novel gradient boosting framework to achieve state-of-the-art results
in virality ranking, already before including tweet's visual or propagation
features. Our Gradient Boosted Regression Tree is the first to offer
explainable, strong ranking performance on benchmark datasets. Since the
analysis focused on features available early, the model is immediately
applicable to incoming tweets in 18 languages.Comment: AffCon@AAAI-19 Best Paper Award; Presented at AAAI-19 W1: Affective
Content Analysi
Predicting Influencer Virality on Twitter
The ability to successfully predict virality on Twitter holds great potential as a resource for Twitter influencers, enabling the development of more sophisticated strategies for audience engagement, audience monetization, and information sharing. To our knowledge, focusing exclusively on tweets posted by influencers is a novel context for studying Twitter virality. We find, among feature categories traditionally considered in the literature, that combining categories covering a range of information performs better than models only incorporating individual feature categories. Moreover, our general predictive model, encompassing a range of feature categories, achieves a prediction accuracy of 68% for influencer virality. We also investigate the role of influencer audiences in predicting virality, a topic we believe to be understudied in the literature. We suspect that incorporating audience information will allow us to better discriminate between virality classes, thus leading to better predictions. We pursue two different approaches, resulting in 10 different predictive models that leverage influencer audience information in addition to traditional feature categories. Both of our attempts to incorporate audience information plateau at an accuracy of approximately 61%, roughly a 7% decrease in performance compared to our general predictive model. We conclude that we are unable to find experimental evidence to support our claim that incorporating influencer audience information will improve virality predictions. Nonetheless, the performance of our general model holds promise for the deployment of a tool that allows influencers to reap the benefits of virality prediction. As stronger performance from the underlying model would make this tool more useful in practice to influencers, improving the predictive performance of our general model is a cornerstone of future work
Analysis of the Structure of Social Networks for Information Diffusion
The vast proliferation of Online Social Networks (OSN) is generating many new ways to interact and create social relationships with others.
In OSN, information spreads among users following existing social relationships. This spread is influenced by the local properties and structures of the social relationships at individual level. Being able to understand these properties can be fundamental for the design of new communication systems able to predict the creation and sharing of content based on social properties of the users.
While substantial results have been obtained in anthropology literature describing the properties of human social networks, a clear understanding of the properties of social networks built using OSN is still to be achieved.
In this thesis, the structure of Ego networks formed online is compared with the properties of offline social relationships showing interesting similarities. These properties are exploited to provide a meaningful way to study the mechanisms controlling the formation of information diffusion chains in social networks (typically referred to as information cascades). Trough the analysis of synthetically generated diffusion cascades executed in a large Facebook communication datasets, is showed that the knowledge of tie strength of the social links is fundamental to infer which nodes will give rise to large information cascades and which links will be more used in the information diffusion process. We analysed the trade off between information spread and trustworthiness of information. Specifically, we have investigated the spread of information when only links of a certain trust value are used. Assuming, based on results from sociology, that trust can be quantised, we show that too strict limits on the minimum trust between users limit significantly information spread. In the thesis we investigate the effect of different strategies to significantly increase spread of information by minimally relaxing constraints on the minimum allowed trust level
Decision Modelling Driven by Twitter Data: A Case Study of the 2017 Presidential Election in Ecuador
Working the News: Preserving Professional Identity Through Networked Journalism at Elite News Media
The concept of journalism as a profession has arguably been fraught and contested throughout its existence. Ideologically, it is founded on a claim to norms and a code of ethics, but in the past, news media also held material control over mass communication through broadcast and print which were largely inaccessible to most citizens. The Internet and social media has created a news environment where professional journalists and their work exist side-by-side with non-journalists. In this space, acts of journalism also can be and are carried out by non-journalists. Through the new news distribution channels offered by social media, non-journalists are potentially able to disseminate their texts to wide audiences. In practice this means that journalism is no longer exclusively the domain of the journalist, and has led to the adoption of collaboration as a journalistic convention that presents opportunities but also serious challenges and risks for the professional community. My research aims to contribute to the news discourse concerning emerging professional practices in networked journalism with a focus on how journalistic authority is reasserted within a collaborative news environment. Rather than looking at networked journalism as primarily participatory, this research explores collaborative newswork as a means to carry out professional boundary work and to articulate this to audiences. I argue that the act of collaboration in newswork at times becomes a quasi-ideological project to protect journalism as a profession that lays claim to ethics, norms and routines. The research comprises three case studies of news stories covered by the BBC World Service and the English-language services of France 24 and Al Jazeera. Using quantitative and qualitative analysis methods, they explore how social media was mobilised in the newswork. The aim was to explore how sourcing practices affected the power relationships between primary and secondary definers, and how journalists create and articulate professional boundaries in collaborative newswork. These research findings were triangulated with interviews with social media editors at the three news organisations