Predicting Information Diffusion on Twitter - Analysis of predictive features

Abstract

Information propagation on online social network focuses much attention in various domains as varied as politics, fact checking, or marketing. Modeling information diffusion in such growing communication media is crucial in order both to understand information propagation and to better control it. Our research aims at predicting whether a post is going to be forwarded or not. Moreover, we aim at predicting how much it is going to be diffused. Our model is based on three types of features: user-based, time-based and content-based. Using three collections corresponding to a total of about 16 millions of tweets, we show that our model improves of about 5% F-measure compared to the state of the art, both when predicting if a tweet is going to be re-tweeted and when predicting how popular it will be. F-measure in our model is between 70% and 82%, depending on the collection. We also show that some features we introduced are very important to predict retweetability such as the numbers of followers and number of communities that a user belongs to. Our contribution in this paper is twofold: firstly we defined new features to represent tweets in order to predict their possible propagation; secondly we evaluate the model we built on top of both features from the literature and features we defined on three collections and show the usefulness of our features in the prediction

    Similar works