24 research outputs found

    What Stops Social Epidemics?

    Full text link
    Theoretical progress in understanding the dynamics of spreading processes on graphs suggests the existence of an epidemic threshold below which no epidemics form and above which epidemics spread to a significant fraction of the graph. We have observed information cascades on the social media site Digg that spread fast enough for one initial spreader to infect hundreds of people, yet end up affecting only 0.1% of the entire network. We find that two effects, previously studied in isolation, combine cooperatively to drastically limit the final size of cascades on Digg. First, because of the highly clustered structure of the Digg network, most people who are aware of a story have been exposed to it via multiple friends. This structure lowers the epidemic threshold while moderately slowing the overall growth of cascades. In addition, we find that the mechanism for social contagion on Digg points to a fundamental difference between information spread and other contagion processes: despite multiple opportunities for infection within a social group, people are less likely to become spreaders of information with repeated exposure. The consequences of this mechanism become more pronounced for more clustered graphs. Ultimately, this effect severely curtails the size of social epidemics on Digg.Comment: 8 pages, 10 figures, accepted in ICWSM1

    Phantom cascades: The effect of hidden nodes on information diffusion

    Full text link
    Research on information diffusion generally assumes complete knowledge of the underlying network. However, in the presence of factors such as increasing privacy awareness, restrictions on application programming interfaces (APIs) and sampling strategies, this assumption rarely holds in the real world which in turn leads to an underestimation of the size of information cascades. In this work we study the effect of hidden network structure on information diffusion processes. We characterise information cascades through activation paths traversing visible and hidden parts of the network. We quantify diffusion estimation error while varying the amount of hidden structure in five empirical and synthetic network datasets and demonstrate the effect of topological properties on this error. Finally, we suggest practical recommendations for practitioners and propose a model to predict the cascade size with minimal information regarding the underlying network.Comment: Preprint submitted to Elsevier Computer Communication

    Information is not a Virus, and Other Consequences of Human Cognitive Limits

    Full text link
    The many decisions people make about what to pay attention to online shape the spread of information in online social networks. Due to the constraints of available time and cognitive resources, the ease of discovery strongly impacts how people allocate their attention to social media content. As a consequence, the position of information in an individual's social feed, as well as explicit social signals about its popularity, determine whether it will be seen, and the likelihood that it will be shared with followers. Accounting for these cognitive limits simplifies mechanics of information diffusion in online social networks and explains puzzling empirical observations: (i) information generally fails to spread in social media and (ii) highly connected people are less likely to re-share information. Studies of information diffusion on different social media platforms reviewed here suggest that the interplay between human cognitive limits and network structure differentiates the spread of information from other social contagions, such as the spread of a virus through a population.Comment: accepted for publication in Future Interne

    Why Do Cascade Sizes Follow a Power-Law?

    Full text link
    We introduce random directed acyclic graph and use it to model the information diffusion network. Subsequently, we analyze the cascade generation model (CGM) introduced by Leskovec et al. [19]. Until now only empirical studies of this model were done. In this paper, we present the first theoretical proof that the sizes of cascades generated by the CGM follow the power-law distribution, which is consistent with multiple empirical analysis of the large social networks. We compared the assumptions of our model with the Twitter social network and tested the goodness of approximation.Comment: 8 pages, 7 figures, accepted to WWW 201

    Non-Conservative Diffusion and its Application to Social Network Analysis

    Full text link
    The random walk is fundamental to modeling dynamic processes on networks. Metrics based on the random walk have been used in many applications from image processing to Web page ranking. However, how appropriate are random walks to modeling and analyzing social networks? We argue that unlike a random walk, which conserves the quantity diffusing on a network, many interesting social phenomena, such as the spread of information or disease on a social network, are fundamentally non-conservative. When an individual infects her neighbor with a virus, the total amount of infection increases. We classify diffusion processes as conservative and non-conservative and show how these differences impact the choice of metrics used for network analysis, as well as our understanding of network structure and behavior. We show that Alpha-Centrality, which mathematically describes non-conservative diffusion, leads to new insights into the behavior of spreading processes on networks. We give a scalable approximate algorithm for computing the Alpha-Centrality in a massive graph. We validate our approach on real-world online social networks of Digg. We show that a non-conservative metric, such as Alpha-Centrality, produces better agreement with empirical measure of influence than conservative metrics, such as PageRank. We hope that our investigation will inspire further exploration into the realms of conservative and non-conservative metrics in social network analysis

    Quantifying Information Overload in Social Media and its Impact on Social Contagions

    Full text link
    Information overload has become an ubiquitous problem in modern society. Social media users and microbloggers receive an endless flow of information, often at a rate far higher than their cognitive abilities to process the information. In this paper, we conduct a large scale quantitative study of information overload and evaluate its impact on information dissemination in the Twitter social media site. We model social media users as information processing systems that queue incoming information according to some policies, process information from the queue at some unknown rates and decide to forward some of the incoming information to other users. We show how timestamped data about tweets received and forwarded by users can be used to uncover key properties of their queueing policies and estimate their information processing rates and limits. Such an understanding of users' information processing behaviors allows us to infer whether and to what extent users suffer from information overload. Our analysis provides empirical evidence of information processing limits for social media users and the prevalence of information overloading. The most active and popular social media users are often the ones that are overloaded. Moreover, we find that the rate at which users receive information impacts their processing behavior, including how they prioritize information from different sources, how much information they process, and how quickly they process information. Finally, the susceptibility of a social media user to social contagions depends crucially on the rate at which she receives information. An exposure to a piece of information, be it an idea, a convention or a product, is much less effective for users that receive information at higher rates, meaning they need more exposures to adopt a particular contagion.Comment: To appear at ICSWM '1

    Computational Social Scientist Beware: Simpson's Paradox in Behavioral Data

    Full text link
    Observational data about human behavior is often heterogeneous, i.e., generated by subgroups within the population under study that vary in size and behavior. Heterogeneity predisposes analysis to Simpson's paradox, whereby the trends observed in data that has been aggregated over the entire population may be substantially different from those of the underlying subgroups. I illustrate Simpson's paradox with several examples coming from studies of online behavior and show that aggregate response leads to wrong conclusions about the underlying individual behavior. I then present a simple method to test whether Simpson's paradox is affecting results of analysis. The presence of Simpson's paradox in social data suggests that important behavioral differences exist within the population, and failure to take these differences into account can distort the studies' findings.Comment: to appear in Journal of Computational Social Science
    corecore