13,993 research outputs found

    Toward Order-of-Magnitude Cascade Prediction

    Full text link
    When a piece of information (microblog, photograph, video, link, etc.) starts to spread in a social network, an important question arises: will it spread to "viral" proportions -- where "viral" is defined as an order-of-magnitude increase. However, several previous studies have established that cascade size and frequency are related through a power-law - which leads to a severe imbalance in this classification problem. In this paper, we devise a suite of measurements based on "structural diversity" -- the variety of social contexts (communities) in which individuals partaking in a given cascade engage. We demonstrate these measures are able to distinguish viral from non-viral cascades, despite the severe imbalance of the data for this problem. Further, we leverage these measurements as features in a classification approach, successfully predicting microblogs that grow from 50 to 500 reposts with precision of 0.69 and recall of 0.52 for the viral class - despite this class comprising under 2\% of samples. This significantly outperforms our baseline approach as well as the current state-of-the-art. Our work also demonstrates how we can tradeoff between precision and recall.Comment: 4 pages, 15 figures, ASONAM 2015 poster pape

    Early Warning Analysis for Social Diffusion Events

    Get PDF
    There is considerable interest in developing predictive capabilities for social diffusion processes, for instance to permit early identification of emerging contentious situations, rapid detection of disease outbreaks, or accurate forecasting of the ultimate reach of potentially viral ideas or behaviors. This paper proposes a new approach to this predictive analytics problem, in which analysis of meso-scale network dynamics is leveraged to generate useful predictions for complex social phenomena. We begin by deriving a stochastic hybrid dynamical systems (S-HDS) model for diffusion processes taking place over social networks with realistic topologies; this modeling approach is inspired by recent work in biology demonstrating that S-HDS offer a useful mathematical formalism with which to represent complex, multi-scale biological network dynamics. We then perform formal stochastic reachability analysis with this S-HDS model and conclude that the outcomes of social diffusion processes may depend crucially upon the way the early dynamics of the process interacts with the underlying network's community structure and core-periphery structure. This theoretical finding provides the foundations for developing a machine learning algorithm that enables accurate early warning analysis for social diffusion events. The utility of the warning algorithm, and the power of network-based predictive metrics, are demonstrated through an empirical investigation of the propagation of political memes over social media networks. Additionally, we illustrate the potential of the approach for security informatics applications through case studies involving early warning analysis of large-scale protests events and politically-motivated cyber attacks

    Governing Cascade Failures in Complex Social-Ecological-Technological Systems: Framing Context, Strategies, and Challenges

    Get PDF
    Cascade failures are events in networked systems with interconnected components in which failure of one or a few parts triggers the failure of other parts, which triggers the failure of more parts, and so on. Cascade failures occur in a wide variety of familiar systems, such as electric power distribution grids, transportation systems, financial systems, and ecosystems. Cascade failures have plagued society for centuries. However, modern social-ecological-technological systems (SETS) have become vast, fast moving, and highly interconnected, exposing these systems to cascade failures of potentially global proportions, spreading at breathtaking speed, and imposing catastrophic harms. The increasing potential for cascade failures of the magnitude of the 2008 financial system collapse, which had a truly global reach and affected systems well beyond finance, screams out for clear thinking about governing vulnerability to cascade failures in SETS. Yet, legal scholarship on the theme is essentially nil, and a more comprehensive, generalizable governance theory leveraging knowledge from scientific research on cascade failures has not emerged. Research initiatives are needed to forge ground on three fronts: (1) system modeling and monitoring; (2) event prediction; and (3) event prevention, response, and recovery. This Article is a first step in that direction. Part I outlines the cascade failures problem. Part II summarizes the scientific research on cascade failures. Part III identifies strategies for controlling cascade failures. Part IV explores the governance challenges of deploying those various strategies in large-scale SETS. Part Vextends the analysis to the special case of cascade failures within ecological systems and the difficulties of managing them through the strategies coming out of cascade failures science. Lastly, Part VI suggests directions of future research on governance of cascade failures

    Spatial evolution of human dialects

    Get PDF
    The geographical pattern of human dialects is a result of history. Here, we formulate a simple spatial model of language change which shows that the final result of this historical evolution may, to some extent, be predictable. The model shows that the boundaries of language dialect regions are controlled by a length minimizing effect analogous to surface tension, mediated by variations in population density which can induce curvature, and by the shape of coastline or similar borders. The predictability of dialect regions arises because these effects will drive many complex, randomized early states toward one of a smaller number of stable final configurations. The model is able to reproduce observations and predictions of dialectologists. These include dialect continua, isogloss bundling, fanning, the wave-like spread of dialect features from cities, and the impact of human movement on the number of dialects that an area can support. The model also provides an analytical form for S\'{e}guy's Curve giving the relationship between geographical and linguistic distance, and a generalisation of the curve to account for the presence of a population centre. A simple modification allows us to analytically characterize the variation of language use by age in an area undergoing linguistic change

    Will This Paper Increase Your h-index? Scientific Impact Prediction

    Full text link
    Scientific impact plays a central role in the evaluation of the output of scholars, departments, and institutions. A widely used measure of scientific impact is citations, with a growing body of literature focused on predicting the number of citations obtained by any given publication. The effectiveness of such predictions, however, is fundamentally limited by the power-law distribution of citations, whereby publications with few citations are extremely common and publications with many citations are relatively rare. Given this limitation, in this work we instead address a related question asked by many academic researchers in the course of writing a paper, namely: "Will this paper increase my h-index?" Using a real academic dataset with over 1.7 million authors, 2 million papers, and 8 million citation relationships from the premier online academic service ArnetMiner, we formalize a novel scientific impact prediction problem to examine several factors that can drive a paper to increase the primary author's h-index. We find that the researcher's authority on the publication topic and the venue in which the paper is published are crucial factors to the increase of the primary author's h-index, while the topic popularity and the co-authors' h-indices are of surprisingly little relevance. By leveraging relevant factors, we find a greater than 87.5% potential predictability for whether a paper will contribute to an author's h-index within five years. As a further experiment, we generate a self-prediction for this paper, estimating that there is a 76% probability that it will contribute to the h-index of the co-author with the highest current h-index in five years. We conclude that our findings on the quantification of scientific impact can help researchers to expand their influence and more effectively leverage their position of "standing on the shoulders of giants."Comment: Proc. of the 8th ACM International Conference on Web Search and Data Mining (WSDM'15
    • …
    corecore