189,153 research outputs found

    A framework for information dissemination in social networks using Hawkes processes

    Get PDF
    International audienceWe define in this paper a general Hawkes-based framework to model information diffusion in social networks. The proposed framework takes into consideration the hidden interactions between users as well as the interactions between contents and social networks, and can also accommodate dynamic social networks and various temporal effects of the diffusion, which provides a complete analysis of the hidden influences in social networks. This framework can be combined with topic modeling, for which modified collapsed Gibbs sampling and variational Bayes techniques are derived. We provide an estimation algorithm based on nonnegative tensor factorization techniques, which together with a dimensionality reduction argument are able to discover , in addition, the latent community structure of the social network. At last, we provide numerical examples from real-life networks: a Game of Thrones and a MemeTracker datasets

    SAMPLING AND CHARACTERIZING EVOLVING COMMUNITIES IN SOCIAL NETWORKS

    Get PDF
    One of the most important structures in social networks is communities. Understanding communities is useful in many applications, such as suggesting a friend for a user in an online friendship network, recommending a product for a user in an e-commerce network, etc. However, before studying anything about communities, researchers first need to collect appropriate data. Getting complete access to the data for community studies is unrealistic in most cases. In this work, we address the problem of crawling networks to identify community structure. Firstly, we present a network sampling technique to crawl the community structure of dynamic networks when there is a limitation on the number of nodes that can be queried. The process begins by obtaining a sample for the first-time step. In subsequent time steps, the crawling process is guided by community structure discoveries made in the past. Experiments conducted on the proposed approach and certain baseline techniques reveal the proposed approach has at least a 35% performance increase in cases when the total query budget is fixed over the entire period and at least an 8% increase in cases when the query budget is fixed per time step. Secondly, we propose a sampling technique to sample communities in node attributed edge streams when there is a limit on the maximum number of nodes that can be stored. The process learns if the nodal information can characterize communities. The nodal information is leveraged with the structural information to generate representative communities. If the nodal information does not characterize communities, only structural information is considered in assigning nodes to communities. The proposed approach provides a performance improvement of up to about 5 times that of baselines. Finally, we investigate factors that characterize the evolution of communities with respect to the number of active users. We perform this investigation on the Reddit social media platform. We begin by first analyzing individual conversations of one community and sees how that generalizes to other communities. The first community studied is Reddit’s changemyview. The changemyview community, in addition to its rich data source, has an interesting property where members whose view are changed award points to users that successfully changed their minds. From the changemyview community, we observe that the linguistic style and interactions of members of the community can significantly differentiate susceptible and non-susceptible users. Next, we examine other communities (subreddits), and investigate how the user behaviors observed from changemyview relate to patterns of community evolution. We learn that the linguistic style and interactions of members in a community can also significantly differentiate the different parts of the evolution of the community with respect to number of active users

    A survey of statistical network models

    Full text link
    Networks are ubiquitous in science and have become a focal point for discussion in everyday life. Formal statistical models for the analysis of network data have emerged as a major topic of interest in diverse areas of study, and most of these involve a form of graphical representation. Probability models on graphs date back to 1959. Along with empirical studies in social psychology and sociology from the 1960s, these early works generated an active network community and a substantial literature in the 1970s. This effort moved into the statistical literature in the late 1970s and 1980s, and the past decade has seen a burgeoning network literature in statistical physics and computer science. The growth of the World Wide Web and the emergence of online networking communities such as Facebook, MySpace, and LinkedIn, and a host of more specialized professional network communities has intensified interest in the study of networks and network data. Our goal in this review is to provide the reader with an entry point to this burgeoning literature. We begin with an overview of the historical development of statistical network modeling and then we introduce a number of examples that have been studied in the network literature. Our subsequent discussion focuses on a number of prominent static and dynamic network models and their interconnections. We emphasize formal model descriptions, and pay special attention to the interpretation of parameters and their estimation. We end with a description of some open problems and challenges for machine learning and statistics.Comment: 96 pages, 14 figures, 333 reference

    Hierarchical Stochastic Block Model for Community Detection in Multiplex Networks

    Full text link
    Multiplex networks have become increasingly more prevalent in many fields, and have emerged as a powerful tool for modeling the complexity of real networks. There is a critical need for developing inference models for multiplex networks that can take into account potential dependencies across different layers, particularly when the aim is community detection. We add to a limited literature by proposing a novel and efficient Bayesian model for community detection in multiplex networks. A key feature of our approach is the ability to model varying communities at different network layers. In contrast, many existing models assume the same communities for all layers. Moreover, our model automatically picks up the necessary number of communities at each layer (as validated by real data examples). This is appealing, since deciding the number of communities is a challenging aspect of community detection, and especially so in the multiplex setting, if one allows the communities to change across layers. Borrowing ideas from hierarchical Bayesian modeling, we use a hierarchical Dirichlet prior to model community labels across layers, allowing dependency in their structure. Given the community labels, a stochastic block model (SBM) is assumed for each layer. We develop an efficient slice sampler for sampling the posterior distribution of the community labels as well as the link probabilities between communities. In doing so, we address some unique challenges posed by coupling the complex likelihood of SBM with the hierarchical nature of the prior on the labels. An extensive empirical validation is performed on simulated and real data, demonstrating the superior performance of the model over single-layer alternatives, as well as the ability to uncover interesting structures in real networks

    Information dynamics shape the networks of Internet-mediated prostitution

    Get PDF
    Like many other social phenomena, prostitution is increasingly coordinated over the Internet. The online behavior affects the offline activity; the reverse is also true. We investigated the reported sexual contacts between 6,624 anonymous escorts and 10,106 sex-buyers extracted from an online community from its beginning and six years on. These sexual encounters were also graded and categorized (in terms of the type of sexual activities performed) by the buyers. From the temporal, bipartite network of posts, we found a full feedback loop in which high grades on previous posts affect the future commercial success of the sex-worker, and vice versa. We also found a peculiar growth pattern in which the turnover of community members and sex workers causes a sublinear preferential attachment. There is, moreover, a strong geographic influence on network structure-the network is geographically clustered but still close to connected, the contacts consistent with the inverse-square law observed in trading patterns. We also found that the number of sellers scales sublinearly with city size, so this type of prostitution does not, comparatively speaking, benefit much from an increasing concentration of people
    • …
    corecore