189,153 research outputs found
A framework for information dissemination in social networks using Hawkes processes
International audienceWe define in this paper a general Hawkes-based framework to model information diffusion in social networks. The proposed framework takes into consideration the hidden interactions between users as well as the interactions between contents and social networks, and can also accommodate dynamic social networks and various temporal effects of the diffusion, which provides a complete analysis of the hidden influences in social networks. This framework can be combined with topic modeling, for which modified collapsed Gibbs sampling and variational Bayes techniques are derived. We provide an estimation algorithm based on nonnegative tensor factorization techniques, which together with a dimensionality reduction argument are able to discover , in addition, the latent community structure of the social network. At last, we provide numerical examples from real-life networks: a Game of Thrones and a MemeTracker datasets
SAMPLING AND CHARACTERIZING EVOLVING COMMUNITIES IN SOCIAL NETWORKS
One of the most important structures in social networks is communities. Understanding communities is useful in many applications, such as suggesting a friend for a user in an online friendship network, recommending a product for a user in an e-commerce network, etc. However, before studying anything about communities, researchers first need to collect appropriate data. Getting complete access to the data for community studies is unrealistic in most cases. In this work, we address the problem of crawling networks to identify community structure. Firstly, we present a network sampling technique to crawl the community structure of dynamic networks when there is a limitation on the number of nodes that can be queried. The process begins by obtaining a sample for the first-time step. In subsequent time steps, the crawling process is guided by community structure discoveries made in the past. Experiments conducted on the proposed approach and certain baseline techniques reveal the proposed approach has at least a 35% performance increase in cases when the total query budget is fixed over the entire period and at least an 8% increase in cases when the query budget is fixed per time step. Secondly, we propose a sampling technique to sample communities in node attributed edge streams when there is a limit on the maximum number of nodes that can be stored. The process learns if the nodal information can characterize communities. The nodal information is leveraged with the structural information to generate representative communities. If the nodal information does not characterize communities, only structural information is considered in assigning nodes to communities. The proposed approach provides a performance improvement of up to about 5 times that of baselines. Finally, we investigate factors that characterize the evolution of communities with respect to the number of active users. We perform this investigation on the Reddit social media platform. We begin by first analyzing individual conversations of one community and sees how that generalizes to other communities. The first community studied is Reddit’s changemyview. The changemyview community, in addition to its rich data source, has an interesting property where members whose view are changed award points to users that successfully changed their minds. From the changemyview community, we observe that the linguistic style and interactions of members of the community can significantly differentiate susceptible and non-susceptible users. Next, we examine other communities (subreddits), and investigate how the user behaviors observed from changemyview relate to patterns of community evolution. We learn that the linguistic style and interactions of members in a community can also significantly differentiate the different parts of the evolution of the community with respect to number of active users
A survey of statistical network models
Networks are ubiquitous in science and have become a focal point for
discussion in everyday life. Formal statistical models for the analysis of
network data have emerged as a major topic of interest in diverse areas of
study, and most of these involve a form of graphical representation.
Probability models on graphs date back to 1959. Along with empirical studies in
social psychology and sociology from the 1960s, these early works generated an
active network community and a substantial literature in the 1970s. This effort
moved into the statistical literature in the late 1970s and 1980s, and the past
decade has seen a burgeoning network literature in statistical physics and
computer science. The growth of the World Wide Web and the emergence of online
networking communities such as Facebook, MySpace, and LinkedIn, and a host of
more specialized professional network communities has intensified interest in
the study of networks and network data. Our goal in this review is to provide
the reader with an entry point to this burgeoning literature. We begin with an
overview of the historical development of statistical network modeling and then
we introduce a number of examples that have been studied in the network
literature. Our subsequent discussion focuses on a number of prominent static
and dynamic network models and their interconnections. We emphasize formal
model descriptions, and pay special attention to the interpretation of
parameters and their estimation. We end with a description of some open
problems and challenges for machine learning and statistics.Comment: 96 pages, 14 figures, 333 reference
Hierarchical Stochastic Block Model for Community Detection in Multiplex Networks
Multiplex networks have become increasingly more prevalent in many fields,
and have emerged as a powerful tool for modeling the complexity of real
networks. There is a critical need for developing inference models for
multiplex networks that can take into account potential dependencies across
different layers, particularly when the aim is community detection. We add to a
limited literature by proposing a novel and efficient Bayesian model for
community detection in multiplex networks. A key feature of our approach is the
ability to model varying communities at different network layers. In contrast,
many existing models assume the same communities for all layers. Moreover, our
model automatically picks up the necessary number of communities at each layer
(as validated by real data examples). This is appealing, since deciding the
number of communities is a challenging aspect of community detection, and
especially so in the multiplex setting, if one allows the communities to change
across layers. Borrowing ideas from hierarchical Bayesian modeling, we use a
hierarchical Dirichlet prior to model community labels across layers, allowing
dependency in their structure. Given the community labels, a stochastic block
model (SBM) is assumed for each layer. We develop an efficient slice sampler
for sampling the posterior distribution of the community labels as well as the
link probabilities between communities. In doing so, we address some unique
challenges posed by coupling the complex likelihood of SBM with the
hierarchical nature of the prior on the labels. An extensive empirical
validation is performed on simulated and real data, demonstrating the superior
performance of the model over single-layer alternatives, as well as the ability
to uncover interesting structures in real networks
Information dynamics shape the networks of Internet-mediated prostitution
Like many other social phenomena, prostitution is increasingly coordinated
over the Internet. The online behavior affects the offline activity; the
reverse is also true. We investigated the reported sexual contacts between
6,624 anonymous escorts and 10,106 sex-buyers extracted from an online
community from its beginning and six years on. These sexual encounters were
also graded and categorized (in terms of the type of sexual activities
performed) by the buyers. From the temporal, bipartite network of posts, we
found a full feedback loop in which high grades on previous posts affect the
future commercial success of the sex-worker, and vice versa. We also found a
peculiar growth pattern in which the turnover of community members and sex
workers causes a sublinear preferential attachment. There is, moreover, a
strong geographic influence on network structure-the network is geographically
clustered but still close to connected, the contacts consistent with the
inverse-square law observed in trading patterns. We also found that the number
of sellers scales sublinearly with city size, so this type of prostitution does
not, comparatively speaking, benefit much from an increasing concentration of
people
- …