59 research outputs found
Intrinsically Dynamic Network Communities
Community finding algorithms for networks have recently been extended to
dynamic data. Most of these recent methods aim at exhibiting community
partitions from successive graph snapshots and thereafter connecting or
smoothing these partitions using clever time-dependent features and sampling
techniques. These approaches are nonetheless achieving longitudinal rather than
dynamic community detection. We assume that communities are fundamentally
defined by the repetition of interactions among a set of nodes over time.
According to this definition, analyzing the data by considering successive
snapshots induces a significant loss of information: we suggest that it blurs
essentially dynamic phenomena - such as communities based on repeated
inter-temporal interactions, nodes switching from a community to another across
time, or the possibility that a community survives while its members are being
integrally replaced over a longer time period. We propose a formalism which
aims at tackling this issue in the context of time-directed datasets (such as
citation networks), and present several illustrations on both empirical and
synthetic dynamic networks. We eventually introduce intrinsically dynamic
metrics to qualify temporal community structure and emphasize their possible
role as an estimator of the quality of the community detection - taking into
account the fact that various empirical contexts may call for distinct
`community' definitions and detection criteria.Comment: 27 pages, 11 figure
Generating constrained random graphs using multiple edge switches
The generation of random graphs using edge swaps provides a reliable method
to draw uniformly random samples of sets of graphs respecting some simple
constraints, e.g. degree distributions. However, in general, it is not
necessarily possible to access all graphs obeying some given con- straints
through a classical switching procedure calling on pairs of edges. We therefore
propose to get round this issue by generalizing this classical approach through
the use of higher-order edge switches. This method, which we denote by "k-edge
switching", makes it possible to progres- sively improve the covered portion of
a set of constrained graphs, thereby providing an increasing, asymptotically
certain confidence on the statistical representativeness of the obtained
sample.Comment: 15 page
A data-driven analysis to question epidemic models for citation cascades on the blogosphere
Citation cascades in blog networks are often considered as traces of
information spreading on this social medium. In this work, we question this
point of view using both a structural and semantic analysis of five months
activity of the most representative blogs of the french-speaking
community.Statistical measures reveal that our dataset shares many features
with those that can be found in the literature, suggesting the existence of an
identical underlying process. However, a closer analysis of the post content
indicates that the popular epidemic-like descriptions of cascades are
misleading in this context.A basic model, taking only into account the behavior
of bloggers and their restricted social network, accounts for several important
statistical features of the data.These arguments support the idea that
citations primary goal may not be information spreading on the blogosphere.Comment: 18 pages, 9 figures, to be published in ICWSM-13 proceeding
Internal links and pairs as a new tool for the analysis of bipartite complex networks
Many real-world complex networks are best modeled as bipartite (or 2-mode)
graphs, where nodes are divided into two sets with links connecting one side to
the other. However, there is currently a lack of methods to analyze properly
such graphs as most existing measures and methods are suited to classical
graphs. A usual but limited approach consists in deriving 1-mode graphs (called
projections) from the underlying bipartite structure, though it causes
important loss of information and data storage issues. We introduce here
internal links and pairs as a new notion useful for such analysis: it gives
insights on the information lost by projecting the bipartite graph. We
illustrate the relevance of theses concepts on several real-world instances
illustrating how it enables to discriminate behaviors among various cases when
we compare them to a benchmark of random networks. Then, we show that we can
draw benefit from this concept for both modeling complex networks and storing
them in a compact format
Predicting interactions between individuals with structural and dynamical information
Capturing both the structural and temporal aspects of interactions is crucial
for many real world datasets like contact between individuals. Using the link
stream formalism to capture the dynamic of the systems, we tackle the issue of
activity prediction in link streams, that is to say predicting the number of
links occurring during a given period of time and we present a protocol that
takes advantage of the temporal and structural information contained in the
link stream. Using a supervised learning method, we are able to model the
dynamic of our system to improve the prediction. We investigate the behavior of
our algorithm and crucial elements affecting the prediction. By introducing
different categories of pair of nodes, we are able to improve the quality as
well as increase the diversity of our prediction
Interaction Prediction Problems in Link Streams
International audienceThe problems of link prediction and recovery have been the focus of much work during the last 10 years. This is due to the fact that these questions have a large number of practical implications ranging from detecting spam emails, to predicting which item is selected by which user in a recommendation system. However, considering the highly dynamical aspect of complex networks, there is a rising interest not only for knowing who will interact with whom, but also when. For example, when trying to control the spreading of a virus in a population, it is important to know whether an individual is bound to have a lot of new contacts before or after being infected. In that sense, this question is located at the crossroad of link prediction and another family of problems which has been widely dealt with in the literature, that is, time-series prediction. We name it the interaction prediction problem in link streams. It calls for the definition of specific features, strategies, and evaluation methods to capture both the structural and temporal aspects of the interactions. In this chapter, we propose a general formulation of the problem, consistent with the link stream formalism, which formally represents the streaming sequence of interactions between the elements of the system. Using this framework, we discuss the formulation of the interaction prediction problem and propose possible strategies to address it
Combining structural and dynamic information to predict activity in link streams
International audienceA link stream is a sequence of triplets (t, u, v) meaning that nodes u and v have interacted at time t. Capturing both the structural and temporal aspects of interactions is crucial for many real world datasets like contact between individuals. We tackle the issue of activity prediction in link streams, that is to say predicting the number of links occurring during a given period of time and we present a protocol that takes advantage of the temporal and structural information contained in the link stream. We introduce a way to represent the information captured using different features and combine them in a prediction function which is used to evaluate the future activity of links
LSCPM: communities in massive real-world Link Streams by Clique Percolation Method
Community detection is a popular approach to understand the organization of
interactions in static networks. For that purpose, the Clique Percolation
Method (CPM), which involves the percolation of k-cliques, is a well-studied
technique that offers several advantages. Besides, studying interactions that
occur over time is useful in various contexts, which can be modeled by the link
stream formalism. The Dynamic Clique Percolation Method (DCPM) has been
proposed for extending CPM to temporal networks.
However, existing implementations are unable to handle massive datasets. We
present a novel algorithm that adapts CPM to link streams, which has the
advantage that it allows us to speed up the computation time with respect to
the existing DCPM method. We evaluate it experimentally on real datasets and
show that it scales to massive link streams. For example, it allows to obtain a
complete set of communities in under twenty-five minutes for a dataset with
thirty million links, what the state of the art fails to achieve even after a
week of computation. We further show that our method provides communities
similar to DCPM, but slightly more aggregated. We exhibit the relevance of the
obtained communities in real world cases, and show that they provide
information on the importance of vertices in the link streams.Comment: 18 pages, 7 figures, to be published in 30th International Symposium
on Temporal Representation and Reasoning (TIME 2023
RankMerging: A supervised learning-to-rank framework to predict links in large social network
Uncovering unknown or missing links in social networks is a difficult task
because of their sparsity and because links may represent different types of
relationships, characterized by different structural patterns. In this paper,
we define a simple yet efficient supervised learning-to-rank framework, called
RankMerging, which aims at combining information provided by various
unsupervised rankings. We illustrate our method on three different kinds of
social networks and show that it substantially improves the performances of
unsupervised metrics of ranking. We also compare it to other combination
strategies based on standard methods. Finally, we explore various aspects of
RankMerging, such as feature selection and parameter estimation and discuss its
area of relevance: the prediction of an adjustable number of links on large
networks.Comment: 43 pages, published in Machine Learning Journa
- …