411 research outputs found
Efficient Truss Maintenance in Evolving Networks
Truss was proposed to study social network data represented by graphs. A
k-truss of a graph is a cohesive subgraph, in which each edge is contained in
at least k-2 triangles within the subgraph. While truss has been demonstrated
as superior to model the close relationship in social networks and efficient
algorithms for finding trusses have been extensively studied, very little
attention has been paid to truss maintenance. However, most social networks are
evolving networks. It may be infeasible to recompute trusses from scratch from
time to time in order to find the up-to-date -trusses in the evolving
networks. In this paper, we discuss how to maintain trusses in a graph with
dynamic updates. We first discuss a set of properties on maintaining trusses,
then propose algorithms on maintaining trusses on edge deletions and
insertions, finally, we discuss truss index maintenance. We test the proposed
techniques on real datasets. The experiment results show the promise of our
work
Egomunities, Exploring Socially Cohesive Person-based Communities
In the last few years, there has been a great interest in detecting
overlapping communities in complex networks, which is understood as dense
groups of nodes featuring a low outbound density. To date, most methods used to
compute such communities stem from the field of disjoint community detection by
either extending the concept of modularity to an overlapping context or by
attempting to decompose the whole set of nodes into several possibly
overlapping subsets. In this report we take an orthogonal approach by
introducing a metric, the cohesion, rooted in sociological considerations. The
cohesion quantifies the community-ness of one given set of nodes, based on the
notions of triangles - triplets of connected nodes - and weak ties, instead of
the classical view using only edge density. A set of nodes has a high cohesion
if it features a high density of triangles and intersects few triangles with
the rest of the network. As such, we introduce a numerical characterization of
communities: sets of nodes featuring a high cohesion. We then present a new
approach to the problem of overlapping communities by introducing the concept
of ego-munities, which are subjective communities centered around a given node,
specifically inside its neighborhood. We build upon the cohesion to construct a
heuristic algorithm which outputs a node's ego-munities by attempting to
maximize their cohesion. We illustrate the pertinence of our method with a
detailed description of one person's ego-munities among Facebook friends. We
finally conclude by describing promising applications of ego-munities such as
information inference and interest recommendations, and present a possible
extension to cohesion in the case of weighted networks
Intrinsically Dynamic Network Communities
Community finding algorithms for networks have recently been extended to
dynamic data. Most of these recent methods aim at exhibiting community
partitions from successive graph snapshots and thereafter connecting or
smoothing these partitions using clever time-dependent features and sampling
techniques. These approaches are nonetheless achieving longitudinal rather than
dynamic community detection. We assume that communities are fundamentally
defined by the repetition of interactions among a set of nodes over time.
According to this definition, analyzing the data by considering successive
snapshots induces a significant loss of information: we suggest that it blurs
essentially dynamic phenomena - such as communities based on repeated
inter-temporal interactions, nodes switching from a community to another across
time, or the possibility that a community survives while its members are being
integrally replaced over a longer time period. We propose a formalism which
aims at tackling this issue in the context of time-directed datasets (such as
citation networks), and present several illustrations on both empirical and
synthetic dynamic networks. We eventually introduce intrinsically dynamic
metrics to qualify temporal community structure and emphasize their possible
role as an estimator of the quality of the community detection - taking into
account the fact that various empirical contexts may call for distinct
`community' definitions and detection criteria.Comment: 27 pages, 11 figure
Analyzing the Facebook Friendship Graph
Online Social Networks (OSN) during last years acquired a\ud
huge and increasing popularity as one of the most important emerging Web phenomena, deeply modifying the behavior of users and contributing to build a solid substrate of connections and relationships among people using the Web. In this preliminary work paper, our purpose is to analyze Facebook, considering a signi�cant sample of data re\ud
ecting relationships among subscribed users. Our goal is to extract, from this platform, relevant information about the distribution of these relations and exploit tools and algorithms provided by the Social Network Analysis (SNA) to discover and, possibly, understand underlying similarities\ud
between the developing of OSN and real-life social networks
Analysis of category co-occurrence in Wikipedia networks
Wikipedia has seen a huge expansion of content since its inception. Pages within this online
encyclopedia are organised by assigning them to one or more categories, where Wikipedia
maintains a manually constructed taxonomy graph that encodes the semantic relationship
between these categories. An alternative, called the category co-occurrence graph, can be
produced automatically by linking together categories that have pages in common. Properties
of the latter graph and its relationship to the former is the concern of this thesis.
The analytic framework, called t-component, is introduced to formalise the graphs and
discover category clusters connecting relevant categories together. The m-core, a cohesive
subgroup concept as a clustering model, is used to construct a subgraph depending on the
number of shared pages between the categories exceeding a given threshold t. The significant
of the clustering result of the m-core is validated using a permutation test. This is compared
to the k-core, another clustering model.
TheWikipedia category co-occurrence graphs are scale-free with a few category hubs and
the majority of clusters are size 2. All observed properties for the distribution of the largest
clusters of the category graphs obey power-laws with decay exponent averages around 1.
As the threshold t of the number of shared pages is increased, eventually a critical threshold
is reached when the largest cluster shrinks significantly in size. This phenomena is only
exhibited for the m-core but not the k-core. Lastly, the clustering in the category graph
is shown to be consistent with the distance between categories in the taxonomy graph
Finding the Hierarchy of Dense Subgraphs using Nucleus Decompositions
Finding dense substructures in a graph is a fundamental graph mining
operation, with applications in bioinformatics, social networks, and
visualization to name a few. Yet most standard formulations of this problem
(like clique, quasiclique, k-densest subgraph) are NP-hard. Furthermore, the
goal is rarely to find the "true optimum", but to identify many (if not all)
dense substructures, understand their distribution in the graph, and ideally
determine relationships among them. Current dense subgraph finding algorithms
usually optimize some objective, and only find a few such subgraphs without
providing any structural relations. We define the nucleus decomposition of a
graph, which represents the graph as a forest of nuclei. Each nucleus is a
subgraph where smaller cliques are present in many larger cliques. The forest
of nuclei is a hierarchy by containment, where the edge density increases as we
proceed towards leaf nuclei. Sibling nuclei can have limited intersections,
which enables discovering overlapping dense subgraphs. With the right
parameters, the nucleus decomposition generalizes the classic notions of
k-cores and k-truss decompositions. We give provably efficient algorithms for
nucleus decompositions, and empirically evaluate their behavior in a variety of
real graphs. The tree of nuclei consistently gives a global, hierarchical
snapshot of dense substructures, and outputs dense subgraphs of higher quality
than other state-of-the-art solutions. Our algorithm can process graphs with
tens of millions of edges in less than an hour
- …