16 research outputs found
Enumerating Top-k Quasi-Cliques
Quasi-cliques are dense incomplete subgraphs of a graph that generalize the
notion of cliques. Enumerating quasi-cliques from a graph is a robust way to
detect densely connected structures with applications to bio-informatics and
social network analysis. However, enumerating quasi-cliques in a graph is a
challenging problem, even harder than the problem of enumerating cliques. We
consider the enumeration of top-k degree-based quasi-cliques, and make the
following contributions: (1) We show that even the problem of detecting if a
given quasi-clique is maximal (i.e. not contained within another quasi-clique)
is NP-hard (2) We present a novel heuristic algorithm KernelQC to enumerate the
k largest quasi-cliques in a graph. Our method is based on identifying kernels
of extremely dense subgraphs within a graph, following by growing subgraphs
around these kernels, to arrive at quasi-cliques with the required densities
(3) Experimental results show that our algorithm accurately enumerates
quasi-cliques from a graph, is much faster than current state-of-the-art
methods for quasi-clique enumeration (often more than three orders of magnitude
faster), and can scale to larger graphs than current methods.Comment: 10 page
Mining subjectively interesting attributed subgraphs
Community detection in graphs, data clustering, and local pattern mining
are three mature fields of data mining and machine learning.
In recent years, attributed subgraph mining is emerging as a new
powerful data mining task in the intersection of these areas.
Given a graph and a set of attributes for each vertex,
attributed subgraph mining aims to find cohesive subgraphs
for which (a subset of) the attribute values has exceptional values in some sense.
While research on this task can borrow from the three abovementioned fields,
the principled integration of graph and attribute data poses two challenges:
the definition of a pattern language that is intuitive and lends itself to efficient search strategies,
and the formalization of the interestingness of such patterns.
We propose an integrated solution to both of these challenges.
The proposed pattern language improves upon prior work in being both highly flexible and intuitive.
We show how an effective and principled algorithm can enumerate patterns of this language.
The proposed approach for quantifying interestingness of patterns of this language
is rooted in information theory, and is able to account for prior knowledge on the data.
Prior work typically quantifies interestingness based on the cohesion of the subgraph
and for the exceptionality of its attributes separately,
combining these in a parameterized trade-off.
Instead, in our proposal this trade-off is implicitly handled in a principled, parameter-free manner.
Extensive empirical results confirm the proposed pattern syntax is intuitive,
and the interestingness measure aligns well with actual subjective interestingness
A Method for Characterizing Communities in Dynamic Attributed Complex Networks
Many methods have been proposed to detect communities, not only in plain, but
also in attributed, directed or even dynamic complex networks. In its simplest
form, a community structure takes the form of a partition of the node set. From
the modeling point of view, to be of some utility, this partition must then be
characterized relatively to the properties of the studied system. However, if
most of the existing works focus on defining methods for the detection of
communities, only very few try to tackle this interpretation problem. Moreover,
the existing approaches are limited either in the type of data they handle, or
by the nature of the results they output. In this work, we propose a method to
efficiently support such a characterization task. We first define a
sequence-based representation of networks, combining temporal information,
topological measures, and nodal attributes. We then describe how to identify
the most emerging sequential patterns of this dataset, and use them to
characterize the communities. We also show how to detect unusual behavior in a
community, and highlight outliers. Finally, as an illustration, we apply our
method to a network of scientific collaborations.Comment: IEEE/ACM International Conference on Advances in Social Network
Analysis and Mining (ASONAM), P\'ekin : China (2014