19,432 research outputs found
Unifying Sparsest Cut, Cluster Deletion, and Modularity Clustering Objectives with Correlation Clustering
Graph clustering, or community detection, is the task of identifying groups
of closely related objects in a large network. In this paper we introduce a new
community-detection framework called LambdaCC that is based on a specially
weighted version of correlation clustering. A key component in our methodology
is a clustering resolution parameter, , which implicitly controls the
size and structure of clusters formed by our framework. We show that, by
increasing this parameter, our objective effectively interpolates between two
different strategies in graph clustering: finding a sparse cut and forming
dense subgraphs. Our methodology unifies and generalizes a number of other
important clustering quality functions including modularity, sparsest cut, and
cluster deletion, and places them all within the context of an optimization
problem that has been well studied from the perspective of approximation
algorithms. Our approach is particularly relevant in the regime of finding
dense clusters, as it leads to a 2-approximation for the cluster deletion
problem. We use our approach to cluster several graphs, including large
collaboration networks and social networks
Clustering based on Random Graph Model embedding Vertex Features
Large datasets with interactions between objects are common to numerous
scientific fields (i.e. social science, internet, biology...). The interactions
naturally define a graph and a common way to explore or summarize such dataset
is graph clustering. Most techniques for clustering graph vertices just use the
topology of connections ignoring informations in the vertices features. In this
paper, we provide a clustering algorithm exploiting both types of data based on
a statistical model with latent structure characterizing each vertex both by a
vector of features as well as by its connectivity. We perform simulations to
compare our algorithm with existing approaches, and also evaluate our method
with real datasets based on hyper-textual documents. We find that our algorithm
successfully exploits whatever information is found both in the connectivity
pattern and in the features
Identification of network modules by optimization of ratio association
We introduce a novel method for identifying the modular structures of a
network based on the maximization of an objective function: the ratio
association. This cost function arises when the communities detection problem
is described in the probabilistic autoencoder frame. An analogy with kernel
k-means methods allows to develop an efficient optimization algorithm, based on
the deterministic annealing scheme. The performance of the proposed method is
shown on a real data set and on simulated networks
A unified approach to mapping and clustering of bibliometric networks
In the analysis of bibliometric networks, researchers often use mapping and
clustering techniques in a combined fashion. Typically, however, mapping and
clustering techniques that are used together rely on very different ideas and
assumptions. We propose a unified approach to mapping and clustering of
bibliometric networks. We show that the VOS mapping technique and a weighted
and parameterized variant of modularity-based clustering can both be derived
from the same underlying principle. We illustrate our proposed approach by
producing a combined mapping and clustering of the most frequently cited
publications that appeared in the field of information science in the period
1999-2008
- …