Search CORE

276 research outputs found

Unifying Sparsest Cut, Cluster Deletion, and Modularity Clustering Objectives with Correlation Clustering

Author: Gleich David
Veldt Nate
Wirth Anthony
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2018
Field of study

Graph clustering, or community detection, is the task of identifying groups of closely related objects in a large network. In this paper we introduce a new community-detection framework called LambdaCC that is based on a specially weighted version of correlation clustering. A key component in our methodology is a clustering resolution parameter,

\lambda

, which implicitly controls the size and structure of clusters formed by our framework. We show that, by increasing this parameter, our objective effectively interpolates between two different strategies in graph clustering: finding a sparse cut and forming dense subgraphs. Our methodology unifies and generalizes a number of other important clustering quality functions including modularity, sparsest cut, and cluster deletion, and places them all within the context of an optimization problem that has been well studied from the perspective of approximation algorithms. Our approach is particularly relevant in the regime of finding dense clusters, as it leads to a 2-approximation for the cluster deletion problem. We use our approach to cluster several graphs, including large collaboration networks and social networks

arXiv.org e-Print Archive

University of Melbourne Institutional Repository

The Feedback Arc Set Problem with Triangle Inequality is a Vertex Cover Problem

Author: Mastrolilli Monaldo
Publication venue
Publication date: 14/12/2011
Field of study

We consider the (precedence constrained) Minimum Feedback Arc Set problem with triangle inequalities on the weights, which finds important applications in problems of ranking with inconsistent information. We present a surprising structural insight showing that the problem is a special case of the minimum vertex cover in hypergraphs with edges of size at most 3. This result leads to combinatorial approximation algorithms for the problem and opens the road to studying the problem as a vertex cover problem

arXiv.org e-Print Archive

CiteSeerX

Correlation Clustering with Adaptive Similarity Queries

Author: Bressan Marco
Cesa-Bianchi Nicolò
Paudice Andrea
Vitale Fabio
Publication venue
Publication date: 01/01/2019
Field of study

In correlation clustering, we are given

n

objects together with a binary similarity score between each pair of them. The goal is to partition the objects into clusters so to minimise the disagreements with the scores. In this work we investigate correlation clustering as an active learning problem: each similarity score can be learned by making a query, and the goal is to minimise both the disagreements and the total number of queries. On the one hand, we describe simple active learning algorithms, which provably achieve an almost optimal trade-off while giving cluster recovery guarantees, and we test them on different datasets. On the other hand, we prove information-theoretical bounds on the number of queries necessary to guarantee a prescribed disagreement bound. These results give a rich characterization of the trade-off between queries and clustering error

arXiv.org e-Print Archive

AIR Universita degli studi di Milano

INRIA a CCSD electronic archive server

HAL Descartes

Archivio della ricerca- Università di Roma La Sapienza

Hal-Diderot

Correlation Clustering Generalized

Author: Gleich David F.
Veldt Nate
Wirth Anthony
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 29th International Symposium on Algorithms and Computation (ISAAC 2018)
Publication date: 01/01/2018
Field of study

We present new results for LambdaCC and MotifCC, two recently introduced variants of the well-studied correlation clustering problem. Both variants are motivated by applications to network analysis and community detection, and have non-trivial approximation algorithms. We first show that the standard linear programming relaxation of LambdaCC has a Theta(log n) integrality gap for a certain choice of the parameter lambda. This sheds light on previous challenges encountered in obtaining parameter-independent approximation results for LambdaCC. We generalize a previous constant-factor algorithm to provide the best results, from the LP-rounding approach, for an extended range of lambda. MotifCC generalizes correlation clustering to the hypergraph setting. In the case of hyperedges of degree 3 with weights satisfying probability constraints, we improve the best approximation factor from 9 to 8. We show that in general our algorithm gives a 4(k-1) approximation when hyperedges have maximum degree k and probability weights. We additionally present approximation results for LambdaCC and MotifCC where we restrict to forming only two clusters

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Cluster Editing: Kernelization based on Edge Cuts

Author: A. Zuylen van
J. Chen
J. Dean
J. Flum
J. Gramm
J. Guo
M. Charikar
M. Tedder
M.R. Fellows
N. Ailon
N. Alon
N. Bansal
P. Berkhin
R. Niedermeier
R. Shamir
R.G. Downey
R.H. Möhring
S. Böcker
W.H. Cunningham
Z.-Z. Chen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

Kernelization algorithms for the {\sc cluster editing} problem have been a popular topic in the recent research in parameterized computation. Thus far most kernelization algorithms for this problem are based on the concept of {\it critical cliques}. In this paper, we present new observations and new techniques for the study of kernelization algorithms for the {\sc cluster editing} problem. Our techniques are based on the study of the relationship between {\sc cluster editing} and graph edge-cuts. As an application, we present an

{\cal O}(n^2)

-time algorithm that constructs a

2k

kernel for the {\it weighted} version of the {\sc cluster editing} problem. Our result meets the best kernel size for the unweighted version for the {\sc cluster editing} problem, and significantly improves the previous best kernel of quadratic size for the weighted version of the problem

arXiv.org e-Print Archive

Crossref

Correlation Clustering with Adaptive Similarity Queries

Author: A. Paudice
F. Vitale
M. Bressan
N. Cesa-Bianchi
Publication venue: Curran Associates, Inc.
Publication date: 01/01/2019
Field of study

In correlation clustering, we are givennobjects together with a binary similarityscore between each pair of them. The goal is to partition the objects into clustersso to minimise the disagreements with the scores. In this work we investigatecorrelation clustering as an active learning problem: each similarity score can belearned by making a query, and the goal is to minimise both the disagreementsand the total number of queries. On the one hand, we describe simple activelearning algorithms, which provably achieve an almost optimal trade-off whilegiving cluster recovery guarantees, and we test them on different datasets. On theother hand, we prove information-theoretical bounds on the number of queriesnecessary to guarantee a prescribed disagreement bound. These results give a richcharacterization of the trade-off between queries and clustering error

AIR Universita degli studi di Milano