Search CORE

14,930 research outputs found

Robust Correlation Clustering

Author: Devvrit
Krishnaswamy Ravishankar
Rajaraman Nived
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2019)
Publication date: 01/01/2019
Field of study

In this paper, we introduce and study the Robust-Correlation-Clustering problem: given a graph G = (V,E) where every edge is either labeled + or - (denoting similar or dissimilar pairs of vertices), and a parameter m, the goal is to delete a set D of m vertices, and partition the remaining vertices V D into clusters to minimize the cost of the clustering, which is the sum of the number of + edges with end-points in different clusters and the number of - edges with end-points in the same cluster. This generalizes the classical Correlation-Clustering problem which is the special case when m = 0. Correlation clustering is useful when we have (only) qualitative information about the similarity or dissimilarity of pairs of points, and Robust-Correlation-Clustering equips this model with the capability to handle noise in datasets. In this work, we present a constant-factor bi-criteria algorithm for Robust-Correlation-Clustering on complete graphs (where our solution is O(1)-approximate w.r.t the cost while however discarding O(1) m points as outliers), and also complement this by showing that no finite approximation is possible if we do not violate the outlier budget. Our algorithm is very simple in that it first does a simple LP-based pre-processing to delete O(m) vertices, and subsequently runs a particular Correlation-Clustering algorithm ACNAlg [Ailon et al., 2005] on the residual instance. We then consider general graphs, and show (O(log n), O(log^2 n)) bi-criteria algorithms while also showing a hardness of alpha_MC on both the cost and the outlier violation, where alpha_MC is the lower bound for the Minimum-Multicut problem

Dagstuhl Research Online Publication Server

Overlapping and Robust Edge-Colored Clustering in Hypergraphs

Author: Crane Alex
Lavallee Brian
Sullivan Blair D.
Veldt Nate
Publication venue
Publication date: 27/05/2023
Field of study

A recent trend in data mining has explored (hyper)graph clustering algorithms for data with categorical relationship types. Such algorithms have applications in the analysis of social, co-authorship, and protein interaction networks, to name a few. Many such applications naturally have some overlap between clusters, a nuance which is missing from current combinatorial models. Additionally, existing models lack a mechanism for handling noise in datasets. We address these concerns by generalizing Edge-Colored Clustering, a recent framework for categorical clustering of hypergraphs. Our generalizations allow for a budgeted number of either (a) overlapping cluster assignments or (b) node deletions. For each new model we present a greedy algorithm which approximately minimizes an edge mistake objective, as well as bicriteria approximations where the second approximation factor is on the budget. Additionally, we address the parameterized complexity of each problem, providing FPT algorithms and hardness results

arXiv.org e-Print Archive

Local Guarantees in Graph Cuts and Clustering

Author: A Ben-Dor
A Wirth
AA Schäffer
D Monderer
DS Johnson
ED Demaine
G Christodoulou
HP Kriegel
N Ailon
N Ailon
N Bansal
N Bansal
P Symeonidis
V Filkov
Z Svitkina
Publication venue
Publication date: 02/04/2017
Field of study

Correlation Clustering is an elegant model that captures fundamental graph cut problems such as Min

s-t

Cut, Multiway Cut, and Multicut, extensively studied in combinatorial optimization. Here, we are given a graph with edges labeled

+

-

and the goal is to produce a clustering that agrees with the labels as much as possible:

+

edges within clusters and

-

edges across clusters. The classical approach towards Correlation Clustering (and other graph cut problems) is to optimize a global objective. We depart from this and study local objectives: minimizing the maximum number of disagreements for edges incident on a single node, and the analogous max min agreements objective. This naturally gives rise to a family of basic min-max graph cut problems. A prototypical representative is Min Max

s-t

Cut: find an

s-t

cut minimizing the largest number of cut edges incident on any node. We present the following results:

(1)

O(\sqrt{n})

-approximation for the problem of minimizing the maximum total weight of disagreement edges incident on any node (thus providing the first known approximation for the above family of min-max graph cut problems),

(2)

a remarkably simple

7

-approximation for minimizing local disagreements in complete graphs (improving upon the previous best known approximation of

48

), and

(3)

1/(2+\varepsilon)

-approximation for maximizing the minimum total weight of agreement edges incident on any node, hence improving upon the

1/(4+\varepsilon)

-approximation that follows from the study of approximate pure Nash equilibria in cut and party affiliation games

arXiv.org e-Print Archive

Crossref

Centrality of Trees for Capacitated k-Center

Author: An Hyung-Chan
Bhaskara Aditya
Svensson Ola
Publication venue
Publication date: 10/04/2013
Field of study

There is a large discrepancy in our understanding of uncapacitated and capacitated versions of network location problems. This is perhaps best illustrated by the classical k-center problem: there is a simple tight 2-approximation algorithm for the uncapacitated version whereas the first constant factor approximation algorithm for the general version with capacities was only recently obtained by using an intricate rounding algorithm that achieves an approximation guarantee in the hundreds. Our paper aims to bridge this discrepancy. For the capacitated k-center problem, we give a simple algorithm with a clean analysis that allows us to prove an approximation guarantee of 9. It uses the standard LP relaxation and comes close to settling the integrality gap (after necessary preprocessing), which is narrowed down to either 7, 8 or 9. The algorithm proceeds by first reducing to special tree instances, and then solves such instances optimally. Our concept of tree instances is quite versatile, and applies to natural variants of the capacitated k-center problem for which we also obtain improved algorithms. Finally, we give evidence to show that more powerful preprocessing could lead to better algorithms, by giving an approximation algorithm that beats the integrality gap for instances where all non-zero capacities are uniform.Comment: 21 pages, 2 figure

arXiv.org e-Print Archive

CiteSeerX