Search CORE

2,736 research outputs found

Spectral Clustering with Imbalanced Data

Author: Qian Jing
Saligrama Venkatesh
Publication venue
Publication date: 09/09/2013
Field of study

Spectral clustering is sensitive to how graphs are constructed from data particularly when proximal and imbalanced clusters are present. We show that Ratio-Cut (RCut) or normalized cut (NCut) objectives are not tailored to imbalanced data since they tend to emphasize cut sizes over cut values. We propose a graph partitioning problem that seeks minimum cut partitions under minimum size constraints on partitions to deal with imbalanced data. Our approach parameterizes a family of graphs, by adaptively modulating node degrees on a fixed node set, to yield a set of parameter dependent cuts reflecting varying levels of imbalance. The solution to our problem is then obtained by optimizing over these parameters. We present rigorous limit cut analysis results to justify our approach. We demonstrate the superiority of our method through unsupervised and semi-supervised experiments on synthetic and real data sets.Comment: 24 pages, 7 figures. arXiv admin note: substantial text overlap with arXiv:1302.513

arXiv.org e-Print Archive

Crossref

Clustering and Community Detection with Imbalanced Clusters

Author: Aksoylar Cem
Qian Jing
Saligrama Venkatesh
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 26/08/2016
Field of study

Spectral clustering methods which are frequently used in clustering and community detection applications are sensitive to the specific graph constructions particularly when imbalanced clusters are present. We show that ratio cut (RCut) or normalized cut (NCut) objectives are not tailored to imbalanced cluster sizes since they tend to emphasize cut sizes over cut values. We propose a graph partitioning problem that seeks minimum cut partitions under minimum size constraints on partitions to deal with imbalanced cluster sizes. Our approach parameterizes a family of graphs by adaptively modulating node degrees on a fixed node set, yielding a set of parameter dependent cuts reflecting varying levels of imbalance. The solution to our problem is then obtained by optimizing over these parameters. We present rigorous limit cut analysis results to justify our approach and demonstrate the superiority of our method through experiments on synthetic and real datasets for data clustering, semi-supervised learning and community detection.Comment: Extended version of arXiv:1309.2303 with new applications. Accepted to IEEE TSIP

arXiv.org e-Print Archive

Crossref

Boston University Institutional Repository (OpenBU)

Parameterized Compilation Lower Bounds for Restricted CNF-formulas

Author: A Darwiche
A Darwiche
C Muise
I Razgon
J Flum
M Lampis
M Thurley
P Duris
R Diestel
R Haan de
S Bova
S Ordyniak
SH Sæther
Publication venue
Publication date: 01/01/2016
Field of study

We show unconditional parameterized lower bounds in the area of knowledge compilation, more specifically on the size of circuits in decomposable negation normal form (DNNF) that encode CNF-formulas restricted by several graph width measures. In particular, we show that - there are CNF formulas of size

n

and modular incidence treewidth

k

whose smallest DNNF-encoding has size

n^{\Omega(k)}

, and - there are CNF formulas of size

n

and incidence neighborhood diversity

k

whose smallest DNNF-encoding has size

n^{\Omega(\sqrt{k})}

. These results complement recent upper bounds for compiling CNF into DNNF and strengthen---quantitatively and qualitatively---known conditional low\-er bounds for cliquewidth. Moreover, they show that, unlike for many graph problems, the parameters considered here behave significantly differently from treewidth

arXiv.org e-Print Archive

Crossref

HAL Descartes

HAL-Artois

Hyperbolic intersection graphs and (quasi)-polynomial time

Author: Kisfaludi-Bak Sándor
Publication venue
Publication date: 30/09/2019
Field of study

We study unit ball graphs (and, more generally, so-called noisy uniform ball graphs) in

d

-dimensional hyperbolic space, which we denote by

\mathbb{H}^d

. Using a new separator theorem, we show that unit ball graphs in

\mathbb{H}^d

enjoy similar properties as their Euclidean counterparts, but in one dimension lower: many standard graph problems, such as Independent Set, Dominating Set, Steiner Tree, and Hamiltonian Cycle can be solved in

2^{O(n^{1-1/(d-1)})}

time for any fixed

d\geq 3

, while the same problems need

2^{O(n^{1-1/d})}

time in

\mathbb{R}^d

. We also show that these algorithms in

\mathbb{H}^d

are optimal up to constant factors in the exponent under ETH. This drop in dimension has the largest impact in

\mathbb{H}^2

, where we introduce a new technique to bound the treewidth of noisy uniform disk graphs. The bounds yield quasi-polynomial (

n^{O(\log n)}

) algorithms for all of the studied problems, while in the case of Hamiltonian Cycle and

3

-Coloring we even get polynomial time algorithms. Furthermore, if the underlying noisy disks in

\mathbb{H}^2

have constant maximum degree, then all studied problems can be solved in polynomial time. This contrasts with the fact that these problems require

2^{\Omega(\sqrt{n})}

time under ETH in constant maximum degree Euclidean unit disk graphs. Finally, we complement our quasi-polynomial algorithm for Independent Set in noisy uniform disk graphs with a matching

n^{\Omega(\log n)}

lower bound under ETH. This shows that the hyperbolic plane is a potential source of NP-intermediate problems.Comment: Short version appears in SODA 202

arXiv.org e-Print Archive

Crossref

MPG.PuRe

Network-Based Vertex Dissolution

Author: Bredereck Robert
Chen Jiehua
Froese Vincent
Niedermeier Rolf
van Bevern René
Woeginger Gerhard J.
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 01/01/2015
Field of study

We introduce a graph-theoretic vertex dissolution model that applies to a number of redistribution scenarios such as gerrymandering in political districting or work balancing in an online situation. The central aspect of our model is the deletion of certain vertices and the redistribution of their load to neighboring vertices in a completely balanced way. We investigate how the underlying graph structure, the knowledge of which vertices should be deleted, and the relation between old and new vertex loads influence the computational complexity of the underlying graph problems. Our results establish a clear borderline between tractable and intractable cases.Comment: Version accepted at SIAM Journal on Discrete Mathematic

arXiv.org e-Print Archive

CiteSeerX

Repository TU/e

Pure OAI Repository