258 research outputs found
Unifying Sparsest Cut, Cluster Deletion, and Modularity Clustering Objectives with Correlation Clustering
Graph clustering, or community detection, is the task of identifying groups
of closely related objects in a large network. In this paper we introduce a new
community-detection framework called LambdaCC that is based on a specially
weighted version of correlation clustering. A key component in our methodology
is a clustering resolution parameter, , which implicitly controls the
size and structure of clusters formed by our framework. We show that, by
increasing this parameter, our objective effectively interpolates between two
different strategies in graph clustering: finding a sparse cut and forming
dense subgraphs. Our methodology unifies and generalizes a number of other
important clustering quality functions including modularity, sparsest cut, and
cluster deletion, and places them all within the context of an optimization
problem that has been well studied from the perspective of approximation
algorithms. Our approach is particularly relevant in the regime of finding
dense clusters, as it leads to a 2-approximation for the cluster deletion
problem. We use our approach to cluster several graphs, including large
collaboration networks and social networks
NeuroCUT: A Neural Approach for Robust Graph Partitioning
Graph partitioning aims to divide a graph into disjoint subsets while
optimizing a specific partitioning objective. The majority of formulations
related to graph partitioning exhibit NP-hardness due to their combinatorial
nature. As a result, conventional approximation algorithms rely on heuristic
methods, sometimes with approximation guarantees and sometimes without.
Unfortunately, traditional approaches are tailored for specific partitioning
objectives and do not generalize well across other known partitioning
objectives from the literature. To overcome this limitation, and learn
heuristics from the data directly, neural approaches have emerged,
demonstrating promising outcomes. In this study, we extend this line of work
through a novel framework, NeuroCut. NeuroCut introduces two key innovations
over prevailing methodologies. First, it is inductive to both graph topology
and the partition count, which is provided at query time. Second, by leveraging
a reinforcement learning based framework over node representations derived from
a graph neural network, NeuroCut can accommodate any optimization objective,
even those encompassing non-differentiable functions. Through empirical
evaluation, we demonstrate that NeuroCut excels in identifying high-quality
partitions, showcases strong generalization across a wide spectrum of
partitioning objectives, and exhibits resilience to topological modifications
Heuristics for Sparsest Cut Approximations in Network Flow Applications
The Maximum Concurrent Flow Problem (MCFP) is a polynomially bounded problem that has been used over the years in a variety of applications. Sometimes it is used to attempt to find the Sparsest Cut, an NP-hard problem, and other times to find communities in Social Network Analysis (SNA) in its hierarchical formulation, the HMCFP. Though it is polynomially bounded, the MCFP quickly grows in space utilization, rendering it useful on only small problems. When it was defined, only a few hundred nodes could be solved, where a few decades later, graphs of one to two thousand nodes can still be too much for modern commodity hardware to handle.
This dissertation covers three approaches to heuristics to the MCFP that run significantly faster in practice than the LP formulation with far less memory utilization. The first two approaches are based on the Maximum Adjacency Search (MAS) and apply to both the MCFP and the HMCFP used for community detection. We compare the three approaches to the LP performance in terms of accuracy, runtime, and memory utilization on several classes of synthetic graphs representing potential real-world applications. We find that the heuristics are often correct, and run using orders of magnitude less memory and time
Average Sensitivity of Graph Algorithms
In modern applications of graphs algorithms, where the graphs of interest are
large and dynamic, it is unrealistic to assume that an input representation
contains the full information of a graph being studied. Hence, it is desirable
to use algorithms that, even when only a (large) subgraph is available, output
solutions that are close to the solutions output when the whole graph is
available. We formalize this idea by introducing the notion of average
sensitivity of graph algorithms, which is the average earth mover's distance
between the output distributions of an algorithm on a graph and its subgraph
obtained by removing an edge, where the average is over the edges removed and
the distance between two outputs is the Hamming distance.
In this work, we initiate a systematic study of average sensitivity. After
deriving basic properties of average sensitivity such as composition, we
provide efficient approximation algorithms with low average sensitivities for
concrete graph problems, including the minimum spanning forest problem, the
global minimum cut problem, the minimum - cut problem, and the maximum
matching problem. In addition, we prove that the average sensitivity of our
global minimum cut algorithm is almost optimal, by showing a nearly matching
lower bound. We also show that every algorithm for the 2-coloring problem has
average sensitivity linear in the number of vertices. One of the main ideas
involved in designing our algorithms with low average sensitivity is the
following fact; if the presence of a vertex or an edge in the solution output
by an algorithm can be decided locally, then the algorithm has a low average
sensitivity, allowing us to reuse the analyses of known sublinear-time
algorithms and local computation algorithms (LCAs). Using this connection, we
show that every LCA for 2-coloring has linear query complexity, thereby
answering an open question.Comment: 39 pages, 1 figur
Fast Generation of Random Spanning Trees and the Effective Resistance Metric
We present a new algorithm for generating a uniformly random spanning tree in
an undirected graph. Our algorithm samples such a tree in expected
time. This improves over the best previously known bound
of -- that follows from the work of
Kelner and M\k{a}dry [FOCS'09] and of Colbourn et al. [J. Algorithms'96] --
whenever the input graph is sufficiently sparse.
At a high level, our result stems from carefully exploiting the interplay of
random spanning trees, random walks, and the notion of effective resistance, as
well as from devising a way to algorithmically relate these concepts to the
combinatorial structure of the graph. This involves, in particular,
establishing a new connection between the effective resistance metric and the
cut structure of the underlying graph
DSA-aware multiple patterning for the manufacturing of vias: Connections to graph coloring problems, IP formulations, and numerical experiments
In this paper, we investigate the manufacturing of vias in integrated
circuits with a new technology combining lithography and Directed Self Assembly
(DSA). Optimizing the production time and costs in this new process entails
minimizing the number of lithography steps, which constitutes a generalization
of graph coloring. We develop integer programming formulations for several
variants of interest in the industry, and then study the computational
performance of our formulations on true industrial instances. We show that the
best integer programming formulation achieves good computational performance,
and indicate potential directions to further speed-up computational time and
develop exact approaches feasible for production
Parameterized Inapproximability for Steiner Orientation by Gap Amplification
In the k-Steiner Orientation problem, we are given a mixed graph, that is, with both directed and undirected edges, and a set of k terminal pairs. The goal is to find an orientation of the undirected edges that maximizes the number of terminal pairs for which there is a path from the source to the sink. The problem is known to be W[1]-hard when parameterized by k and hard to approximate up to some constant for FPT algorithms assuming Gap-ETH. On the other hand, no approximation factor better than ?(k) is known.
We show that k-Steiner Orientation is unlikely to admit an approximation algorithm with any constant factor, even within FPT running time. To obtain this result, we construct a self-reduction via a hashing-based gap amplification technique, which turns out useful even outside of the FPT paradigm. Precisely, we rule out any approximation factor of the form (log k)^o(1) for FPT algorithms (assuming FPT ? W[1]) and (log n)^o(1) for purely polynomial-time algorithms (assuming that the class W[1] does not admit randomized FPT algorithms). This constitutes a novel inapproximability result for polynomial-time algorithms obtained via tools from the FPT theory. Moreover, we prove k-Steiner Orientation to belong to W[1], which entails W[1]-completeness of (log k)^o(1)-approximation for k-Steiner Orientation. This provides an example of a natural approximation task that is complete in a parameterized complexity class.
Finally, we apply our technique to the maximization version of directed multicut - Max (k,p)-Directed Multicut - where we are given a directed graph, k terminals pairs, and a budget p. The goal is to maximize the number of separated terminal pairs by removing p edges. We present a simple proof that the problem admits no FPT approximation with factor ?(k^(1/2 - ?)) (assuming FPT ? W[1]) and no polynomial-time approximation with ratio ?(|E(G)|^(1/2 - ?)) (assuming NP ? co-RP)
Most Balanced Minimum Cuts and Partially Ordered Knapsack
Abstract We consider the problem of finding most balanced cuts among minimum st-edge cuts and minimum st-vertex cuts, for given vertices s and t, according to different balance criteria. For edge cuts [S, S] we seek to maximize min{|S|, |S|}. For vertex cuts C of G we consider the objectives of (i) maximizing min{|S|, |T |}, where {S, T } is a partition of V (G)\C with s ∈ S, t ∈ T and [S, T ] = ∅, (ii) minimizing the order of the largest component of G − C, and (iii) maximizing the order of the smallest component of G − C. All of these problems are shown to be NP-hard. We give a PTAS for the edge cut variant and for (i). We give a 2-approximation for (ii), and show that no non-trivial approximation exists for (iii) unless P=NP. To prove these results we show that we can partition the vertices of G, and define a partial order on the subsets of the partition, such that ideals of the partial order correspond bijectively to minimum st-cuts of G. This shows that the problems are closely related to Uniform Partially Ordered Knapsack (UPOK), a variant of POK where element utilities are equal to element weights. Our PTAS is also a PTAS for special types of UPOK instances
- …