258 research outputs found

    Unifying Sparsest Cut, Cluster Deletion, and Modularity Clustering Objectives with Correlation Clustering

    Get PDF
    Graph clustering, or community detection, is the task of identifying groups of closely related objects in a large network. In this paper we introduce a new community-detection framework called LambdaCC that is based on a specially weighted version of correlation clustering. A key component in our methodology is a clustering resolution parameter, λ\lambda, which implicitly controls the size and structure of clusters formed by our framework. We show that, by increasing this parameter, our objective effectively interpolates between two different strategies in graph clustering: finding a sparse cut and forming dense subgraphs. Our methodology unifies and generalizes a number of other important clustering quality functions including modularity, sparsest cut, and cluster deletion, and places them all within the context of an optimization problem that has been well studied from the perspective of approximation algorithms. Our approach is particularly relevant in the regime of finding dense clusters, as it leads to a 2-approximation for the cluster deletion problem. We use our approach to cluster several graphs, including large collaboration networks and social networks

    NeuroCUT: A Neural Approach for Robust Graph Partitioning

    Full text link
    Graph partitioning aims to divide a graph into kk disjoint subsets while optimizing a specific partitioning objective. The majority of formulations related to graph partitioning exhibit NP-hardness due to their combinatorial nature. As a result, conventional approximation algorithms rely on heuristic methods, sometimes with approximation guarantees and sometimes without. Unfortunately, traditional approaches are tailored for specific partitioning objectives and do not generalize well across other known partitioning objectives from the literature. To overcome this limitation, and learn heuristics from the data directly, neural approaches have emerged, demonstrating promising outcomes. In this study, we extend this line of work through a novel framework, NeuroCut. NeuroCut introduces two key innovations over prevailing methodologies. First, it is inductive to both graph topology and the partition count, which is provided at query time. Second, by leveraging a reinforcement learning based framework over node representations derived from a graph neural network, NeuroCut can accommodate any optimization objective, even those encompassing non-differentiable functions. Through empirical evaluation, we demonstrate that NeuroCut excels in identifying high-quality partitions, showcases strong generalization across a wide spectrum of partitioning objectives, and exhibits resilience to topological modifications

    Heuristics for Sparsest Cut Approximations in Network Flow Applications

    Get PDF
    The Maximum Concurrent Flow Problem (MCFP) is a polynomially bounded problem that has been used over the years in a variety of applications. Sometimes it is used to attempt to find the Sparsest Cut, an NP-hard problem, and other times to find communities in Social Network Analysis (SNA) in its hierarchical formulation, the HMCFP. Though it is polynomially bounded, the MCFP quickly grows in space utilization, rendering it useful on only small problems. When it was defined, only a few hundred nodes could be solved, where a few decades later, graphs of one to two thousand nodes can still be too much for modern commodity hardware to handle. This dissertation covers three approaches to heuristics to the MCFP that run significantly faster in practice than the LP formulation with far less memory utilization. The first two approaches are based on the Maximum Adjacency Search (MAS) and apply to both the MCFP and the HMCFP used for community detection. We compare the three approaches to the LP performance in terms of accuracy, runtime, and memory utilization on several classes of synthetic graphs representing potential real-world applications. We find that the heuristics are often correct, and run using orders of magnitude less memory and time

    Combinatorial algorithms

    Get PDF

    Average Sensitivity of Graph Algorithms

    Full text link
    In modern applications of graphs algorithms, where the graphs of interest are large and dynamic, it is unrealistic to assume that an input representation contains the full information of a graph being studied. Hence, it is desirable to use algorithms that, even when only a (large) subgraph is available, output solutions that are close to the solutions output when the whole graph is available. We formalize this idea by introducing the notion of average sensitivity of graph algorithms, which is the average earth mover's distance between the output distributions of an algorithm on a graph and its subgraph obtained by removing an edge, where the average is over the edges removed and the distance between two outputs is the Hamming distance. In this work, we initiate a systematic study of average sensitivity. After deriving basic properties of average sensitivity such as composition, we provide efficient approximation algorithms with low average sensitivities for concrete graph problems, including the minimum spanning forest problem, the global minimum cut problem, the minimum ss-tt cut problem, and the maximum matching problem. In addition, we prove that the average sensitivity of our global minimum cut algorithm is almost optimal, by showing a nearly matching lower bound. We also show that every algorithm for the 2-coloring problem has average sensitivity linear in the number of vertices. One of the main ideas involved in designing our algorithms with low average sensitivity is the following fact; if the presence of a vertex or an edge in the solution output by an algorithm can be decided locally, then the algorithm has a low average sensitivity, allowing us to reuse the analyses of known sublinear-time algorithms and local computation algorithms (LCAs). Using this connection, we show that every LCA for 2-coloring has linear query complexity, thereby answering an open question.Comment: 39 pages, 1 figur

    Fast Generation of Random Spanning Trees and the Effective Resistance Metric

    Full text link
    We present a new algorithm for generating a uniformly random spanning tree in an undirected graph. Our algorithm samples such a tree in expected O~(m4/3)\tilde{O}(m^{4/3}) time. This improves over the best previously known bound of min(O~(mn),O(nω))\min(\tilde{O}(m\sqrt{n}),O(n^{\omega})) -- that follows from the work of Kelner and M\k{a}dry [FOCS'09] and of Colbourn et al. [J. Algorithms'96] -- whenever the input graph is sufficiently sparse. At a high level, our result stems from carefully exploiting the interplay of random spanning trees, random walks, and the notion of effective resistance, as well as from devising a way to algorithmically relate these concepts to the combinatorial structure of the graph. This involves, in particular, establishing a new connection between the effective resistance metric and the cut structure of the underlying graph

    DSA-aware multiple patterning for the manufacturing of vias: Connections to graph coloring problems, IP formulations, and numerical experiments

    Full text link
    In this paper, we investigate the manufacturing of vias in integrated circuits with a new technology combining lithography and Directed Self Assembly (DSA). Optimizing the production time and costs in this new process entails minimizing the number of lithography steps, which constitutes a generalization of graph coloring. We develop integer programming formulations for several variants of interest in the industry, and then study the computational performance of our formulations on true industrial instances. We show that the best integer programming formulation achieves good computational performance, and indicate potential directions to further speed-up computational time and develop exact approaches feasible for production

    Parameterized Inapproximability for Steiner Orientation by Gap Amplification

    Get PDF
    In the k-Steiner Orientation problem, we are given a mixed graph, that is, with both directed and undirected edges, and a set of k terminal pairs. The goal is to find an orientation of the undirected edges that maximizes the number of terminal pairs for which there is a path from the source to the sink. The problem is known to be W[1]-hard when parameterized by k and hard to approximate up to some constant for FPT algorithms assuming Gap-ETH. On the other hand, no approximation factor better than ?(k) is known. We show that k-Steiner Orientation is unlikely to admit an approximation algorithm with any constant factor, even within FPT running time. To obtain this result, we construct a self-reduction via a hashing-based gap amplification technique, which turns out useful even outside of the FPT paradigm. Precisely, we rule out any approximation factor of the form (log k)^o(1) for FPT algorithms (assuming FPT ? W[1]) and (log n)^o(1) for purely polynomial-time algorithms (assuming that the class W[1] does not admit randomized FPT algorithms). This constitutes a novel inapproximability result for polynomial-time algorithms obtained via tools from the FPT theory. Moreover, we prove k-Steiner Orientation to belong to W[1], which entails W[1]-completeness of (log k)^o(1)-approximation for k-Steiner Orientation. This provides an example of a natural approximation task that is complete in a parameterized complexity class. Finally, we apply our technique to the maximization version of directed multicut - Max (k,p)-Directed Multicut - where we are given a directed graph, k terminals pairs, and a budget p. The goal is to maximize the number of separated terminal pairs by removing p edges. We present a simple proof that the problem admits no FPT approximation with factor ?(k^(1/2 - ?)) (assuming FPT ? W[1]) and no polynomial-time approximation with ratio ?(|E(G)|^(1/2 - ?)) (assuming NP ? co-RP)

    Most Balanced Minimum Cuts and Partially Ordered Knapsack

    Get PDF
    Abstract We consider the problem of finding most balanced cuts among minimum st-edge cuts and minimum st-vertex cuts, for given vertices s and t, according to different balance criteria. For edge cuts [S, S] we seek to maximize min{|S|, |S|}. For vertex cuts C of G we consider the objectives of (i) maximizing min{|S|, |T |}, where {S, T } is a partition of V (G)\C with s ∈ S, t ∈ T and [S, T ] = ∅, (ii) minimizing the order of the largest component of G − C, and (iii) maximizing the order of the smallest component of G − C. All of these problems are shown to be NP-hard. We give a PTAS for the edge cut variant and for (i). We give a 2-approximation for (ii), and show that no non-trivial approximation exists for (iii) unless P=NP. To prove these results we show that we can partition the vertices of G, and define a partial order on the subsets of the partition, such that ideals of the partial order correspond bijectively to minimum st-cuts of G. This shows that the problems are closely related to Uniform Partially Ordered Knapsack (UPOK), a variant of POK where element utilities are equal to element weights. Our PTAS is also a PTAS for special types of UPOK instances
    corecore