1,102 research outputs found

    Axioms for graph clustering quality functions

    Get PDF
    We investigate properties that intuitively ought to be satisfied by graph clustering quality functions, that is, functions that assign a score to a clustering of a graph. Graph clustering, also known as network community detection, is often performed by optimizing such a function. Two axioms tailored for graph clustering quality functions are introduced, and the four axioms introduced in previous work on distance based clustering are reformulated and generalized for the graph setting. We show that modularity, a standard quality function for graph clustering, does not satisfy all of these six properties. This motivates the derivation of a new family of quality functions, adaptive scale modularity, which does satisfy the proposed axioms. Adaptive scale modularity has two parameters, which give greater flexibility in the kinds of clusterings that can be found. Standard graph clustering quality functions, such as normalized cut and unnormalized cut, are obtained as special cases of adaptive scale modularity. In general, the results of our investigation indicate that the considered axiomatic framework covers existing `good' quality functions for graph clustering, and can be used to derive an interesting new family of quality functions.Comment: 23 pages. Full text and sources available on: http://www.cs.ru.nl/~T.vanLaarhoven/graph-clustering-axioms-2014

    Violating the Shannon capacity of metric graphs with entanglement

    Full text link
    The Shannon capacity of a graph G is the maximum asymptotic rate at which messages can be sent with zero probability of error through a noisy channel with confusability graph G. This extensively studied graph parameter disregards the fact that on atomic scales, Nature behaves in line with quantum mechanics. Entanglement, arguably the most counterintuitive feature of the theory, turns out to be a useful resource for communication across noisy channels. Recently, Leung, Mancinska, Matthews, Ozols and Roy [Comm. Math. Phys. 311, 2012] presented two examples of graphs whose Shannon capacity is strictly less than the capacity attainable if the sender and receiver have entangled quantum systems. Here we give new, possibly infinite, families of graphs for which the entangled capacity exceeds the Shannon capacity.Comment: 15 pages, 2 figure

    Assessing the Computational Complexity of Multi-Layer Subgraph Detection

    Get PDF
    Multi-layer graphs consist of several graphs (layers) over the same vertex set. They are motivated by real-world problems where entities (vertices) are associated via multiple types of relationships (edges in different layers). We chart the border of computational (in)tractability for the class of subgraph detection problems on multi-layer graphs, including fundamental problems such as maximum matching, finding certain clique relaxations (motivated by community detection), or path problems. Mostly encountering hardness results, sometimes even for two or three layers, we can also spot some islands of tractability

    A Tutorial on Clique Problems in Communications and Signal Processing

    Full text link
    Since its first use by Euler on the problem of the seven bridges of K\"onigsberg, graph theory has shown excellent abilities in solving and unveiling the properties of multiple discrete optimization problems. The study of the structure of some integer programs reveals equivalence with graph theory problems making a large body of the literature readily available for solving and characterizing the complexity of these problems. This tutorial presents a framework for utilizing a particular graph theory problem, known as the clique problem, for solving communications and signal processing problems. In particular, the paper aims to illustrate the structural properties of integer programs that can be formulated as clique problems through multiple examples in communications and signal processing. To that end, the first part of the tutorial provides various optimal and heuristic solutions for the maximum clique, maximum weight clique, and kk-clique problems. The tutorial, further, illustrates the use of the clique formulation through numerous contemporary examples in communications and signal processing, mainly in maximum access for non-orthogonal multiple access networks, throughput maximization using index and instantly decodable network coding, collision-free radio frequency identification networks, and resource allocation in cloud-radio access networks. Finally, the tutorial sheds light on the recent advances of such applications, and provides technical insights on ways of dealing with mixed discrete-continuous optimization problems

    Optimality Clue for Graph Coloring Problem

    Full text link
    In this paper, we present a new approach which qualifies or not a solution found by a heuristic as a potential optimal solution. Our approach is based on the following observation: for a minimization problem, the number of admissible solutions decreases with the value of the objective function. For the Graph Coloring Problem (GCP), we confirm this observation and present a new way to prove optimality. This proof is based on the counting of the number of different k-colorings and the number of independent sets of a given graph G. Exact solutions counting problems are difficult problems (\#P-complete). However, we show that, using only randomized heuristics, it is possible to define an estimation of the upper bound of the number of k-colorings. This estimate has been calibrated on a large benchmark of graph instances for which the exact number of optimal k-colorings is known. Our approach, called optimality clue, build a sample of k-colorings of a given graph by running many times one randomized heuristic on the same graph instance. We use the evolutionary algorithm HEAD [Moalic et Gondran, 2018], which is one of the most efficient heuristic for GCP. Optimality clue matches with the standard definition of optimality on a wide number of instances of DIMACS and RBCII benchmarks where the optimality is known. Then, we show the clue of optimality for another set of graph instances. Optimality Metaheuristics Near-optimal

    Algorithmic and enumerative aspects of the Moser-Tardos distribution

    Full text link
    Moser & Tardos have developed a powerful algorithmic approach (henceforth "MT") to the Lovasz Local Lemma (LLL); the basic operation done in MT and its variants is a search for "bad" events in a current configuration. In the initial stage of MT, the variables are set independently. We examine the distributions on these variables which arise during intermediate stages of MT. We show that these configurations have a more or less "random" form, building further on the "MT-distribution" concept of Haeupler et al. in understanding the (intermediate and) output distribution of MT. This has a variety of algorithmic applications; the most important is that bad events can be found relatively quickly, improving upon MT across the complexity spectrum: it makes some polynomial-time algorithms sub-linear (e.g., for Latin transversals, which are of basic combinatorial interest), gives lower-degree polynomial run-times in some settings, transforms certain super-polynomial-time algorithms into polynomial-time ones, and leads to Las Vegas algorithms for some coloring problems for which only Monte Carlo algorithms were known. We show that in certain conditions when the LLL condition is violated, a variant of the MT algorithm can still produce a distribution which avoids most of the bad events. We show in some cases this MT variant can run faster than the original MT algorithm itself, and develop the first-known criterion for the case of the asymmetric LLL. This can be used to find partial Latin transversals -- improving upon earlier bounds of Stein (1975) -- among other applications. We furthermore give applications in enumeration, showing that most applications (where we aim for all or most of the bad events to be avoided) have many more solutions than known before by proving that the MT-distribution has "large" min-entropy and hence that its support-size is large

    Two extensions of Ramsey's theorem

    Get PDF
    Ramsey's theorem, in the version of Erd\H{o}s and Szekeres, states that every 2-coloring of the edges of the complete graph on {1, 2,...,n} contains a monochromatic clique of order 1/2\log n. In this paper, we consider two well-studied extensions of Ramsey's theorem. Improving a result of R\"odl, we show that there is a constant c>0c>0 such that every 2-coloring of the edges of the complete graph on \{2, 3,...,n\} contains a monochromatic clique S for which the sum of 1/\log i over all vertices i \in S is at least c\log\log\log n. This is tight up to the constant factor c and answers a question of Erd\H{o}s from 1981. Motivated by a problem in model theory, V\"a\"an\"anen asked whether for every k there is an n such that the following holds. For every permutation \pi of 1,...,k-1, every 2-coloring of the edges of the complete graph on {1, 2, ..., n} contains a monochromatic clique a_1<...<a_k with a_{\pi(1)+1}-a_{\pi(1)}>a_{\pi(2)+1}-a_{\pi(2)}>...>a_{\pi(k-1)+1}-a_{\pi(k-1)}. That is, not only do we want a monochromatic clique, but the differences between consecutive vertices must satisfy a prescribed order. Alon and, independently, Erd\H{o}s, Hajnal and Pach answered this question affirmatively. Alon further conjectured that the true growth rate should be exponential in k. We make progress towards this conjecture, obtaining an upper bound on n which is exponential in a power of k. This improves a result of Shelah, who showed that n is at most double-exponential in k.Comment: 21 pages, accepted for publication in Duke Math.

    Unifying Sparsest Cut, Cluster Deletion, and Modularity Clustering Objectives with Correlation Clustering

    Get PDF
    Graph clustering, or community detection, is the task of identifying groups of closely related objects in a large network. In this paper we introduce a new community-detection framework called LambdaCC that is based on a specially weighted version of correlation clustering. A key component in our methodology is a clustering resolution parameter, λ\lambda, which implicitly controls the size and structure of clusters formed by our framework. We show that, by increasing this parameter, our objective effectively interpolates between two different strategies in graph clustering: finding a sparse cut and forming dense subgraphs. Our methodology unifies and generalizes a number of other important clustering quality functions including modularity, sparsest cut, and cluster deletion, and places them all within the context of an optimization problem that has been well studied from the perspective of approximation algorithms. Our approach is particularly relevant in the regime of finding dense clusters, as it leads to a 2-approximation for the cluster deletion problem. We use our approach to cluster several graphs, including large collaboration networks and social networks

    High-dimensional learning of linear causal networks via inverse covariance estimation

    Get PDF
    We establish a new framework for statistical estimation of directed acyclic graphs (DAGs) when data are generated from a linear, possibly non-Gaussian structural equation model. Our framework consists of two parts: (1) inferring the moralized graph from the support of the inverse covariance matrix; and (2) selecting the best-scoring graph amongst DAGs that are consistent with the moralized graph. We show that when the error variances are known or estimated to close enough precision, the true DAG is the unique minimizer of the score computed using the reweighted squared l_2-loss. Our population-level results have implications for the identifiability of linear SEMs when the error covariances are specified up to a constant multiple. On the statistical side, we establish rigorous conditions for high-dimensional consistency of our two-part algorithm, defined in terms of a "gap" between the true DAG and the next best candidate. Finally, we demonstrate that dynamic programming may be used to select the optimal DAG in linear time when the treewidth of the moralized graph is bounded.Comment: 41 pages, 7 figure
    • …
    corecore