    Distributed local approximation algorithms for maximum matching in graphs and hypergraphs

    We describe approximation algorithms in Linial's classic LOCAL model of distributed computing to find maximum-weight matchings in a hypergraph of rank rr. Our main result is a deterministic algorithm to generate a matching which is an O(r)O(r)-approximation to the maximum weight matching, running in O~(rlogΔ+log2Δ+logn)\tilde O(r \log \Delta + \log^2 \Delta + \log^* n) rounds. (Here, the O~()\tilde O() notations hides polyloglog Δ\text{polyloglog } \Delta and polylog r\text{polylog } r factors). This is based on a number of new derandomization techniques extending methods of Ghaffari, Harris & Kuhn (2017). As a main application, we obtain nearly-optimal algorithms for the long-studied problem of maximum-weight graph matching. Specifically, we get a (1+ϵ)(1+\epsilon) approximation algorithm using O~(logΔ/ϵ3+polylog(1/ϵ,loglogn))\tilde O(\log \Delta / \epsilon^3 + \text{polylog}(1/\epsilon, \log \log n)) randomized time and O~(log2Δ/ϵ4+logn/ϵ)\tilde O(\log^2 \Delta / \epsilon^4 + \log^*n / \epsilon) deterministic time. The second application is a faster algorithm for hypergraph maximal matching, a versatile subroutine introduced in Ghaffari et al. (2017) for a variety of local graph algorithms. This gives an algorithm for (2Δ1)(2 \Delta - 1)-edge-list coloring in O~(log2Δlogn)\tilde O(\log^2 \Delta \log n) rounds deterministically or O~((loglogn)3)\tilde O( (\log \log n)^3 ) rounds randomly. Another consequence (with additional optimizations) is an algorithm which generates an edge-orientation with out-degree at most (1+ϵ)λ\lceil (1+\epsilon) \lambda \rceil for a graph of arboricity λ\lambda; for fixed ϵ\epsilon this runs in O~(log6n)\tilde O(\log^6 n) rounds deterministically or O~(log3n)\tilde O(\log^3 n ) rounds randomly

    Some results on chromatic number as a function of triangle count

    A variety of powerful extremal results have been shown for the chromatic number of triangle-free graphs. Three noteworthy bounds are in terms of the number of vertices, edges, and maximum degree given by Poljak \& Tuza (1994), and Johansson. There have been comparatively fewer works extending these types of bounds to graphs with a small number of triangles. One noteworthy exception is a result of Alon et. al (1999) bounding the chromatic number for graphs with low degree and few triangles per vertex; this bound is nearly the same as for triangle-free graphs. This type of parametrization is much less rigid, and has appeared in dozens of combinatorial constructions. In this paper, we show a similar type of result for χ(G)\chi(G) as a function of the number of vertices nn, the number of edges mm, as well as the triangle count (both local and global measures). Our results smoothly interpolate between the generic bounds true for all graphs and bounds for triangle-free graphs. Our results are tight for most of these cases; we show how an open problem regarding fractional chromatic number and degeneracy in triangle-free graphs can resolve the small remaining gap in our bounds

    Deterministic parallel algorithms for bilinear objective functions

    Many randomized algorithms can be derandomized efficiently using either the method of conditional expectations or probability spaces with low independence. A series of papers, beginning with work by Luby (1988), showed that in many cases these techniques can be combined to give deterministic parallel (NC) algorithms for a variety of combinatorial optimization problems, with low time- and processor-complexity. We extend and generalize a technique of Luby for efficiently handling bilinear objective functions. One noteworthy application is an NC algorithm for maximal independent set. On a graph GG with mm edges and nn vertices, this takes O~(log2n)\tilde O(\log^2 n) time and (m+n)no(1)(m + n) n^{o(1)} processors, nearly matching the best randomized parallel algorithms. Other applications include reduced processor counts for algorithms of Berger (1997) for maximum acyclic subgraph and Gale-Berlekamp switching games. This bilinear factorization also gives better algorithms for problems involving discrepancy. An important application of this is to automata-fooling probability spaces, which are the basis of a notable derandomization technique of Sivakumar (2002). Our method leads to large reduction in processor complexity for a number of derandomization algorithms based on automata-fooling, including set discrepancy and the Johnson-Lindenstrauss Lemma

    Parallel algorithms and concentration bounds for the Lovasz Local Lemma via witness DAGs

    The Lov\'{a}sz Local Lemma (LLL) is a cornerstone principle in the probabilistic method of combinatorics, and a seminal algorithm of Moser & Tardos (2010) provides an efficient randomized algorithm to implement it. This can be parallelized to give an algorithm that uses polynomially many processors and runs in O(log3n)O(\log^3 n) time on an EREW PRAM, stemming from O(logn)O(\log n) adaptive computations of a maximal independent set (MIS). Chung et al. (2014) developed faster local and parallel algorithms, potentially running in time O(log2n)O(\log^2 n), but these algorithms require more stringent conditions than the LLL. We give a new parallel algorithm that works under essentially the same conditions as the original algorithm of Moser & Tardos but uses only a single MIS computation, thus running in O(log2n)O(\log^2 n) time on an EREW PRAM. This can be derandomized to give an NC algorithm running in time O(log2n)O(\log^2 n) as well, speeding up a previous NC LLL algorithm of Chandrasekaran et al. (2013). We also provide improved and tighter bounds on the run-times of the sequential and parallel resampling-based algorithms originally developed by Moser & Tardos. These apply to any problem instance in which the tighter Shearer LLL criterion is satisfied

    Tight bounds and conjectures for the isolation lemma

    Given a hypergraph HH and a weight function w:V{1,,M}w: V \rightarrow \{1, \dots, M\} on its vertices, we say that ww is isolating if there is exactly one edge of minimum weight w(e)=iew(i)w(e) = \sum_{i \in e} w(i). The Isolation Lemma is a combinatorial principle introduced in Mulmuley et. al (1987) which gives a lower bound on the number of isolating weight functions. Mulmuley used this as the basis of a parallel algorithm for finding perfect graph matchings. It has a number of other applications to parallel algorithms and to reductions of general search problems to unique search problems (in which there are one or zero solutions). The original bound given by Mulmuley et al. was recently improved by Ta-Shma (2015). In this paper, we show improved lower bounds on the number of isolating weight functions, and we conjecture that the extremal case is when HH consists of nn singleton edges. When MnM \gg n our improved bound matches this extremal case asymptotically. We are able to show that this conjecture holds in a number of special cases: when HH is a linear hypergraph or is 1-degenerate, or when M=2M = 2. We also show that it holds asymptotically when Mn1M \gg n \gg 1

    Improved bounds and algorithms for graph cuts and network reliability

    Karger (SIAM Journal on Computing, 1999) developed the first fully-polynomial approximation scheme to estimate the probability that a graph GG becomes disconnected, given that its edges are removed independently with probability pp. This algorithm runs in n5+o(1)ϵ3n^{5+o(1)} \epsilon^{-3} time to obtain an estimate within relative error ϵ\epsilon. We improve this run-time through algorithmic and graph-theoretic advances. First, there is a certain key sub-problem encountered by Karger, for which a generic estimation procedure is employed, we show that this has a special structure for which a much more efficient algorithm can be used. Second, we show better bounds on the number of edge cuts which are likely to fail. Here, Karger's analysis uses a variety of bounds for various graph parameters, we show that these bounds cannot be simultaneously tight. We describe a new graph parameter, which simultaneously influences all the bounds used by Karger, and obtain much tighter estimates of the cut structure of GG. These techniques allow us to improve the runtime to n3+o(1)ϵ2n^{3+o(1)} \epsilon^{-2}, our results also rigorously prove certain experimental observations of Karger & Tai (Proc. ACM-SIAM Symposium on Discrete Algorithms, 1997). Our rigorous proofs are motivated by certain non-rigorous differential-equation approximations which, however, provably track the worst-case trajectories of the relevant parameters. A key driver of Karger's approach (and other cut-related results) is a bound on the number of small cuts: we improve these estimates when the min-cut size is "small" and odd, augmenting, in part, a result of Bixby (Bulletin of the AMS, 1974)

    Parameter estimation for integer-valued Gibbs distributions

    We consider Gibbs distributions, which are families of probability distributions over a discrete space Ω\Omega with probability mass function given by μβΩ(x)=eβH(x)Z(β)\mu^\Omega_\beta(x) = \frac{e^{\beta H(x)}}{Z(\beta)}. Here H:Ω{0,..,n}H:\Omega\rightarrow\{0,..,n\} is a fixed function (called a Hamiltonian), β\beta is the parameter of the distribution, and the normalization factor Z(β)=xΩeβH(x)=k=0nckeβkZ(\beta)=\sum_{x\in\Omega}e^{\beta H(x)}=\sum_{k=0}^nc_ke^{\beta k} is called the partition function. We study how function ZZ can be estimated using an oracle that produces samples xμβΩ(.)x\sim\mu^\Omega_\beta(.) for a value β\beta in a given interval [βmin,βmax][\beta_{min},\beta_{max}]. We consider the problem of estimating the normalized coefficients ckc_k for indices kKk\in\cal K satisfying maxβμβΩ({xH(x)=k})μ\max_\beta\mu^\Omega_\beta(\{x|H(x)=k\})\ge\mu_*, where μ(0,1)\mu_*\in(0,1) is a given parameter and K\cal K is a given subset of H\cal H. We solve this using O~(min{q,n2}+min{q,K}μϵ2)\tilde O(\frac{\min\{q,n^2\}+\frac{\min\{\sqrt q,|\cal K|\}}{\mu_*}}{\epsilon^2}) samples where q=logZ(βmax)Z(βmin)q=\log\frac{Z(\beta_{max})}{Z(\beta_{min})}, and we show this is optimal up to logarithmic factors. We also improve the sample complexity to roughly O~(1/μ+min{q+n,n2}ϵ2)\tilde O(\frac{1/\mu_*+\min\{q+n,n^2\}}{\epsilon^2}) for applications where the coefficients are log-concave (e.g. counting connected subgraphs of a given graph). As a key subroutine, we show how to estimate qq using O~(min{q,n2}ϵ2)\tilde O(\frac{\min\{q,n^2\}}{\epsilon^2}) samples. This improves over a prior algorithm of Kolmogorov (2018) that uses O~(qϵ2)\tilde O(\frac q{\epsilon^2}) samples. We also show a "batched" version of this algorithm which simultaneously estimates Z(β)Z(βmin)\frac{Z(\beta)}{Z(\beta_{min})} for many values of β\beta, at essentially the same cost as for estimating just Z(βmax)Z(βmin)\frac{Z(\beta_{max})}{Z(\beta_{min})} alone. We show matching lower bounds, demonstrating that this complexity is optimal as a function of n,qn,q up to logarithmic terms.Comment: Superseded by arXiv:2007.1082

    The Moser-Tardos Framework with Partial Resampling

    The resampling algorithm of Moser \& Tardos is a powerful approach to develop constructive versions of the Lov\'{a}sz Local Lemma (LLL). We generalize this to partial resampling: when a bad event holds, we resample an appropriately-random subset of the variables that define this event, rather than the entire set as in Moser & Tardos. This is particularly useful when the bad events are determined by sums of random variables. This leads to several improved algorithmic applications in scheduling, graph transversals, packet routing etc. For instance, we settle a conjecture of Szab\'{o} & Tardos (2006) on graph transversals asymptotically, and obtain improved approximation ratios for a packet routing problem of Leighton, Maggs, & Rao (1994)

    Algorithmic and enumerative aspects of the Moser-Tardos distribution

    Moser & Tardos have developed a powerful algorithmic approach (henceforth "MT") to the Lovasz Local Lemma (LLL); the basic operation done in MT and its variants is a search for "bad" events in a current configuration. In the initial stage of MT, the variables are set independently. We examine the distributions on these variables which arise during intermediate stages of MT. We show that these configurations have a more or less "random" form, building further on the "MT-distribution" concept of Haeupler et al. in understanding the (intermediate and) output distribution of MT. This has a variety of algorithmic applications; the most important is that bad events can be found relatively quickly, improving upon MT across the complexity spectrum: it makes some polynomial-time algorithms sub-linear (e.g., for Latin transversals, which are of basic combinatorial interest), gives lower-degree polynomial run-times in some settings, transforms certain super-polynomial-time algorithms into polynomial-time ones, and leads to Las Vegas algorithms for some coloring problems for which only Monte Carlo algorithms were known. We show that in certain conditions when the LLL condition is violated, a variant of the MT algorithm can still produce a distribution which avoids most of the bad events. We show in some cases this MT variant can run faster than the original MT algorithm itself, and develop the first-known criterion for the case of the asymmetric LLL. This can be used to find partial Latin transversals -- improving upon earlier bounds of Stein (1975) -- among other applications. We furthermore give applications in enumeration, showing that most applications (where we aim for all or most of the bad events to be avoided) have many more solutions than known before by proving that the MT-distribution has "large" min-entropy and hence that its support-size is large