41,772 research outputs found

    An exponential lower bound for Individualization-Refinement algorithms for Graph Isomorphism

    Full text link
    The individualization-refinement paradigm provides a strong toolbox for testing isomorphism of two graphs and indeed, the currently fastest implementations of isomorphism solvers all follow this approach. While these solvers are fast in practice, from a theoretical point of view, no general lower bounds concerning the worst case complexity of these tools are known. In fact, it is an open question whether individualization-refinement algorithms can achieve upper bounds on the running time similar to the more theoretical techniques based on a group theoretic approach. In this work we give a negative answer to this question and construct a family of graphs on which algorithms based on the individualization-refinement paradigm require exponential time. Contrary to a previous construction of Miyazaki, that only applies to a specific implementation within the individualization-refinement framework, our construction is immune to changing the cell selector, or adding various heuristic invariants to the algorithm. Furthermore, our graphs also provide exponential lower bounds in the case when the kk-dimensional Weisfeiler-Leman algorithm is used to replace the standard color refinement operator and the arguments even work when the entire automorphism group of the inputs is initially provided to the algorithm.Comment: 21 page

    On the Optimality of Pseudo-polynomial Algorithms for Integer Programming

    Get PDF
    In the classic Integer Programming (IP) problem, the objective is to decide whether, for a given m×nm \times n matrix AA and an mm-vector b=(b1,
,bm)b=(b_1,\dots, b_m), there is a non-negative integer nn-vector xx such that Ax=bAx=b. Solving (IP) is an important step in numerous algorithms and it is important to obtain an understanding of the precise complexity of this problem as a function of natural parameters of the input. The classic pseudo-polynomial time algorithm of Papadimitriou [J. ACM 1981] for instances of (IP) with a constant number of constraints was only recently improved upon by Eisenbrand and Weismantel [SODA 2018] and Jansen and Rohwedder [ArXiv 2018]. We continue this line of work and show that under the Exponential Time Hypothesis (ETH), the algorithm of Jansen and Rohwedder is nearly optimal. We also show that when the matrix AA is assumed to be non-negative, a component of Papadimitriou's original algorithm is already nearly optimal under ETH. This motivates us to pick up the line of research initiated by Cunningham and Geelen [IPCO 2007] who studied the complexity of solving (IP) with non-negative matrices in which the number of constraints may be unbounded, but the branch-width of the column-matroid corresponding to the constraint matrix is a constant. We prove a lower bound on the complexity of solving (IP) for such instances and obtain optimal results with respect to a closely related parameter, path-width. Specifically, we prove matching upper and lower bounds for (IP) when the path-width of the corresponding column-matroid is a constant.Comment: 29 pages, To appear in ESA 201

    Supporting User-Defined Functions on Uncertain Data

    Get PDF
    Uncertain data management has become crucial in many sensing and scientific applications. As user-defined functions (UDFs) become widely used in these applications, an important task is to capture result uncertainty for queries that evaluate UDFs on uncertain data. In this work, we provide a general framework for supporting UDFs on uncertain data. Specifically, we propose a learning approach based on Gaussian processes (GPs) to compute approximate output distributions of a UDF when evaluated on uncertain input, with guaranteed error bounds. We also devise an online algorithm to compute such output distributions, which employs a suite of optimizations to improve accuracy and performance. Our evaluation using both real-world and synthetic functions shows that our proposed GP approach can outperform the state-of-the-art sampling approach with up to two orders of magnitude improvement for a variety of UDFs. 1

    Dispersion for Data-Driven Algorithm Design, Online Learning, and Private Optimization

    Full text link
    Data-driven algorithm design, that is, choosing the best algorithm for a specific application, is a crucial problem in modern data science. Practitioners often optimize over a parameterized algorithm family, tuning parameters based on problems from their domain. These procedures have historically come with no guarantees, though a recent line of work studies algorithm selection from a theoretical perspective. We advance the foundations of this field in several directions: we analyze online algorithm selection, where problems arrive one-by-one and the goal is to minimize regret, and private algorithm selection, where the goal is to find good parameters over a set of problems without revealing sensitive information contained therein. We study important algorithm families, including SDP-rounding schemes for problems formulated as integer quadratic programs, and greedy techniques for canonical subset selection problems. In these cases, the algorithm's performance is a volatile and piecewise Lipschitz function of its parameters, since tweaking the parameters can completely change the algorithm's behavior. We give a sufficient and general condition, dispersion, defining a family of piecewise Lipschitz functions that can be optimized online and privately, which includes the functions measuring the performance of the algorithms we study. Intuitively, a set of piecewise Lipschitz functions is dispersed if no small region contains many of the functions' discontinuities. We present general techniques for online and private optimization of the sum of dispersed piecewise Lipschitz functions. We improve over the best-known regret bounds for a variety of problems, prove regret bounds for problems not previously studied, and give matching lower bounds. We also give matching upper and lower bounds on the utility loss due to privacy. Moreover, we uncover dispersion in auction design and pricing problems

    Structurally Parameterized d-Scattered Set

    Full text link
    In dd-Scattered Set we are given an (edge-weighted) graph and are asked to select at least kk vertices, so that the distance between any pair is at least dd, thus generalizing Independent Set. We provide upper and lower bounds on the complexity of this problem with respect to various standard graph parameters. In particular, we show the following: - For any d≄2d\ge2, an O∗(dtw)O^*(d^{\textrm{tw}})-time algorithm, where tw\textrm{tw} is the treewidth of the input graph. - A tight SETH-based lower bound matching this algorithm's performance. These generalize known results for Independent Set. - dd-Scattered Set is W[1]-hard parameterized by vertex cover (for edge-weighted graphs), or feedback vertex set (for unweighted graphs), even if kk is an additional parameter. - A single-exponential algorithm parameterized by vertex cover for unweighted graphs, complementing the above-mentioned hardness. - A 2O(td2)2^{O(\textrm{td}^2)}-time algorithm parameterized by tree-depth (td\textrm{td}), as well as a matching ETH-based lower bound, both for unweighted graphs. We complement these mostly negative results by providing an FPT approximation scheme parameterized by treewidth. In particular, we give an algorithm which, for any error parameter Ï”>0\epsilon > 0, runs in time O∗((tw/Ï”)O(tw))O^*((\textrm{tw}/\epsilon)^{O(\textrm{tw})}) and returns a d/(1+Ï”)d/(1+\epsilon)-scattered set of size kk, if a dd-scattered set of the same size exists

    Analysis of Noisy Evolutionary Optimization When Sampling Fails

    Full text link
    In noisy evolutionary optimization, sampling is a common strategy to deal with noise. By the sampling strategy, the fitness of a solution is evaluated multiple times (called \emph{sample size}) independently, and its true fitness is then approximated by the average of these evaluations. Previous studies on sampling are mainly empirical. In this paper, we first investigate the effect of sample size from a theoretical perspective. By analyzing the (1+1)-EA on the noisy LeadingOnes problem, we show that as the sample size increases, the running time can reduce from exponential to polynomial, but then return to exponential. This suggests that a proper sample size is crucial in practice. Then, we investigate what strategies can work when sampling with any fixed sample size fails. By two illustrative examples, we prove that using parent or offspring populations can be better. Finally, we construct an artificial noisy example to show that when using neither sampling nor populations is effective, adaptive sampling (i.e., sampling with an adaptive sample size) can work. This, for the first time, provides a theoretical support for the use of adaptive sampling
    • 

    corecore