1,571 research outputs found

    DNF Sparsification and a Faster Deterministic Counting Algorithm

    Full text link
    Given a DNF formula on n variables, the two natural size measures are the number of terms or size s(f), and the maximum width of a term w(f). It is folklore that short DNF formulas can be made narrow. We prove a converse, showing that narrow formulas can be sparsified. More precisely, any width w DNF irrespective of its size can be ϵ\epsilon-approximated by a width ww DNF with at most (wlog(1/ϵ))O(w)(w\log(1/\epsilon))^{O(w)} terms. We combine our sparsification result with the work of Luby and Velikovic to give a faster deterministic algorithm for approximately counting the number of satisfying solutions to a DNF. Given a formula on n variables with poly(n) terms, we give a deterministic nO~(loglog(n))n^{\tilde{O}(\log \log(n))} time algorithm that computes an additive ϵ\epsilon approximation to the fraction of satisfying assignments of f for \epsilon = 1/\poly(\log n). The previous best result due to Luby and Velickovic from nearly two decades ago had a run-time of nexp(O(loglogn))n^{\exp(O(\sqrt{\log \log n}))}.Comment: To appear in the IEEE Conference on Computational Complexity, 201

    NP-hardness of circuit minimization for multi-output functions

    Get PDF
    Can we design efficient algorithms for finding fast algorithms? This question is captured by various circuit minimization problems, and algorithms for the corresponding tasks have significant practical applications. Following the work of Cook and Levin in the early 1970s, a central question is whether minimizing the circuit size of an explicitly given function is NP-complete. While this is known to hold in restricted models such as DNFs, making progress with respect to more expressive classes of circuits has been elusive. In this work, we establish the first NP-hardness result for circuit minimization of total functions in the setting of general (unrestricted) Boolean circuits. More precisely, we show that computing the minimum circuit size of a given multi-output Boolean function f : {0,1}^n ? {0,1}^m is NP-hard under many-one polynomial-time randomized reductions. Our argument builds on a simpler NP-hardness proof for the circuit minimization problem for (single-output) Boolean functions under an extended set of generators. Complementing these results, we investigate the computational hardness of minimizing communication. We establish that several variants of this problem are NP-hard under deterministic reductions. In particular, unless ? = ??, no polynomial-time computable function can approximate the deterministic two-party communication complexity of a partial Boolean function up to a polynomial. This has consequences for the class of structural results that one might hope to show about the communication complexity of partial functions

    Oblivious Bounds on the Probability of Boolean Functions

    Full text link
    This paper develops upper and lower bounds for the probability of Boolean functions by treating multiple occurrences of variables as independent and assigning them new individual probabilities. We call this approach dissociation and give an exact characterization of optimal oblivious bounds, i.e. when the new probabilities are chosen independent of the probabilities of all other variables. Our motivation comes from the weighted model counting problem (or, equivalently, the problem of computing the probability of a Boolean function), which is #P-hard in general. By performing several dissociations, one can transform a Boolean formula whose probability is difficult to compute, into one whose probability is easy to compute, and which is guaranteed to provide an upper or lower bound on the probability of the original formula by choosing appropriate probabilities for the dissociated variables. Our new bounds shed light on the connection between previous relaxation-based and model-based approximations and unify them as concrete choices in a larger design space. We also show how our theory allows a standard relational database management system (DBMS) to both upper and lower bound hard probabilistic queries in guaranteed polynomial time.Comment: 34 pages, 14 figures, supersedes: http://arxiv.org/abs/1105.281

    Approximation Algorithms for Stochastic Boolean Function Evaluation and Stochastic Submodular Set Cover

    Full text link
    Stochastic Boolean Function Evaluation is the problem of determining the value of a given Boolean function f on an unknown input x, when each bit of x_i of x can only be determined by paying an associated cost c_i. The assumption is that x is drawn from a given product distribution, and the goal is to minimize the expected cost. This problem has been studied in Operations Research, where it is known as "sequential testing" of Boolean functions. It has also been studied in learning theory in the context of learning with attribute costs. We consider the general problem of developing approximation algorithms for Stochastic Boolean Function Evaluation. We give a 3-approximation algorithm for evaluating Boolean linear threshold formulas. We also present an approximation algorithm for evaluating CDNF formulas (and decision trees) achieving a factor of O(log kd), where k is the number of terms in the DNF formula, and d is the number of clauses in the CNF formula. In addition, we present approximation algorithms for simultaneous evaluation of linear threshold functions, and for ranking of linear functions. Our function evaluation algorithms are based on reductions to the Stochastic Submodular Set Cover (SSSC) problem. This problem was introduced by Golovin and Krause. They presented an approximation algorithm for the problem, called Adaptive Greedy. Our main technical contribution is a new approximation algorithm for the SSSC problem, which we call Adaptive Dual Greedy. It is an extension of the Dual Greedy algorithm for Submodular Set Cover due to Fujito, which is a generalization of Hochbaum's algorithm for the classical Set Cover Problem. We also give a new bound on the approximation achieved by the Adaptive Greedy algorithm of Golovin and Krause

    Sketch-based Influence Maximization and Computation: Scaling up with Guarantees

    Full text link
    Propagation of contagion through networks is a fundamental process. It is used to model the spread of information, influence, or a viral infection. Diffusion patterns can be specified by a probabilistic model, such as Independent Cascade (IC), or captured by a set of representative traces. Basic computational problems in the study of diffusion are influence queries (determining the potency of a specified seed set of nodes) and Influence Maximization (identifying the most influential seed set of a given size). Answering each influence query involves many edge traversals, and does not scale when there are many queries on very large graphs. The gold standard for Influence Maximization is the greedy algorithm, which iteratively adds to the seed set a node maximizing the marginal gain in influence. Greedy has a guaranteed approximation ratio of at least (1-1/e) and actually produces a sequence of nodes, with each prefix having approximation guarantee with respect to the same-size optimum. Since Greedy does not scale well beyond a few million edges, for larger inputs one must currently use either heuristics or alternative algorithms designed for a pre-specified small seed set size. We develop a novel sketch-based design for influence computation. Our greedy Sketch-based Influence Maximization (SKIM) algorithm scales to graphs with billions of edges, with one to two orders of magnitude speedup over the best greedy methods. It still has a guaranteed approximation ratio, and in practice its quality nearly matches that of exact greedy. We also present influence oracles, which use linear-time preprocessing to generate a small sketch for each node, allowing the influence of any seed set to be quickly answered from the sketches of its nodes.Comment: 10 pages, 5 figures. Appeared at the 23rd Conference on Information and Knowledge Management (CIKM 2014) in Shanghai, Chin

    Pre-Reduction Graph Products: Hardnesses of Properly Learning DFAs and Approximating EDP on DAGs

    Full text link
    The study of graph products is a major research topic and typically concerns the term f(GH)f(G*H), e.g., to show that f(GH)=f(G)f(H)f(G*H)=f(G)f(H). In this paper, we study graph products in a non-standard form f(R[GH]f(R[G*H] where RR is a "reduction", a transformation of any graph into an instance of an intended optimization problem. We resolve some open problems as applications. (1) A tight n1ϵn^{1-\epsilon}-approximation hardness for the minimum consistent deterministic finite automaton (DFA) problem, where nn is the sample size. Due to Board and Pitt [Theoretical Computer Science 1992], this implies the hardness of properly learning DFAs assuming NPRPNP\neq RP (the weakest possible assumption). (2) A tight n1/2ϵn^{1/2-\epsilon} hardness for the edge-disjoint paths (EDP) problem on directed acyclic graphs (DAGs), where nn denotes the number of vertices. (3) A tight hardness of packing vertex-disjoint kk-cycles for large kk. (4) An alternative (and perhaps simpler) proof for the hardness of properly learning DNF, CNF and intersection of halfspaces [Alekhnovich et al., FOCS 2004 and J. Comput.Syst.Sci. 2008]
    corecore