22 research outputs found

    Approximate Hypergraph Coloring under Low-discrepancy and Related Promises

    Get PDF
    A hypergraph is said to be Ο‡\chi-colorable if its vertices can be colored with Ο‡\chi colors so that no hyperedge is monochromatic. 22-colorability is a fundamental property (called Property B) of hypergraphs and is extensively studied in combinatorics. Algorithmically, however, given a 22-colorable kk-uniform hypergraph, it is NP-hard to find a 22-coloring miscoloring fewer than a fraction 2βˆ’k+12^{-k+1} of hyperedges (which is achieved by a random 22-coloring), and the best algorithms to color the hypergraph properly require β‰ˆn1βˆ’1/k\approx n^{1-1/k} colors, approaching the trivial bound of nn as kk increases. In this work, we study the complexity of approximate hypergraph coloring, for both the maximization (finding a 22-coloring with fewest miscolored edges) and minimization (finding a proper coloring using fewest number of colors) versions, when the input hypergraph is promised to have the following stronger properties than 22-colorability: (A) Low-discrepancy: If the hypergraph has discrepancy β„“β‰ͺk\ell \ll \sqrt{k}, we give an algorithm to color the it with β‰ˆnO(β„“2/k)\approx n^{O(\ell^2/k)} colors. However, for the maximization version, we prove NP-hardness of finding a 22-coloring miscoloring a smaller than 2βˆ’O(k)2^{-O(k)} (resp. kβˆ’O(k)k^{-O(k)}) fraction of the hyperedges when β„“=O(log⁑k)\ell = O(\log k) (resp. β„“=2\ell=2). Assuming the UGC, we improve the latter hardness factor to 2βˆ’O(k)2^{-O(k)} for almost discrepancy-11 hypergraphs. (B) Rainbow colorability: If the hypergraph has a (kβˆ’β„“)(k-\ell)-coloring such that each hyperedge is polychromatic with all these colors, we give a 22-coloring algorithm that miscolors at most kβˆ’Ξ©(k)k^{-\Omega(k)} of the hyperedges when β„“β‰ͺk\ell \ll \sqrt{k}, and complement this with a matching UG hardness result showing that when β„“=k\ell =\sqrt{k}, it is hard to even beat the 2βˆ’k+12^{-k+1} bound achieved by a random coloring.Comment: Approx 201

    Approximation algorithms for a graph-cut problem with applications to a clustering problem in bioinformatics

    Get PDF
    xiii, 71 leaves : ill. ; 29 cm.Clusters in protein interaction networks can potentially help identify functional relationships among proteins. We study the clustering problem by modeling it as graph cut problems. Given an edge weighted graph, the goal is to partition the graph into a prescribed number of subsets obeying some capacity constraints, so as to maximize the total weight of the edges that are within a subset. Identification of a dense subset might shed some light on the biological function of all the proteins in the subset. We study integer programming formulations and exhibit large integrality gaps for various formulations. This is indicative of the difficulty in obtaining constant factor approximation algorithms using the primal-dual schema. We propose three approximation algorithms for the problem. We evaluate the algorithms on the database of interacting proteins and on randomly generated graphs. Our experiments show that the algorithms are fast and have good performance ratio in practice

    Algorithms for string and graph layout

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2004.Includes bibliographical references (p. 121-125).Many graph optimization problems can be viewed as graph layout problems. A layout of a graph is a geometric arrangement of the vertices subject to given constraints. For example, the vertices of a graph can be arranged on a line or a circle, on a two- or three-dimensional lattice, etc. The goal is usually to place all the vertices so as to optimize some specified objective function. We develop combinatorial methods as well as models based on linear and semidefinite programming for graph layout problems. We apply these techniques to some well-known optimization problems. In particular, we give improved approximation algorithms for the string folding problem on the two- and three-dimensional square lattices. This combinatorial graph problem is motivated by the protein folding problem, which is central in computational biology. We then present a new semidefinite programming formulation for the linear ordering problem (also known as the maximum acyclic subgraph problem) and show that it provides an improved bound on the value of an optimal solution for random graphs. This is the first relaxation that improves on the trivial "all edges" bound for random graphs.by Alantha Newman.Ph.D

    Multi Layer Peeling for Linear Arrangement and Hierarchical Clustering

    Get PDF
    We present a new multi-layer peeling technique to cluster points in a metric space. A well-known non-parametric objective is to embed the metric space into a simpler structured metric space such as a line (i.e., Linear Arrangement) or a binary tree (i.e., Hierarchical Clustering). Points which are close in the metric space should be mapped to close points/leaves in the line/tree; similarly, points which are far in the metric space should be far in the line or on the tree. In particular we consider the Maximum Linear Arrangement problem [Refael Hassin and Shlomi Rubinstein, 2001] and the Maximum Hierarchical Clustering problem [Vincent Cohen-Addad et al., 2018] applied to metrics. We design approximation schemes (1-? approximation for any constant ? > 0) for these objectives. In particular this shows that by considering metrics one may significantly improve former approximations (0.5 for Max Linear Arrangement and 0.74 for Max Hierarchical Clustering). Our main technique, which is called multi-layer peeling, consists of recursively peeling off points which are far from the "core" of the metric space. The recursion ends once the core becomes a sufficiently densely weighted metric space (i.e. the average distance is at least a constant times the diameter) or once it becomes negligible with respect to its inner contribution to the objective. Interestingly, the algorithm in the Linear Arrangement case is much more involved than that in the Hierarchical Clustering case, and uses a significantly more delicate peeling

    Subset selection using nonlinear optimization

    Get PDF
    A common problem in computer science is how to represent a large dataset in a smaller more compact form. This thesis describes a generalized framework for selecting canonical subsets of data points that are highly representative of the original larger dataset. The contributions of the work are formulation of the subset selection problem as an optimization problem, an analysis of the complexity of the problem, the development of approximation algorithms to compute canonical subsets, and a demonstration of the utility of the algorithms in several problem domains.Ph.D., Computer Science -- Drexel University, 200

    ON THE HARDNESS OF APPROXIMATING MULTICUT AND SPARSEST-CUT

    Full text link
    corecore