22 research outputs found
Approximate Hypergraph Coloring under Low-discrepancy and Related Promises
A hypergraph is said to be -colorable if its vertices can be colored
with colors so that no hyperedge is monochromatic. -colorability is a
fundamental property (called Property B) of hypergraphs and is extensively
studied in combinatorics. Algorithmically, however, given a -colorable
-uniform hypergraph, it is NP-hard to find a -coloring miscoloring fewer
than a fraction of hyperedges (which is achieved by a random
-coloring), and the best algorithms to color the hypergraph properly require
colors, approaching the trivial bound of as
increases.
In this work, we study the complexity of approximate hypergraph coloring, for
both the maximization (finding a -coloring with fewest miscolored edges) and
minimization (finding a proper coloring using fewest number of colors)
versions, when the input hypergraph is promised to have the following stronger
properties than -colorability:
(A) Low-discrepancy: If the hypergraph has discrepancy ,
we give an algorithm to color the it with colors.
However, for the maximization version, we prove NP-hardness of finding a
-coloring miscoloring a smaller than (resp. )
fraction of the hyperedges when (resp. ). Assuming
the UGC, we improve the latter hardness factor to for almost
discrepancy- hypergraphs.
(B) Rainbow colorability: If the hypergraph has a -coloring such
that each hyperedge is polychromatic with all these colors, we give a
-coloring algorithm that miscolors at most of the
hyperedges when , and complement this with a matching UG
hardness result showing that when , it is hard to even beat the
bound achieved by a random coloring.Comment: Approx 201
Approximation algorithms for a graph-cut problem with applications to a clustering problem in bioinformatics
xiii, 71 leaves : ill. ; 29 cm.Clusters in protein interaction networks can potentially help identify functional relationships
among proteins. We study the clustering problem by modeling it as graph cut problems.
Given an edge weighted graph, the goal is to partition the graph into a prescribed
number of subsets obeying some capacity constraints, so as to maximize the total weight
of the edges that are within a subset. Identification of a dense subset might shed some light
on the biological function of all the proteins in the subset.
We study integer programming formulations and exhibit large integrality gaps for various
formulations. This is indicative of the difficulty in obtaining constant factor approximation
algorithms using the primal-dual schema. We propose three approximation algorithms for
the problem. We evaluate the algorithms on the database of interacting proteins and on
randomly generated graphs. Our experiments show that the algorithms are fast and have
good performance ratio in practice
Algorithms for string and graph layout
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2004.Includes bibliographical references (p. 121-125).Many graph optimization problems can be viewed as graph layout problems. A layout of a graph is a geometric arrangement of the vertices subject to given constraints. For example, the vertices of a graph can be arranged on a line or a circle, on a two- or three-dimensional lattice, etc. The goal is usually to place all the vertices so as to optimize some specified objective function. We develop combinatorial methods as well as models based on linear and semidefinite programming for graph layout problems. We apply these techniques to some well-known optimization problems. In particular, we give improved approximation algorithms for the string folding problem on the two- and three-dimensional square lattices. This combinatorial graph problem is motivated by the protein folding problem, which is central in computational biology. We then present a new semidefinite programming formulation for the linear ordering problem (also known as the maximum acyclic subgraph problem) and show that it provides an improved bound on the value of an optimal solution for random graphs. This is the first relaxation that improves on the trivial "all edges" bound for random graphs.by Alantha Newman.Ph.D
Multi Layer Peeling for Linear Arrangement and Hierarchical Clustering
We present a new multi-layer peeling technique to cluster points in a metric space. A well-known non-parametric objective is to embed the metric space into a simpler structured metric space such as a line (i.e., Linear Arrangement) or a binary tree (i.e., Hierarchical Clustering). Points which are close in the metric space should be mapped to close points/leaves in the line/tree; similarly, points which are far in the metric space should be far in the line or on the tree. In particular we consider the Maximum Linear Arrangement problem [Refael Hassin and Shlomi Rubinstein, 2001] and the Maximum Hierarchical Clustering problem [Vincent Cohen-Addad et al., 2018] applied to metrics.
We design approximation schemes (1-? approximation for any constant ? > 0) for these objectives. In particular this shows that by considering metrics one may significantly improve former approximations (0.5 for Max Linear Arrangement and 0.74 for Max Hierarchical Clustering). Our main technique, which is called multi-layer peeling, consists of recursively peeling off points which are far from the "core" of the metric space. The recursion ends once the core becomes a sufficiently densely weighted metric space (i.e. the average distance is at least a constant times the diameter) or once it becomes negligible with respect to its inner contribution to the objective. Interestingly, the algorithm in the Linear Arrangement case is much more involved than that in the Hierarchical Clustering case, and uses a significantly more delicate peeling
Recommended from our members
Approximation Algorithms for NP-Hard Problems
The workshop was concerned with the most important recent developments in the area of efficient approximation algorithms for NP-hard optimization problems as well as with new techniques for proving intrinsic lower bounds for efficient approximation
Subset selection using nonlinear optimization
A common problem in computer science is how to represent a large dataset in a smaller more compact form. This thesis describes a generalized framework for selecting canonical subsets of data points that are highly representative of the original larger dataset. The contributions of the work are formulation of the subset selection problem as an optimization problem, an analysis of the complexity of the problem, the development of approximation algorithms to compute canonical subsets, and a demonstration of the utility of the algorithms in several problem domains.Ph.D., Computer Science -- Drexel University, 200