520 research outputs found
On the Hardness and Inapproximability of Recognizing Wheeler Graphs
In recent years several compressed indexes based on variants of the Burrows-Wheeler transformation have been introduced. Some of these are used to index structures far more complex than a single string, as was originally done with the FM-index [Ferragina and Manzini, J. ACM 2005]. As such, there has been an increasing effort to better understand under which conditions such an indexing scheme is possible. This has led to the introduction of Wheeler graphs [Gagie et al., Theor. Comput. Sci., 2017]. Gagie et al. showed that de Bruijn graphs, generalized compressed suffix arrays, and several other BWT related structures can be represented as Wheeler graphs, and that Wheeler graphs can be indexed in a way which is space efficient. Hence, being able to recognize whether a given graph is a Wheeler graph, or being able to approximate a given graph by a Wheeler graph, could have numerous applications in indexing. Here we resolve the open question of whether there exists an efficient algorithm for recognizing if a given graph is a Wheeler graph. We present:
- The problem of recognizing whether a given graph G=(V,E) is a Wheeler graph is NP-complete for any edge label alphabet of size sigma >= 2, even when G is a DAG. This holds even on a restricted, subset of graphs called d-NFA\u27s for d >= 5. This is in contrast to recent results demonstrating the problem can be solved in polynomial time for d-NFA\u27s where d <= 2. We also show the recognition problem can be solved in linear time for sigma =1;
- There exists an 2^{e log sigma + O(n + e)} time exact algorithm where n = |V| and e = |E|. This algorithm relies on graph isomorphism being computable in strictly sub-exponential time;
- We define an optimization variant of the problem called Wheeler Graph Violation, abbreviated WGV, where the aim is to remove the minimum number of edges in order to obtain a Wheeler graph. We show WGV is APX-hard, even when G is a DAG, implying there exists a constant C >= 1 for which there is no C-approximation algorithm (unless P = NP). Also, conditioned on the Unique Games Conjecture, for all C >= 1, it is NP-hard to find a C-approximation;
- We define the Wheeler Subgraph problem, abbreviated WS, where the aim is to find the largest subgraph which is a Wheeler Graph (the dual of the WGV). In contrast to WGV, we prove that the WS problem is in APX for sigma=O(1);
The above findings suggest that most problems under this theme are computationally difficult. However, we identify a class of graphs for which the recognition problem is polynomial time solvable, raising the open question of which parameters determine this problem\u27s difficulty
On the Complexity of BWT-Runs Minimization via Alphabet Reordering
The Burrows-Wheeler Transform (BWT) has been an essential tool in text
compression and indexing. First introduced in 1994, it went on to provide the
backbone for the first encoding of the classic suffix tree data structure in
space close to the entropy-based lower bound. Recently, there has been the
development of compact suffix trees in space proportional to "", the number
of runs in the BWT, as well as the appearance of in the time complexity of
new algorithms. Unlike other popular measures of compression, the parameter
is sensitive to the lexicographic ordering given to the text's alphabet.
Despite several past attempts to exploit this, a provably efficient algorithm
for finding, or approximating, an alphabet ordering which minimizes has
been open for years.
We present the first set of results on the computational complexity of
minimizing BWT-runs via alphabet reordering. We prove that the decision version
of this problem is NP-complete and cannot be solved in time unless the Exponential Time Hypothesis fails, where is the
size of the alphabet and is the length of the text. We also show that the
optimization problem is APX-hard. In doing so, we relate two previously
disparate topics: the optimal traveling salesperson path and the number of runs
in the BWT of a text, providing a surprising connection between problems on
graphs and text compression. Also, by relating recent results in the field of
dictionary compression, we illustrate that an arbitrary alphabet ordering
provides a -approximation.
We provide an optimal linear-time algorithm for the problem of finding a run
minimizing ordering on a subset of symbols (occurring only once) under ordering
constraints, and prove a generalization of this problem to a class of graphs
with BWT like properties called Wheeler graphs is NP-complete
Tight Localizations of Feedback Sets
The classical NP-hard feedback arc set problem (FASP) and feedback vertex set
problem (FVSP) ask for a minimum set of arcs or
vertices whose removal , makes a given multi-digraph acyclic, respectively. Though both
problems are known to be APX-hard, approximation algorithms or proofs of
inapproximability are unknown. We propose a new
-heuristic for the directed FASP. While a ratio of is known to be a lower bound for the APX-hardness, at least by
empirical validation we achieve an approximation of . The most
relevant applications, such as circuit testing, ask for solving the FASP on
large sparse graphs, which can be done efficiently within tight error bounds
due to our approach.Comment: manuscript submitted to AC
A PTAS for the Multiple Knapsack Problem
The Multiple Knapsack problem (MKP) is a natural and well known generalization of the single knapsack problem and is defined as follows. We are given a set of n items and m bins (knapsacks) such that each item i has a profit p(i) and a size s(i), and each bin j has a capacity c(j). The goal is to find a subset of items of maximum profit such that they have a feasible packing in the bins. MKP is a special case of the Generalized Assignment problem (GAP) where the profit and the size of an item can vary based on the specific bin that it is assigned to. GAP is APX-hard and a 2-approximation for it is implicit in the work of Shmoys and Tardos [26], and thus far, this was also the best known approximation for MKP. The main result of this paper is a polynomial time approximation scheme for MKP.
Apart from its inherent theoretical interest as a common generalization of the well-studied knapsack and bin packing problems, it appears to be the strongest special case of GAP that is not APX-hard. We substantiate this by showing that slight generalizations of MKP are APX-hard. Thus our results help demarcate the boundary at which instances of GAP become APX-hard. An interesting aspect of our approach is a ptas-preserving reduction from an arbitrary instance of MKP to an instance with O(log n) distinct sizes and profits
On the complexity of the vector connectivity problem
We study a relaxation of the Vector Domination problem called Vector
Connectivity (VecCon). Given a graph with a requirement for each
vertex , VecCon asks for a minimum cardinality set of vertices such that
every vertex is connected to via disjoint paths.
In the paper introducing the problem, Boros et al. [Networks, 2014] gave
polynomial-time solutions for VecCon in trees, cographs, and split graphs, and
showed that the problem can be approximated in polynomial time on -vertex
graphs to within a factor of , leaving open the question of whether
the problem is NP-hard on general graphs. We show that VecCon is APX-hard in
general graphs, and NP-hard in planar bipartite graphs and in planar line
graphs. We also generalize the polynomial result for trees by solving the
problem for block graphs.Comment: 14 page
- …