520 research outputs found

    On the Hardness and Inapproximability of Recognizing Wheeler Graphs

    Get PDF
    In recent years several compressed indexes based on variants of the Burrows-Wheeler transformation have been introduced. Some of these are used to index structures far more complex than a single string, as was originally done with the FM-index [Ferragina and Manzini, J. ACM 2005]. As such, there has been an increasing effort to better understand under which conditions such an indexing scheme is possible. This has led to the introduction of Wheeler graphs [Gagie et al., Theor. Comput. Sci., 2017]. Gagie et al. showed that de Bruijn graphs, generalized compressed suffix arrays, and several other BWT related structures can be represented as Wheeler graphs, and that Wheeler graphs can be indexed in a way which is space efficient. Hence, being able to recognize whether a given graph is a Wheeler graph, or being able to approximate a given graph by a Wheeler graph, could have numerous applications in indexing. Here we resolve the open question of whether there exists an efficient algorithm for recognizing if a given graph is a Wheeler graph. We present: - The problem of recognizing whether a given graph G=(V,E) is a Wheeler graph is NP-complete for any edge label alphabet of size sigma >= 2, even when G is a DAG. This holds even on a restricted, subset of graphs called d-NFA\u27s for d >= 5. This is in contrast to recent results demonstrating the problem can be solved in polynomial time for d-NFA\u27s where d <= 2. We also show the recognition problem can be solved in linear time for sigma =1; - There exists an 2^{e log sigma + O(n + e)} time exact algorithm where n = |V| and e = |E|. This algorithm relies on graph isomorphism being computable in strictly sub-exponential time; - We define an optimization variant of the problem called Wheeler Graph Violation, abbreviated WGV, where the aim is to remove the minimum number of edges in order to obtain a Wheeler graph. We show WGV is APX-hard, even when G is a DAG, implying there exists a constant C >= 1 for which there is no C-approximation algorithm (unless P = NP). Also, conditioned on the Unique Games Conjecture, for all C >= 1, it is NP-hard to find a C-approximation; - We define the Wheeler Subgraph problem, abbreviated WS, where the aim is to find the largest subgraph which is a Wheeler Graph (the dual of the WGV). In contrast to WGV, we prove that the WS problem is in APX for sigma=O(1); The above findings suggest that most problems under this theme are computationally difficult. However, we identify a class of graphs for which the recognition problem is polynomial time solvable, raising the open question of which parameters determine this problem\u27s difficulty

    On the Complexity of BWT-Runs Minimization via Alphabet Reordering

    Get PDF
    The Burrows-Wheeler Transform (BWT) has been an essential tool in text compression and indexing. First introduced in 1994, it went on to provide the backbone for the first encoding of the classic suffix tree data structure in space close to the entropy-based lower bound. Recently, there has been the development of compact suffix trees in space proportional to "rr", the number of runs in the BWT, as well as the appearance of rr in the time complexity of new algorithms. Unlike other popular measures of compression, the parameter rr is sensitive to the lexicographic ordering given to the text's alphabet. Despite several past attempts to exploit this, a provably efficient algorithm for finding, or approximating, an alphabet ordering which minimizes rr has been open for years. We present the first set of results on the computational complexity of minimizing BWT-runs via alphabet reordering. We prove that the decision version of this problem is NP-complete and cannot be solved in time 2o(σ+n)2^{o(\sigma + \sqrt{n})} unless the Exponential Time Hypothesis fails, where σ\sigma is the size of the alphabet and nn is the length of the text. We also show that the optimization problem is APX-hard. In doing so, we relate two previously disparate topics: the optimal traveling salesperson path and the number of runs in the BWT of a text, providing a surprising connection between problems on graphs and text compression. Also, by relating recent results in the field of dictionary compression, we illustrate that an arbitrary alphabet ordering provides a O(log2n)O(\log^2 n)-approximation. We provide an optimal linear-time algorithm for the problem of finding a run minimizing ordering on a subset of symbols (occurring only once) under ordering constraints, and prove a generalization of this problem to a class of graphs with BWT like properties called Wheeler graphs is NP-complete

    Tight Localizations of Feedback Sets

    Full text link
    The classical NP-hard feedback arc set problem (FASP) and feedback vertex set problem (FVSP) ask for a minimum set of arcs εE\varepsilon \subseteq E or vertices νV\nu \subseteq V whose removal GεG\setminus \varepsilon, GνG\setminus \nu makes a given multi-digraph G=(V,E)G=(V,E) acyclic, respectively. Though both problems are known to be APX-hard, approximation algorithms or proofs of inapproximability are unknown. We propose a new O(VE4)\mathcal{O}(|V||E|^4)-heuristic for the directed FASP. While a ratio of r1.3606r \approx 1.3606 is known to be a lower bound for the APX-hardness, at least by empirical validation we achieve an approximation of r2r \leq 2. The most relevant applications, such as circuit testing, ask for solving the FASP on large sparse graphs, which can be done efficiently within tight error bounds due to our approach.Comment: manuscript submitted to AC

    A PTAS for the Multiple Knapsack Problem

    Get PDF
    The Multiple Knapsack problem (MKP) is a natural and well known generalization of the single knapsack problem and is defined as follows. We are given a set of n items and m bins (knapsacks) such that each item i has a profit p(i) and a size s(i), and each bin j has a capacity c(j). The goal is to find a subset of items of maximum profit such that they have a feasible packing in the bins. MKP is a special case of the Generalized Assignment problem (GAP) where the profit and the size of an item can vary based on the specific bin that it is assigned to. GAP is APX-hard and a 2-approximation for it is implicit in the work of Shmoys and Tardos [26], and thus far, this was also the best known approximation for MKP. The main result of this paper is a polynomial time approximation scheme for MKP. Apart from its inherent theoretical interest as a common generalization of the well-studied knapsack and bin packing problems, it appears to be the strongest special case of GAP that is not APX-hard. We substantiate this by showing that slight generalizations of MKP are APX-hard. Thus our results help demarcate the boundary at which instances of GAP become APX-hard. An interesting aspect of our approach is a ptas-preserving reduction from an arbitrary instance of MKP to an instance with O(log n) distinct sizes and profits

    On the complexity of the vector connectivity problem

    Full text link
    We study a relaxation of the Vector Domination problem called Vector Connectivity (VecCon). Given a graph GG with a requirement r(v)r(v) for each vertex vv, VecCon asks for a minimum cardinality set SS of vertices such that every vertex vVSv\in V\setminus S is connected to SS via r(v)r(v) disjoint paths. In the paper introducing the problem, Boros et al. [Networks, 2014] gave polynomial-time solutions for VecCon in trees, cographs, and split graphs, and showed that the problem can be approximated in polynomial time on nn-vertex graphs to within a factor of logn+2\log n+2, leaving open the question of whether the problem is NP-hard on general graphs. We show that VecCon is APX-hard in general graphs, and NP-hard in planar bipartite graphs and in planar line graphs. We also generalize the polynomial result for trees by solving the problem for block graphs.Comment: 14 page
    corecore