    Quantum Meets Fine-Grained Complexity: Sublinear Time Quantum Algorithms for String Problems

    On Complexity of 1-Center in Various Metrics

    We consider the classic 1-center problem: Given a set P of n points in a metric space find the point in P that minimizes the maximum distance to the other points of P. We study the complexity of this problem in d-dimensional p\ell_p-metrics and in edit and Ulam metrics over strings of length d. Our results for the 1-center problem may be classified based on d as follows. \bullet Small d: We provide the first linear-time algorithm for 1-center problem in fixed-dimensional 1\ell_1 metrics. On the other hand, assuming the hitting set conjecture (HSC), we show that when d=ω(logn)d=\omega(\log n), no subquadratic algorithm can solve 1-center problem in any of the p\ell_p-metrics, or in edit or Ulam metrics. \bullet Large d. When d=Ω(n)d=\Omega(n), we extend our conditional lower bound to rule out sub quartic algorithms for 1-center problem in edit metric (assuming Quantified SETH). On the other hand, we give a (1+ϵ)(1+\epsilon)-approximation for 1-center in Ulam metric with running time Oϵ~(nd+n2d)\tilde{O_{\epsilon}}(nd+n^2\sqrt{d}). We also strengthen some of the above lower bounds by allowing approximations or by reducing the dimension d, but only against a weaker class of algorithms which list all requisite solutions. Moreover, we extend one of our hardness results to rule out subquartic algorithms for the well-studied 1-median problem in the edit metric, where given a set of n strings each of length n, the goal is to find a string in the set that minimizes the sum of the edit distances to the rest of the strings in the set

    Gap Edit Distance via Non-Adaptive Queries: Simple and Optimal

    We study the problem of approximating edit distance in sublinear time. This is formalized as a promise problem (k,kc)(k,k^c)-Gap Edit Distance, where the input is a pair of strings X,YX,Y and parameters k,c>1k,c>1, and the goal is to return YES if ED(X,Y)kED(X,Y)\leq k and NO if ED(X,Y)>kcED(X,Y)> k^c. Recent years have witnessed significant interest in designing sublinear-time algorithms for Gap Edit Distance. We resolve the non-adaptive query complexity of Gap Edit Distance, improving over several previous results. Specifically, we design a non-adaptive algorithm with query complexity O~(nkc0.5)\tilde{O}(\frac{n}{k^{c-0.5}}), and further prove that this bound is optimal up to polylogarithmic factors. Our algorithm also achieves optimal time complexity O~(nkc0.5)\tilde{O}(\frac{n}{k^{c-0.5}}) whenever c1.5c\geq 1.5. For 1<c<1.51<c<1.5, the running time of our algorithm is O~(nk2c1)\tilde{O}(\frac{n}{k^{2c-1}}). For the restricted case of kc=Ω(n)k^c=\Omega(n), this matches a known result [Batu, Erg\"un, Kilian, Magen, Raskhodnikova, Rubinfeld, and Sami, STOC 2003], and in all other (nontrivial) cases, our running time is strictly better than all previous algorithms, including the adaptive ones

    Approximating Edit Distance Within Constant Factor in Truly Sub-Quadratic Time

    Edit distance is a measure of similarity of two strings based on the minimum number of character insertions, deletions, and substitutions required to transform one string into the other. The edit distance can be computed exactly using a dynamic programming algorithm that runs in quadratic time. Andoni, Krauthgamer and Onak (2010) gave a nearly linear time algorithm that approximates edit distance within approximation factor poly(logn)\text{poly}(\log n). In this paper, we provide an algorithm with running time O~(n22/7)\tilde{O}(n^{2-2/7}) that approximates the edit distance within a constant factor

    Approximating the Center Ranking Under Ulam

    Estimating the Longest Increasing Subsequence in Nearly Optimal Time

    Longest Increasing Subsequence (LIS) is a fundamental statistic of a sequence, and has been studied for decades. While the LIS of a sequence of length nn can be computed exactly in time O(nlogn)O(n\log n), the complexity of estimating the (length of the) LIS in sublinear time, especially when LIS n\ll n, is still open. We show that for any integer nn and any λ=o(1)\lambda = o(1), there exists a (randomized) non-adaptive algorithm that, given a sequence of length nn with LIS λn\ge \lambda n, approximates the LIS up to a factor of 1/λo(1)1/\lambda^{o(1)} in no(1)/λn^{o(1)} / \lambda time. Our algorithm improves upon prior work substantially in terms of both approximation and run-time: (i) we provide the first sub-polynomial approximation for LIS in sub-linear time; and (ii) our run-time complexity essentially matches the trivial sample complexity lower bound of Ω(1/λ)\Omega(1/\lambda), which is required to obtain any non-trivial approximation of the LIS. As part of our solution, we develop two novel ideas which may be of independent interest: First, we define a new Genuine-LIS problem, where each sequence element may either be genuine or corrupted. In this model, the user receives unrestricted access to actual sequence, but does not know apriori which elements are genuine. The goal is to estimate the LIS using genuine elements only, with the minimal number of "genuiness tests". The second idea, Precision Forest, enables accurate estimations for composition of general functions from "coarse" (sub-)estimates. Precision Forest essentially generalizes classical precision sampling, which works only for summations. As a central tool, the Precision Forest is initially pre-processed on a set of samples, which thereafter is repeatedly reused by multiple sub-parts of the algorithm, improving their amortized complexity.Comment: Full version of FOCS 2022 pape

    Fair Rank Aggregation

    Ranking algorithms find extensive usage in diverse areas such as web search, employment, college admission, voting, etc. The related rank aggregation problem deals with combining multiple rankings into a single aggregate ranking. However, algorithms for both these problems might be biased against some individuals or groups due to implicit prejudice or marginalization in the historical data. We study ranking and rank aggregation problems from a fairness or diversity perspective, where the candidates (to be ranked) may belong to different groups and each group should have a fair representation in the final ranking. We allow the designer to set the parameters that define fair representation. These parameters specify the allowed range of the number of candidates from a particular group in the top-kk positions of the ranking. Given any ranking, we provide a fast and exact algorithm for finding the closest fair ranking for the Kendall tau metric under block-fairness. We also provide an exact algorithm for finding the closest fair ranking for the Ulam metric under strict-fairness, when there are only O(1)O(1) number of groups. Our algorithms are simple, fast, and might be extendable to other relevant metrics. We also give a novel meta-algorithm for the general rank aggregation problem under the fairness framework. Surprisingly, this meta-algorithm works for any generalized mean objective (including center and median problems) and any fairness criteria. As a byproduct, we obtain 3-approximation algorithms for both center and median problems, under both Kendall tau and Ulam metrics. Furthermore, using sophisticated techniques we obtain a (3ε)(3-\varepsilon)-approximation algorithm, for a constant ε>0\varepsilon>0, for the Ulam metric under strong fairness.Comment: A preliminary version of this paper appeared in NeurIPS 202