34 research outputs found
On Complexity of 1-Center in Various Metrics
We consider the classic 1-center problem: Given a set P of n points in a
metric space find the point in P that minimizes the maximum distance to the
other points of P. We study the complexity of this problem in d-dimensional
-metrics and in edit and Ulam metrics over strings of length d. Our
results for the 1-center problem may be classified based on d as follows.
Small d: We provide the first linear-time algorithm for 1-center
problem in fixed-dimensional metrics. On the other hand, assuming the
hitting set conjecture (HSC), we show that when , no
subquadratic algorithm can solve 1-center problem in any of the
-metrics, or in edit or Ulam metrics.
Large d. When , we extend our conditional lower bound
to rule out sub quartic algorithms for 1-center problem in edit metric
(assuming Quantified SETH). On the other hand, we give a
-approximation for 1-center in Ulam metric with running time
.
We also strengthen some of the above lower bounds by allowing approximations
or by reducing the dimension d, but only against a weaker class of algorithms
which list all requisite solutions. Moreover, we extend one of our hardness
results to rule out subquartic algorithms for the well-studied 1-median problem
in the edit metric, where given a set of n strings each of length n, the goal
is to find a string in the set that minimizes the sum of the edit distances to
the rest of the strings in the set
Gap Edit Distance via Non-Adaptive Queries: Simple and Optimal
We study the problem of approximating edit distance in sublinear time. This
is formalized as a promise problem -Gap Edit Distance, where the input
is a pair of strings and parameters , and the goal is to return
YES if and NO if . Recent years have witnessed
significant interest in designing sublinear-time algorithms for Gap Edit
Distance.
We resolve the non-adaptive query complexity of Gap Edit Distance, improving
over several previous results. Specifically, we design a non-adaptive algorithm
with query complexity , and further prove that
this bound is optimal up to polylogarithmic factors.
Our algorithm also achieves optimal time complexity
whenever . For , the
running time of our algorithm is . For the
restricted case of , this matches a known result [Batu, Erg\"un,
Kilian, Magen, Raskhodnikova, Rubinfeld, and Sami, STOC 2003], and in all other
(nontrivial) cases, our running time is strictly better than all previous
algorithms, including the adaptive ones
Approximating Edit Distance Within Constant Factor in Truly Sub-Quadratic Time
Edit distance is a measure of similarity of two strings based on the minimum
number of character insertions, deletions, and substitutions required to
transform one string into the other. The edit distance can be computed exactly
using a dynamic programming algorithm that runs in quadratic time. Andoni,
Krauthgamer and Onak (2010) gave a nearly linear time algorithm that
approximates edit distance within approximation factor .
In this paper, we provide an algorithm with running time
that approximates the edit distance within a constant
factor
Estimating the Longest Increasing Subsequence in Nearly Optimal Time
Longest Increasing Subsequence (LIS) is a fundamental statistic of a
sequence, and has been studied for decades. While the LIS of a sequence of
length can be computed exactly in time , the complexity of
estimating the (length of the) LIS in sublinear time, especially when LIS , is still open.
We show that for any integer and any , there exists a
(randomized) non-adaptive algorithm that, given a sequence of length with
LIS , approximates the LIS up to a factor of
in time.
Our algorithm improves upon prior work substantially in terms of both
approximation and run-time: (i) we provide the first sub-polynomial
approximation for LIS in sub-linear time; and (ii) our run-time complexity
essentially matches the trivial sample complexity lower bound of
, which is required to obtain any non-trivial approximation
of the LIS.
As part of our solution, we develop two novel ideas which may be of
independent interest: First, we define a new Genuine-LIS problem, where each
sequence element may either be genuine or corrupted. In this model, the user
receives unrestricted access to actual sequence, but does not know apriori
which elements are genuine. The goal is to estimate the LIS using genuine
elements only, with the minimal number of "genuiness tests". The second idea,
Precision Forest, enables accurate estimations for composition of general
functions from "coarse" (sub-)estimates. Precision Forest essentially
generalizes classical precision sampling, which works only for summations. As a
central tool, the Precision Forest is initially pre-processed on a set of
samples, which thereafter is repeatedly reused by multiple sub-parts of the
algorithm, improving their amortized complexity.Comment: Full version of FOCS 2022 pape
Fair Rank Aggregation
Ranking algorithms find extensive usage in diverse areas such as web search,
employment, college admission, voting, etc. The related rank aggregation
problem deals with combining multiple rankings into a single aggregate ranking.
However, algorithms for both these problems might be biased against some
individuals or groups due to implicit prejudice or marginalization in the
historical data. We study ranking and rank aggregation problems from a fairness
or diversity perspective, where the candidates (to be ranked) may belong to
different groups and each group should have a fair representation in the final
ranking. We allow the designer to set the parameters that define fair
representation. These parameters specify the allowed range of the number of
candidates from a particular group in the top- positions of the ranking.
Given any ranking, we provide a fast and exact algorithm for finding the
closest fair ranking for the Kendall tau metric under block-fairness. We also
provide an exact algorithm for finding the closest fair ranking for the Ulam
metric under strict-fairness, when there are only number of groups. Our
algorithms are simple, fast, and might be extendable to other relevant metrics.
We also give a novel meta-algorithm for the general rank aggregation problem
under the fairness framework. Surprisingly, this meta-algorithm works for any
generalized mean objective (including center and median problems) and any
fairness criteria. As a byproduct, we obtain 3-approximation algorithms for
both center and median problems, under both Kendall tau and Ulam metrics.
Furthermore, using sophisticated techniques we obtain a
-approximation algorithm, for a constant , for
the Ulam metric under strong fairness.Comment: A preliminary version of this paper appeared in NeurIPS 202