    Deterministic Sparse Pattern Matching via the Baur-Strassen Theorem

    How fast can you test whether a constellation of stars appears in the night sky? This question can be modeled as the computational problem of testing whether a set of points PP can be moved into (or close to) another set QQ under some prescribed group of transformations. Consider, as a simple representative, the following problem: Given two sets of at most nn integers P,Q[N]P,Q\subseteq[N], determine whether there is some shift ss such that PP shifted by ss is a subset of QQ, i.e., P+s={p+s:pP}QP+s=\{p+s:p\in P\}\subseteq Q. This problem, to which we refer as the Constellation problem, can be solved in near-linear time O(nlogn)O(n\log n) by a Monte Carlo randomized algorithm [Cardoze, Schulman; FOCS'98] and time O(nlog2N)O(n\log^2 N) by a Las Vegas randomized algorithm [Cole, Hariharan; STOC'02]. Moreover, there is a deterministic algorithm running in time n2O(lognloglogN)n\cdot2^{O(\sqrt{\log n\log\log N})} [Chan, Lewenstein; STOC'15]. An interesting question left open by these previous works is whether Constellation is in deterministic near-linear time (i.e., with only polylogarithmic overhead). We answer this question positively by giving an n(logN)O(1)n\cdot(\log N)^{O(1)}-time deterministic algorithm for the Constellation problem. Our algorithm extends to various more complex Point Pattern Matching problems in higher dimensions, under translations and rigid motions, and possibly with mismatches, and also to a near-linear-time derandomization of the Sparse Wildcard Matching problem on strings. We find it particularly interesting how we obtain our deterministic algorithm. All previous algorithms are based on the same baseline idea, using additive hashing and the Fast Fourier Transform. In contrast, our algorithms are based on new ideas, involving a surprising blend of combinatorial and algebraic techniques. At the heart lies an innovative application of the Baur-Strassen theorem from algebraic complexity theory.Comment: Abstract shortened to fit arxiv requirement

    Algorithms for sparse convolution and sublinear edit distance

    In this PhD thesis on fine-grained algorithm design and complexity, we investigate output-sensitive and sublinear-time algorithms for two important problems. (1) Sparse Convolution: Computing the convolution of two vectors is a basic algorithmic primitive with applications across all of Computer Science and Engineering. In the sparse convolution problem we assume that the input and output vectors have at most t nonzero entries, and the goal is to design algorithms with running times dependent on t. For the special case where all entries are nonnegative, which is particularly important for algorithm design, it is known since twenty years that sparse convolutions can be computed in near-linear randomized time O(t log^2 n). In this thesis we develop a randomized algorithm with running time O(t \log t) which is optimal (under some mild assumptions), and the first near-linear deterministic algorithm for sparse nonnegative convolution. We also present an application of these results, leading to seemingly unrelated fine-grained lower bounds against distance oracles in graphs. (2) Sublinear Edit Distance: The edit distance of two strings is a well-studied similarity measure with numerous applications in computational biology. While computing the edit distance exactly provably requires quadratic time, a long line of research has lead to a constant-factor approximation algorithm in almost-linear time. Perhaps surprisingly, it is also possible to approximate the edit distance k within a large factor O(k) in sublinear time O~(n/k + poly(k)). We drastically improve the approximation factor of the known sublinear algorithms from O(k) to k^{o(1)} while preserving the O(n/k + poly(k)) running time.In dieser Doktorarbeit über feinkörnige Algorithmen und Komplexität untersuchen wir ausgabesensitive Algorithmen und Algorithmen mit sublinearer Lauf-zeit für zwei wichtige Probleme. (1) Dünne Faltungen: Die Berechnung der Faltung zweier Vektoren ist ein grundlegendes algorithmisches Primitiv, das in allen Bereichen der Informatik und des Ingenieurwesens Anwendung findet. Für das dünne Faltungsproblem nehmen wir an, dass die Eingabe- und Ausgabevektoren höchstens t Einträge ungleich Null haben, und das Ziel ist, Algorithmen mit Laufzeiten in Abhängigkeit von t zu entwickeln. Für den speziellen Fall, dass alle Einträge nicht-negativ sind, was insbesondere für den Entwurf von Algorithmen relevant ist, ist seit zwanzig Jahren bekannt, dass dünn besetzte Faltungen in nahezu linearer randomisierter Zeit O(t \log^2 n) berechnet werden können. In dieser Arbeit entwickeln wir einen randomisierten Algorithmus mit Laufzeit O(t \log t), der (unter milden Annahmen) optimal ist, und den ersten nahezu linearen deterministischen Algorithmus für dünne nichtnegative Faltungen. Wir stellen auch eine Anwendung dieser Ergebnisse vor, die zu scheinbar unverwandten feinkörnigen unteren Schranken gegen Distanzorakel in Graphen führt. (2) Sublineare Editierdistanz: Die Editierdistanz zweier Zeichenketten ist ein gut untersuchtes Ähnlichkeitsmaß mit zahlreichen Anwendungen in der Computerbiologie. Während die exakte Berechnung der Editierdistanz nachweislich quadratische Zeit erfordert, hat eine lange Reihe von Forschungsarbeiten zu einem Approximationsalgorithmus mit konstantem Faktor in fast-linearer Zeit geführt. Überraschenderweise ist es auch möglich, die Editierdistanz k innerhalb eines großen Faktors O(k) in sublinearer Zeit O~(n/k + poly(k)) zu approximieren. Wir verbessern drastisch den Approximationsfaktor der bekannten sublinearen Algorithmen von O(k) auf k^{o(1)} unter Beibehaltung der O(n/k + poly(k))-Laufzeit

    Deterministic 3SUM-Hardness

    As one of the three main pillars of fine-grained complexity theory, the 3SUM problem explains the hardness of many diverse polynomial-time problems via fine-grained reductions. Many of these reductions are either directly based on or heavily inspired by P\u{a}tra\c{s}cu's framework involving additive hashing and are thus randomized. Some selected reductions were derandomized in previous work [Chan, He; SOSA'20], but the current techniques are limited and a major fraction of the reductions remains randomized. In this work we gather a toolkit aimed to derandomize reductions based on additive hashing. Using this toolkit, we manage to derandomize almost all known 3SUM-hardness reductions. As technical highlights we derandomize the hardness reductions to (offline) Set Disjointness, (offline) Set Intersection and Triangle Listing -- these questions were explicitly left open in previous work [Kopelowitz, Pettie, Porat; SODA'16]. The few exceptions to our work fall into a special category of recent reductions based on structure-versus-randomness dichotomies. We expect that our toolkit can be readily applied to derandomize future reductions as well. As a conceptual innovation, our work thereby promotes the theory of deterministic 3SUM-hardness. As our second contribution, we prove that there is a deterministic universe reduction for 3SUM. Specifically, using additive hashing it is a standard trick to assume that the numbers in 3SUM have size at most n3n^3. We prove that this assumption is similarly valid for deterministic algorithms.Comment: To appear at ITCS 202

    Negative-Weight Single-Source Shortest Paths in Near-Linear Time: Now Faster!

    In this work we revisit the fundamental Single-Source Shortest Paths (SSSP) problem with possibly negative edge weights. A recent breakthrough result by Bernstein, Nanongkai and Wulff-Nilsen established a near-linear O(mlog8(n)log(W))O(m \log^8(n) \log(W))-time algorithm for negative-weight SSSP, where WW is an upper bound on the magnitude of the smallest negative-weight edge. In this work we improve the running time to O(mlog2(n)log(nW)loglogn)O(m \log^2(n) \log(nW) \log\log n), which is an improvement by nearly six log-factors. Some of these log-factors are easy to shave (e.g. replacing the priority queue used in Dijkstra's algorithm), while others are significantly more involved (e.g. to find negative cycles we design an algorithm reminiscent of noisy binary search and analyze it with drift analysis). As side results, we obtain an algorithm to compute the minimum cycle mean in the same running time as well as a new construction for computing Low-Diameter Decompositions in directed graphs

    Stronger 3-SUM Lower Bounds for Approximate Distance Oracles via Additive Combinatorics

    The "short cycle removal" technique was recently introduced by Abboud, Bringmann, Khoury and Zamir (STOC '22) to prove fine-grained hardness of approximation. Its main technical result is that listing all triangles in an n1/2n^{1/2}-regular graph is n2o(1)n^{2-o(1)}-hard under the 3-SUM conjecture even when the number of short cycles is small; namely, when the number of kk-cycles is O(nk/2+γ)O(n^{k/2+\gamma}) for γ<1/2\gamma<1/2. Abboud et al. achieve γ1/4\gamma\geq 1/4 by applying structure vs. randomness arguments on graphs. In this paper, we take a step back and apply conceptually similar arguments on the numbers of the 3-SUM problem. Consequently, we achieve the best possible γ=0\gamma=0 and the following lower bounds under the 3-SUM conjecture: * Approximate distance oracles: The seminal Thorup-Zwick distance oracles achieve stretch 2k±O(1)2k\pm O(1) after preprocessing a graph in O(mn1/k)O(m n^{1/k}) time. For the same stretch, and assuming the query time is no(1)n^{o(1)} Abboud et al. proved an Ω(m1+112.7552k)\Omega(m^{1+\frac{1}{12.7552 \cdot k}}) lower bound on the preprocessing time; we improve it to Ω(m1+12k)\Omega(m^{1+\frac1{2k}}) which is only a factor 2 away from the upper bound. We also obtain tight bounds for stretch 2+o(1)2+o(1) and 3ϵ3-\epsilon and higher lower bounds for dynamic shortest paths. * Listing 4-cycles: Abboud et al. proved the first super-linear lower bound for listing all 4-cycles in a graph, ruling out (m1.1927+t)1+o(1)(m^{1.1927}+t)^{1+o(1)} time algorithms where tt is the number of 4-cycles. We settle the complexity of this basic problem by showing that the O~(min(m4/3,n2)+t)\widetilde{O}(\min(m^{4/3},n^2) +t) upper bound is tight up to no(1)n^{o(1)} factors. Our results exploit a rich tool set from additive combinatorics, most notably the Balog-Szemer\'edi-Gowers theorem and Rusza's covering lemma. A key ingredient that may be of independent interest is a subquadratic algorithm for 3-SUM if one of the sets has small doubling.Comment: Abstract shortened to fit arXiv requirement

    Fine-Grained Completeness for Optimization in P

    The Hardship That is Internet Deprivation and What it Means for Sentencing: Development of the Internet Sanction and Connectivity for Prisoners

    Twenty years ago, the internet was a novel tool. Now it is such an ingrained part of most people’s lives that they experience and exhibit signs of anxiety and stress if they cannot access it. Non-accessibility to the internet can also tangibly set back peoples’ social, educational, financial, and vocational pursuits and interests. In this Article, we argue that the sentencing law needs to be reformed to adapt to the fundamental changes in human behavior caused by the internet. We present three novel and major implications for the sentencing law and practice in the era of the internet. First, we argue that denial of access to the internet should be developed as a discrete sentencing sanction, which can be invoked for relatively minor offenses in much the same way that deprivation of other entitlements or privileges, such as the right to drive a motor vehicle, are currently imposed for certain crimes. Second, we argue that prisoners should have unfettered access to the internet. This would lessen the pain stemming from incarceration in a manner which does not undermine the principal objectives of imprisonment—community protection and infliction of a hardship—while at the same time providing prisoners with the opportunity to develop skills, knowledge, and relationships that will better equip them for a productive life once they are released. Previous arguments that have been made for denying internet access to prisoners are unsound. Technological advances can readily curb supposed risks associated with prisoners using the internet. Finally, if the second recommendation is not adopted, and prisoners continue to be denied access to the internet, there should be an acknowledgement that the burden of imprisonment is greater than is currently acknowledged. The internet is now such an ingrained and important aspect of people’s lives that prohibiting its use is a cause of considerable unpleasantness. This leads to our third proposal: continued denial of the internet to prisoners should result in a recalibration of the pain of imprisonment such that a sentencing reduction should be conferred to prisoners

    Fine-Grained Completeness for Optimization in P

    We initiate the study of fine-grained completeness theorems for exact and approximate optimization in the polynomial-time regime. Inspired by the first completeness results for decision problems in P (Gao, Impagliazzo, Kolokolova, Williams, TALG 2019) as well as the classic class MaxSNP and MaxSNP-completeness for NP optimization problems (Papadimitriou, Yannakakis, JCSS 1991), we define polynomial-time analogues MaxSP and MinSP, which contain a number of natural optimization problems in P, including Maximum Inner Product, general forms of nearest neighbor search and optimization variants of the kk-XOR problem. Specifically, we define MaxSP as the class of problems definable as maxx1,,xk#{(y1,,y):ϕ(x1,,xk,y1,,y)}\max_{x_1,\dots,x_k} \#\{ (y_1,\dots,y_\ell) : \phi(x_1,\dots,x_k, y_1,\dots,y_\ell) \}, where ϕ\phi is a quantifier-free first-order property over a given relational structure (with MinSP defined analogously). On mm-sized structures, we can solve each such problem in time O(mk+1)O(m^{k+\ell-1}). Our results are: - We determine (a sparse variant of) the Maximum/Minimum Inner Product problem as complete under *deterministic* fine-grained reductions: A strongly subquadratic algorithm for Maximum/Minimum Inner Product would beat the baseline running time of O(mk+1)O(m^{k+\ell-1}) for *all* problems in MaxSP/MinSP by a polynomial factor. - This completeness transfers to approximation: Maximum/Minimum Inner Product is also complete in the sense that a strongly subquadratic cc-approximation would give a (c+ε)(c+\varepsilon)-approximation for all MaxSP/MinSP problems in time O(mk+1δ)O(m^{k+\ell-1-\delta}), where ε>0\varepsilon > 0 can be chosen arbitrarily small. Combining our completeness with~(Chen, Williams, SODA 2019), we obtain the perhaps surprising consequence that refuting the OV Hypothesis is *equivalent* to giving a O(1)O(1)-approximation for all MinSP problems in faster-than-O(mk+1)O(m^{k+\ell-1}) time.Comment: Full version of APPROX'21 paper, abstract shortened to fit ArXiv requirement

    Faster Minimization of Tardy Processing Time on a Single Machine

    This paper is concerned with the 1pjUj1||\sum p_jU_j problem, the problem of minimizing the total processing time of tardy jobs on a single machine. This is not only a fundamental scheduling problem, but also a very important problem from a theoretical point of view as it generalizes the Subset Sum problem and is closely related to the 0/1-Knapsack problem. The problem is well-known to be NP-hard, but only in a weak sense, meaning it admits pseudo-polynomial time algorithms. The fastest known pseudo-polynomial time algorithm for the problem is the famous Lawler and Moore algorithm which runs in O(Pn)O(P \cdot n) time, where PP is the total processing time of all nn jobs in the input. This algorithm has been developed in the late 60s, and has yet to be improved to date. In this paper we develop two new algorithms for 1pjUj1||\sum p_jU_j, each improving on Lawler and Moore's algorithm in a different scenario. Both algorithms rely on basic primitive operations between sets of integers and vectors of integers for the speedup in their running times. The second algorithm relies on fast polynomial multiplication as its main engine, while for the first algorithm we define a new "skewed" version of (max,min)(\max,\min)-convolution which is interesting in its own right