276 research outputs found
Minimizing Convex Functions with Integral Minimizers
Given a separation oracle for a convex function that has an
integral minimizer inside a box with radius , we show how to find an exact
minimizer of using at most (a) calls to
and arithmetic operations, or (b)
calls to and arithmetic
operations. When the set of minimizers of has integral extreme points, our
algorithm outputs an integral minimizer of . This improves upon the
previously best oracle complexity of for polynomial time
algorithms obtained by [Gr\"otschel, Lov\'asz and Schrijver, Prog. Comb. Opt.
1984, Springer 1988] over thirty years ago.
For the Submodular Function Minimization problem, our result immediately
implies a strongly polynomial algorithm that makes at most calls to an
evaluation oracle, and an exponential time algorithm that makes at most calls to an evaluation oracle. These improve upon the previously best
oracle complexity for strongly polynomial algorithms given
in [Lee, Sidford and Wong, FOCS 2015] and [Dadush, V\'egh and Zambelli, SODA
2018], and an exponential time algorithm with oracle complexity given in the former work.
Our result is achieved via a reduction to the Shortest Vector Problem in
lattices. We show how an approximately shortest vector of certain lattice can
be used to effectively reduce the dimension of the problem. Our analysis of the
oracle complexity is based on a potential function that captures simultaneously
the size of the search set and the density of the lattice, which we analyze via
technical tools from convex geometry.Comment: This version of the paper simplifies and generalizes the results in
an earlier version which will appear in SODA 202
Algorithms and Adaptivity Gaps for Stochastic k-TSP
Given a metric and a , the classic
\textsf{k-TSP} problem is to find a tour originating at the
of minimum length that visits at least nodes in . In this work,
motivated by applications where the input to an optimization problem is
uncertain, we study two stochastic versions of \textsf{k-TSP}.
In Stoch-Reward -TSP, originally defined by Ene-Nagarajan-Saket [ENS17],
each vertex in the given metric contains a stochastic reward .
The goal is to adaptively find a tour of minimum expected length that collects
at least reward ; here "adaptively" means our next decision may depend on
previous outcomes. Ene et al. give an -approximation adaptive
algorithm for this problem, and left open if there is an -approximation
algorithm. We totally resolve their open question and even give an
-approximation \emph{non-adaptive} algorithm for this problem.
We also introduce and obtain similar results for the Stoch-Cost -TSP
problem. In this problem each vertex has a stochastic cost , and the
goal is to visit and select at least vertices to minimize the expected
\emph{sum} of tour length and cost of selected vertices. This problem
generalizes the Price of Information framework [Singla18] from deterministic
probing costs to metric probing costs.
Our techniques are based on two crucial ideas: "repetitions" and "critical
scaling". We show using Freedman's and Jogdeo-Samuels' inequalities that for
our problems, if we truncate the random variables at an ideal threshold and
repeat, then their expected values form a good surrogate. Unfortunately, this
ideal threshold is adaptive as it depends on how far we are from achieving our
target , so we truncate at various different scales and identify a
"critical" scale.Comment: ITCS 202
Forward and Inverse Approximation Theory for Linear Temporal Convolutional Networks
We present a theoretical analysis of the approximation properties of
convolutional architectures when applied to the modeling of temporal sequences.
Specifically, we prove an approximation rate estimate (Jackson-type result) and
an inverse approximation theorem (Bernstein-type result), which together
provide a comprehensive characterization of the types of sequential
relationships that can be efficiently captured by a temporal convolutional
architecture. The rate estimate improves upon a previous result via the
introduction of a refined complexity measure, whereas the inverse approximation
theorem is new
Approximation theory of transformer networks for sequence modeling
The transformer is a widely applied architecture in sequence modeling
applications, but the theoretical understanding of its working principles is
limited. In this work, we investigate the ability of transformers to
approximate sequential relationships. We first prove a universal approximation
theorem for the transformer hypothesis space. From its derivation, we identify
a novel notion of regularity under which we can prove an explicit approximation
rate estimate. This estimate reveals key structural properties of the
transformer and suggests the types of sequence relationships that the
transformer is adapted to approximating. In particular, it allows us to
concretely discuss the structural bias between the transformer and classical
sequence modeling methods, such as recurrent neural networks. Our findings are
supported by numerical experiments
Sparse Submodular Function Minimization
In this paper we study the problem of minimizing a submodular function that is guaranteed to have a -sparse minimizer.
We give a deterministic algorithm that computes an additive
-approximate minimizer of such in parallel depth using a polynomial number of queries to an
evaluation oracle of , where . Further,
we give a randomized algorithm that computes an exact minimizer of with
high probability using queries and
polynomial time. When , our algorithms use either
nearly-constant parallel depth or a nearly-linear number of evaluation oracle
queries. All previous algorithms for this problem either use
parallel depth or queries.
In contrast to state-of-the-art weakly-polynomial and strongly-polynomial
time algorithms for SFM, our algorithms use first-order optimization methods,
e.g., mirror descent and follow the regularized leader. We introduce what we
call {\em sparse dual certificates}, which encode information on the structure
of sparse minimizers, and both our parallel and sequential algorithms provide
new algorithmic tools for allowing first-order optimization methods to
efficiently compute them. Correspondingly, our algorithm does not invoke fast
matrix multiplication or general linear system solvers and in this sense is
more combinatorial than previous state-of-the-art methods.Comment: Accepted to FOCS 202
A Unified PTAS for Prize Collecting TSP and Steiner Tree Problem in Doubling Metrics
We present a unified (randomized) polynomial-time approximation scheme (PTAS) for the prize collecting traveling salesman problem (PCTSP) and the prize collecting Steiner tree problem (PCSTP) in doubling metrics. Given a metric space and a penalty function on a subset of points known as terminals, a solution is a subgraph on points in the metric space, whose cost is the weight of its edges plus the penalty due to terminals not covered by the subgraph. Under our unified framework, the solution subgraph needs to be Eulerian for PCTSP, while it needs to be a tree for PCSTP. Before our work, even a QPTAS for the problems in doubling metrics is not known.
Our unified PTAS is based on the previous dynamic programming frameworks proposed in [Talwar STOC 2004] and [Bartal, Gottlieb, Krauthgamer STOC 2012]. However, since it is unknown which part of the optimal cost is due to edge lengths and which part is due to penalties of uncovered terminals, we need to develop new techniques to apply previous divide-and-conquer strategies and sparse instance decompositions
Parallel Submodular Function Minimization
We consider the parallel complexity of submodular function minimization
(SFM). We provide a pair of methods which obtain two new query versus depth
trade-offs a submodular function defined on subsets of elements that has
integer values between and . The first method has depth and query
complexity and the second method has depth and query complexity . Despite a line of work
on improved parallel lower bounds for SFM, prior to our work the only known
algorithms for parallel SFM either followed from more general methods for
sequential SFM or highly-parallel minimization of convex -Lipschitz
functions. Interestingly, to obtain our second result we provide the first
highly-parallel algorithm for minimizing -Lipschitz function over
the hypercube which obtains near-optimal depth for obtaining constant accuracy
- …