25 research outputs found
Simple Local Computation Algorithms for the General Lovasz Local Lemma
We consider the task of designing Local Computation Algorithms (LCA) for
applications of the Lov\'{a}sz Local Lemma (LLL). LCA is a class of sublinear
algorithms proposed by Rubinfeld et al.~\cite{Ronitt} that have received a lot
of attention in recent years. The LLL is an existential, sufficient condition
for a collection of sets to have non-empty intersection (in applications,
often, each set comprises all objects having a certain property). The
ground-breaking algorithm of Moser and Tardos~\cite{MT} made the LLL fully
constructive, following earlier results by Beck~\cite{beck_lll} and
Alon~\cite{alon_lll} giving algorithms under significantly stronger LLL-like
conditions. LCAs under those stronger conditions were given in~\cite{Ronitt},
where it was asked if the Moser-Tardos algorithm can be used to design LCAs
under the standard LLL condition. The main contribution of this paper is to
answer this question affirmatively. In fact, our techniques yield LCAs for
settings beyond the standard LLL condition
Certified Computation from Unreliable Datasets
A wide range of learning tasks require human input in labeling massive data.
The collected data though are usually low quality and contain inaccuracies and
errors. As a result, modern science and business face the problem of learning
from unreliable data sets.
In this work, we provide a generic approach that is based on
\textit{verification} of only few records of the data set to guarantee high
quality learning outcomes for various optimization objectives. Our method,
identifies small sets of critical records and verifies their validity. We show
that many problems only need verifications, to
ensure that the output of the computation is at most a factor of away from the truth. For any given instance, we provide an
\textit{instance optimal} solution that verifies the minimum possible number of
records to approximately certify correctness. Then using this instance optimal
formulation of the problem we prove our main result: "every function that
satisfies some Lipschitz continuity condition can be certified with a small
number of verifications". We show that the required Lipschitz continuity
condition is satisfied even by some NP-complete problems, which illustrates the
generality and importance of this theorem.
In case this certification step fails, an invalid record will be identified.
Removing these records and repeating until success, guarantees that the result
will be accurate and will depend only on the verified records. Surprisingly, as
we show, for several computation tasks more efficient methods are possible.
These methods always guarantee that the produced result is not affected by the
invalid records, since any invalid record that affects the output will be
detected and verified
Sampling Correctors
In many situations, sample data is obtained from a noisy or imperfect source.
In order to address such corruptions, this paper introduces the concept of a
sampling corrector. Such algorithms use structure that the distribution is
purported to have, in order to allow one to make "on-the-fly" corrections to
samples drawn from probability distributions. These algorithms then act as
filters between the noisy data and the end user.
We show connections between sampling correctors, distribution learning
algorithms, and distribution property testing algorithms. We show that these
connections can be utilized to expand the applicability of known distribution
learning and property testing algorithms as well as to achieve improved
algorithms for those tasks.
As a first step, we show how to design sampling correctors using proper
learning algorithms. We then focus on the question of whether algorithms for
sampling correctors can be more efficient in terms of sample complexity than
learning algorithms for the analogous families of distributions. When
correcting monotonicity, we show that this is indeed the case when also granted
query access to the cumulative distribution function. We also obtain sampling
correctors for monotonicity without this stronger type of access, provided that
the distribution be originally very close to monotone (namely, at a distance
). In addition to that, we consider a restricted error model
that aims at capturing "missing data" corruptions. In this model, we show that
distributions that are close to monotone have sampling correctors that are
significantly more efficient than achievable by the learning approach.
We also consider the question of whether an additional source of independent
random bits is required by sampling correctors to implement the correction
process
Fast Local Computation Algorithms
For input , let denote the set of outputs that are the "legal"
answers for a computational problem . Suppose and members of are
so large that there is not time to read them in their entirety. We propose a
model of {\em local computation algorithms} which for a given input ,
support queries by a user to values of specified locations in a legal
output . When more than one legal output exists for a given
, the local computation algorithm should output in a way that is consistent
with at least one such . Local computation algorithms are intended to
distill the common features of several concepts that have appeared in various
algorithmic subfields, including local distributed computation, local
algorithms, locally decodable codes, and local reconstruction.
We develop a technique, based on known constructions of small sample spaces
of -wise independent random variables and Beck's analysis in his algorithmic
approach to the Lov{\'{a}}sz Local Lemma, which under certain conditions can be
applied to construct local computation algorithms that run in {\em
polylogarithmic} time and space. We apply this technique to maximal independent
set computations, scheduling radio network broadcasts, hypergraph coloring and
satisfying -SAT formulas.Comment: A preliminary version of this paper appeared in ICS 2011, pp. 223-23
Agnostic proper learning of monotone functions: beyond the black-box correction barrier
We give the first agnostic, efficient, proper learning algorithm for monotone
Boolean functions. Given uniformly random
examples of an unknown function , our
algorithm outputs a hypothesis that is
monotone and -close to , where
is the distance from to the closest monotone function. The running time of
the algorithm (and consequently the size and evaluation time of the hypothesis)
is also , nearly matching the lower bound
of Blais et al (RANDOM '15). We also give an algorithm for estimating up to
additive error the distance of an unknown function to
monotone using a run-time of . Previously,
for both of these problems, sample-efficient algorithms were known, but these
algorithms were not run-time efficient. Our work thus closes this gap in our
knowledge between the run-time and sample complexity.
This work builds upon the improper learning algorithm of Bshouty and Tamon
(JACM '96) and the proper semiagnostic learning algorithm of Lange, Rubinfeld,
and Vasilyan (FOCS '22), which obtains a non-monotone Boolean-valued
hypothesis, then ``corrects'' it to monotone using query-efficient local
computation algorithms on graphs. This black-box correction approach can
achieve no error better than
information-theoretically; we bypass this barrier by
a) augmenting the improper learner with a convex optimization step, and
b) learning and correcting a real-valued function before rounding its values
to Boolean.
Our real-valued correction algorithm solves the ``poset sorting'' problem of
[LRV22] for functions over general posets with non-Boolean labels