    Simple Local Computation Algorithms for the General Lovasz Local Lemma

    We consider the task of designing Local Computation Algorithms (LCA) for applications of the Lov\'{a}sz Local Lemma (LLL). LCA is a class of sublinear algorithms proposed by Rubinfeld et al.~\cite{Ronitt} that have received a lot of attention in recent years. The LLL is an existential, sufficient condition for a collection of sets to have non-empty intersection (in applications, often, each set comprises all objects having a certain property). The ground-breaking algorithm of Moser and Tardos~\cite{MT} made the LLL fully constructive, following earlier results by Beck~\cite{beck_lll} and Alon~\cite{alon_lll} giving algorithms under significantly stronger LLL-like conditions. LCAs under those stronger conditions were given in~\cite{Ronitt}, where it was asked if the Moser-Tardos algorithm can be used to design LCAs under the standard LLL condition. The main contribution of this paper is to answer this question affirmatively. In fact, our techniques yield LCAs for settings beyond the standard LLL condition

    Certified Computation from Unreliable Datasets

    A wide range of learning tasks require human input in labeling massive data. The collected data though are usually low quality and contain inaccuracies and errors. As a result, modern science and business face the problem of learning from unreliable data sets. In this work, we provide a generic approach that is based on \textit{verification} of only few records of the data set to guarantee high quality learning outcomes for various optimization objectives. Our method, identifies small sets of critical records and verifies their validity. We show that many problems only need poly(1/ε)\text{poly}(1/\varepsilon) verifications, to ensure that the output of the computation is at most a factor of (1±ε)(1 \pm \varepsilon) away from the truth. For any given instance, we provide an \textit{instance optimal} solution that verifies the minimum possible number of records to approximately certify correctness. Then using this instance optimal formulation of the problem we prove our main result: "every function that satisfies some Lipschitz continuity condition can be certified with a small number of verifications". We show that the required Lipschitz continuity condition is satisfied even by some NP-complete problems, which illustrates the generality and importance of this theorem. In case this certification step fails, an invalid record will be identified. Removing these records and repeating until success, guarantees that the result will be accurate and will depend only on the verified records. Surprisingly, as we show, for several computation tasks more efficient methods are possible. These methods always guarantee that the produced result is not affected by the invalid records, since any invalid record that affects the output will be detected and verified

    Sampling Correctors

    In many situations, sample data is obtained from a noisy or imperfect source. In order to address such corruptions, this paper introduces the concept of a sampling corrector. Such algorithms use structure that the distribution is purported to have, in order to allow one to make "on-the-fly" corrections to samples drawn from probability distributions. These algorithms then act as filters between the noisy data and the end user. We show connections between sampling correctors, distribution learning algorithms, and distribution property testing algorithms. We show that these connections can be utilized to expand the applicability of known distribution learning and property testing algorithms as well as to achieve improved algorithms for those tasks. As a first step, we show how to design sampling correctors using proper learning algorithms. We then focus on the question of whether algorithms for sampling correctors can be more efficient in terms of sample complexity than learning algorithms for the analogous families of distributions. When correcting monotonicity, we show that this is indeed the case when also granted query access to the cumulative distribution function. We also obtain sampling correctors for monotonicity without this stronger type of access, provided that the distribution be originally very close to monotone (namely, at a distance O(1/log2n)O(1/\log^2 n)). In addition to that, we consider a restricted error model that aims at capturing "missing data" corruptions. In this model, we show that distributions that are close to monotone have sampling correctors that are significantly more efficient than achievable by the learning approach. We also consider the question of whether an additional source of independent random bits is required by sampling correctors to implement the correction process

    Fast Local Computation Algorithms

    For input xx, let F(x)F(x) denote the set of outputs that are the "legal" answers for a computational problem FF. Suppose xx and members of F(x)F(x) are so large that there is not time to read them in their entirety. We propose a model of {\em local computation algorithms} which for a given input xx, support queries by a user to values of specified locations yiy_i in a legal output yF(x)y \in F(x). When more than one legal output yy exists for a given xx, the local computation algorithm should output in a way that is consistent with at least one such yy. Local computation algorithms are intended to distill the common features of several concepts that have appeared in various algorithmic subfields, including local distributed computation, local algorithms, locally decodable codes, and local reconstruction. We develop a technique, based on known constructions of small sample spaces of kk-wise independent random variables and Beck's analysis in his algorithmic approach to the Lov{\'{a}}sz Local Lemma, which under certain conditions can be applied to construct local computation algorithms that run in {\em polylogarithmic} time and space. We apply this technique to maximal independent set computations, scheduling radio network broadcasts, hypergraph coloring and satisfying kk-SAT formulas.Comment: A preliminary version of this paper appeared in ICS 2011, pp. 223-23

    Agnostic proper learning of monotone functions: beyond the black-box correction barrier

    We give the first agnostic, efficient, proper learning algorithm for monotone Boolean functions. Given 2O~(n/ε)2^{\tilde{O}(\sqrt{n}/\varepsilon)} uniformly random examples of an unknown function f:{±1}n{±1}f:\{\pm 1\}^n \rightarrow \{\pm 1\}, our algorithm outputs a hypothesis g:{±1}n{±1}g:\{\pm 1\}^n \rightarrow \{\pm 1\} that is monotone and (opt+ε)(\mathrm{opt} + \varepsilon)-close to ff, where opt\mathrm{opt} is the distance from ff to the closest monotone function. The running time of the algorithm (and consequently the size and evaluation time of the hypothesis) is also 2O~(n/ε)2^{\tilde{O}(\sqrt{n}/\varepsilon)}, nearly matching the lower bound of Blais et al (RANDOM '15). We also give an algorithm for estimating up to additive error ε\varepsilon the distance of an unknown function ff to monotone using a run-time of 2O~(n/ε)2^{\tilde{O}(\sqrt{n}/\varepsilon)}. Previously, for both of these problems, sample-efficient algorithms were known, but these algorithms were not run-time efficient. Our work thus closes this gap in our knowledge between the run-time and sample complexity. This work builds upon the improper learning algorithm of Bshouty and Tamon (JACM '96) and the proper semiagnostic learning algorithm of Lange, Rubinfeld, and Vasilyan (FOCS '22), which obtains a non-monotone Boolean-valued hypothesis, then ``corrects'' it to monotone using query-efficient local computation algorithms on graphs. This black-box correction approach can achieve no error better than 2opt+ε2\mathrm{opt} + \varepsilon information-theoretically; we bypass this barrier by a) augmenting the improper learner with a convex optimization step, and b) learning and correcting a real-valued function before rounding its values to Boolean. Our real-valued correction algorithm solves the ``poset sorting'' problem of [LRV22] for functions over general posets with non-Boolean labels