Search CORE

1,559 research outputs found

Agnostic Learning from Tolerant Natural Proofs

Author: Carmosino Marco L.
Impagliazzo Russell
Kabanets Valentine
Kolokolova Antonina
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2017)
Publication date: 01/01/2017
Field of study

We generalize the "learning algorithms from natural properties" framework of [CIKK16] to get agnostic learning algorithms from natural properties with extra features. We show that if a natural property (in the sense of Razborov and Rudich [RR97]) is useful also against functions that are close to the class of "easy" functions, rather than just against "easy" functions, then it can be used to get an agnostic learning algorithm over the uniform distribution with membership queries. * For AC0[q], any prime q (constant-depth circuits of polynomial size, with AND, OR, NOT, and MODq gates of unbounded fanin), which happens to have a natural property with the requisite extra feature by [Raz87, Smo87, RR97], we obtain the first agnostic learning algorithm for AC0[q], for every prime q. Our algorithm runs in randomized quasi-polynomial time, uses membership queries, and outputs a circuit for a given Boolean function f that agrees with f on all but at most polylog(n)*opt fraction of inputs, where opt is the relative distance between f and the closest function h in the class AC0[q]. * For the ideal case, a natural proof of strongly exponential correlation circuit lower bounds against a circuit class C containing AC0[2] (i.e., circuits of size exp(Omega(n)) cannot compute some n-variate function even with exp(-Omega(n)) advantage over random guessing) would yield a polynomial-time query agnostic learning algorithm for C with the approximation error O(opt)

Dagstuhl Research Online Publication Server

A Complete Characterization of Statistical Query Learning with Applications to Evolvability

Author: Feldman Vitaly
Publication venue
Publication date: 30/09/2012
Field of study

Statistical query (SQ) learning model of Kearns (1993) is a natural restriction of the PAC learning model in which a learning algorithm is allowed to obtain estimates of statistical properties of the examples but cannot see the examples themselves. We describe a new and simple characterization of the query complexity of learning in the SQ learning model. Unlike the previously known bounds on SQ learning our characterization preserves the accuracy and the efficiency of learning. The preservation of accuracy implies that that our characterization gives the first characterization of SQ learning in the agnostic learning framework. The preservation of efficiency is achieved using a new boosting technique and allows us to derive a new approach to the design of evolutionary algorithms in Valiant's (2006) model of evolvability. We use this approach to demonstrate the existence of a large class of monotone evolutionary learning algorithms based on square loss performance estimation. These results differ significantly from the few known evolutionary algorithms and give evidence that evolvability in Valiant's model is a more versatile phenomenon than there had been previous reason to suspect.Comment: Simplified Lemma 3.8 and it's application

arXiv.org e-Print Archive

Elsevier - Publisher Connector

Privately Releasing Conjunctions and the Statistical Query Barrier

Author: Gupta Anupam
Hardt Moritz
Roth Aaron
Ullman Jonathan
Publication venue
Publication date: 01/01/2011
Field of study

Suppose we would like to know all answers to a set of statistical queries C on a data set up to small error, but we can only access the data itself using statistical queries. A trivial solution is to exhaustively ask all queries in C. Can we do any better? + We show that the number of statistical queries necessary and sufficient for this task is---up to polynomial factors---equal to the agnostic learning complexity of C in Kearns' statistical query (SQ) model. This gives a complete answer to the question when running time is not a concern. + We then show that the problem can be solved efficiently (allowing arbitrary error on a small fraction of queries) whenever the answers to C can be described by a submodular function. This includes many natural concept classes, such as graph cuts and Boolean disjunctions and conjunctions. While interesting from a learning theoretic point of view, our main applications are in privacy-preserving data analysis: Here, our second result leads to the first algorithm that efficiently releases differentially private answers to of all Boolean conjunctions with 1% average error. This presents significant progress on a key open problem in privacy-preserving data analysis. Our first result on the other hand gives unconditional lower bounds on any differentially private algorithm that admits a (potentially non-privacy-preserving) implementation using only statistical queries. Not only our algorithms, but also most known private algorithms can be implemented using only statistical queries, and hence are constrained by these lower bounds. Our result therefore isolates the complexity of agnostic learning in the SQ-model as a new barrier in the design of differentially private algorithms

arXiv.org e-Print Archive

CiteSeerX

Sampling Correctors

Author: Canonne Clément
Gouleakis Themis
Rubinfeld Ronitt
Publication venue
Publication date: 31/03/2018
Field of study

In many situations, sample data is obtained from a noisy or imperfect source. In order to address such corruptions, this paper introduces the concept of a sampling corrector. Such algorithms use structure that the distribution is purported to have, in order to allow one to make "on-the-fly" corrections to samples drawn from probability distributions. These algorithms then act as filters between the noisy data and the end user. We show connections between sampling correctors, distribution learning algorithms, and distribution property testing algorithms. We show that these connections can be utilized to expand the applicability of known distribution learning and property testing algorithms as well as to achieve improved algorithms for those tasks. As a first step, we show how to design sampling correctors using proper learning algorithms. We then focus on the question of whether algorithms for sampling correctors can be more efficient in terms of sample complexity than learning algorithms for the analogous families of distributions. When correcting monotonicity, we show that this is indeed the case when also granted query access to the cumulative distribution function. We also obtain sampling correctors for monotonicity without this stronger type of access, provided that the distribution be originally very close to monotone (namely, at a distance

O(1/\log^2 n)

). In addition to that, we consider a restricted error model that aims at capturing "missing data" corruptions. In this model, we show that distributions that are close to monotone have sampling correctors that are significantly more efficient than achievable by the learning approach. We also consider the question of whether an additional source of independent random bits is required by sampling correctors to implement the correction process

arXiv.org e-Print Archive

DSpace@MIT

Moment-Matching Polynomials

Author: Klivans Adam
Meka Raghu
Publication venue
Publication date: 04/01/2013
Field of study

We give a new framework for proving the existence of low-degree, polynomial approximators for Boolean functions with respect to broad classes of non-product distributions. Our proofs use techniques related to the classical moment problem and deviate significantly from known Fourier-based methods, which require the underlying distribution to have some product structure. Our main application is the first polynomial-time algorithm for agnostically learning any function of a constant number of halfspaces with respect to any log-concave distribution (for any constant accuracy parameter). This result was not known even for the case of learning the intersection of two halfspaces without noise. Additionally, we show that in the "smoothed-analysis" setting, the above results hold with respect to distributions that have sub-exponential tails, a property satisfied by many natural and well-studied distributions in machine learning. Given that our algorithms can be implemented using Support Vector Machines (SVMs) with a polynomial kernel, these results give a rigorous theoretical explanation as to why many kernel methods work so well in practice

arXiv.org e-Print Archive

CiteSeerX

Agnostic Membership Query Learning with Nontrivial Savings: New Results, Techniques

Author: Karchmer Ari
Publication venue
Publication date: 11/11/2023
Field of study

(Abridged) Designing computationally efficient algorithms in the agnostic learning model (Haussler, 1992; Kearns et al., 1994) is notoriously difficult. In this work, we consider agnostic learning with membership queries for touchstone classes at the frontier of agnostic learning, with a focus on how much computation can be saved over the trivial runtime of 2^n$. This approach is inspired by and continues the study of ``learning with nontrivial savings'' (Servedio and Tan, 2017). To this end, we establish multiple agnostic learning algorithms, highlighted by: 1. An agnostic learning algorithm for circuits consisting of a sublinear number of gates, which can each be any function computable by a sublogarithmic degree k polynomial threshold function (the depth of the circuit is bounded only by size). This algorithm runs in time 2^{n -s(n)} for s(n) \approx n/(k+1), and learns over the uniform distribution over unlabelled examples on \{0,1\}^n. 2. An agnostic learning algorithm for circuits consisting of a sublinear number of gates, where each can be any function computable by a \sym^+ circuit of subexponential size and sublogarithmic degree k. This algorithm runs in time 2^{n-s(n)} for s(n) \approx n/(k+1), and learns over distributions of unlabelled examples that are products of k+1 arbitrary and unknown distributions, each over \{0,1\}^{n/(k+1)} (assume without loss of generality that k+1 divides n)

arXiv.org e-Print Archive

Testing k-Monotonicity

Author: Grigorescu Elena
Guo Siyao
Kumar Akash
Wimmer Karl
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 8th Innovations in Theoretical Computer Science Conference (ITCS 2017)
Publication date: 01/01/2017
Field of study

A Boolean k-monotone function defined over a finite poset domain D alternates between the values 0 and 1 at most k times on any ascending chain in D. Therefore, k-monotone functions are natural generalizations of the classical monotone functions, which are the 1-monotone functions. Motivated by the recent interest in k-monotone functions in the context of circuit complexity and learning theory, and by the central role that monotonicity testing plays in the context of property testing, we initiate a systematic study of k-monotone functions, in the property testing model. In this model, the goal is to distinguish functions that are k-monotone (or are close to being k-monotone) from functions that are far from being k-monotone. Our results include the following: 1. We demonstrate a separation between testing k-monotonicity and testing monotonicity, on the hypercube domain {0,1}^d, for k >= 3; 2. We demonstrate a separation between testing and learning on {0,1}^d, for k=omega(log d): testing k-monotonicity can be performed with 2^{O(sqrt d . log d . log{1/eps})} queries, while learning k-monotone functions requires 2^{Omega(k . sqrt d .{1/eps})} queries (Blais et al. (RANDOM 2015)). 3. We present a tolerant test for functions fcolon[n]^dto {0,1}$with complexity independent of n, which makes progress on a problem left open by Berman et al. (STOC 2014). Our techniques exploit the testing-by-learning paradigm, use novel applications of Fourier analysis on the grid [n]^d, and draw connections to distribution testing techniques. Our techniques exploit the testing-by-learning paradigm, use novel applications of Fourier analysis on the grid [n]^d, and draw connections to distribution testing techniques

Dagstuhl Research Online Publication Server

Auditing: Active Learning with Outcome-Dependent Query Costs

Author: Sabato Sivan
Sarwate Anand D.
Srebro Nathan
Publication venue
Publication date: 12/07/2015
Field of study

We propose a learning setting in which unlabeled data is free, and the cost of a label depends on its value, which is not known in advance. We study binary classification in an extreme case, where the algorithm only pays for negative labels. Our motivation are applications such as fraud detection, in which investigating an honest transaction should be avoided if possible. We term the setting auditing, and consider the auditing complexity of an algorithm: the number of negative labels the algorithm requires in order to learn a hypothesis with low relative error. We design auditing algorithms for simple hypothesis classes (thresholds and rectangles), and show that with these algorithms, the auditing complexity can be significantly lower than the active label complexity. We also discuss a general competitive approach for auditing and possible modifications to the framework.Comment: Corrections in section

arXiv.org e-Print Archive

CiteSeerX

Learning Coverage Functions and Private Release of Marginals

Author: Feldman Vitaly
Kothari Pravesh
Publication venue
Publication date: 27/05/2014
Field of study

We study the problem of approximating and learning coverage functions. A function

c: 2^{[n]} \rightarrow \mathbf{R}^{+}

is a coverage function, if there exists a universe

U

with non-negative weights

w(u)

for each

u \in U

and subsets

A_1, A_2, \ldots, A_n

U

such that

c(S) = \sum_{u \in \cup_{i \in S} A_i} w(u)

. Alternatively, coverage functions can be described as non-negative linear combinations of monotone disjunctions. They are a natural subclass of submodular functions and arise in a number of applications. We give an algorithm that for any

\gamma,\delta>0

, given random and uniform examples of an unknown coverage function

c

, finds a function

h

that approximates

c

within factor

1+\gamma

on all but

\delta

-fraction of the points in time

poly(n,1/\gamma,1/\delta)

. This is the first fully-polynomial algorithm for learning an interesting class of functions in the demanding PMAC model of Balcan and Harvey (2011). Our algorithms are based on several new structural properties of coverage functions. Using the results in (Feldman and Kothari, 2014), we also show that coverage functions are learnable agnostically with excess

\ell_1

-error

\epsilon

over all product and symmetric distributions in time

n^{\log(1/\epsilon)}

. In contrast, we show that, without assumptions on the distribution, learning coverage functions is at least as hard as learning polynomial-size disjoint DNF formulas, a class of functions for which the best known algorithm runs in time

2^{\tilde{O}(n^{1/3})}

(Klivans and Servedio, 2004). As an application of our learning results, we give simple differentially-private algorithms for releasing monotone conjunction counting queries with low average error. In particular, for any

k \leq n

, we obtain private release of

k

-way marginals with average error

\bar{\alpha}

in time

n^{O(\log(1/\bar{\alpha}))}

arXiv.org e-Print Archive

CiteSeerX