Search CORE

20,295 research outputs found

Improved Bounds on Quantum Learning Algorithms

Author: A. Blumer
A. Ehrenfeucht
Alp Atici
C. Bennett
D. Angluin
D. Deutsch
D.R. Simon
E. Bernstein
E. Farhi
L. Hellerstein
L.G. Valiant
M. Boyer
N. Bshouty
N.H. Bshouty
R.A. Servedio
Rocco A. Servedio
Y. Shi
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 18/11/2004
Field of study

In this article we give several new results on the complexity of algorithms that learn Boolean functions from quantum queries and quantum examples. Hunziker et al. conjectured that for any class C of Boolean functions, the number of quantum black-box queries which are required to exactly identify an unknown function from C is

O(\frac{\log |C|}{\sqrt{{\hat{\gamma}}^{C}}})

, where

\hat{\gamma}^{C}

is a combinatorial parameter of the class C. We essentially resolve this conjecture in the affirmative by giving a quantum algorithm that, for any class C, identifies any unknown function from C using

O(\frac{\log |C| \log \log |C|}{\sqrt{{\hat{\gamma}}^{C}}})

quantum black-box queries. We consider a range of natural problems intermediate between the exact learning problem (in which the learner must obtain all bits of information about the black-box function) and the usual problem of computing a predicate (in which the learner must obtain only one bit of information about the black-box function). We give positive and negative results on when the quantum and classical query complexities of these intermediate problems are polynomially related to each other. Finally, we improve the known lower bounds on the number of quantum examples (as opposed to quantum black-box queries) required for

(\epsilon,\delta)

-PAC learning any concept class of Vapnik-Chervonenkis dimension d over the domain

\{0,1\}^n

from

\Omega(\frac{d}{n})

\Omega(\frac{1}{\epsilon}\log \frac{1}{\delta}+d+\frac{\sqrt{d}}{\epsilon})

. This new lower bound comes closer to matching known upper bounds for classical PAC learning.Comment: Minor corrections. 18 pages. To appear in Quantum Information Processing. Requires: algorithm.sty, algorithmic.sty to buil

arXiv.org e-Print Archive

Crossref

CERN Document Server

Database Learning: Toward a Database that Becomes Smarter Every Time

Author: Acharya S.
Agrawal S.
Bishop C. M.
Carbonell J. G.
Carlson A.
Condie T.
Ganti V.
Idreos S.
Lawrence N.
Meliou A.
Micchelli C. A.
Mozafari B.
Mozafari B.
Mozafari B.
Olston C.
Park Y.
Rusu F.
Sarawagi S.
Sidirourgos L.
Skilling J.
Wasserman L.
Williams C. K.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 28/03/2017
Field of study

In today's databases, previous query answers rarely benefit answering future queries. For the first time, to the best of our knowledge, we change this paradigm in an approximate query processing (AQP) context. We make the following observation: the answer to each query reveals some degree of knowledge about the answer to another query because their answers stem from the same underlying distribution that has produced the entire dataset. Exploiting and refining this knowledge should allow us to answer queries more analytically, rather than by reading enormous amounts of raw data. Also, processing more queries should continuously enhance our knowledge of the underlying distribution, and hence lead to increasingly faster response times for future queries. We call this novel idea---learning from past query answers---Database Learning. We exploit the principle of maximum entropy to produce answers, which are in expectation guaranteed to be more accurate than existing sample-based approximations. Empowered by this idea, we build a query engine on top of Spark SQL, called Verdict. We conduct extensive experiments on real-world query traces from a large customer of a major database vendor. Our results demonstrate that Verdict supports 73.7% of these queries, speeding them up by up to 23.0x for the same accuracy level compared to existing AQP systems.Comment: This manuscript is an extended report of the work published in ACM SIGMOD conference 201

arXiv.org e-Print Archive

Crossref

A Complete Characterization of Statistical Query Learning with Applications to Evolvability

Author: Feldman Vitaly
Publication venue
Publication date: 30/09/2012
Field of study

Statistical query (SQ) learning model of Kearns (1993) is a natural restriction of the PAC learning model in which a learning algorithm is allowed to obtain estimates of statistical properties of the examples but cannot see the examples themselves. We describe a new and simple characterization of the query complexity of learning in the SQ learning model. Unlike the previously known bounds on SQ learning our characterization preserves the accuracy and the efficiency of learning. The preservation of accuracy implies that that our characterization gives the first characterization of SQ learning in the agnostic learning framework. The preservation of efficiency is achieved using a new boosting technique and allows us to derive a new approach to the design of evolutionary algorithms in Valiant's (2006) model of evolvability. We use this approach to demonstrate the existence of a large class of monotone evolutionary learning algorithms based on square loss performance estimation. These results differ significantly from the few known evolutionary algorithms and give evidence that evolvability in Valiant's model is a more versatile phenomenon than there had been previous reason to suspect.Comment: Simplified Lemma 3.8 and it's application

arXiv.org e-Print Archive

Elsevier - Publisher Connector

Top-Down Induction of Decision Trees: Rigorous Guarantees and Inherent Limitations

Author: Blanc Guy
Lange Jane
Tan Li-Yang
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 11th Innovations in Theoretical Computer Science Conference (ITCS 2020)
Publication date: 17/11/2019
Field of study

Consider the following heuristic for building a decision tree for a function

f : \{0,1\}^n \to \{\pm 1\}

. Place the most influential variable

x_i

f

at the root, and recurse on the subfunctions

f_{x_i=0}

and

f_{x_i=1}

on the left and right subtrees respectively; terminate once the tree is an

\varepsilon

-approximation of

f

. We analyze the quality of this heuristic, obtaining near-matching upper and lower bounds:

\circ

Upper bound: For every

f

with decision tree size

s

and every

\varepsilon \in (0,\frac1{2})

, this heuristic builds a decision tree of size at most

s^{O(\log(s/\varepsilon)\log(1/\varepsilon))}

\circ

Lower bound: For every

\varepsilon \in (0,\frac1{2})

and

s \le 2^{\tilde{O}(\sqrt{n})}

, there is an

f

with decision tree size

s

such that this heuristic builds a decision tree of size

s^{\tilde{\Omega}(\log s)}

. We also obtain upper and lower bounds for monotone functions:

s^{O(\sqrt{\log s}/\varepsilon)}

and

s^{\tilde{\Omega}(\sqrt[4]{\log s } )}

respectively. The lower bound disproves conjectures of Fiat and Pechyony (2004) and Lee (2009). Our upper bounds yield new algorithms for properly learning decision trees under the uniform distribution. We show that these algorithms---which are motivated by widely employed and empirically successful top-down decision tree learning heuristics such as ID3, C4.5, and CART---achieve provable guarantees that compare favorably with those of the current fastest algorithm (Ehrenfeucht and Haussler, 1989). Our lower bounds shed new light on the limitations of these heuristics. Finally, we revisit the classic work of Ehrenfeucht and Haussler. We extend it to give the first uniform-distribution proper learning algorithm that achieves polynomial sample and memory complexity, while matching its state-of-the-art quasipolynomial runtime

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Query Complexity of Approximate Equilibria in Anonymous Games

Author: C Daskalakis
C Daskalakis
C Daskalakis
F Brandt
K Etessami
S Hart
X Chen
Y Azrieli
Y Babichenko
Publication venue
Publication date: 05/05/2016
Field of study

We study the computation of equilibria of anonymous games, via algorithms that may proceed via a sequence of adaptive queries to the game's payoff function, assumed to be unknown initially. The general topic we consider is \emph{query complexity}, that is, how many queries are necessary or sufficient to compute an exact or approximate Nash equilibrium. We show that exact equilibria cannot be found via query-efficient algorithms. We also give an example of a 2-strategy, 3-player anonymous game that does not have any exact Nash equilibrium in rational numbers. However, more positive query-complexity bounds are attainable if either further symmetries of the utility functions are assumed or we focus on approximate equilibria. We investigate four sub-classes of anonymous games previously considered by \cite{bfh09, dp14}. Our main result is a new randomized query-efficient algorithm that finds a

O(n^{-1/4})

-approximate Nash equilibrium querying

\tilde{O}(n^{3/2})

payoffs and runs in time

\tilde{O}(n^{3/2})

. This improves on the running time of pre-existing algorithms for approximate equilibria of anonymous games, and is the first one to obtain an inverse polynomial approximation in poly-time. We also show how this can be utilized as an efficient polynomial-time approximation scheme (PTAS). Furthermore, we prove that

\Omega(n \log{n})

payoffs must be queried in order to find any

\epsilon

-well-supported Nash equilibrium, even by randomized algorithms

arXiv.org e-Print Archive

Crossref

Oxford University Research Archive

Privately Releasing Conjunctions and the Statistical Query Barrier

Author: Gupta Anupam
Hardt Moritz
Roth Aaron
Ullman Jonathan
Publication venue
Publication date: 01/01/2011
Field of study

Suppose we would like to know all answers to a set of statistical queries C on a data set up to small error, but we can only access the data itself using statistical queries. A trivial solution is to exhaustively ask all queries in C. Can we do any better? + We show that the number of statistical queries necessary and sufficient for this task is---up to polynomial factors---equal to the agnostic learning complexity of C in Kearns' statistical query (SQ) model. This gives a complete answer to the question when running time is not a concern. + We then show that the problem can be solved efficiently (allowing arbitrary error on a small fraction of queries) whenever the answers to C can be described by a submodular function. This includes many natural concept classes, such as graph cuts and Boolean disjunctions and conjunctions. While interesting from a learning theoretic point of view, our main applications are in privacy-preserving data analysis: Here, our second result leads to the first algorithm that efficiently releases differentially private answers to of all Boolean conjunctions with 1% average error. This presents significant progress on a key open problem in privacy-preserving data analysis. Our first result on the other hand gives unconditional lower bounds on any differentially private algorithm that admits a (potentially non-privacy-preserving) implementation using only statistical queries. Not only our algorithms, but also most known private algorithms can be implemented using only statistical queries, and hence are constrained by these lower bounds. Our result therefore isolates the complexity of agnostic learning in the SQ-model as a new barrier in the design of differentially private algorithms

arXiv.org e-Print Archive

CiteSeerX