Search CORE

8 research outputs found

MCMC Learning

Author: Kanade Varun
Mossel Elchanan
Publication venue
Publication date: 12/06/2015
Field of study

The theory of learning under the uniform distribution is rich and deep, with connections to cryptography, computational complexity, and the analysis of boolean functions to name a few areas. This theory however is very limited due to the fact that the uniform distribution and the corresponding Fourier basis are rarely encountered as a statistical model. A family of distributions that vastly generalizes the uniform distribution on the Boolean cube is that of distributions represented by Markov Random Fields (MRF). Markov Random Fields are one of the main tools for modeling high dimensional data in many areas of statistics and machine learning. In this paper we initiate the investigation of extending central ideas, methods and algorithms from the theory of learning under the uniform distribution to the setup of learning concepts given examples from MRF distributions. In particular, our results establish a novel connection between properties of MCMC sampling of MRFs and learning under the MRF distribution.Comment: 28 pages, 1 figur

arXiv.org e-Print Archive

CiteSeerX

Top-Down Induction of Decision Trees: Rigorous Guarantees and Inherent Limitations

Author: Blanc Guy
Lange Jane
Tan Li-Yang
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 11th Innovations in Theoretical Computer Science Conference (ITCS 2020)
Publication date: 17/11/2019
Field of study

Consider the following heuristic for building a decision tree for a function

f : \{0,1\}^n \to \{\pm 1\}

. Place the most influential variable

x_i

f

at the root, and recurse on the subfunctions

f_{x_i=0}

and

f_{x_i=1}

on the left and right subtrees respectively; terminate once the tree is an

\varepsilon

-approximation of

f

. We analyze the quality of this heuristic, obtaining near-matching upper and lower bounds:

\circ

Upper bound: For every

f

with decision tree size

s

and every

\varepsilon \in (0,\frac1{2})

, this heuristic builds a decision tree of size at most

s^{O(\log(s/\varepsilon)\log(1/\varepsilon))}

\circ

Lower bound: For every

\varepsilon \in (0,\frac1{2})

and

s \le 2^{\tilde{O}(\sqrt{n})}

, there is an

f

with decision tree size

s

such that this heuristic builds a decision tree of size

s^{\tilde{\Omega}(\log s)}

. We also obtain upper and lower bounds for monotone functions:

s^{O(\sqrt{\log s}/\varepsilon)}

and

s^{\tilde{\Omega}(\sqrt[4]{\log s } )}

respectively. The lower bound disproves conjectures of Fiat and Pechyony (2004) and Lee (2009). Our upper bounds yield new algorithms for properly learning decision trees under the uniform distribution. We show that these algorithms---which are motivated by widely employed and empirically successful top-down decision tree learning heuristics such as ID3, C4.5, and CART---achieve provable guarantees that compare favorably with those of the current fastest algorithm (Ehrenfeucht and Haussler, 1989). Our lower bounds shed new light on the limitations of these heuristics. Finally, we revisit the classic work of Ehrenfeucht and Haussler. We extend it to give the first uniform-distribution proper learning algorithm that achieves polynomial sample and memory complexity, while matching its state-of-the-art quasipolynomial runtime

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Advances in Functional Encryption

Author: Wee Hoeteck
Publication venue: HAL CCSD
Publication date: 01/07/2016
Field of study

Functional encryption is a novel paradigm for public-key encryption that enables both fine-grained access control and selective computation on encrypted data, as is necessary to protect big, complex data in the cloud. In this thesis, I provide a brief introduction to functional encryption, and an overview of my contributions to the area

INRIA a CCSD electronic archive server

Optimal Cryptographic Hardness of Learning Monotone Functions

Author: A. Blum
A. Healy
A. Klivans
A. Razborov
E. Mossel
E. Mossel
L. Valiant
M. Kharitonov
M. Naor
M.J. Kearns
N. Bshouty
N. Linial
R. O’Donnell
R. O’Donnell
R. Servedio
Y. Mansour
Publication venue
Publication date: 01/01/2008
Field of study

Abstract. A wide range of positive and negative results have been established for learning different classes of Boolean functions from uniformly distributed random examples. However, polynomial-time algorithms have thus far been obtained almost exclusively for various classes of monotone functions, while the computational hardness results obtained to date have all been for various classes of general (nonmonotone) functions. Motivated by this disparity between known positive results (for monotone functions) and negative results (for nonmonotone functions), we establish strong computational limitations on the efficient learnability of various classes of monotone functions. We give several such hardness results which are provably almost optimal since they nearly match known positive results. Some of our results show cryptographic hardness of learning polynomial-size monotone circuits to accuracy only slightly greater than 1/2 + 1 / √ n; this accuracy bound is close to optimal by known positive results (Blum et al., FOCS ’98). Other results show that under a plausible cryptographic hardness assumption, a class of constant-depth, sub-polynomialsize circuits computing monotone functions is hard to learn; this result is close to optimal in terms of the circuit size parameter by known positive results as well (Servedio, Information and Computation ’04). Our main tool is a complexitytheoretic approach to hardness amplification via noise sensitivity of monotone functions that was pioneered by O’Donnell (JCSS ’04).

CiteSeerX

City University of New York

Crossref

Recommended from our members

Unconditional Lower Bounds in Complexity Theory

Author: Carboni Oliveira Igor
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2015
Field of study

This work investigates the hardness of solving natural computational problems according to different complexity measures. Our results and techniques span several areas in theoretical computer science and discrete mathematics. They have in common the following aspects: (i) the results are unconditional, i.e., they rely on no unproven hardness assumption from complexity theory; (ii) the corresponding lower bounds are essentially optimal. Among our contributions, we highlight the following results. Constraint Satisfaction Problems and Monotone Complexity. We introduce a natural formulation of the satisfiability problem as a monotone function, and prove a near-optimal 2^{Ω (n/log n)} lower bound on the size of monotone formulas solving k-SAT on n-variable instances (for a large enough k ∈ ℕ). More generally, we investigate constraint satisfaction problems according to the geometry of their constraints, i.e., as a function of the hypergraph describing which variables appear in each constraint. Our results show in a certain technical sense that the monotone circuit depth complexity of the satisfiability problem is polynomially related to the tree-width of the corresponding graphs. Interactive Protocols and Communication Complexity. We investigate interactive compression protocols, a hybrid model between computational complexity and communication complexity. We prove that the communication complexity of the Majority function on n-bit inputs with respect to Boolean circuits of size s and depth d extended with modulo p gates is precisely n/log^{ϴ(d)} s, where p is a fixed prime number, and d ∈ ℕ. Further, we establish a strong round-separation theorem for bounded-depth circuits, showing that (r+1)-round protocols can be substantially more efficient than r-round protocols, for every r ∈ ℕ. Negations in Computational Learning Theory. We study the learnability of circuits containing a given number of negation gates, a measure that interpolates between monotone functions, and the class of all functions. Let C^t_n be the class of Boolean functions on n input variables that can be computed by Boolean circuits with at most t negations. We prove that any algorithm that learns every f ∈ C^t_n with membership queries according to the uniform distribution to accuracy ε has query complexity 2^{Ω (2^t sqrt(n)/ε)} (for a large range of these parameters). Moreover, we give an algorithm that learns C^t_n from random examples only, and with a running time that essentially matches this information-theoretic lower bound. Negations in Theory of Cryptography. We investigate the power of negation gates in cryptography and related areas, and prove that many basic cryptographic primitives require essentially the maximum number of negations among all Boolean functions. In other words, cryptography is highly non-monotone. Our results rely on a variety of techniques, and give near-optimal lower bounds for pseudorandom functions, error-correcting codes, hardcore predicates, randomness extractors, and small-bias generators. Algorithms versus Circuit Lower Bounds. We strengthen a few connections between algorithms and circuit lower bounds. We show that the design of faster algorithms in some widely investigated learning models would imply new unconditional lower bounds in complexity theory. In addition, we prove that the existence of non-trivial satisfiability algorithms for certain classes of Boolean circuits of depth d+2 leads to lower bounds for the corresponding class of circuits of depth d. These results show that either there are no faster algorithms for some computational tasks, or certain circuit lower bounds hold

Columbia University Academic Commons