Search CORE

32,746 research outputs found

Tight Hardness Results for Training Depth-2 ReLU Networks

Author: Goel Surbhi
Klivans Adam
Manurangsi Pasin
Reichman Daniel
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 12th Innovations in Theoretical Computer Science Conference (ITCS 2021)
Publication date: 26/11/2020
Field of study

We prove several hardness results for training depth-2 neural networks with the ReLU activation function; these networks are simply weighted sums (that may include negative coefficients) of ReLUs. Our goal is to output a depth-2 neural network that minimizes the square loss with respect to a given training set. We prove that this problem is NP-hard already for a network with a single ReLU. We also prove NP-hardness for outputting a weighted sum of

k

ReLUs minimizing the squared error (for

k>1

) even in the realizable setting (i.e., when the labels are consistent with an unknown depth-2 ReLU network). We are also able to obtain lower bounds on the running time in terms of the desired additive error

\epsilon

. To obtain our lower bounds, we use the Gap Exponential Time Hypothesis (Gap-ETH) as well as a new hypothesis regarding the hardness of approximating the well known Densest

\kappa

-Subgraph problem in subexponential time (these hypotheses are used separately in proving different lower bounds). For example, we prove that under reasonable hardness assumptions, any proper learning algorithm for finding the best fitting ReLU must run in time exponential in

1/\epsilon^2

. Together with a previous work regarding improperly learning a ReLU (Goel et al., COLT'17), this implies the first separation between proper and improper algorithms for learning a ReLU. We also study the problem of properly learning a depth-2 network of ReLUs with bounded weights giving new (worst-case) upper bounds on the running time needed to learn such networks both in the realizable and agnostic settings. Our upper bounds on the running time essentially matches our lower bounds in terms of the dependency on

\epsilon

.Comment: To appear in ITCS'2

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Pre-Reduction Graph Products: Hardnesses of Properly Learning DFAs and Approximating EDP on DAGs

Author: Chalermsook Parinya
Laekhanukit Bundit
Nanongkai Danupon
Publication venue
Publication date: 01/01/2014
Field of study

The study of graph products is a major research topic and typically concerns the term

f(G*H)

, e.g., to show that

f(G*H)=f(G)f(H)

. In this paper, we study graph products in a non-standard form

f(R[G*H]

where

R

is a "reduction", a transformation of any graph into an instance of an intended optimization problem. We resolve some open problems as applications. (1) A tight

n^{1-\epsilon}

-approximation hardness for the minimum consistent deterministic finite automaton (DFA) problem, where

n

is the sample size. Due to Board and Pitt [Theoretical Computer Science 1992], this implies the hardness of properly learning DFAs assuming

NP\neq RP

(the weakest possible assumption). (2) A tight

n^{1/2-\epsilon}

hardness for the edge-disjoint paths (EDP) problem on directed acyclic graphs (DAGs), where

n

denotes the number of vertices. (3) A tight hardness of packing vertex-disjoint

k

-cycles for large

k

. (4) An alternative (and perhaps simpler) proof for the hardness of properly learning DNF, CNF and intersection of halfspaces [Alekhnovich et al., FOCS 2004 and J. Comput.Syst.Sci. 2008]

arXiv.org e-Print Archive

Crossref

MPG.PuRe

Learning and selection

Author: Kingsbury Justine
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

Are learning processes selection processes? This paper takes a slightly modified version of the account of selection presented in Hull et al. (Behav Brain Sci 24:511–527, 2001) and asks whether it applies to learning processes. The answer is that although some learning processes are selectional, many are not. This has consequences for teleological theories of mental content. According to these theories, mental states have content in virtue of having proper functions, and they have proper functions in virtue of being the products of selection processes. For some mental states, it is plausible that the relevant selection process is natural selection, but there are many for which it is not plausible. One response to this (due to David Papineau) is to suggest that the learning processes by which we acquire non-innate mental states are selection processes and can therefore confer proper functions on mental states. This paper considers two ways in which this response could be elaborated, and argues that neither of them succeed: the teleosemanticist cannot rely on the claim that learning processes are selection processes in order to justify the attribution of proper functions to beliefs

Research Commons@Waikato

Order-Revealing Encryption and the Hardness of Private Learning

Author: A Beimel
A Beimel
A Blum
A Boldyreva
A Gupta
B Chor
C Dwork
C Dwork
D Boneh
D Boneh
D Boneh
J Groth
J Thaler
J Ullman
L Pitt
LG Valiant
M Kearns
M Kearns
M Kharitonov
O Goldreich
O Pandey
RA Servedio
RA Servedio
S Garg
S Goldwasser
SP Kasiviswanathan
T Graepel
Z Brakerski
Publication venue
Publication date: 01/01/2015
Field of study

An order-revealing encryption scheme gives a public procedure by which two ciphertexts can be compared to reveal the ordering of their underlying plaintexts. We show how to use order-revealing encryption to separate computationally efficient PAC learning from efficient

(\epsilon, \delta)

-differentially private PAC learning. That is, we construct a concept class that is efficiently PAC learnable, but for which every efficient learner fails to be differentially private. This answers a question of Kasiviswanathan et al. (FOCS '08, SIAM J. Comput. '11). To prove our result, we give a generic transformation from an order-revealing encryption scheme into one with strongly correct comparison, which enables the consistent comparison of ciphertexts that are not obtained as the valid encryption of any message. We believe this construction may be of independent interest.Comment: 28 page

arXiv.org e-Print Archive

CiteSeerX

Crossref

From average case complexity to improper learning complexity

Author: Berthet Q.
Daniely A.
Feige U.
Vapnik V. N.
Publication venue
Publication date: 01/01/2014
Field of study

The basic problem in the PAC model of computational learning theory is to determine which hypothesis classes are efficiently learnable. There is presently a dearth of results showing hardness of learning problems. Moreover, the existing lower bounds fall short of the best known algorithms. The biggest challenge in proving complexity results is to establish hardness of {\em improper learning} (a.k.a. representation independent learning).The difficulty in proving lower bounds for improper learning is that the standard reductions from

\mathbf{NP}

-hard problems do not seem to apply in this context. There is essentially only one known approach to proving lower bounds on improper learning. It was initiated in (Kearns and Valiant 89) and relies on cryptographic assumptions. We introduce a new technique for proving hardness of improper learning, based on reductions from problems that are hard on average. We put forward a (fairly strong) generalization of Feige's assumption (Feige 02) about the complexity of refuting random constraint satisfaction problems. Combining this assumption with our new technique yields far reaching implications. In particular, 1. Learning

\mathrm{DNF}

's is hard. 2. Agnostically learning halfspaces with a constant approximation ratio is hard. 3. Learning an intersection of

\omega(1)

halfspaces is hard.Comment: 34 page

arXiv.org e-Print Archive

CiteSeerX

Crossref