Search CORE

168,064 research outputs found

Input Fast-Forwarding for Better Deep Learning

Author: A Ibrahim
A Torralba
N Cristianini
Q Wang
Y LeCun
Publication venue
Publication date: 23/05/2017
Field of study

This paper introduces a new architectural framework, known as input fast-forwarding, that can enhance the performance of deep networks. The main idea is to incorporate a parallel path that sends representations of input values forward to deeper network layers. This scheme is substantially different from "deep supervision" in which the loss layer is re-introduced to earlier layers. The parallel path provided by fast-forwarding enhances the training process in two ways. First, it enables the individual layers to combine higher-level information (from the standard processing path) with lower-level information (from the fast-forward path). Second, this new architecture reduces the problem of vanishing gradients substantially because the fast-forwarding path provides a shorter route for gradient backpropagation. In order to evaluate the utility of the proposed technique, a Fast-Forward Network (FFNet), with 20 convolutional layers along with parallel fast-forward paths, has been created and tested. The paper presents empirical results that demonstrate improved learning capacity of FFNet due to fast-forwarding, as compared to GoogLeNet (with deep supervision) and CaffeNet, which are 4x and 18x larger in size, respectively. All of the source code and deep learning models described in this paper will be made available to the entire research communityComment: Accepted in the 14th International Conference on Image Analysis and Recognition (ICIAR) 2017, Montreal, Canad

arXiv.org e-Print Archive

Crossref

From average case complexity to improper learning complexity

Author: Berthet Q.
Daniely A.
Feige U.
Vapnik V. N.
Publication venue
Publication date: 01/01/2014
Field of study

The basic problem in the PAC model of computational learning theory is to determine which hypothesis classes are efficiently learnable. There is presently a dearth of results showing hardness of learning problems. Moreover, the existing lower bounds fall short of the best known algorithms. The biggest challenge in proving complexity results is to establish hardness of {\em improper learning} (a.k.a. representation independent learning).The difficulty in proving lower bounds for improper learning is that the standard reductions from

\mathbf{NP}

-hard problems do not seem to apply in this context. There is essentially only one known approach to proving lower bounds on improper learning. It was initiated in (Kearns and Valiant 89) and relies on cryptographic assumptions. We introduce a new technique for proving hardness of improper learning, based on reductions from problems that are hard on average. We put forward a (fairly strong) generalization of Feige's assumption (Feige 02) about the complexity of refuting random constraint satisfaction problems. Combining this assumption with our new technique yields far reaching implications. In particular, 1. Learning

\mathrm{DNF}

's is hard. 2. Agnostically learning halfspaces with a constant approximation ratio is hard. 3. Learning an intersection of

\omega(1)

halfspaces is hard.Comment: 34 page

arXiv.org e-Print Archive

CiteSeerX

Crossref

Coleman-Weinberg Potential In Good Agreement With WMAP

Author: A. D. Linde
A. R. Liddle
Q. Shafi
V. N. Şenoğuz
Publication venue: 'American Physical Society (APS)'
Publication date: 30/03/2006
Field of study

We briefly summarize and update a class of inflationary models from the early eighties based on a quartic (Coleman-Weinberg) potential for a gauge singlet scalar (inflaton) field. For vacuum energy scales comparable to the grand unification scale, the scalar spectral index n_s=0.94-0.97, in very good agreement with the WMAP three year results. The tensor to scalar ratio r<~0.14, while alpha=dn/dlnk is =~-10^-3. An SO(10) version naturally explains the observed baryon asymmetry via non-thermal leptogenesis.Comment: v1: 6 pages, 1 table. v2: minor corrections. v3: 8 pages, added some details, comments, references and 3 figures. v4: minor corrections, published versio

arXiv.org e-Print Archive

Crossref

CERN Document Server