Search CORE

4,293 research outputs found

Nonparametric Weight Initialization of Neural Networks via Integral Representation

Author: Murata Noboru
Sonoda Sho
Publication venue
Publication date: 19/02/2014
Field of study

A new initialization method for hidden parameters in a neural network is proposed. Derived from the integral representation of the neural network, a nonparametric probability distribution of hidden parameters is introduced. In this proposal, hidden parameters are initialized by samples drawn from this distribution, and output parameters are fitted by ordinary linear regression. Numerical experiments show that backpropagation with proposed initialization converges faster than uniformly random initialization. Also it is shown that the proposed method achieves enough accuracy by itself without backpropagation in some cases.Comment: For ICLR2014, revised into 9 pages; revised into 12 pages (with supplements

arXiv.org e-Print Archive

CiteSeerX

Historical Document Image Segmentation with LDA-Initialized Deep Neural Networks

Author: Alberti Michele
Ingold Rolf
Liwicki Marcus
Pondenkandath Vinaychandran
Seuret Mathias
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 19/10/2017
Field of study

In this paper, we present a novel approach to perform deep neural networks layer-wise weight initialization using Linear Discriminant Analysis (LDA). Typically, the weights of a deep neural network are initialized with: random values, greedy layer-wise pre-training (usually as Deep Belief Network or as auto-encoder) or by re-using the layers from another network (transfer learning). Hence, many training epochs are needed before meaningful weights are learned, or a rather similar dataset is required for seeding a fine-tuning of transfer learning. In this paper, we describe how to turn an LDA into either a neural layer or a classification layer. We analyze the initialization technique on historical documents. First, we show that an LDA-based initialization is quick and leads to a very stable initialization. Furthermore, for the task of layout analysis at pixel level, we investigate the effectiveness of LDA-based initialization and show that it outperforms state-of-the-art random weight initialization methods.Comment: 5 page

arXiv.org e-Print Archive

Crossref