Search CORE

134,680 research outputs found

Elimination of All Bad Local Minima in Deep Learning

Author: Kaelbling Leslie Pack
Kawaguchi Kenji
Publication venue
Publication date: 15/01/2020
Field of study

In this paper, we theoretically prove that adding one special neuron per output unit eliminates all suboptimal local minima of any deep neural network, for multi-class classification, binary classification, and regression with an arbitrary loss function, under practical assumptions. At every local minimum of any deep neural network with these added neurons, the set of parameters of the original neural network (without added neurons) is guaranteed to be a global minimum of the original neural network. The effects of the added neurons are proven to automatically vanish at every local minimum. Moreover, we provide a novel theoretical characterization of a failure mode of eliminating suboptimal local minima via an additional theorem and several examples. This paper also introduces a novel proof technique based on the perturbable gradient basis (PGB) necessary condition of local minima, which provides new insight into the elimination of local minima and is applicable to analyze various models and transformations of objective functions beyond the elimination of local minima.Comment: Accepted to appear in AISTATS 202

arXiv.org e-Print Archive

DSpace@MIT

NeuroFlow: A General Purpose Spiking Neural Network Simulation Platform using Customizable Processors

Author: Cheung K
Luk W
Schultz SR
Publication venue: 'Frontiers Media SA'
Publication date: 11/12/2015
Field of study

© 2016 Cheung, Schultz and Luk.NeuroFlow is a scalable spiking neural network simulation platform for off-the-shelf high performance computing systems using customizable hardware processors such as Field-Programmable Gate Arrays (FPGAs). Unlike multi-core processors and application-specific integrated circuits, the processor architecture of NeuroFlow can be redesigned and reconfigured to suit a particular simulation to deliver optimized performance, such as the degree of parallelism to employ. The compilation process supports using PyNN, a simulator-independent neural network description language, to configure the processor. NeuroFlow supports a number of commonly used current or conductance based neuronal models such as integrate-and-fire and Izhikevich models, and the spike-timing-dependent plasticity (STDP) rule for learning. A 6-FPGA system can simulate a network of up to ~600,000 neurons and can achieve a real-time performance of 400,000 neurons. Using one FPGA, NeuroFlow delivers a speedup of up to 33.6 times the speed of an 8-core processor, or 2.83 times the speed of GPU-based platforms. With high flexibility and throughput, NeuroFlow provides a viable environment for large-scale neural network simulation

Spiral - Imperial College Digital Repository

Recommended from our members

Shear capacity of reinforced concrete beams using neural network

Author: Ashour Ashraf F.
Song J-K.
Yang Keun-Hyeok
Publication venue
Publication date: 01/01/2007
Field of study

NoOptimum multi-layered feed-forward neural network (NN) models using a resilient back-propagation algorithm and early stopping technique are built to predict the shear capacity of reinforced concrete deep and slender beams. The input layer neurons represent geometrical and material properties of reinforced concrete beams and the output layer produces the beam shear capacity. Training, validation and testing of the developed neural network have been achieved using 50%, 25%, and 25%, respectively, of a comprehensive database compiled from 631 deep and 549 slender beam specimens. The predictions obtained from the developed neural network models are in much better agreement with test results than those determined from shear provisions of different codes, such as KBCS, ACI 318-05, and EC2. The mean and standard deviation of the ratio between predicted using the neural network models and measured shear capacities are 1.02 and 0.18, respectively, for deep beams, and 1.04 and 0.17, respectively, for slender beams. In addition, the influence of different parameters on the shear capacity of reinforced concrete beams predicted by the developed neural network shows consistent agreement with those experimentally observed

Bradford Scholars

Heterogeneous Multi-task Learning for Human Pose Estimation with Deep Convolutional Neural Network

Author: Chan Antoni B.
Li Sijin
Liu Zhi-Qiang
Publication venue
Publication date: 01/01/2014
Field of study

We propose an heterogeneous multi-task learning framework for human pose estimation from monocular image with deep convolutional neural network. In particular, we simultaneously learn a pose-joint regressor and a sliding-window body-part detector in a deep network architecture. We show that including the body-part detection task helps to regularize the network, directing it to converge to a good solution. We report competitive and state-of-art results on several data sets. We also empirically show that the learned neurons in the middle layer of our network are tuned to localized body parts

arXiv.org e-Print Archive

CiteSeerX

Crossref