134,680 research outputs found
Elimination of All Bad Local Minima in Deep Learning
In this paper, we theoretically prove that adding one special neuron per
output unit eliminates all suboptimal local minima of any deep neural network,
for multi-class classification, binary classification, and regression with an
arbitrary loss function, under practical assumptions. At every local minimum of
any deep neural network with these added neurons, the set of parameters of the
original neural network (without added neurons) is guaranteed to be a global
minimum of the original neural network. The effects of the added neurons are
proven to automatically vanish at every local minimum. Moreover, we provide a
novel theoretical characterization of a failure mode of eliminating suboptimal
local minima via an additional theorem and several examples. This paper also
introduces a novel proof technique based on the perturbable gradient basis
(PGB) necessary condition of local minima, which provides new insight into the
elimination of local minima and is applicable to analyze various models and
transformations of objective functions beyond the elimination of local minima.Comment: Accepted to appear in AISTATS 202
NeuroFlow: A General Purpose Spiking Neural Network Simulation Platform using Customizable Processors
© 2016 Cheung, Schultz and Luk.NeuroFlow is a scalable spiking neural network simulation platform for off-the-shelf high performance computing systems using customizable hardware processors such as Field-Programmable Gate Arrays (FPGAs). Unlike multi-core processors and application-specific integrated circuits, the processor architecture of NeuroFlow can be redesigned and reconfigured to suit a particular simulation to deliver optimized performance, such as the degree of parallelism to employ. The compilation process supports using PyNN, a simulator-independent neural network description language, to configure the processor. NeuroFlow supports a number of commonly used current or conductance based neuronal models such as integrate-and-fire and Izhikevich models, and the spike-timing-dependent plasticity (STDP) rule for learning. A 6-FPGA system can simulate a network of up to ~600,000 neurons and can achieve a real-time performance of 400,000 neurons. Using one FPGA, NeuroFlow delivers a speedup of up to 33.6 times the speed of an 8-core processor, or 2.83 times the speed of GPU-based platforms. With high flexibility and throughput, NeuroFlow provides a viable environment for large-scale neural network simulation
Recommended from our members
Shear capacity of reinforced concrete beams using neural network
NoOptimum multi-layered feed-forward neural network (NN) models using a resilient back-propagation algorithm and
early stopping technique are built to predict the shear capacity of reinforced concrete deep and slender beams. The input layer
neurons represent geometrical and material properties of reinforced concrete beams and the output layer produces the beam shear
capacity. Training, validation and testing of the developed neural network have been achieved using 50%, 25%, and 25%,
respectively, of a comprehensive database compiled from 631 deep and 549 slender beam specimens. The predictions obtained from
the developed neural network models are in much better agreement with test results than those determined from shear provisions of
different codes, such as KBCS, ACI 318-05, and EC2. The mean and standard deviation of the ratio between predicted using the
neural network models and measured shear capacities are 1.02 and 0.18, respectively, for deep beams, and 1.04 and 0.17,
respectively, for slender beams. In addition, the influence of different parameters on the shear capacity of reinforced concrete beams
predicted by the developed neural network shows consistent agreement with those experimentally observed
Heterogeneous Multi-task Learning for Human Pose Estimation with Deep Convolutional Neural Network
We propose an heterogeneous multi-task learning framework for human pose
estimation from monocular image with deep convolutional neural network. In
particular, we simultaneously learn a pose-joint regressor and a sliding-window
body-part detector in a deep network architecture. We show that including the
body-part detection task helps to regularize the network, directing it to
converge to a good solution. We report competitive and state-of-art results on
several data sets. We also empirically show that the learned neurons in the
middle layer of our network are tuned to localized body parts
- …