13,191 research outputs found
Deep Learning with S-shaped Rectified Linear Activation Units
Rectified linear activation units are important components for
state-of-the-art deep convolutional networks. In this paper, we propose a novel
S-shaped rectified linear activation unit (SReLU) to learn both convex and
non-convex functions, imitating the multiple function forms given by the two
fundamental laws, namely the Webner-Fechner law and the Stevens law, in
psychophysics and neural sciences. Specifically, SReLU consists of three
piecewise linear functions, which are formulated by four learnable parameters.
The SReLU is learned jointly with the training of the whole deep network
through back propagation. During the training phase, to initialize SReLU in
different layers, we propose a "freezing" method to degenerate SReLU into a
predefined leaky rectified linear unit in the initial several training epochs
and then adaptively learn the good initial values. SReLU can be universally
used in the existing deep networks with negligible additional parameters and
computation cost. Experiments with two popular CNN architectures, Network in
Network and GoogLeNet on scale-various benchmarks including CIFAR10, CIFAR100,
MNIST and ImageNet demonstrate that SReLU achieves remarkable improvement
compared to other activation functions.Comment: Accepted by AAAI-1
- …