24 research outputs found
Convolutional Kernel Networks
An important goal in visual recognition is to devise image representations
that are invariant to particular transformations. In this paper, we address
this goal with a new type of convolutional neural network (CNN) whose
invariance is encoded by a reproducing kernel. Unlike traditional approaches
where neural networks are learned either to represent data or for solving a
classification task, our network learns to approximate the kernel feature map
on training data. Such an approach enjoys several benefits over classical ones.
First, by teaching CNNs to be invariant, we obtain simple network architectures
that achieve a similar accuracy to more complex ones, while being easy to train
and robust to overfitting. Second, we bridge a gap between the neural network
literature and kernels, which are natural tools to model invariance. We
evaluate our methodology on visual recognition tasks where CNNs have proven to
perform well, e.g., digit recognition with the MNIST dataset, and the more
challenging CIFAR-10 and STL-10 datasets, where our accuracy is competitive
with the state of the art.Comment: appears in Advances in Neural Information Processing Systems (NIPS),
Dec 2014, Montreal, Canada, http://nips.c
Activation Adaptation in Neural Networks
Many neural network architectures rely on the choice of the activation
function for each hidden layer. Given the activation function, the neural
network is trained over the bias and the weight parameters. The bias catches
the center of the activation, and the weights capture the scale. Here we
propose to train the network over a shape parameter as well. This view allows
each neuron to tune its own activation function and adapt the neuron curvature
towards a better prediction. This modification only adds one further equation
to the back-propagation for each neuron. Re-formalizing activation functions as
CDF generalizes the class of activation function extensively. We aimed at
generalizing an extensive class of activation functions to study: i) skewness
and ii) smoothness of activation functions. Here we introduce adaptive Gumbel
activation function as a bridge between Gumbel and sigmoid. A similar approach
is used to invent a smooth version of ReLU. Our comparison with common
activation functions suggests different data representation especially in early
neural network layers. This adaptation also provides prediction improvement
Neural Generalization of Multiple Kernel Learning
Multiple Kernel Learning is a conventional way to learn the kernel function
in kernel-based methods. MKL algorithms enhance the performance of kernel
methods. However, these methods have a lower complexity compared to deep
learning models and are inferior to these models in terms of recognition
accuracy. Deep learning models can learn complex functions by applying
nonlinear transformations to data through several layers. In this paper, we
show that a typical MKL algorithm can be interpreted as a one-layer neural
network with linear activation functions. By this interpretation, we propose a
Neural Generalization of Multiple Kernel Learning (NGMKL), which extends the
conventional multiple kernel learning framework to a multi-layer neural network
with nonlinear activation functions. Our experiments on several benchmarks show
that the proposed method improves the complexity of MKL algorithms and leads to
higher recognition accuracy