440 research outputs found
Convolutional Neural Networks Applied to House Numbers Digit Classification
We classify digits of real-world house numbers using convolutional neural
networks (ConvNets). ConvNets are hierarchical feature learning neural networks
whose structure is biologically inspired. Unlike many popular vision approaches
that are hand-designed, ConvNets can automatically learn a unique set of
features optimized for a given task. We augmented the traditional ConvNet
architecture by learning multi-stage features and by using Lp pooling and
establish a new state-of-the-art of 94.85% accuracy on the SVHN dataset (45.2%
error improvement). Furthermore, we analyze the benefits of different pooling
methods and multi-stage features in ConvNets. The source code and a tutorial
are available at eblearn.sf.net.Comment: 4 pages, 6 figures, 2 table
Gradient-free activation maximization for identifying effective stimuli
A fundamental question for understanding brain function is what types of
stimuli drive neurons to fire. In visual neuroscience, this question has also
been posted as characterizing the receptive field of a neuron. The search for
effective stimuli has traditionally been based on a combination of insights
from previous studies, intuition, and luck. Recently, the same question has
emerged in the study of units in convolutional neural networks (ConvNets), and
together with this question a family of solutions were developed that are
generally referred to as "feature visualization by activation maximization."
We sought to bring in tools and techniques developed for studying ConvNets to
the study of biological neural networks. However, one key difference that
impedes direct translation of tools is that gradients can be obtained from
ConvNets using backpropagation, but such gradients are not available from the
brain. To circumvent this problem, we developed a method for gradient-free
activation maximization by combining a generative neural network with a genetic
algorithm. We termed this method XDream (EXtending DeepDream with real-time
evolution for activation maximization), and we have shown that this method can
reliably create strong stimuli for neurons in the macaque visual cortex (Ponce
et al., 2019). In this paper, we describe extensive experiments characterizing
the XDream method by using ConvNet units as in silico models of neurons. We
show that XDream is applicable across network layers, architectures, and
training sets; examine design choices in the algorithm; and provide practical
guides for choosing hyperparameters in the algorithm. XDream is an efficient
algorithm for uncovering neuronal tuning preferences in black-box networks
using a vast and diverse stimulus space.Comment: 16 pages, 8 figures, 3 table
-softmax: Improving Intra-class Compactness and Inter-class Separability of Features
Intra-class compactness and inter-class separability are crucial indicators
to measure the effectiveness of a model to produce discriminative features,
where intra-class compactness indicates how close the features with the same
label are to each other and inter-class separability indicates how far away the
features with different labels are. In this work, we investigate intra-class
compactness and inter-class separability of features learned by convolutional
networks and propose a Gaussian-based softmax (-softmax) function
that can effectively improve intra-class compactness and inter-class
separability. The proposed function is simple to implement and can easily
replace the softmax function. We evaluate the proposed -softmax
function on classification datasets (i.e., CIFAR-10, CIFAR-100, and Tiny
ImageNet) and on multi-label classification datasets (i.e., MS COCO and
NUS-WIDE). The experimental results show that the proposed
-softmax function improves the state-of-the-art models across all
evaluated datasets. In addition, analysis of the intra-class compactness and
inter-class separability demonstrates the advantages of the proposed function
over the softmax function, which is consistent with the performance
improvement. More importantly, we observe that high intra-class compactness and
inter-class separability are linearly correlated to average precision on MS
COCO and NUS-WIDE. This implies that improvement of intra-class compactness and
inter-class separability would lead to improvement of average precision.Comment: 15 pages, published in TNNL
A Deep Learning Framework for Unsupervised Affine and Deformable Image Registration
Image registration, the process of aligning two or more images, is the core
technique of many (semi-)automatic medical image analysis tasks. Recent studies
have shown that deep learning methods, notably convolutional neural networks
(ConvNets), can be used for image registration. Thus far training of ConvNets
for registration was supervised using predefined example registrations.
However, obtaining example registrations is not trivial. To circumvent the need
for predefined examples, and thereby to increase convenience of training
ConvNets for image registration, we propose the Deep Learning Image
Registration (DLIR) framework for \textit{unsupervised} affine and deformable
image registration. In the DLIR framework ConvNets are trained for image
registration by exploiting image similarity analogous to conventional
intensity-based image registration. After a ConvNet has been trained with the
DLIR framework, it can be used to register pairs of unseen images in one shot.
We propose flexible ConvNets designs for affine image registration and for
deformable image registration. By stacking multiple of these ConvNets into a
larger architecture, we are able to perform coarse-to-fine image registration.
We show for registration of cardiac cine MRI and registration of chest CT that
performance of the DLIR framework is comparable to conventional image
registration while being several orders of magnitude faster.Comment: Accepted: Medical Image Analysis - Elsevie
- …