16,289 research outputs found
Backward Gradient Normalization in Deep Neural Networks
We introduce a new technique for gradient normalization during neural network
training. The gradients are rescaled during the backward pass using
normalization layers introduced at certain points within the network
architecture. These normalization nodes do not affect forward activity
propagation, but modify backpropagation equations to permit a well-scaled
gradient flow that reaches the deepest network layers without experimenting
vanishing or explosion. Results on tests with very deep neural networks show
that the new technique can do an effective control of the gradient norm,
allowing the update of weights in the deepest layers and improving network
accuracy on several experimental conditions
Convolutional Neural Networks for Histopathology Image Classification: Training vs. Using Pre-Trained Networks
We explore the problem of classification within a medical image data-set
based on a feature vector extracted from the deepest layer of pre-trained
Convolution Neural Networks. We have used feature vectors from several
pre-trained structures, including networks with/without transfer learning to
evaluate the performance of pre-trained deep features versus CNNs which have
been trained by that specific dataset as well as the impact of transfer
learning with a small number of samples. All experiments are done on Kimia
Path24 dataset which consists of 27,055 histopathology training patches in 24
tissue texture classes along with 1,325 test patches for evaluation. The result
shows that pre-trained networks are quite competitive against training from
scratch. As well, fine-tuning does not seem to add any tangible improvement for
VGG16 to justify additional training while we observed considerable improvement
in retrieval and classification accuracy when we fine-tuned the Inception
structure.Comment: To appear in proceedings of the 7th International Conference on Image
Processing Theory, Tools and Applications (IPTA 2017), Nov 28-Dec 1,
Montreal, Canad
Towards Automatic Wild Animal Monitoring: Identification of Animal Species in Camera-trap Images using Very Deep Convolutional Neural Networks
Non intrusive monitoring of animals in the wild is possible using camera
trapping framework, which uses cameras triggered by sensors to take a burst of
images of animals in their habitat. However camera trapping framework produces
a high volume of data (in the order on thousands or millions of images), which
must be analyzed by a human expert. In this work, a method for animal species
identification in the wild using very deep convolutional neural networks is
presented. Multiple versions of the Snapshot Serengeti dataset were used in
order to probe the ability of the method to cope with different challenges that
camera-trap images demand. The method reached 88.9% of accuracy in Top-1 and
98.1% in Top-5 in the evaluation set using a residual network topology. Also,
the results show that the proposed method outperforms previous approximations
and proves that recognition in camera-trap images can be automated.Comment: Submitted to ECCV1
On Minimization of a Quadratic Binary Functional
The problem of minimization of a quadratic functional depending on great
number of binary variables is examined. 3 variants of minimization procedure
are studied with the aid of computer simulation for spin-glass matrices. It is
shown that under other equal conditions evident superiority has the maximal
dynamics (the greedy algorithm). The dependence of the results on a distance
between start points and the ground state is investigated. It is determined
that the character of distribution of local minima depends on this distance
crucially.Comment: 17 pages, 16 figure
Deep Barcodes for Fast Retrieval of Histopathology Scans
We investigate the concept of deep barcodes and propose two methods to
generate them in order to expedite the process of classification and retrieval
of histopathology images. Since binary search is computationally less
expensive, in terms of both speed and storage, deep barcodes could be useful
when dealing with big data retrieval. Our experiments use the dataset Kimia
Path24 to test three pre-trained networks for image retrieval. The dataset
consists of 27,055 training images in 24 different classes with large
variability, and 1,325 test images for testing. Apart from the high-speed and
efficiency, results show a surprising retrieval accuracy of 71.62% for deep
barcodes, as compared to 68.91% for deep features and 68.53% for compressed
deep features.Comment: Accepted for publication in proceedings of the IEEE World Congress on
Computational Intelligence (IEEE WCCI), Rio de Janeiro, Brazil, 8-3 July,
201
How Do Neural Networks Estimate Optical Flow? A Neuropsychology-Inspired Study
End-to-end trained convolutional neural networks have led to a breakthrough
in optical flow estimation. The most recent advances focus on improving the
optical flow estimation by improving the architecture and setting a new
benchmark on the publicly available MPI-Sintel dataset. Instead, in this
article, we investigate how deep neural networks estimate optical flow. A
better understanding of how these networks function is important for (i)
assessing their generalization capabilities to unseen inputs, and (ii)
suggesting changes to improve their performance. For our investigation, we
focus on FlowNetS, as it is the prototype of an encoder-decoder neural network
for optical flow estimation. Furthermore, we use a filter identification method
that has played a major role in uncovering the motion filters present in animal
brains in neuropsychological research. The method shows that the filters in the
deepest layer of FlowNetS are sensitive to a variety of motion patterns. Not
only do we find translation filters, as demonstrated in animal brains, but
thanks to the easier measurements in artificial neural networks, we even unveil
dilation, rotation, and occlusion filters. Furthermore, we find similarities in
the refinement part of the network and the perceptual filling-in process which
occurs in the mammal primary visual cortex.Comment: 16 pages, 15 figure
Projection-Based 2.5D U-net Architecture for Fast Volumetric Segmentation
Convolutional neural networks are state-of-the-art for various segmentation
tasks. While for 2D images these networks are also computationally efficient,
3D convolutions have huge storage requirements and require long training time.
To overcome this issue, we introduce a network structure for volumetric data
without 3D convolutional layers. The main idea is to include maximum intensity
projections from different directions to transform the volumetric data to a
sequence of images, where each image contains information of the full data. We
then apply 2D convolutions to these projection images and lift them again to
volumetric data using a trainable reconstruction algorithm.The proposed network
architecture has less storage requirements than network structures using 3D
convolutions. For a tested binary segmentation task, it even shows better
performance than the 3D U-net and can be trained much faster.Comment: presented at the SAMPTA 2019 conferenc
Adaptive Neural Networks for Efficient Inference
We present an approach to adaptively utilize deep neural networks in order to
reduce the evaluation time on new examples without loss of accuracy. Rather
than attempting to redesign or approximate existing networks, we propose two
schemes that adaptively utilize networks. We first pose an adaptive network
evaluation scheme, where we learn a system to adaptively choose the components
of a deep network to be evaluated for each example. By allowing examples
correctly classified using early layers of the system to exit, we avoid the
computational time associated with full evaluation of the network. We extend
this to learn a network selection system that adaptively selects the network to
be evaluated for each example. We show that computational time can be
dramatically reduced by exploiting the fact that many examples can be correctly
classified using relatively efficient networks and that complex,
computationally costly networks are only necessary for a small fraction of
examples. We pose a global objective for learning an adaptive early exit or
network selection policy and solve it by reducing the policy learning problem
to a layer-by-layer weighted binary classification problem. Empirically, these
approaches yield dramatic reductions in computational cost, with up to a 2.8x
speedup on state-of-the-art networks from the ImageNet image recognition
challenge with minimal (<1%) loss of top5 accuracy
SegNet: A Deep Convolutional Encoder-Decoder Architecture for Robust Semantic Pixel-Wise Labelling
We propose a novel deep architecture, SegNet, for semantic pixel wise image
labelling. SegNet has several attractive properties; (i) it only requires
forward evaluation of a fully learnt function to obtain smooth label
predictions, (ii) with increasing depth, a larger context is considered for
pixel labelling which improves accuracy, and (iii) it is easy to visualise the
effect of feature activation(s) in the pixel label space at any depth. SegNet
is composed of a stack of encoders followed by a corresponding decoder stack
which feeds into a soft-max classification layer. The decoders help map low
resolution feature maps at the output of the encoder stack to full input image
size feature maps. This addresses an important drawback of recent deep learning
approaches which have adopted networks designed for object categorization for
pixel wise labelling. These methods lack a mechanism to map deep layer feature
maps to input dimensions. They resort to ad hoc methods to upsample features,
e.g. by replication. This results in noisy predictions and also restricts the
number of pooling layers in order to avoid too much upsampling and thus reduces
spatial context. SegNet overcomes these problems by learning to map encoder
outputs to image pixel labels. We test the performance of SegNet on outdoor RGB
scenes from CamVid, KITTI and indoor scenes from the NYU dataset. Our results
show that SegNet achieves state-of-the-art performance even without use of
additional cues such as depth, video frames or post-processing with CRF models.Comment: This version was first submitted to CVPR' 15 on November 14, 2014
with paper Id 1468. A similar architecture was proposed more recently on May
17, 2015, see http://arxiv.org/pdf/1505.04366.pd
Machine Learning as Statistical Data Assimilation
We identify a strong equivalence between neural network based machine
learning (ML) methods and the formulation of statistical data assimilation
(DA), known to be a problem in statistical physics. DA, as used widely in
physical and biological sciences, systematically transfers information in
observations to a model of the processes producing the observations. The
correspondence is that layer label in the ML setting is the analog of time in
the data assimilation setting. Utilizing aspects of this equivalence we discuss
how to establish the global minimum of the cost functions in the ML context,
using a variational annealing method from DA. This provides a design method for
optimal networks for ML applications and may serve as the basis for
understanding the success of "deep learning". Results from an ML example are
presented.
When the layer label is taken to be continuous, the Euler-Lagrange equation
for the ML optimization problem is an ordinary differential equation, and we
see that the problem being solved is a two point boundary value problem. The
use of continuous layers is denoted "deepest learning". The Hamiltonian version
provides a direct rationale for back propagation as a solution method for the
canonical momentum; however, it suggests other solution methods are to be
preferred.Comment: arXiv admin note: text overlap with arXiv:1707.0141
- …