5,881 research outputs found
DeepCorrect: Correcting DNN models against Image Distortions
In recent years, the widespread use of deep neural networks (DNNs) has
facilitated great improvements in performance for computer vision tasks like
image classification and object recognition. In most realistic computer vision
applications, an input image undergoes some form of image distortion such as
blur and additive noise during image acquisition or transmission. Deep networks
trained on pristine images perform poorly when tested on such distortions. In
this paper, we evaluate the effect of image distortions like Gaussian blur and
additive noise on the activations of pre-trained convolutional filters. We
propose a metric to identify the most noise susceptible convolutional filters
and rank them in order of the highest gain in classification accuracy upon
correction. In our proposed approach called DeepCorrect, we apply small stacks
of convolutional layers with residual connections, at the output of these
ranked filters and train them to correct the worst distortion affected filter
activations, whilst leaving the rest of the pre-trained filter outputs in the
network unchanged. Performance results show that applying DeepCorrect models
for common vision tasks like image classification (ImageNet), object
recognition (Caltech-101, Caltech-256) and scene classification (SUN-397),
significantly improves the robustness of DNNs against distorted images and
outperforms other alternative approaches..Comment: Accepted to IEEE Transactions on Image Processing, April 2019. For
associated code, see https://github.com/tsborkar/DeepCorrec
Improving Image Classification Robustness through Selective CNN-Filters Fine-Tuning
Image quality plays a big role in CNN-based image classification performance.
Fine-tuning the network with distorted samples may be too costly for large
networks. To solve this issue, we propose a transfer learning approach
optimized to keep into account that in each layer of a CNN some filters are
more susceptible to image distortion than others. Our method identifies the
most susceptible filters and applies retraining only to the filters that show
the highest activation maps distance between clean and distorted images.
Filters are ranked using the Borda count election method and then only the most
affected filters are fine-tuned. This significantly reduces the number of
parameters to retrain. We evaluate this approach on the CIFAR-10 and CIFAR-100
datasets, testing it on two different models and two different types of
distortion. Results show that the proposed transfer learning technique recovers
most of the lost performance due to input data distortion, at a considerably
faster pace with respect to existing methods, thanks to the reduced number of
parameters to fine-tune. When few noisy samples are provided for training, our
filter-level fine tuning performs particularly well, also outperforming state
of the art layer-level transfer learning approaches.Comment: arXiv admin note: text overlap with arXiv:1705.02406 by other author
Enhancing the Performance of Convolutional Neural Networks on Quality Degraded Datasets
Despite the appeal of deep neural networks that largely replace the
traditional handmade filters, they still suffer from isolated cases that cannot
be properly handled only by the training of convolutional filters. Abnormal
factors, including real-world noise, blur, or other quality degradations, ruin
the output of a neural network. These unexpected problems can produce critical
complications, and it is surprising that there has only been minimal research
into the effects of noise in the deep neural network model. Therefore, we
present an exhaustive investigation into the effect of noise in image
classification and suggest a generalized architecture of a dual-channel model
to treat quality degraded input images. We compare the proposed dual-channel
model with a simple single model and show it improves the overall performance
of neural networks on various types of quality degraded input datasets.Comment: The International Conference on Digital Image Computing: Techniques
and Applications (DICTA), 201
Intriguing properties of neural networks
Deep neural networks are highly expressive models that have recently achieved
state of the art performance on speech and visual recognition tasks. While
their expressiveness is the reason they succeed, it also causes them to learn
uninterpretable solutions that could have counter-intuitive properties. In this
paper we report two such properties.
First, we find that there is no distinction between individual high level
units and random linear combinations of high level units, according to various
methods of unit analysis. It suggests that it is the space, rather than the
individual units, that contains of the semantic information in the high layers
of neural networks.
Second, we find that deep neural networks learn input-output mappings that
are fairly discontinuous to a significant extend. We can cause the network to
misclassify an image by applying a certain imperceptible perturbation, which is
found by maximizing the network's prediction error. In addition, the specific
nature of these perturbations is not a random artifact of learning: the same
perturbation can cause a different network, that was trained on a different
subset of the dataset, to misclassify the same input
Spatial Transformer Networks
Convolutional Neural Networks define an exceptionally powerful class of
models, but are still limited by the lack of ability to be spatially invariant
to the input data in a computationally and parameter efficient manner. In this
work we introduce a new learnable module, the Spatial Transformer, which
explicitly allows the spatial manipulation of data within the network. This
differentiable module can be inserted into existing convolutional
architectures, giving neural networks the ability to actively spatially
transform feature maps, conditional on the feature map itself, without any
extra training supervision or modification to the optimisation process. We show
that the use of spatial transformers results in models which learn invariance
to translation, scale, rotation and more generic warping, resulting in
state-of-the-art performance on several benchmarks, and for a number of classes
of transformations
Polar Feature Based Deep Architectures for Automatic Modulation Classification Considering Channel Fading
To develop intelligent receivers, automatic modulation classification (AMC)
plays an important role for better spectrum utilization. The emerging deep
learning (DL) technique has received much attention in AMC due to its superior
performance in classifying data with deep structure. In this work, a novel
polar-based deep learning architecture with channel compensation network (CCN)
is proposed. Our test results show that learning features from polar domain
(r-theta) can improve recognition accuracy by 5% and reduce training overhead
by 48%. Besides, the proposed CCN is also robust to channel fading, such as
amplitude and phase offsets, and can improve the recognition accuracy by 14%
under practical channel environments.Comment: 5 pages, accepted by the 2018 Sixth IEEE Global Conference on Signal
and Information Processin
Towards Distortion-Predictable Embedding of Neural Networks
Current research in Computer Vision has shown that Convolutional Neural
Networks (CNN) give state-of-the-art performance in many classification tasks
and Computer Vision problems. The embedding of CNN, which is the internal
representation produced by the last layer, can indirectly learn topological and
relational properties. Moreover, by using a suitable loss function, CNN models
can learn invariance to a wide range of non-linear distortions such as
rotation, viewpoint angle or lighting condition. In this work, new insights are
discovered about CNN embeddings and a new loss function is proposed, derived
from the contrastive loss, that creates models with more predicable mappings
and also quantifies distortions. In typical distortion-dependent methods, there
is no simple relation between the features corresponding to one image and the
features of this image distorted. Therefore, these methods require to
feed-forward inputs under every distortions in order to find the corresponding
features representations. Our contribution makes a step towards embeddings
where features of distorted inputs are related and can be derived from each
others by the intensity of the distortion.Comment: 54 pages, 28 figures. Master project at EPFL (Switzerland) in 2015.
For source code on GitHub, see https://github.com/axel-angel/master-projec
On the Use of Deep Learning for Blind Image Quality Assessment
In this work we investigate the use of deep learning for distortion-generic
blind image quality assessment. We report on different design choices, ranging
from the use of features extracted from pre-trained Convolutional Neural
Networks (CNNs) as a generic image description, to the use of features
extracted from a CNN fine-tuned for the image quality task. Our best proposal,
named DeepBIQ, estimates the image quality by average pooling the scores
predicted on multiple sub-regions of the original image. The score of each
sub-region is computed using a Support Vector Regression (SVR) machine taking
as input features extracted using a CNN fine-tuned for category-based image
quality assessment. Experimental results on the LIVE In the Wild Image Quality
Challenge Database and on the LIVE Image Quality Assessment Database show that
DeepBIQ outperforms the state-of-the-art methods compared, having a Linear
Correlation Coefficient (LCC) with human subjective scores of almost 0.91 and
0.98 respectively. Furthermore, in most of the cases, the quality score
predictions of DeepBIQ are closer to the average observer than those of a
generic human observer
Image Distortion Detection using Convolutional Neural Network
Image distortion classification and detection is an important task in many
applications. For example when compressing images, if we know the exact
location of the distortion, then it is possible to re-compress images by
adjusting the local compression level dynamically. In this paper, we address
the problem of detecting the distortion region and classifying the distortion
type of a given image. We show that our model significantly outperforms the
state-of-the-art distortion classifier, and report accurate detection results
for the first time. We expect that such results prove the usefulness of our
approach in many potential applications such as image compression or distortion
restoration.Comment: Accepted to ACPR 201
Can the early human visual system compete with Deep Neural Networks?
We study and compare the human visual system and state-of-the-art deep neural
networks on classification of distorted images. Different from previous works,
we limit the display time to 100ms to test only the early mechanisms of the
human visual system, without allowing time for any eye movements or other
higher level processes. Our findings show that the human visual system still
outperforms modern deep neural networks under blurry and noisy images. These
findings motivate future research into developing more robust deep networks.Comment: Accepted as an oral paper at the Mutual Benefits of Cognitive and
Computer Vision Workshop (held in conjunction with ICCV2017
- …