1,623 research outputs found
Distortion Robust Image Classification using Deep Convolutional Neural Network with Discrete Cosine Transform
Convolutional Neural Network is good at image classification. However, it is
found to be vulnerable to image quality degradation. Even a small amount of
distortion such as noise or blur can severely hamper the performance of these
CNN architectures. Most of the work in the literature strives to mitigate this
problem simply by fine-tuning a pre-trained CNN on mutually exclusive or a
union set of distorted training data. This iterative fine-tuning process with
all known types of distortion is exhaustive and the network struggles to handle
unseen distortions. In this work, we propose distortion robust DCT-Net, a
Discrete Cosine Transform based module integrated into a deep network which is
built on top of VGG16. Unlike other works in the literature, DCT-Net is "blind"
to the distortion type and level in an image both during training and testing.
As a part of the training process, the proposed DCT module discards input
information which mostly represents the contribution of high frequencies. The
DCT-Net is trained "blindly" only once and applied in generic situation without
further retraining. We also extend the idea of traditional dropout and present
a training adaptive version of the same. We evaluate our proposed method
against Gaussian blur, motion blur, salt and pepper noise, Gaussian noise and
speckle noise added to CIFAR-10/100 and ImageNet test sets. Experimental
results demonstrate that once trained, DCT-Net not only generalizes well to a
variety of unseen image distortions but also outperforms other methods in the
literature
Deep Learning for Environmentally Robust Speech Recognition: An Overview of Recent Developments
Eliminating the negative effect of non-stationary environmental noise is a
long-standing research topic for automatic speech recognition that stills
remains an important challenge. Data-driven supervised approaches, including
ones based on deep neural networks, have recently emerged as potential
alternatives to traditional unsupervised approaches and with sufficient
training, can alleviate the shortcomings of the unsupervised methods in various
real-life acoustic environments. In this light, we review recently developed,
representative deep learning approaches for tackling non-stationary additive
and convolutional degradation of speech with the aim of providing guidelines
for those involved in the development of environmentally robust speech
recognition systems. We separately discuss single- and multi-channel techniques
developed for the front-end and back-end of speech recognition systems, as well
as joint front-end and back-end training frameworks
Generative Compression
Traditional image and video compression algorithms rely on hand-crafted
encoder/decoder pairs (codecs) that lack adaptability and are agnostic to the
data being compressed. Here we describe the concept of generative compression,
the compression of data using generative models, and suggest that it is a
direction worth pursuing to produce more accurate and visually pleasing
reconstructions at much deeper compression levels for both image and video
data. We also demonstrate that generative compression is orders-of-magnitude
more resilient to bit error rates (e.g. from noisy wireless channels) than
traditional variable-length coding schemes
Learning robust and efficient point cloud representations
L'abstract è presente nell'allegato / the abstract is in the attachmen
Impact of Colour Variation on Robustness of Deep Neural Networks
Deep neural networks (DNNs) have have shown state-of-the-art performance for
computer vision applications like image classification, segmentation and object
detection. Whereas recent advances have shown their vulnerability to manual
digital perturbations in the input data, namely adversarial attacks. The
accuracy of the networks is significantly affected by the data distribution of
their training dataset. Distortions or perturbations on color space of input
images generates out-of-distribution data, which make networks more likely to
misclassify them. In this work, we propose a color-variation dataset by
distorting their RGB color on a subset of the ImageNet with 27 different
combinations. The aim of our work is to study the impact of color variation on
the performance of DNNs. We perform experiments on several state-of-the-art DNN
architectures on the proposed dataset, and the result shows a significant
correlation between color variation and loss of accuracy. Furthermore, based on
the ResNet50 architecture, we demonstrate some experiments of the performance
of recently proposed robust training techniques and strategies, such as Augmix,
revisit, and free normalizer, on our proposed dataset. Experimental results
indicate that these robust training techniques can improve the robustness of
deep networks to color variation.Comment: arXiv admin note: substantial text overlap with arXiv:2209.0213
- …