5,881 research outputs found

    DeepCorrect: Correcting DNN models against Image Distortions

    Full text link
    In recent years, the widespread use of deep neural networks (DNNs) has facilitated great improvements in performance for computer vision tasks like image classification and object recognition. In most realistic computer vision applications, an input image undergoes some form of image distortion such as blur and additive noise during image acquisition or transmission. Deep networks trained on pristine images perform poorly when tested on such distortions. In this paper, we evaluate the effect of image distortions like Gaussian blur and additive noise on the activations of pre-trained convolutional filters. We propose a metric to identify the most noise susceptible convolutional filters and rank them in order of the highest gain in classification accuracy upon correction. In our proposed approach called DeepCorrect, we apply small stacks of convolutional layers with residual connections, at the output of these ranked filters and train them to correct the worst distortion affected filter activations, whilst leaving the rest of the pre-trained filter outputs in the network unchanged. Performance results show that applying DeepCorrect models for common vision tasks like image classification (ImageNet), object recognition (Caltech-101, Caltech-256) and scene classification (SUN-397), significantly improves the robustness of DNNs against distorted images and outperforms other alternative approaches..Comment: Accepted to IEEE Transactions on Image Processing, April 2019. For associated code, see https://github.com/tsborkar/DeepCorrec

    Improving Image Classification Robustness through Selective CNN-Filters Fine-Tuning

    Full text link
    Image quality plays a big role in CNN-based image classification performance. Fine-tuning the network with distorted samples may be too costly for large networks. To solve this issue, we propose a transfer learning approach optimized to keep into account that in each layer of a CNN some filters are more susceptible to image distortion than others. Our method identifies the most susceptible filters and applies retraining only to the filters that show the highest activation maps distance between clean and distorted images. Filters are ranked using the Borda count election method and then only the most affected filters are fine-tuned. This significantly reduces the number of parameters to retrain. We evaluate this approach on the CIFAR-10 and CIFAR-100 datasets, testing it on two different models and two different types of distortion. Results show that the proposed transfer learning technique recovers most of the lost performance due to input data distortion, at a considerably faster pace with respect to existing methods, thanks to the reduced number of parameters to fine-tune. When few noisy samples are provided for training, our filter-level fine tuning performs particularly well, also outperforming state of the art layer-level transfer learning approaches.Comment: arXiv admin note: text overlap with arXiv:1705.02406 by other author

    Enhancing the Performance of Convolutional Neural Networks on Quality Degraded Datasets

    Full text link
    Despite the appeal of deep neural networks that largely replace the traditional handmade filters, they still suffer from isolated cases that cannot be properly handled only by the training of convolutional filters. Abnormal factors, including real-world noise, blur, or other quality degradations, ruin the output of a neural network. These unexpected problems can produce critical complications, and it is surprising that there has only been minimal research into the effects of noise in the deep neural network model. Therefore, we present an exhaustive investigation into the effect of noise in image classification and suggest a generalized architecture of a dual-channel model to treat quality degraded input images. We compare the proposed dual-channel model with a simple single model and show it improves the overall performance of neural networks on various types of quality degraded input datasets.Comment: The International Conference on Digital Image Computing: Techniques and Applications (DICTA), 201

    Intriguing properties of neural networks

    Full text link
    Deep neural networks are highly expressive models that have recently achieved state of the art performance on speech and visual recognition tasks. While their expressiveness is the reason they succeed, it also causes them to learn uninterpretable solutions that could have counter-intuitive properties. In this paper we report two such properties. First, we find that there is no distinction between individual high level units and random linear combinations of high level units, according to various methods of unit analysis. It suggests that it is the space, rather than the individual units, that contains of the semantic information in the high layers of neural networks. Second, we find that deep neural networks learn input-output mappings that are fairly discontinuous to a significant extend. We can cause the network to misclassify an image by applying a certain imperceptible perturbation, which is found by maximizing the network's prediction error. In addition, the specific nature of these perturbations is not a random artifact of learning: the same perturbation can cause a different network, that was trained on a different subset of the dataset, to misclassify the same input

    Spatial Transformer Networks

    Full text link
    Convolutional Neural Networks define an exceptionally powerful class of models, but are still limited by the lack of ability to be spatially invariant to the input data in a computationally and parameter efficient manner. In this work we introduce a new learnable module, the Spatial Transformer, which explicitly allows the spatial manipulation of data within the network. This differentiable module can be inserted into existing convolutional architectures, giving neural networks the ability to actively spatially transform feature maps, conditional on the feature map itself, without any extra training supervision or modification to the optimisation process. We show that the use of spatial transformers results in models which learn invariance to translation, scale, rotation and more generic warping, resulting in state-of-the-art performance on several benchmarks, and for a number of classes of transformations

    Polar Feature Based Deep Architectures for Automatic Modulation Classification Considering Channel Fading

    Full text link
    To develop intelligent receivers, automatic modulation classification (AMC) plays an important role for better spectrum utilization. The emerging deep learning (DL) technique has received much attention in AMC due to its superior performance in classifying data with deep structure. In this work, a novel polar-based deep learning architecture with channel compensation network (CCN) is proposed. Our test results show that learning features from polar domain (r-theta) can improve recognition accuracy by 5% and reduce training overhead by 48%. Besides, the proposed CCN is also robust to channel fading, such as amplitude and phase offsets, and can improve the recognition accuracy by 14% under practical channel environments.Comment: 5 pages, accepted by the 2018 Sixth IEEE Global Conference on Signal and Information Processin

    Towards Distortion-Predictable Embedding of Neural Networks

    Full text link
    Current research in Computer Vision has shown that Convolutional Neural Networks (CNN) give state-of-the-art performance in many classification tasks and Computer Vision problems. The embedding of CNN, which is the internal representation produced by the last layer, can indirectly learn topological and relational properties. Moreover, by using a suitable loss function, CNN models can learn invariance to a wide range of non-linear distortions such as rotation, viewpoint angle or lighting condition. In this work, new insights are discovered about CNN embeddings and a new loss function is proposed, derived from the contrastive loss, that creates models with more predicable mappings and also quantifies distortions. In typical distortion-dependent methods, there is no simple relation between the features corresponding to one image and the features of this image distorted. Therefore, these methods require to feed-forward inputs under every distortions in order to find the corresponding features representations. Our contribution makes a step towards embeddings where features of distorted inputs are related and can be derived from each others by the intensity of the distortion.Comment: 54 pages, 28 figures. Master project at EPFL (Switzerland) in 2015. For source code on GitHub, see https://github.com/axel-angel/master-projec

    On the Use of Deep Learning for Blind Image Quality Assessment

    Full text link
    In this work we investigate the use of deep learning for distortion-generic blind image quality assessment. We report on different design choices, ranging from the use of features extracted from pre-trained Convolutional Neural Networks (CNNs) as a generic image description, to the use of features extracted from a CNN fine-tuned for the image quality task. Our best proposal, named DeepBIQ, estimates the image quality by average pooling the scores predicted on multiple sub-regions of the original image. The score of each sub-region is computed using a Support Vector Regression (SVR) machine taking as input features extracted using a CNN fine-tuned for category-based image quality assessment. Experimental results on the LIVE In the Wild Image Quality Challenge Database and on the LIVE Image Quality Assessment Database show that DeepBIQ outperforms the state-of-the-art methods compared, having a Linear Correlation Coefficient (LCC) with human subjective scores of almost 0.91 and 0.98 respectively. Furthermore, in most of the cases, the quality score predictions of DeepBIQ are closer to the average observer than those of a generic human observer

    Image Distortion Detection using Convolutional Neural Network

    Full text link
    Image distortion classification and detection is an important task in many applications. For example when compressing images, if we know the exact location of the distortion, then it is possible to re-compress images by adjusting the local compression level dynamically. In this paper, we address the problem of detecting the distortion region and classifying the distortion type of a given image. We show that our model significantly outperforms the state-of-the-art distortion classifier, and report accurate detection results for the first time. We expect that such results prove the usefulness of our approach in many potential applications such as image compression or distortion restoration.Comment: Accepted to ACPR 201

    Can the early human visual system compete with Deep Neural Networks?

    Full text link
    We study and compare the human visual system and state-of-the-art deep neural networks on classification of distorted images. Different from previous works, we limit the display time to 100ms to test only the early mechanisms of the human visual system, without allowing time for any eye movements or other higher level processes. Our findings show that the human visual system still outperforms modern deep neural networks under blurry and noisy images. These findings motivate future research into developing more robust deep networks.Comment: Accepted as an oral paper at the Mutual Benefits of Cognitive and Computer Vision Workshop (held in conjunction with ICCV2017
    • …
    corecore