3 research outputs found

    Understanding How Image Quality Affects Deep Neural Networks

    Full text link
    Image quality is an important practical challenge that is often overlooked in the design of machine vision systems. Commonly, machine vision systems are trained and tested on high quality image datasets, yet in practical applications the input images can not be assumed to be of high quality. Recently, deep neural networks have obtained state-of-the-art performance on many machine vision tasks. In this paper we provide an evaluation of 4 state-of-the-art deep neural network models for image classification under quality distortions. We consider five types of quality distortions: blur, noise, contrast, JPEG, and JPEG2000 compression. We show that the existing networks are susceptible to these quality distortions, particularly to blur and noise. These results enable future work in developing deep neural networks that are more invariant to quality distortions.Comment: Final version will appear in IEEE Xplore in the Proceedings of the Conference on the Quality of Multimedia Experience (QoMEX), June 6-8, 201

    Effects of Degradations on Deep Neural Network Architectures

    Full text link
    Recently, image classification methods based on capsules (groups of neurons) and a novel dynamic routing protocol are proposed. The methods show promising performances than the state-of-the-art CNN-based models in some of the existing datasets. However, the behavior of capsule-based models and CNN-based models are largely unknown in presence of noise. So it is important to study the performance of these models under various noises. In this paper, we demonstrate the effect of image degradations on deep neural network architectures for image classification task. We select six widely used CNN architectures to analyse their performances for image classification task on datasets of various distortions. Our work has three main contributions: 1) we observe the effects of degradations on different CNN models; 2) accordingly, we propose a network setup that can enhance the robustness of any CNN architecture for certain degradations, and 3) we propose a new capsule network that achieves high recognition accuracy. To the best of our knowledge, this is the first study on the performance of CapsuleNet (CapsNet) and other state-of-the-art CNN architectures under different types of image degradations. Also, our datasets and source code are available publicly to the researchers.Comment: Journa

    Tree-Based Deep Mixture of Experts with Applications to Visual Saliency Prediction and Quality Robust Visual Recognition

    Get PDF
    abstract: Mixture of experts is a machine learning ensemble approach that consists of individual models that are trained to be ``experts'' on subsets of the data, and a gating network that provides weights to output a combination of the expert predictions. Mixture of experts models do not currently see wide use due to difficulty in training diverse experts and high computational requirements. This work presents modifications of the mixture of experts formulation that use domain knowledge to improve training, and incorporate parameter sharing among experts to reduce computational requirements. First, this work presents an application of mixture of experts models for quality robust visual recognition. First it is shown that human subjects outperform deep neural networks on classification of distorted images, and then propose a model, MixQualNet, that is more robust to distortions. The proposed model consists of ``experts'' that are trained on a particular type of image distortion. The final output of the model is a weighted sum of the expert models, where the weights are determined by a separate gating network. The proposed model also incorporates weight sharing to reduce the number of parameters, as well as increase performance. Second, an application of mixture of experts to predict visual saliency is presented. A computational saliency model attempts to predict where humans will look in an image. In the proposed model, each expert network is trained to predict saliency for a set of closely related images. The final saliency map is computed as a weighted mixture of the expert networks' outputs, with weights determined by a separate gating network. The proposed model achieves better performance than several other visual saliency models and a baseline non-mixture model. Finally, this work introduces a saliency model that is a weighted mixture of models trained for different levels of saliency. Levels of saliency include high saliency, which corresponds to regions where almost all subjects look, and low saliency, which corresponds to regions where some, but not all subjects look. The weighted mixture shows improved performance compared with baseline models because of the diversity of the individual model predictions.Dissertation/ThesisDoctoral Dissertation Electrical Engineering 201
    corecore