1,343 research outputs found
Effects of Degradations on Deep Neural Network Architectures
Recently, image classification methods based on capsules (groups of neurons)
and a novel dynamic routing protocol are proposed. The methods show promising
performances than the state-of-the-art CNN-based models in some of the existing
datasets. However, the behavior of capsule-based models and CNN-based models
are largely unknown in presence of noise. So it is important to study the
performance of these models under various noises. In this paper, we demonstrate
the effect of image degradations on deep neural network architectures for image
classification task. We select six widely used CNN architectures to analyse
their performances for image classification task on datasets of various
distortions. Our work has three main contributions: 1) we observe the effects
of degradations on different CNN models; 2) accordingly, we propose a network
setup that can enhance the robustness of any CNN architecture for certain
degradations, and 3) we propose a new capsule network that achieves high
recognition accuracy. To the best of our knowledge, this is the first study on
the performance of CapsuleNet (CapsNet) and other state-of-the-art CNN
architectures under different types of image degradations. Also, our datasets
and source code are available publicly to the researchers.Comment: Journa
How Image Degradations Affect Deep CNN-based Face Recognition?
Face recognition approaches that are based on deep convolutional neural
networks (CNN) have been dominating the field. The performance improvements
they have provided in the so called in-the-wild datasets are significant,
however, their performance under image quality degradations have not been
assessed, yet. This is particularly important, since in real-world face
recognition applications, images may contain various kinds of degradations due
to motion blur, noise, compression artifacts, color distortions, and occlusion.
In this work, we have addressed this problem and analyzed the influence of
these image degradations on the performance of deep CNN-based face recognition
approaches using the standard LFW closed-set identification protocol. We have
evaluated three popular deep CNN models, namely, the AlexNet, VGG-Face, and
GoogLeNet. Results have indicated that blur, noise, and occlusion cause a
significant decrease in performance, while deep CNN models are found to be
robust to distortions, such as color distortions and change in color balance.Comment: 8 pages, 3 figure
Deep Learning for Single Image Super-Resolution: A Brief Review
Single image super-resolution (SISR) is a notoriously challenging ill-posed
problem, which aims to obtain a high-resolution (HR) output from one of its
low-resolution (LR) versions. To solve the SISR problem, recently powerful deep
learning algorithms have been employed and achieved the state-of-the-art
performance. In this survey, we review representative deep learning-based SISR
methods, and group them into two categories according to their major
contributions to two essential aspects of SISR: the exploration of efficient
neural network architectures for SISR, and the development of effective
optimization objectives for deep SISR learning. For each category, a baseline
is firstly established and several critical limitations of the baseline are
summarized. Then representative works on overcoming these limitations are
presented based on their original contents as well as our critical
understandings and analyses, and relevant comparisons are conducted from a
variety of perspectives. Finally we conclude this review with some vital
current challenges and future trends in SISR leveraging deep learning
algorithms.Comment: Accepted by IEEE Transactions on Multimedia (TMM
Learning to Rank Question-Answer Pairs using Hierarchical Recurrent Encoder with Latent Topic Clustering
In this paper, we propose a novel end-to-end neural architecture for ranking
candidate answers, that adapts a hierarchical recurrent neural network and a
latent topic clustering module. With our proposed model, a text is encoded to a
vector representation from an word-level to a chunk-level to effectively
capture the entire meaning. In particular, by adapting the hierarchical
structure, our model shows very small performance degradations in longer text
comprehension while other state-of-the-art recurrent neural network models
suffer from it. Additionally, the latent topic clustering module extracts
semantic information from target samples. This clustering module is useful for
any text related tasks by allowing each data sample to find its nearest topic
cluster, thus helping the neural network model analyze the entire data. We
evaluate our models on the Ubuntu Dialogue Corpus and consumer electronic
domain question answering dataset, which is related to Samsung products. The
proposed model shows state-of-the-art results for ranking question-answer
pairs.Comment: 10 pages, Accepted as a conference paper at NAACL 201
- …