5,565 research outputs found
A Taxonomy of Deep Convolutional Neural Nets for Computer Vision
Traditional architectures for solving computer vision problems and the degree
of success they enjoyed have been heavily reliant on hand-crafted features.
However, of late, deep learning techniques have offered a compelling
alternative -- that of automatically learning problem-specific features. With
this new paradigm, every problem in computer vision is now being re-examined
from a deep learning perspective. Therefore, it has become important to
understand what kind of deep networks are suitable for a given problem.
Although general surveys of this fast-moving paradigm (i.e. deep-networks)
exist, a survey specific to computer vision is missing. We specifically
consider one form of deep networks widely used in computer vision -
convolutional neural networks (CNNs). We start with "AlexNet" as our base CNN
and then examine the broad variations proposed over time to suit different
applications. We hope that our recipe-style survey will serve as a guide,
particularly for novice practitioners intending to use deep-learning techniques
for computer vision.Comment: Published in Frontiers in Robotics and AI (http://goo.gl/6691Bm
Kervolutional Neural Networks
Convolutional neural networks (CNNs) have enabled the state-of-the-art
performance in many computer vision tasks. However, little effort has been
devoted to establishing convolution in non-linear space. Existing works mainly
leverage on the activation layers, which can only provide point-wise
non-linearity. To solve this problem, a new operation, kervolution (kernel
convolution), is introduced to approximate complex behaviors of human
perception systems leveraging on the kernel trick. It generalizes convolution,
enhances the model capacity, and captures higher order interactions of
features, via patch-wise kernel functions, but without introducing additional
parameters. Extensive experiments show that kervolutional neural networks (KNN)
achieve higher accuracy and faster convergence than baseline CNN.Comment: oral paper in CVPR 201
Advancements in Image Classification using Convolutional Neural Network
Convolutional Neural Network (CNN) is the state-of-the-art for image
classification task. Here we have briefly discussed different components of
CNN. In this paper, We have explained different CNN architectures for image
classification. Through this paper, we have shown advancements in CNN from
LeNet-5 to latest SENet model. We have discussed the model description and
training details of each model. We have also drawn a comparison among those
models.Comment: 9 pages, 15 figures, 3 Tables. Submitted to 2018 Fourth International
Conference on Research in Computational Intelligence and Communication
Networks(ICRCICN 2018
Lip2AudSpec: Speech reconstruction from silent lip movements video
In this study, we propose a deep neural network for reconstructing
intelligible speech from silent lip movement videos. We use auditory
spectrogram as spectral representation of speech and its corresponding sound
generation method resulting in a more natural sounding reconstructed speech.
Our proposed network consists of an autoencoder to extract bottleneck features
from the auditory spectrogram which is then used as target to our main lip
reading network comprising of CNN, LSTM and fully connected layers. Our
experiments show that the autoencoder is able to reconstruct the original
auditory spectrogram with a 98% correlation and also improves the quality of
reconstructed speech from the main lip reading network. Our model, trained
jointly on different speakers is able to extract individual speaker
characteristics and gives promising results of reconstructing intelligible
speech with superior word recognition accuracy
- …