482,163 research outputs found
A multi-task learning CNN for image steganalysis
Convolutional neural network (CNN) based image steganalysis are increasingly popular because of their superiority in accuracy. The most straightforward way to employ CNN for image steganalysis is to learn a CNN-based classifier to distinguish whether secret messages have been embedded into an image. However, it is difficult to learn such a classifier because of the weak stego signals and the limited useful information. To address this issue, in this paper, a multi-task learning CNN is proposed. In addition to the typical use of CNN, learning a CNN-based classifier for the whole image, our multi-task CNN is learned with an auxiliary task of the pixel binary classification, estimating whether each pixel in an image has been modified due to steganography. To the best of our knowledge, we are the first to employ CNN to perform the pixel-level classification of such type. Experimental results have justified the effectiveness and efficiency of the proposed multi-task learning CNN
Short-segment heart sound classification using an ensemble of deep convolutional neural networks
This paper proposes a framework based on deep convolutional neural networks
(CNNs) for automatic heart sound classification using short-segments of
individual heart beats. We design a 1D-CNN that directly learns features from
raw heart-sound signals, and a 2D-CNN that takes inputs of two- dimensional
time-frequency feature maps based on Mel-frequency cepstral coefficients
(MFCC). We further develop a time-frequency CNN ensemble (TF-ECNN) combining
the 1D-CNN and 2D-CNN based on score-level fusion of the class probabilities.
On the large PhysioNet CinC challenge 2016 database, the proposed CNN models
outperformed traditional classifiers based on support vector machine and hidden
Markov models with various hand-crafted time- and frequency-domain features.
Best classification scores with 89.22% accuracy and 89.94% sensitivity were
achieved by the ECNN, and 91.55% specificity and 88.82% modified accuracy by
the 2D-CNN alone on the test set.Comment: 8 pages, 1 figure, conferenc
When Face Recognition Meets with Deep Learning: an Evaluation of Convolutional Neural Networks for Face Recognition
Deep learning, in particular Convolutional Neural Network (CNN), has achieved
promising results in face recognition recently. However, it remains an open
question: why CNNs work well and how to design a 'good' architecture. The
existing works tend to focus on reporting CNN architectures that work well for
face recognition rather than investigate the reason. In this work, we conduct
an extensive evaluation of CNN-based face recognition systems (CNN-FRS) on a
common ground to make our work easily reproducible. Specifically, we use public
database LFW (Labeled Faces in the Wild) to train CNNs, unlike most existing
CNNs trained on private databases. We propose three CNN architectures which are
the first reported architectures trained using LFW data. This paper
quantitatively compares the architectures of CNNs and evaluate the effect of
different implementation choices. We identify several useful properties of
CNN-FRS. For instance, the dimensionality of the learned features can be
significantly reduced without adverse effect on face recognition accuracy. In
addition, traditional metric learning method exploiting CNN-learned features is
evaluated. Experiments show two crucial factors to good CNN-FRS performance are
the fusion of multiple CNNs and metric learning. To make our work reproducible,
source code and models will be made publicly available.Comment: 7 pages, 4 figures, 7 table
Genetic CNN
The deep Convolutional Neural Network (CNN) is the state-of-the-art solution
for large-scale visual recognition. Following basic principles such as
increasing the depth and constructing highway connections, researchers have
manually designed a lot of fixed network structures and verified their
effectiveness.
In this paper, we discuss the possibility of learning deep network structures
automatically. Note that the number of possible network structures increases
exponentially with the number of layers in the network, which inspires us to
adopt the genetic algorithm to efficiently traverse this large search space. We
first propose an encoding method to represent each network structure in a
fixed-length binary string, and initialize the genetic algorithm by generating
a set of randomized individuals. In each generation, we define standard genetic
operations, e.g., selection, mutation and crossover, to eliminate weak
individuals and then generate more competitive ones. The competitiveness of
each individual is defined as its recognition accuracy, which is obtained via
training the network from scratch and evaluating it on a validation set. We run
the genetic process on two small datasets, i.e., MNIST and CIFAR10,
demonstrating its ability to evolve and find high-quality structures which are
little studied before. These structures are also transferrable to the
large-scale ILSVRC2012 dataset.Comment: Submitted to CVPR 2017 (10 pages, 5 figures
Multi-scale Orderless Pooling of Deep Convolutional Activation Features
Deep convolutional neural networks (CNN) have shown their promise as a
universal representation for recognition. However, global CNN activations lack
geometric invariance, which limits their robustness for classification and
matching of highly variable scenes. To improve the invariance of CNN
activations without degrading their discriminative power, this paper presents a
simple but effective scheme called multi-scale orderless pooling (MOP-CNN).
This scheme extracts CNN activations for local patches at multiple scale
levels, performs orderless VLAD pooling of these activations at each level
separately, and concatenates the result. The resulting MOP-CNN representation
can be used as a generic feature for either supervised or unsupervised
recognition tasks, from image classification to instance-level retrieval; it
consistently outperforms global CNN activations without requiring any joint
training of prediction layers for a particular target dataset. In absolute
terms, it achieves state-of-the-art results on the challenging SUN397 and MIT
Indoor Scenes classification datasets, and competitive results on
ILSVRC2012/2013 classification and INRIA Holidays retrieval datasets
One-to-many face recognition with bilinear CNNs
The recent explosive growth in convolutional neural network (CNN) research
has produced a variety of new architectures for deep learning. One intriguing
new architecture is the bilinear CNN (B-CNN), which has shown dramatic
performance gains on certain fine-grained recognition problems [15]. We apply
this new CNN to the challenging new face recognition benchmark, the IARPA Janus
Benchmark A (IJB-A) [12]. It features faces from a large number of identities
in challenging real-world conditions. Because the face images were not
identified automatically using a computerized face detection system, it does
not have the bias inherent in such a database. We demonstrate the performance
of the B-CNN model beginning from an AlexNet-style network pre-trained on
ImageNet. We then show results for fine-tuning using a moderate-sized and
public external database, FaceScrub [17]. We also present results with
additional fine-tuning on the limited training data provided by the protocol.
In each case, the fine-tuned bilinear model shows substantial improvements over
the standard CNN. Finally, we demonstrate how a standard CNN pre-trained on a
large face database, the recently released VGG-Face model [20], can be
converted into a B-CNN without any additional feature training. This B-CNN
improves upon the CNN performance on the IJB-A benchmark, achieving 89.5%
rank-1 recall.Comment: Published version at WACV 201
- …
