487,201 research outputs found
Invisibility in billiards
The question of invisibility for bodies with mirror surface is studied in the
framework of geometrical optics. We construct bodies that are invisible/have
zero resistance in two mutually orthogonal directions, and prove that there do
not exist bodies which are invisible/have zero resistance in all possible
directions of incidence
Multi-scale Orderless Pooling of Deep Convolutional Activation Features
Deep convolutional neural networks (CNN) have shown their promise as a
universal representation for recognition. However, global CNN activations lack
geometric invariance, which limits their robustness for classification and
matching of highly variable scenes. To improve the invariance of CNN
activations without degrading their discriminative power, this paper presents a
simple but effective scheme called multi-scale orderless pooling (MOP-CNN).
This scheme extracts CNN activations for local patches at multiple scale
levels, performs orderless VLAD pooling of these activations at each level
separately, and concatenates the result. The resulting MOP-CNN representation
can be used as a generic feature for either supervised or unsupervised
recognition tasks, from image classification to instance-level retrieval; it
consistently outperforms global CNN activations without requiring any joint
training of prediction layers for a particular target dataset. In absolute
terms, it achieves state-of-the-art results on the challenging SUN397 and MIT
Indoor Scenes classification datasets, and competitive results on
ILSVRC2012/2013 classification and INRIA Holidays retrieval datasets
Short-segment heart sound classification using an ensemble of deep convolutional neural networks
This paper proposes a framework based on deep convolutional neural networks
(CNNs) for automatic heart sound classification using short-segments of
individual heart beats. We design a 1D-CNN that directly learns features from
raw heart-sound signals, and a 2D-CNN that takes inputs of two- dimensional
time-frequency feature maps based on Mel-frequency cepstral coefficients
(MFCC). We further develop a time-frequency CNN ensemble (TF-ECNN) combining
the 1D-CNN and 2D-CNN based on score-level fusion of the class probabilities.
On the large PhysioNet CinC challenge 2016 database, the proposed CNN models
outperformed traditional classifiers based on support vector machine and hidden
Markov models with various hand-crafted time- and frequency-domain features.
Best classification scores with 89.22% accuracy and 89.94% sensitivity were
achieved by the ECNN, and 91.55% specificity and 88.82% modified accuracy by
the 2D-CNN alone on the test set.Comment: 8 pages, 1 figure, conferenc
One-to-many face recognition with bilinear CNNs
The recent explosive growth in convolutional neural network (CNN) research
has produced a variety of new architectures for deep learning. One intriguing
new architecture is the bilinear CNN (B-CNN), which has shown dramatic
performance gains on certain fine-grained recognition problems [15]. We apply
this new CNN to the challenging new face recognition benchmark, the IARPA Janus
Benchmark A (IJB-A) [12]. It features faces from a large number of identities
in challenging real-world conditions. Because the face images were not
identified automatically using a computerized face detection system, it does
not have the bias inherent in such a database. We demonstrate the performance
of the B-CNN model beginning from an AlexNet-style network pre-trained on
ImageNet. We then show results for fine-tuning using a moderate-sized and
public external database, FaceScrub [17]. We also present results with
additional fine-tuning on the limited training data provided by the protocol.
In each case, the fine-tuned bilinear model shows substantial improvements over
the standard CNN. Finally, we demonstrate how a standard CNN pre-trained on a
large face database, the recently released VGG-Face model [20], can be
converted into a B-CNN without any additional feature training. This B-CNN
improves upon the CNN performance on the IJB-A benchmark, achieving 89.5%
rank-1 recall.Comment: Published version at WACV 201
A multi-task learning CNN for image steganalysis
Convolutional neural network (CNN) based image steganalysis are increasingly popular because of their superiority in accuracy. The most straightforward way to employ CNN for image steganalysis is to learn a CNN-based classifier to distinguish whether secret messages have been embedded into an image. However, it is difficult to learn such a classifier because of the weak stego signals and the limited useful information. To address this issue, in this paper, a multi-task learning CNN is proposed. In addition to the typical use of CNN, learning a CNN-based classifier for the whole image, our multi-task CNN is learned with an auxiliary task of the pixel binary classification, estimating whether each pixel in an image has been modified due to steganography. To the best of our knowledge, we are the first to employ CNN to perform the pixel-level classification of such type. Experimental results have justified the effectiveness and efficiency of the proposed multi-task learning CNN
P-CNN: Pose-based CNN Features for Action Recognition
This work targets human action recognition in video. While recent methods
typically represent actions by statistics of local video features, here we
argue for the importance of a representation derived from human pose. To this
end we propose a new Pose-based Convolutional Neural Network descriptor (P-CNN)
for action recognition. The descriptor aggregates motion and appearance
information along tracks of human body parts. We investigate different schemes
of temporal aggregation and experiment with P-CNN features obtained both for
automatically estimated and manually annotated human poses. We evaluate our
method on the recent and challenging JHMDB and MPII Cooking datasets. For both
datasets our method shows consistent improvement over the state of the art.Comment: ICCV, December 2015, Santiago, Chil
- …
