1,093 research outputs found
One-to-many face recognition with bilinear CNNs
The recent explosive growth in convolutional neural network (CNN) research
has produced a variety of new architectures for deep learning. One intriguing
new architecture is the bilinear CNN (B-CNN), which has shown dramatic
performance gains on certain fine-grained recognition problems [15]. We apply
this new CNN to the challenging new face recognition benchmark, the IARPA Janus
Benchmark A (IJB-A) [12]. It features faces from a large number of identities
in challenging real-world conditions. Because the face images were not
identified automatically using a computerized face detection system, it does
not have the bias inherent in such a database. We demonstrate the performance
of the B-CNN model beginning from an AlexNet-style network pre-trained on
ImageNet. We then show results for fine-tuning using a moderate-sized and
public external database, FaceScrub [17]. We also present results with
additional fine-tuning on the limited training data provided by the protocol.
In each case, the fine-tuned bilinear model shows substantial improvements over
the standard CNN. Finally, we demonstrate how a standard CNN pre-trained on a
large face database, the recently released VGG-Face model [20], can be
converted into a B-CNN without any additional feature training. This B-CNN
improves upon the CNN performance on the IJB-A benchmark, achieving 89.5%
rank-1 recall.Comment: Published version at WACV 201
Statistically Motivated Second Order Pooling
Second-order pooling, a.k.a.~bilinear pooling, has proven effective for deep
learning based visual recognition. However, the resulting second-order networks
yield a final representation that is orders of magnitude larger than that of
standard, first-order ones, making them memory-intensive and cumbersome to
deploy. Here, we introduce a general, parametric compression strategy that can
produce more compact representations than existing compression techniques, yet
outperform both compressed and uncompressed second-order models. Our approach
is motivated by a statistical analysis of the network's activations, relying on
operations that lead to a Gaussian-distributed final representation, as
inherently used by first-order deep networks. As evidenced by our experiments,
this lets us outperform the state-of-the-art first-order and second-order
models on several benchmark recognition datasets.Comment: Accepted to ECCV 2018. Camera ready version. 14 page, 5 figures, 3
table
- …