35 research outputs found
Pairwise Confusion for Fine-Grained Visual Classification
Fine-Grained Visual Classification (FGVC) datasets contain small sample
sizes, along with significant intra-class variation and inter-class similarity.
While prior work has addressed intra-class variation using localization and
segmentation techniques, inter-class similarity may also affect feature
learning and reduce classification performance. In this work, we address this
problem using a novel optimization procedure for the end-to-end neural network
training on FGVC tasks. Our procedure, called Pairwise Confusion (PC) reduces
overfitting by intentionally {introducing confusion} in the activations. With
PC regularization, we obtain state-of-the-art performance on six of the most
widely-used FGVC datasets and demonstrate improved localization ability. {PC}
is easy to implement, does not need excessive hyperparameter tuning during
training, and does not add significant overhead during test time.Comment: Camera-Ready version for ECCV 201
Kervolutional Neural Networks
Convolutional neural networks (CNNs) have enabled the state-of-the-art
performance in many computer vision tasks. However, little effort has been
devoted to establishing convolution in non-linear space. Existing works mainly
leverage on the activation layers, which can only provide point-wise
non-linearity. To solve this problem, a new operation, kervolution (kernel
convolution), is introduced to approximate complex behaviors of human
perception systems leveraging on the kernel trick. It generalizes convolution,
enhances the model capacity, and captures higher order interactions of
features, via patch-wise kernel functions, but without introducing additional
parameters. Extensive experiments show that kervolutional neural networks (KNN)
achieve higher accuracy and faster convergence than baseline CNN.Comment: oral paper in CVPR 201
The DKU-OPPO System for the 2022 Spoofing-Aware Speaker Verification Challenge
This paper describes our DKU-OPPO system for the 2022 Spoofing-Aware Speaker
Verification (SASV) Challenge. First, we split the joint task into speaker
verification (SV) and spoofing countermeasure (CM), these two tasks which are
optimized separately. For ASV systems, four state-of-the-art methods are
employed. For CM systems, we propose two methods on top of the challenge
baseline to further improve the performance, namely Embedding Random Sampling
Augmentation (ERSA) and One-Class Confusion Loss(OCCL). Second, we also explore
whether SV embedding could help improve CM system performance. We observe a
dramatic performance degradation of existing CM systems on the
domain-mismatched Voxceleb2 dataset. Third, we compare different fusion
strategies, including parallel score fusion and sequential cascaded systems.
Compared to the 1.71% SASV-EER baseline, our submitted cascaded system obtains
a 0.21% SASV-EER on the challenge official evaluation set.Comment: Accepted by Interspeech202