25 research outputs found
Multi-Modal Multi-Scale Deep Learning for Large-Scale Image Annotation
Image annotation aims to annotate a given image with a variable number of
class labels corresponding to diverse visual concepts. In this paper, we
address two main issues in large-scale image annotation: 1) how to learn a rich
feature representation suitable for predicting a diverse set of visual concepts
ranging from object, scene to abstract concept; 2) how to annotate an image
with the optimal number of class labels. To address the first issue, we propose
a novel multi-scale deep model for extracting rich and discriminative features
capable of representing a wide range of visual concepts. Specifically, a novel
two-branch deep neural network architecture is proposed which comprises a very
deep main network branch and a companion feature fusion network branch designed
for fusing the multi-scale features computed from the main branch. The deep
model is also made multi-modal by taking noisy user-provided tags as model
input to complement the image input. For tackling the second issue, we
introduce a label quantity prediction auxiliary task to the main label
prediction task to explicitly estimate the optimal label number for a given
image. Extensive experiments are carried out on two large-scale image
annotation benchmark datasets and the results show that our method
significantly outperforms the state-of-the-art.Comment: Submited to IEEE TI
Learning with Biased Complementary Labels
In this paper, we study the classification problem in which we have access to
easily obtainable surrogate for true labels, namely complementary labels, which
specify classes that observations do \textbf{not} belong to. Let and
be the true and complementary labels, respectively. We first model
the annotation of complementary labels via transition probabilities
, where is the number of
classes. Previous methods implicitly assume that , are identical, which is not true in practice because humans are
biased toward their own experience. For example, as shown in Figure 1, if an
annotator is more familiar with monkeys than prairie dogs when providing
complementary labels for meerkats, she is more likely to employ "monkey" as a
complementary label. We therefore reason that the transition probabilities will
be different. In this paper, we propose a framework that contributes three main
innovations to learning with \textbf{biased} complementary labels: (1) It
estimates transition probabilities with no bias. (2) It provides a general
method to modify traditional loss functions and extends standard deep neural
network classifiers to learn with biased complementary labels. (3) It
theoretically ensures that the classifier learned with complementary labels
converges to the optimal one learned with true labels. Comprehensive
experiments on several benchmark datasets validate the superiority of our
method to current state-of-the-art methods.Comment: ECCV 2018 Ora
FRCSyn Challenge at WACV 2024:Face Recognition Challenge in the Era of Synthetic Data
Despite the widespread adoption of face recognition technology around the
world, and its remarkable performance on current benchmarks, there are still
several challenges that must be covered in more detail. This paper offers an
overview of the Face Recognition Challenge in the Era of Synthetic Data
(FRCSyn) organized at WACV 2024. This is the first international challenge
aiming to explore the use of synthetic data in face recognition to address
existing limitations in the technology. Specifically, the FRCSyn Challenge
targets concerns related to data privacy issues, demographic biases,
generalization to unseen scenarios, and performance limitations in challenging
scenarios, including significant age disparities between enrollment and
testing, pose variations, and occlusions. The results achieved in the FRCSyn
Challenge, together with the proposed benchmark, contribute significantly to
the application of synthetic data to improve face recognition technology.Comment: 10 pages, 1 figure, WACV 2024 Workshop
Binary Classification with Instance and Label Dependent Label Noise
Learning with label dependent label noise has been extensively explored in
both theory and practice; however, dealing with instance (i.e., feature) and
label dependent label noise continues to be a challenging task. The difficulty
arises from the fact that the noise rate varies for each instance, making it
challenging to estimate accurately. The question of whether it is possible to
learn a reliable model using only noisy samples remains unresolved. We answer
this question with a theoretical analysis that provides matching upper and
lower bounds. Surprisingly, our results show that, without any additional
assumptions, empirical risk minimization achieves the optimal excess risk
bound. Specifically, we derive a novel excess risk bound proportional to the
noise level, which holds in very general settings, by comparing the empirical
risk minimizers obtained from clean samples and noisy samples. Second, we show
that the minimax lower bound for the 0-1 loss is a constant proportional to the
average noise rate. Our findings suggest that learning solely with noisy
samples is impossible without access to clean samples or strong assumptions on
the distribution of the data