55 research outputs found
Self-tuned Visual Subclass Learning with Shared Samples An Incremental Approach
Computer vision tasks are traditionally defined and evaluated using semantic
categories. However, it is known to the field that semantic classes do not
necessarily correspond to a unique visual class (e.g. inside and outside of a
car). Furthermore, many of the feasible learning techniques at hand cannot
model a visual class which appears consistent to the human eye. These problems
have motivated the use of 1) Unsupervised or supervised clustering as a
preprocessing step to identify the visual subclasses to be used in a
mixture-of-experts learning regime. 2) Felzenszwalb et al. part model and other
works model mixture assignment with latent variables which is optimized during
learning 3) Highly non-linear classifiers which are inherently capable of
modelling multi-modal input space but are inefficient at the test time. In this
work, we promote an incremental view over the recognition of semantic classes
with varied appearances. We propose an optimization technique which
incrementally finds maximal visual subclasses in a regularized risk
minimization framework. Our proposed approach unifies the clustering and
classification steps in a single algorithm. The importance of this approach is
its compliance with the classification via the fact that it does not need to
know about the number of clusters, the representation and similarity measures
used in pre-processing clustering methods a priori. Following this approach we
show both qualitatively and quantitatively significant results. We show that
the visual subclasses demonstrate a long tail distribution. Finally, we show
that state of the art object detection methods (e.g. DPM) are unable to use the
tails of this distribution comprising 50\% of the training samples. In fact we
show that DPM performance slightly increases on average by the removal of this
half of the data.Comment: Updated ICCV 2013 submissio
Generalized Jensen-Shannon Divergence Loss for Learning with Noisy Labels
Prior works have found it beneficial to combine provably noise-robust loss
functions e.g., mean absolute error (MAE) with standard categorical loss
function e.g. cross entropy (CE) to improve their learnability. Here, we
propose to use Jensen-Shannon divergence as a noise-robust loss function and
show that it interestingly interpolate between CE and MAE with a controllable
mixing parameter. Furthermore, we make a crucial observation that CE exhibit
lower consistency around noisy data points. Based on this observation, we adopt
a generalized version of the Jensen-Shannon divergence for multiple
distributions to encourage consistency around data points. Using this loss
function, we show state-of-the-art results on both synthetic (CIFAR), and
real-world (e.g., WebVision) noise with varying noise rates.Comment: Neural Information Processing Systems (NeurIPS 2021
CNN Features off-the-shelf: an Astounding Baseline for Recognition
Recent results indicate that the generic descriptors extracted from the
convolutional neural networks are very powerful. This paper adds to the
mounting evidence that this is indeed the case. We report on a series of
experiments conducted for different recognition tasks using the publicly
available code and model of the \overfeat network which was trained to perform
object classification on ILSVRC13. We use features extracted from the \overfeat
network as a generic image representation to tackle the diverse range of
recognition tasks of object image classification, scene recognition, fine
grained recognition, attribute detection and image retrieval applied to a
diverse set of datasets. We selected these tasks and datasets as they gradually
move further away from the original task and data the \overfeat network was
trained to solve. Astonishingly, we report consistent superior results compared
to the highly tuned state-of-the-art systems in all the visual classification
tasks on various datasets. For instance retrieval it consistently outperforms
low memory footprint methods except for sculptures dataset. The results are
achieved using a linear SVM classifier (or distance in case of retrieval)
applied to a feature representation of size 4096 extracted from a layer in the
net. The representations are further modified using simple augmentation
techniques e.g. jittering. The results strongly suggest that features obtained
from deep learning with convolutional nets should be the primary candidate in
most visual recognition tasks.Comment: version 3 revisions: 1)Added results using feature processing and
data augmentation 2)Referring to most recent efforts of using CNN for
different visual recognition tasks 3) updated text/captio
Spotlight the Negatives: A Generalized Discriminative Latent Model
Discriminative latent variable models (LVM) are frequently applied to various
visual recognition tasks. In these systems the latent (hidden) variables
provide a formalism for modeling structured variation of visual features.
Conventionally, latent variables are de- fined on the variation of the
foreground (positive) class. In this work we augment LVMs to include negative
latent variables corresponding to the background class. We formalize the
scoring function of such a generalized LVM (GLVM). Then we discuss a framework
for learning a model based on the GLVM scoring function. We theoretically
showcase how some of the current visual recognition methods can benefit from
this generalization. Finally, we experiment on a generalized form of Deformable
Part Models with negative latent variables and show significant improvements on
two different detection tasks.Comment: Published in proceedings of BMVC 201
Logistic-Normal Likelihoods for Heteroscedastic Label Noise
A natural way of estimating heteroscedastic label noise in regression is to
model the observed (potentially noisy) target as a sample from a normal
distribution, whose parameters can be learned by minimizing the negative
log-likelihood. This formulation has desirable loss attenuation properties, as
it reduces the contribution of high-error examples. Intuitively, this behavior
can improve robustness against label noise by reducing overfitting. We propose
an extension of this simple and probabilistic approach to classification that
has the same desirable loss attenuation properties. Furthermore, we discuss and
address some practical challenges of this extension. We evaluate the
effectiveness of the method by measuring its robustness against label noise in
classification. We perform enlightening experiments exploring the inner
workings of the method, including sensitivity to hyperparameters, ablation
studies, and other insightful analyses
On the Lipschitz Constant of Deep Networks and Double Descent
Existing bounds on the generalization error of deep networks assume some form
of smooth or bounded dependence on the input variable, falling short of
investigating the mechanisms controlling such factors in practice. In this
work, we present an extensive experimental study of the empirical Lipschitz
constant of deep networks undergoing double descent, and highlight
non-monotonic trends strongly correlating with the test error. Building a
connection between parameter-space and input-space gradients for SGD around a
critical point, we isolate two important factors -- namely loss landscape
curvature and distance of parameters from initialization -- respectively
controlling optimization dynamics around a critical point and bounding model
function complexity, even beyond the training data. Our study presents novels
insights on implicit regularization via overparameterization, and effective
model complexity for networks trained in practice
- …