309 research outputs found
Challenges in Disentangling Independent Factors of Variation
We study the problem of building models that disentangle independent factors
of variation. Such models could be used to encode features that can efficiently
be used for classification and to transfer attributes between different images
in image synthesis. As data we use a weakly labeled training set. Our weak
labels indicate what single factor has changed between two data samples,
although the relative value of the change is unknown. This labeling is of
particular interest as it may be readily available without annotation costs. To
make use of weak labels we introduce an autoencoder model and train it through
constraints on image pairs and triplets. We formally prove that without
additional knowledge there is no guarantee that two images with the same factor
of variation will be mapped to the same feature. We call this issue the
reference ambiguity. Moreover, we show the role of the feature dimensionality
and adversarial training. We demonstrate experimentally that the proposed model
can successfully transfer attributes on several datasets, but show also cases
when the reference ambiguity occurs.Comment: Submitted to ICLR 201
On the Foundations of Shortcut Learning
Deep-learning models can extract a rich assortment of features from data.
Which features a model uses depends not only on predictivity-how reliably a
feature indicates train-set labels-but also on availability-how easily the
feature can be extracted, or leveraged, from inputs. The literature on shortcut
learning has noted examples in which models privilege one feature over another,
for example texture over shape and image backgrounds over foreground objects.
Here, we test hypotheses about which input properties are more available to a
model, and systematically study how predictivity and availability interact to
shape models' feature use. We construct a minimal, explicit generative
framework for synthesizing classification datasets with two latent features
that vary in predictivity and in factors we hypothesize to relate to
availability, and quantify a model's shortcut bias-its over-reliance on the
shortcut (more available, less predictive) feature at the expense of the core
(less available, more predictive) feature. We find that linear models are
relatively unbiased, but introducing a single hidden layer with ReLU or Tanh
units yields a bias. Our empirical findings are consistent with a theoretical
account based on Neural Tangent Kernels. Finally, we study how models used in
practice trade off predictivity and availability in naturalistic datasets,
discovering availability manipulations which increase models' degree of
shortcut bias. Taken together, these findings suggest that the propensity to
learn shortcut features is a fundamental characteristic of deep nonlinear
architectures warranting systematic study given its role in shaping how models
solve tasks
On automatic age estimation from facial profile view
YesIn recent years, automatic facial age estimation has gained popularity due to its numerous applications. Much work has been done on frontal images and lately, minimal estimation errors have been achieved on most of the benchmark databases. However, in reality, images obtained in unconstrained environments are not always frontal. For instance, when conducting a demographic study or crowd analysis, one may get profile images of the face. To the best of our knowledge, no attempt has been made to estimate ages from the side-view of face images. Here we exploit this by using a pre-trained deep residual neural network (ResNet) to extract features. We then utilize a sparse partial least squares regression approach to estimate ages. Despite having less information as compared to frontal images, our results show that the extracted deep features achieve a promising performance
Chroma-VAE: Mitigating Shortcut Learning with Generative Classifiers
Deep neural networks are susceptible to shortcut learning, using simple
features to achieve low training loss without discovering essential semantic
structure. Contrary to prior belief, we show that generative models alone are
not sufficient to prevent shortcut learning, despite an incentive to recover a
more comprehensive representation of the data than discriminative approaches.
However, we observe that shortcuts are preferentially encoded with minimal
information, a fact that generative models can exploit to mitigate shortcut
learning. In particular, we propose Chroma-VAE, a two-pronged approach where a
VAE classifier is initially trained to isolate the shortcut in a small latent
subspace, allowing a secondary classifier to be trained on the complementary,
shortcut-free latent subspace. In addition to demonstrating the efficacy of
Chroma-VAE on benchmark and real-world shortcut learning tasks, our work
highlights the potential for manipulating the latent space of generative
classifiers to isolate or interpret specific correlations.Comment: Presented at the 36th Conference on Neural Information Processing
Systems (NeurIPS 2022
- …