571 research outputs found
OL\'E: Orthogonal Low-rank Embedding, A Plug and Play Geometric Loss for Deep Learning
Deep neural networks trained using a softmax layer at the top and the
cross-entropy loss are ubiquitous tools for image classification. Yet, this
does not naturally enforce intra-class similarity nor inter-class margin of the
learned deep representations. To simultaneously achieve these two goals,
different solutions have been proposed in the literature, such as the pairwise
or triplet losses. However, such solutions carry the extra task of selecting
pairs or triplets, and the extra computational burden of computing and learning
for many combinations of them. In this paper, we propose a plug-and-play loss
term for deep networks that explicitly reduces intra-class variance and
enforces inter-class margin simultaneously, in a simple and elegant geometric
manner. For each class, the deep features are collapsed into a learned linear
subspace, or union of them, and inter-class subspaces are pushed to be as
orthogonal as possible. Our proposed Orthogonal Low-rank Embedding (OL\'E) does
not require carefully crafting pairs or triplets of samples for training, and
works standalone as a classification loss, being the first reported deep metric
learning framework of its kind. Because of the improved margin between features
of different classes, the resulting deep networks generalize better, are more
discriminative, and more robust. We demonstrate improved classification
performance in general object recognition, plugging the proposed loss term into
existing off-the-shelf architectures. In particular, we show the advantage of
the proposed loss in the small data/model scenario, and we significantly
advance the state-of-the-art on the Stanford STL-10 benchmark
A New PHO-rmula for Improved Performance of Semi-Structured Networks
Recent advances to combine structured regression models and deep neural
networks for better interpretability, more expressiveness, and statistically
valid uncertainty quantification demonstrate the versatility of semi-structured
neural networks (SSNs). We show that techniques to properly identify the
contributions of the different model components in SSNs, however, lead to
suboptimal network estimation, slower convergence, and degenerated or erroneous
predictions. In order to solve these problems while preserving favorable model
properties, we propose a non-invasive post-hoc orthogonalization (PHO) that
guarantees identifiability of model components and provides better estimation
and prediction quality. Our theoretical findings are supported by numerical
experiments, a benchmark comparison as well as a real-world application to
COVID-19 infections.Comment: ICML 202
- …