968 research outputs found
Energy-based Self-attentive Learning of Abstractive Communities for Spoken Language Understanding
Abstractive community detection is an important spoken language understanding
task, whose goal is to group utterances in a conversation according to whether
they can be jointly summarized by a common abstractive sentence. This paper
provides a novel approach to this task. We first introduce a neural contextual
utterance encoder featuring three types of self-attention mechanisms. We then
train it using the siamese and triplet energy-based meta-architectures.
Experiments on the AMI corpus show that our system outperforms multiple
energy-based and non-energy based baselines from the state-of-the-art. Code and
data are publicly available.Comment: Update baseline
Learning Similarity Attention
We consider the problem of learning similarity functions. While there has
been substantial progress in learning suitable distance metrics, these
techniques in general lack decision reasoning, i.e., explaining why the input
set of images is similar or dissimilar. In this work, we solve this key problem
by proposing the first method to generate generic visual similarity
explanations with gradient-based attention. We demonstrate that our technique
is agnostic to the specific similarity model type, e.g., we show applicability
to Siamese, triplet, and quadruplet models. Furthermore, we make our proposed
similarity attention a principled part of the learning process, resulting in a
new paradigm for learning similarity functions. We demonstrate that our
learning mechanism results in more generalizable, as well as explainable,
similarity models. Finally, we demonstrate the generality of our framework by
means of experiments on a variety of tasks, including image retrieval, person
re-identification, and low-shot semantic segmentation.Comment: 10 pages, 7 figures, 4 table
Class interference regularization
Contrastive losses yield state-of-the-art performance for person re-identification, face verification and few shot learning. They have recently outperformed the cross-entropy loss on classification at the ImageNet scale and outperformed all self-supervision prior results by a large margin (SimCLR). Simple and effective regularization techniques such as label smoothing and self-distillation do not apply anymore, because they act on multinomial label distributions, adopted in cross-entropy losses, and not on tuple comparative terms, which characterize the contrastive losses.
Here we propose a novel, simple and effective regularization technique, the Class Interference Regularization (CIR), which applies to cross-entropy losses but is especially effective on contrastive losses. CIR perturbs the output features by randomly moving them towards the average embeddings of the negative classes. To the best of our knowledge, CIR is the first regularization technique to act on the output features.
In experimental evaluation, the combination of CIR and a plain Siamese-net with triplet loss yields best few-shot learning performance on the challenging tieredImageNet. CIR also improves the state-of-the-art technique in person re-identification on the Market-1501 dataset, based on triplet loss, and the state-of-the-art technique in person search on the CUHK-SYSU dataset, based on a cross-entropy loss. Finally, on the task of classification CIR performs on par with the popular label smoothing, as demonstrated for CIFAR-10 and -100
- …