2 research outputs found
Text2Model: Model Induction for Zero-shot Generalization Using Task Descriptions
We study the problem of generating a training-free task-dependent visual
classifier from text descriptions without visual samples. This
\textit{Text-to-Model} (T2M) problem is closely related to zero-shot learning,
but unlike previous work, a T2M model infers a model tailored to a task, taking
into account all classes in the task. We analyze the symmetries of T2M, and
characterize the equivariance and invariance properties of corresponding
models. In light of these properties, we design an architecture based on
hypernetworks that given a set of new class descriptions predicts the weights
for an object recognition model which classifies images from those zero-shot
classes. We demonstrate the benefits of our approach compared to zero-shot
learning from text descriptions in image and point-cloud classification using
various types of text descriptions: From single words to rich text
descriptions
Example-based Hypernetworks for Out-of-Distribution Generalization
As Natural Language Processing (NLP) algorithms continually achieve new
milestones, out-of-distribution generalization remains a significant challenge.
This paper addresses the issue of multi-source adaptation for unfamiliar
domains: We leverage labeled data from multiple source domains to generalize to
unknown target domains at training. Our innovative framework employs
example-based Hypernetwork adaptation: a T5 encoder-decoder initially generates
a unique signature from an input example, embedding it within the source
domains' semantic space. This signature is subsequently utilized by a
Hypernetwork to generate the task classifier's weights. We evaluated our method
across two tasks - sentiment classification and natural language inference - in
29 adaptation scenarios, where it outpaced established algorithms. In an
advanced version, the signature also enriches the input example's
representation. We also compare our finetuned architecture to few-shot GPT-3,
demonstrating its effectiveness in essential use cases. To our knowledge, this
marks the first application of Hypernetworks to the adaptation for unknown
domains.Comment: First two authors contributed equally to this work. Our code and data
are available at: https://github.com/TomerVolk/Hyper-PAD