Search CORE

2 research outputs found

Text2Model: Model Induction for Zero-shot Generalization Using Task Descriptions

Author: Amosy Ohad
Ben-David Eyal
Chechik Gal
Reichart Roi
Volk Tomer
Publication venue
Publication date: 27/10/2022
Field of study

We study the problem of generating a training-free task-dependent visual classifier from text descriptions without visual samples. This \textit{Text-to-Model} (T2M) problem is closely related to zero-shot learning, but unlike previous work, a T2M model infers a model tailored to a task, taking into account all classes in the task. We analyze the symmetries of T2M, and characterize the equivariance and invariance properties of corresponding models. In light of these properties, we design an architecture based on hypernetworks that given a set of new class descriptions predicts the weights for an object recognition model which classifies images from those zero-shot classes. We demonstrate the benefits of our approach compared to zero-shot learning from text descriptions in image and point-cloud classification using various types of text descriptions: From single words to rich text descriptions

arXiv.org e-Print Archive

Example-based Hypernetworks for Out-of-Distribution Generalization

Author: Amosy Ohad
Ben-David Eyal
Chechik Gal
Reichart Roi
Volk Tomer
Publication venue
Publication date: 18/10/2023
Field of study

As Natural Language Processing (NLP) algorithms continually achieve new milestones, out-of-distribution generalization remains a significant challenge. This paper addresses the issue of multi-source adaptation for unfamiliar domains: We leverage labeled data from multiple source domains to generalize to unknown target domains at training. Our innovative framework employs example-based Hypernetwork adaptation: a T5 encoder-decoder initially generates a unique signature from an input example, embedding it within the source domains' semantic space. This signature is subsequently utilized by a Hypernetwork to generate the task classifier's weights. We evaluated our method across two tasks - sentiment classification and natural language inference - in 29 adaptation scenarios, where it outpaced established algorithms. In an advanced version, the signature also enriches the input example's representation. We also compare our finetuned architecture to few-shot GPT-3, demonstrating its effectiveness in essential use cases. To our knowledge, this marks the first application of Hypernetworks to the adaptation for unknown domains.Comment: First two authors contributed equally to this work. Our code and data are available at: https://github.com/TomerVolk/Hyper-PAD

arXiv.org e-Print Archive