7 research outputs found

    Learning with Unavailable Data: Generalized and Open Zero-Shot Learning

    Get PDF
    The field of visual object recognition has seen a significant progress in recent years thanks to the availability of large-scale annotated datasets. However, labelling a large amount of data is difficult and costly and can be simply infeasible for some classes due to the long-tail instances distribution problem. Zero-Shot Learning (ZSL) is a framework that consider the case in which for some of the classes no labeled training examples are available to train the model. To solve the problem a multi-modal source of information, the class (semantic) embeddings, is exploited to extract knowledge from the available classes, the seen classes, and recognize novel categories for which the class embeddings is the only information available, namely, the unseen classes. To directly targeting the extreme imbalance in the data, in this thesis, we first propose a methodology to improve synthetic data generation for the unseen classes through their class embeddings. Second, we propose to generalize the Zero-Shot Learning framework towards a more competitive and real-world oriented scenario. Thus, we formalize the problem of Open Zero-Shot Learning as the problem of recognizing seen and unseen classes, as in ZSL, while also rejecting instances from unknown categories, for which neither visual data nor class embeddings are provided. Finally, we propose methodologies to not only generate unseen categories, but also the unknown ones

    Towards Effective Deep Embedding for Zero-Shot Learning

    No full text
    © 1991-2012 IEEE. Zero-shot learning (ZSL) can be formulated as a cross-domain matching problem: after being projected into a joint embedding space, a visual sample will match against all candidate class-level semantic descriptions and be assigned to the nearest class. In this process, the embedding space underpins the success of such matching and is crucial for ZSL. In this paper, we conduct an in-depth study on the construction of embedding space for ZSL and posit that an ideal embedding space should satisfy two criteria: intra-class compactness and inter-class separability. While the former encourages the embeddings of visual samples of one class to distribute tightly close to the semantic description embedding of this class, the latter requires embeddings from different classes to be well separated from each other. Towards this goal, we present a simple but effective two-branch network to simultaneously map semantic descriptions and visual samples into a joint space, on which visual embeddings are forced to regress to their class-level semantic embeddings and the embeddings crossing classes are required to be distinguishable by a trainable classifier. Furthermore, we extend our method to a transductive setting to better handle the model bias problem in ZSL (i.e., samples from unseen classes tend to be categorized into seen classes) with minimal extra supervision. Specifically, we propose a pseudo labeling strategy to progressively incorporate the testing samples into the training process and thus balance the model between seen and unseen classes. Experimental results on five standard ZSL datasets show the superior performance of the proposed method and its transductive extension
    corecore