21,378 research outputs found

    Balancing feature alignment and uniformity for few-shot classification

    Get PDF
    In Few-Shot Learning (FSL), the objective is to correctly recognize new samples from novel classes with only a few available samples per class. Existing methods in FSL primarily focus on learning transferable knowledge from base classes by maximizing the information between feature representations and their corresponding labels. However, this approach may suffer from the “supervision collapse" issue, which arises due to a bias towards the base classes. In this paper, we propose a solution to address this issue by preserving the intrinsic structure of the data and enabling the learning of a generalized model for the novel classes. Following the InfoMax principle, our approach maximizes two types of mutual information (MI): between the samples and their feature representations, and between the feature representations and their class labels. This allows us to strike a balance between discrimination (capturing class-specific information) and generalization (capturing common characteristics across different classes) in the feature representations. To achieve this, we adopt a unified framework that perturbs the feature embedding space using two low-bias estimators. The first estimator maximizes the MI between a pair of intra-class samples, while the second estimator maximizes the MI between a sample and its augmented views. This framework effectively combines knowledge distillation between class-wise pairs and enlarges the diversity in feature representations. By conducting extensive experiments on popular FSL benchmarks, our proposed approach achieves comparable performances with state-of-the-art competitors. For example, we achieved an accuracy of 69.53% on the miniImageNet dataset and 77.06% on the CIFAR-FS dataset for the 5-way 1-shot task

    Unsupervised Learning via Total Correlation Explanation

    Full text link
    Learning by children and animals occurs effortlessly and largely without obvious supervision. Successes in automating supervised learning have not translated to the more ambiguous realm of unsupervised learning where goals and labels are not provided. Barlow (1961) suggested that the signal that brains leverage for unsupervised learning is dependence, or redundancy, in the sensory environment. Dependence can be characterized using the information-theoretic multivariate mutual information measure called total correlation. The principle of Total Cor-relation Ex-planation (CorEx) is to learn representations of data that "explain" as much dependence in the data as possible. We review some manifestations of this principle along with successes in unsupervised learning problems across diverse domains including human behavior, biology, and language.Comment: Invited contribution for IJCAI 2017 Early Career Spotlight. 5 pages, 1 figur

    Spectral Unsupervised Domain Adaptation for Visual Recognition

    Full text link
    Unsupervised domain adaptation (UDA) aims to learn a well-performed model in an unlabeled target domain by leveraging labeled data from one or multiple related source domains. It remains a great challenge due to 1) the lack of annotations in the target domain and 2) the rich discrepancy between the distributions of source and target data. We propose Spectral UDA (SUDA), an efficient yet effective UDA technique that works in the spectral space and is generic across different visual recognition tasks in detection, classification and segmentation. SUDA addresses UDA challenges from two perspectives. First, it mitigates inter-domain discrepancies by a spectrum transformer (ST) that maps source and target images into spectral space and learns to enhance domain-invariant spectra while suppressing domain-variant spectra simultaneously. To this end, we design novel adversarial multi-head spectrum attention that leverages contextual information to identify domain-variant and domain-invariant spectra effectively. Second, it mitigates the lack of annotations in target domain by introducing multi-view spectral learning which aims to learn comprehensive yet confident target representations by maximizing the mutual information among multiple ST augmentations capturing different spectral views of each target sample. Extensive experiments over different visual tasks (e.g., detection, classification and segmentation) show that SUDA achieves superior accuracy and it is also complementary with state-of-the-art UDA methods with consistent performance boosts but little extra computation

    Deep Multi-view Learning to Rank

    Full text link
    We study the problem of learning to rank from multiple information sources. Though multi-view learning and learning to rank have been studied extensively leading to a wide range of applications, multi-view learning to rank as a synergy of both topics has received little attention. The aim of the paper is to propose a composite ranking method while keeping a close correlation with the individual rankings simultaneously. We present a generic framework for multi-view subspace learning to rank (MvSL2R), and two novel solutions are introduced under the framework. The first solution captures information of feature mappings from within each view as well as across views using autoencoder-like networks. Novel feature embedding methods are formulated in the optimization of multi-view unsupervised and discriminant autoencoders. Moreover, we introduce an end-to-end solution to learning towards both the joint ranking objective and the individual rankings. The proposed solution enhances the joint ranking with minimum view-specific ranking loss, so that it can achieve the maximum global view agreements in a single optimization process. The proposed method is evaluated on three different ranking problems, i.e. university ranking, multi-view lingual text ranking and image data ranking, providing superior results compared to related methods.Comment: Published at IEEE TKD
    corecore