1,018 research outputs found

    Recent Advances in Transfer Learning for Cross-Dataset Visual Recognition: A Problem-Oriented Perspective

    Get PDF
    This paper takes a problem-oriented perspective and presents a comprehensive review of transfer learning methods, both shallow and deep, for cross-dataset visual recognition. Specifically, it categorises the cross-dataset recognition into seventeen problems based on a set of carefully chosen data and label attributes. Such a problem-oriented taxonomy has allowed us to examine how different transfer learning approaches tackle each problem and how well each problem has been researched to date. The comprehensive problem-oriented review of the advances in transfer learning with respect to the problem has not only revealed the challenges in transfer learning for visual recognition, but also the problems (e.g. eight of the seventeen problems) that have been scarcely studied. This survey not only presents an up-to-date technical review for researchers, but also a systematic approach and a reference for a machine learning practitioner to categorise a real problem and to look up for a possible solution accordingly

    Multi-Domain Adaptation for Image Classification, Depth Estimation, and Semantic Segmentation

    Get PDF
    The appearance of scenes may change for many reasons, including the viewpoint, the time of day, the weather, and the seasons. Traditionally, deep neural networks are trained and evaluated using images from the same scene and domain to avoid the domain gap. Recent advances in domain adaptation have led to a new type of method that bridges such domain gaps and learns from multiple domains. This dissertation proposes methods for multi-domain adaptation for various computer vision tasks, including image classification, depth estimation, and semantic segmentation. The first work focuses on semi-supervised domain adaptation. I address this semi-supervised setting and propose to use dynamic feature alignment to address both inter- and intra-domain discrepancy. The second work addresses the task of monocular depth estimation in the multi-domain setting. I propose to address this task with a unified approach that includes adversarial knowledge distillation and uncertainty-guided self-supervised reconstruction. The third work considers the problem of semantic segmentation for aerial imagery with diverse environments and viewing geometries. I present CrossSeg: a novel framework that learns a semantic segmentation network that can generalize well in a cross-scene setting with only a few labeled samples. I believe this line of work can be applicable to many domain adaptation scenarios and aerial applications

    On the Effectiveness of Image Rotation for Open Set Domain Adaptation

    Get PDF
    Open Set Domain Adaptation (OSDA) bridges the domain gap between a labeled source domain and an unlabeled target domain, while also rejecting target classes that are not present in the source. To avoid negative transfer, OSDA can be tackled by first separating the known/unknown target samples and then aligning known target samples with the source data. We propose a novel method to addresses both these problems using the self-supervised task of rotation recognition. Moreover, we assess the performance with a new open set metric that properly balances the contribution of recognizing the known classes and rejecting the unknown samples. Comparative experiments with existing OSDA methods on the standard Office-31 and Office-Home benchmarks show that: (i) our method outperforms its competitors, (ii) reproducibility for this field is a crucial issue to tackle, (iii) our metric provides a reliable tool to allow fair open set evaluation.Comment: accepted at ECCV 202

    DDAM-PS: Diligent Domain Adaptive Mixer for Person Search

    Full text link
    Person search (PS) is a challenging computer vision problem where the objective is to achieve joint optimization for pedestrian detection and re-identification (ReID). Although previous advancements have shown promising performance in the field under fully and weakly supervised learning fashion, there exists a major gap in investigating the domain adaptation ability of PS models. In this paper, we propose a diligent domain adaptive mixer (DDAM) for person search (DDAP-PS) framework that aims to bridge a gap to improve knowledge transfer from the labeled source domain to the unlabeled target domain. Specifically, we introduce a novel DDAM module that generates moderate mixed-domain representations by combining source and target domain representations. The proposed DDAM module encourages domain mixing to minimize the distance between the two extreme domains, thereby enhancing the ReID task. To achieve this, we introduce two bridge losses and a disparity loss. The objective of the two bridge losses is to guide the moderate mixed-domain representations to maintain an appropriate distance from both the source and target domain representations. The disparity loss aims to prevent the moderate mixed-domain representations from being biased towards either the source or target domains, thereby avoiding overfitting. Furthermore, we address the conflict between the two subtasks, localization and ReID, during domain adaptation. To handle this cross-task conflict, we forcefully decouple the norm-aware embedding, which aids in better learning of the moderate mixed-domain representation. We conduct experiments to validate the effectiveness of our proposed method. Our approach demonstrates favorable performance on the challenging PRW and CUHK-SYSU datasets. Our source code is publicly available at \url{https://github.com/mustansarfiaz/DDAM-PS}.Comment: Accepted in WACV-2024. Code is here at \url{https://github.com/mustansarfiaz/DDAM-P

    CO2 emission based GDP prediction using intuitionistic fuzzy transfer learning

    Get PDF
    The industrialization has been the primary cause of the economic boom in almost all countries. However, this happened at the cost of the environment, as industrialization also caused carbon emissions to increase exponentially. According to the established literature, Gross Domestic Product (GDP) is related to carbon emissions (CO2) which could be optimally employed to precisely estimate a country's GDP. However, the scarcity of data is a significant bottleneck that could be handled using transfer learning (TL) which uses previously learned information to resolve new tasks, more specifically, related tasks. Notably, TL is highly vulnerable to performance degradation due to the deficiency of suitable information and hesitancy in decision-making. Therefore, this paper proposes ‘Intuitionistic Fuzzy Transfer Learning (IFTL)’, which is trained to use CO2 emission data of developed nations and is tested for its prediction of GDP in a developing nation. IFTL exploits the concepts of intuitionistic fuzzy sets (IFSs) and a newly introduced function called the modified Hausdorff distance function. The proposed IFTL is investigated to demonstrate its actual capabilities for TL in modeling hesitancy. To further emphasize the role of hesitancy modelled with IFSs, we propose an ordinary fuzzy set (FS) based transfer learning. The prediction accuracy of the IFTL is further compared with widely used machine learning approaches, extreme learning machines, support vector regression, and generalized regression neural networks. It is observed that IFTL capably ensured significant improvements in the prediction accuracy over other existing approaches whenever training and testing data have huge data distribution differences. Moreover, the proposed IFTL is deterministic in nature and presents a novel way for mathematically computing the intuitionistic hesitation degree.© 2023 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).fi=vertaisarvioitu|en=peerReviewed

    Learning with Single View Co-training and Marginalized Dropout

    Get PDF
    The generalization properties of most existing machine learning techniques are predicated on the assumptions that 1) a sufficiently large quantity of training data is available; 2) the training and testing data come from some common distribution. Although these assumptions are often met in practice, there are also many scenarios in which training data from the relevant distribution is insufficient. We focus on making use of additional data, which is readily available or can be obtained easily but comes from a different distribution than the testing data, to aid learning. We present five learning scenarios, depending on how the distribution we used to sample the additional training data differs from the testing distribution: 1) learning with weak supervision; 2) domain adaptation; 3) learning from multiple domains; 4) learning from corrupted data; 5) learning with partial supervision. We introduce two strategies and manifest them in five ways to cope with the difference between the training and testing distribution. The first strategy, which gives rise to Pseudo Multi-view Co-training: PMC) and Co-training for Domain Adaptation: CODA), is inspired by the co-training algorithm for multi-view data. PMC generalizes co-training to the more common single view data and allows us to learn from weakly labeled data retrieved free from the web. CODA integrates PMC with an another feature selection component to address the feature incompatibility between domains for domain adaptation. PMC and CODA are evaluated on a variety of real datasets, and both yield record performance. The second strategy marginalized dropout leads to marginalized Stacked Denoising Autoencoders: mSDA), Marginalized Corrupted Features: MCF) and FastTag: FastTag). mSDA diminishes the difference between distributions associated with different domains by learning a new representation through marginalized corruption and reconstruciton. MCF learns from a known distribution which is created by corrupting a small set of training data, and improves robustness of learned classifiers by training on ``infinitely\u27\u27 many data sampled from the distribution. FastTag applies marginalized dropout to the output of partially labeled data to recover missing labels for multi-label tasks. These three algorithms not only achieve the state-of-art performance in various tasks, but also deliver orders of magnitude speed up at training and testing comparing to competing algorithms
    corecore