2 research outputs found

    Joint Progressive Knowledge Distillation and Unsupervised Domain Adaptation

    Full text link
    Currently, the divergence in distributions of design and operational data, and large computational complexity are limiting factors in the adoption of CNNs in real-world applications. For instance, person re-identification systems typically rely on a distributed set of cameras, where each camera has different capture conditions. This can translate to a considerable shift between source (e.g. lab setting) and target (e.g. operational camera) domains. Given the cost of annotating image data captured for fine-tuning in each target domain, unsupervised domain adaptation (UDA) has become a popular approach to adapt CNNs. Moreover, state-of-the-art deep learning models that provide a high level of accuracy often rely on architectures that are too complex for real-time applications. Although several compression and UDA approaches have recently been proposed to overcome these limitations, they do not allow optimizing a CNN to simultaneously address both. In this paper, we propose an unexplored direction -- the joint optimization of CNNs to provide a compressed model that is adapted to perform well for a given target domain. In particular, the proposed approach performs unsupervised knowledge distillation (KD) from a complex teacher model to a compact student model, by leveraging both source and target data. It also improves upon existing UDA techniques by progressively teaching the student about domain-invariant features, instead of directly adapting a compact model on target domain data. Our method is compared against state-of-the-art compression and UDA techniques, using two popular classification datasets for UDA -- Office31 and ImageClef-DA. In both datasets, results indicate that our method can achieve the highest level of accuracy while requiring a comparable or lower time complexity.Comment: Accepted to WCCI/IJCNN 202

    Exploiting Prunability for Person Re-Identification

    Full text link
    Recent years have witnessed a substantial increase in the deep learning (DL)architectures proposed for visual recognition tasks like person re-identification,where individuals must be recognized over multiple distributed cameras. Althoughthese architectures have greatly improved the state-of-the-art accuracy, thecomputational complexity of the CNNs commonly used for feature extractionremains an issue, hindering their deployment on platforms with limited resources,or in applications with real-time constraints. There is an obvious advantage toaccelerating and compressing DL models without significantly decreasing theiraccuracy. However, the source (pruning) domain differs from operational (target)domains, and the domain shift between image data captured with differentnon-overlapping camera viewpoints leads to lower recognition accuracy. In thispaper, we investigate the prunability of these architectures under different designscenarios. This paper first revisits pruning techniques that are suitable forreducing the computational complexity of deep CNN networks applied to personre-identification. Then, these techniques are analysed according to their pruningcriteria and strategy, and according to different scenarios for exploiting pruningmethods to fine-tuning networks to target domains. Experimental resultsobtained using DL models with ResNet feature extractors, and multiplebenchmarks re-identification datasets, indicate that pruning can considerablyreduce network complexity while maintaining a high level of accuracy. Inscenarios where pruning is performed with large pre-training or fine-tuningdatasets, the number of FLOPS required by ResNet architectures is reduced byhalf, while maintaining a comparable rank-1 accuracy (within 1% of the originalmodel). Pruning while training a larger CNNs can also provide a significantlybetter performance than fine-tuning smaller ones.Comment: Accepted for EURASIP Journal on Image and Video Processin
    corecore