Visual Learning Beyond Human Curated Datasets

Abstract

The success of deep neural networks in a variety of computer vision tasks heavily relies on large- scale datasets. However, it is expensive to manually acquire labels for large datasets. Given the human annotation cost and scarcity of data, the challenge is to learn efficiently with insufficiently labeled data. In this dissertation, we propose several approaches towards data-efficient learning in the context of few-shot learning, long-tailed visual recognition, and unsupervised and semi-supervised learning. In the first part, we propose a novel paradigm of Task-Agnostic Meta- Learning (TAML) algorithms to improve few-shot learning. Furthermore, in the second part, we analyze the long-tailed problem from a domain adaptation perspective and propose to augment the classic class-balanced learning for longtails by explicitly estimating the differences between the class-conditioned distributions with a meta-learning approach. Following this, we propose our lazy approach based on an intuitive teacher-student scheme to enable the gradient-based meta- learning algorithms to explore long horizons. Finally, in the third part, we propose a novel face detector adaptation approach that is applicable whenever the target domain supplies many representative images, no matter they are labeled or not. Experiments on several benchmark datasets verify the efficacy of the proposed methods under all settings

    Similar works