Search CORE

259 research outputs found

Recommended from our members

Learning from Limited Labeled Data for Visual Recognition

Author: Su Jong-Chyi
Publication venue: ScholarWorks@UMass Amherst
Publication date: 20/10/2021
Field of study

Recent advances in computer vision are in part due to the widespread use of deep neural networks. However, training deep networks require enormous amounts of labeled data which can be a bottleneck. In this thesis, we propose several approaches to mitigate this in the context of modern deep networks and computer vision tasks. While transfer learning is an effective strategy for natural image tasks where large labeled datasets such as ImageNet are available, it is less effective for distant domains such as medical images and 3D shapes. Chapter 2 focuses on transfer learning from natural image representations to other modalities. In many cases, cross-modal data can be generated using computer graphics techniques. By forcing the agreement of predictions across modalities, we show that the models are more robust to image degradation, such as lower resolution, grayscale, or line drawings instead of color images in high-resolution. Similarly, we show that 3D shape classifiers learned from multi-view images can be transferred to the models of voxel or point cloud representations. Another line of work has focused on techniques for few-shot learning. In particular, meta-learning approaches explicitly aim to generalize representations by emphasizing transferability to novel tasks. In Chapter 3, we analyze how to improve these techniques by exploiting unlabeled data from related tasks. We show that combining unsupervised objectives with meta-learning objectives can boost the performance of novel tasks. However, we find that small amounts of domain-specific data can be more beneficial than large amounts of generic data. While transfer learning, unsupervised learning, and few-shot learning have been studied in isolation, in practice, one often finds that transfer learning from large labeled datasets is more effective than others. This is partly due to a lack of evaluation on benchmarks that contains challenges such as class imbalance and domain mismatch. In Chapter 4, we explore the role of expert models in the context of semi-supervised learning on a realistic benchmark. Unlike existing semi-supervised benchmarks, our dataset is designed to expose some of the challenges encountered in a realistic setting, such as the fine-grained similarity between classes, significant class imbalance, and domain mismatch between the labeled and unlabeled data. We show that current semi-supervised methods are negatively affected by out-of-class data, and their performance pales compared to a transfer learning baseline. Last, we leverage the coarse labels from a large collection of images to improve semi-supervised learning. In Chapter 5, we show that incorporating hierarchical labels in the taxonomy improves state-of-the-art semi-supervised methods

ScholarWorks@UMass Amherst

A Survey of Self-supervised Learning from Multiple Perspectives: Algorithms, Applications and Future Trends

Author: Cao Qiong
Chen Tuo
Gui Jie
Luo Hao
Sun Zhenan
Tao Dacheng
Zhang Jing
Publication venue
Publication date: 21/08/2023
Field of study

Deep supervised learning algorithms generally require large numbers of labeled examples to achieve satisfactory performance. However, collecting and labeling too many examples can be costly and time-consuming. As a subset of unsupervised learning, self-supervised learning (SSL) aims to learn useful features from unlabeled examples without any human-annotated labels. SSL has recently attracted much attention and many related algorithms have been developed. However, there are few comprehensive studies that explain the connections and evolution of different SSL variants. In this paper, we provide a review of various SSL methods from the perspectives of algorithms, applications, three main trends, and open questions. First, the motivations of most SSL algorithms are introduced in detail, and their commonalities and differences are compared. Second, typical applications of SSL in domains such as image processing and computer vision (CV), as well as natural language processing (NLP), are discussed. Finally, the three main trends of SSL and the open research questions are discussed. A collection of useful materials is available at https://github.com/guijiejie/SSL

arXiv.org e-Print Archive

To Compress or Not to Compress -- Self-Supervised Learning and Information Theory: A Review

Author: LeCun Yann
Shwartz-Ziv Ravid
Publication venue
Publication date: 18/04/2023
Field of study

Deep neural networks have demonstrated remarkable performance in supervised learning tasks but require large amounts of labeled data. Self-supervised learning offers an alternative paradigm, enabling the model to learn from data without explicit labels. Information theory has been instrumental in understanding and optimizing deep neural networks. Specifically, the information bottleneck principle has been applied to optimize the trade-off between compression and relevant information preservation in supervised settings. However, the optimal information objective in self-supervised learning remains unclear. In this paper, we review various approaches to self-supervised learning from an information-theoretic standpoint and present a unified framework that formalizes the \textit{self-supervised information-theoretic learning problem}. We integrate existing research into a coherent framework, examine recent self-supervised methods, and identify research opportunities and challenges. Moreover, we discuss empirical measurement of information-theoretic quantities and their estimators. This paper offers a comprehensive review of the intersection between information theory, self-supervised learning, and deep neural networks

arXiv.org e-Print Archive

Learning Dense Object Descriptors from Multiple Views for Low-shot Category Generalization

Author: Huang Zixuan
Rehg James M.
Stojanov Stefan
Thai Anh
Publication venue
Publication date: 27/11/2022
Field of study

A hallmark of the deep learning era for computer vision is the successful use of large-scale labeled datasets to train feature representations for tasks ranging from object recognition and semantic segmentation to optical flow estimation and novel view synthesis of 3D scenes. In this work, we aim to learn dense discriminative object representations for low-shot category recognition without requiring any category labels. To this end, we propose Deep Object Patch Encodings (DOPE), which can be trained from multiple views of object instances without any category or semantic object part labels. To train DOPE, we assume access to sparse depths, foreground masks and known cameras, to obtain pixel-level correspondences between views of an object, and use this to formulate a self-supervised learning task to learn discriminative object patches. We find that DOPE can directly be used for low-shot classification of novel categories using local-part matching, and is competitive with and outperforms supervised and self-supervised learning baselines. Code and data available at https://github.com/rehg-lab/dope_selfsup.Comment: Accepted at NeurIPS 2022. Code and data available at https://github.com/rehg-lab/dope_selfsu

arXiv.org e-Print Archive

Recent Advances in Transfer Learning for Cross-Dataset Visual Recognition: A Problem-Oriented Perspective

Author: Li Wanqing
Ogunbona Philip
Xu Dong
Zhang Jing
Publication venue
Publication date: 01/01/2019
Field of study

This paper takes a problem-oriented perspective and presents a comprehensive review of transfer learning methods, both shallow and deep, for cross-dataset visual recognition. Specifically, it categorises the cross-dataset recognition into seventeen problems based on a set of carefully chosen data and label attributes. Such a problem-oriented taxonomy has allowed us to examine how different transfer learning approaches tackle each problem and how well each problem has been researched to date. The comprehensive problem-oriented review of the advances in transfer learning with respect to the problem has not only revealed the challenges in transfer learning for visual recognition, but also the problems (e.g. eight of the seventeen problems) that have been scarcely studied. This survey not only presents an up-to-date technical review for researchers, but also a systematic approach and a reference for a machine learning practitioner to categorise a real problem and to look up for a possible solution accordingly

arXiv.org e-Print Archive

Research Online