Search CORE

142,551 research outputs found

A Survey on Deep Semi-supervised Learning

Author: King Irwin
Song Zixing
Xu Zenglin
Yang Xiangli
Publication venue
Publication date: 28/02/2021
Field of study

Deep semi-supervised learning is a fast-growing field with a range of practical applications. This paper provides a comprehensive survey on both fundamentals and recent advances in deep semi-supervised learning methods from model design perspectives and unsupervised loss functions. We first present a taxonomy for deep semi-supervised learning that categorizes existing methods, including deep generative methods, consistency regularization methods, graph-based methods, pseudo-labeling methods, and hybrid methods. Then we offer a detailed comparison of these methods in terms of the type of losses, contributions, and architecture differences. In addition to the past few years' progress, we further discuss some shortcomings of existing methods and provide some tentative heuristic solutions for solving these open problems.Comment: 24 pages, 6 figure

arXiv.org e-Print Archive

A Survey on Semi-Supervised Learning for Delayed Partially Labelled Data Streams

Author: Bifet Albert
Gomes Heitor Murilo
Grzenda Maciej
Mello Rodrigo
Nguyen Minh Huong Le
Read Jesse
Publication venue
Publication date: 16/06/2021
Field of study

Unlabelled data appear in many domains and are particularly relevant to streaming applications, where even though data is abundant, labelled data is rare. To address the learning problems associated with such data, one can ignore the unlabelled data and focus only on the labelled data (supervised learning); use the labelled data and attempt to leverage the unlabelled data (semi-supervised learning); or assume some labels will be available on request (active learning). The first approach is the simplest, yet the amount of labelled data available will limit the predictive performance. The second relies on finding and exploiting the underlying characteristics of the data distribution. The third depends on an external agent to provide the required labels in a timely fashion. This survey pays special attention to methods that leverage unlabelled data in a semi-supervised setting. We also discuss the delayed labelling issue, which impacts both fully supervised and semi-supervised methods. We propose a unified problem setting, discuss the learning guarantees and existing methods, explain the differences between related problem settings. Finally, we review the current benchmarking practices and propose adaptations to enhance them

arXiv.org e-Print Archive

Victoria University of Wellington

CONSS: Contrastive Learning Approach for Semi-Supervised Seismic Facies Classification

Author: Dou Yimin
Duan Hongjie
Jing Ruilin
Li Kewen
Liu Wenlong
Xu Zhifeng
Publication venue
Publication date: 12/03/2023
Field of study

Recently, seismic facies classification based on convolutional neural networks (CNN) has garnered significant research interest. However, existing CNN-based supervised learning approaches necessitate massive labeled data. Labeling is laborious and time-consuming, particularly for 3D seismic data volumes. To overcome this challenge, we propose a semi-supervised method based on pixel-level contrastive learning, termed CONSS, which can efficiently identify seismic facies using only 1% of the original annotations. Furthermore, the absence of a unified data division and standardized metrics hinders the fair comparison of various facies classification approaches. To this end, we develop an objective benchmark for the evaluation of semi-supervised methods, including self-training, consistency regularization, and the proposed CONSS. Our benchmark is publicly available to enable researchers to objectively compare different approaches. Experimental results demonstrate that our approach achieves state-of-the-art performance on the F3 survey

arXiv.org e-Print Archive

A Survey on Metric Learning for Feature Vectors and Structured Data

Author: Bellet Aurélien
Habrard Amaury
Sebban Marc
Publication venue
Publication date: 01/01/2013
Field of study

The need for appropriate ways to measure the distance or similarity between data is ubiquitous in machine learning, pattern recognition and data mining, but handcrafting such good metrics for specific problems is generally difficult. This has led to the emergence of metric learning, which aims at automatically learning a metric from data and has attracted a lot of interest in machine learning and related fields for the past ten years. This survey paper proposes a systematic review of the metric learning literature, highlighting the pros and cons of each approach. We pay particular attention to Mahalanobis distance metric learning, a well-studied and successful framework, but additionally present a wide range of methods that have recently emerged as powerful alternatives, including nonlinear metric learning, similarity learning and local metric learning. Recent trends and extensions, such as semi-supervised metric learning, metric learning for histogram data and the derivation of generalization guarantees, are also covered. Finally, this survey addresses metric learning for structured data, in particular edit distance learning, and attempts to give an overview of the remaining challenges in metric learning for the years to come.Comment: Technical report, 59 pages. Changes in v2: fixed typos and improved presentation. Changes in v3: fixed typos. Changes in v4: fixed typos and new method

arXiv.org e-Print Archive

HAL-UJM

Label-Efficient Deep Learning in Medical Image Analysis: Challenges and Future Directions

Author: Chen Hao
Guo Zhengrui
Jin Cheng
Lin Yi
Luo Luyang
Publication venue
Publication date: 19/12/2023
Field of study

Deep learning has seen rapid growth in recent years and achieved state-of-the-art performance in a wide range of applications. However, training models typically requires expensive and time-consuming collection of large quantities of labeled data. This is particularly true within the scope of medical imaging analysis (MIA), where data are limited and labels are expensive to be acquired. Thus, label-efficient deep learning methods are developed to make comprehensive use of the labeled data as well as the abundance of unlabeled and weak-labeled data. In this survey, we extensively investigated over 300 recent papers to provide a comprehensive overview of recent progress on label-efficient learning strategies in MIA. We first present the background of label-efficient learning and categorize the approaches into different schemes. Next, we examine the current state-of-the-art methods in detail through each scheme. Specifically, we provide an in-depth investigation, covering not only canonical semi-supervised, self-supervised, and multi-instance learning schemes, but also recently emerged active and annotation-efficient learning strategies. Moreover, as a comprehensive contribution to the field, this survey not only elucidates the commonalities and unique features of the surveyed methods but also presents a detailed analysis of the current challenges in the field and suggests potential avenues for future research.Comment: Update Few-shot Method

arXiv.org e-Print Archive

Deep Semi-Supervised Semantic Segmentation in Multi-Frequency Echosounder Data

Author: Choi Changkyu
Handegard Nils Olav
Jenssen Robert
Kampffmeyer Michael
Salberg Arnt-Børre
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2023
Field of study

Multi-frequency echosounder data can provide a broad understanding of the underwater environment in a non-invasive manner. The analysis of echosounder data is, hence, a topic of great importance for the marine ecosystem. Semantic segmentation, a deep learning based analysis method predicting the class attribute of each acoustic intensity, has recently been in the spotlight of the fisheries and aquatic industry since its result can be used to estimate the abundance of the marine organisms. However, a fundamental problem with current methods is the massive reliance on the availability of large amounts of annotated training data, which can only be acquired through expensive handcrafted annotation processes, making such approaches unrealistic in practice. As a solution to this challenge, we propose a novel approach, where we leverage a small amount of annotated data (supervised deep learning) and a large amount of readily available unannotated data (unsupervised learning), yielding a new data-efficient and accurate semi-supervised semantic segmentation method, all embodied into a single end-to-end trainable convolutional neural networks architecture. Our method is evaluated on representative data from a sandeel survey in the North Sea conducted by the Norwegian Institute of Marine Research. The rigorous experiments validate that our method achieves comparable results utilizing only 40 percent of the annotated data on which the supervised method is trained, by leveraging unannotated data. The code is available at https://github.com/SFI-Visual-Intelligence/PredKlus-semisup-segmentation.Deep Semi-Supervised Semantic Segmentation in Multi-Frequency Echosounder DatasubmittedVersionsubmittedVersionsubmittedVersio

Brage IMR