352 research outputs found
Semi-Supervised Deep Regression with Uncertainty Consistency and Variational Model Ensembling via Bayesian Neural Networks
Deep regression is an important problem with numerous applications. These
range from computer vision tasks such as age estimation from photographs, to
medical tasks such as ejection fraction estimation from echocardiograms for
disease tracking. Semi-supervised approaches for deep regression are notably
under-explored compared to classification and segmentation tasks, however.
Unlike classification tasks, which rely on thresholding functions for
generating class pseudo-labels, regression tasks use real number target
predictions directly as pseudo-labels, making them more sensitive to prediction
quality. In this work, we propose a novel approach to semi-supervised
regression, namely Uncertainty-Consistent Variational Model Ensembling (UCVME),
which improves training by generating high-quality pseudo-labels and
uncertainty estimates for heteroscedastic regression. Given that aleatoric
uncertainty is only dependent on input data by definition and should be equal
for the same inputs, we present a novel uncertainty consistency loss for
co-trained models. Our consistency loss significantly improves uncertainty
estimates and allows higher quality pseudo-labels to be assigned greater
importance under heteroscedastic regression. Furthermore, we introduce a novel
variational model ensembling approach to reduce prediction noise and generate
more robust pseudo-labels. We analytically show our method generates higher
quality targets for unlabeled data and further improves training. Experiments
show that our method outperforms state-of-the-art alternatives on different
tasks and can be competitive with supervised methods that use full labels. Our
code is available at https://github.com/xmed-lab/UCVME.Comment: Accepted by AAAI2
Robust Surgical Tools Detection in Endoscopic Videos with Noisy Data
Over the past few years, surgical data science has attracted substantial
interest from the machine learning (ML) community. Various studies have
demonstrated the efficacy of emerging ML techniques in analysing surgical data,
particularly recordings of procedures, for digitizing clinical and non-clinical
functions like preoperative planning, context-aware decision-making, and
operating skill assessment. However, this field is still in its infancy and
lacks representative, well-annotated datasets for training robust models in
intermediate ML tasks. Also, existing datasets suffer from inaccurate labels,
hindering the development of reliable models. In this paper, we propose a
systematic methodology for developing robust models for surgical tool detection
using noisy data. Our methodology introduces two key innovations: (1) an
intelligent active learning strategy for minimal dataset identification and
label correction by human experts; and (2) an assembling strategy for a
student-teacher model-based self-training framework to achieve the robust
classification of 14 surgical tools in a semi-supervised fashion. Furthermore,
we employ weighted data loaders to handle difficult class labels and address
class imbalance issues. The proposed methodology achieves an average F1-score
of 85.88\% for the ensemble model-based self-training with class weights, and
80.88\% without class weights for noisy labels. Also, our proposed method
significantly outperforms existing approaches, which effectively demonstrates
its effectiveness
Less is More: Reducing Task and Model Complexity for 3D Point Cloud Semantic Segmentation
Whilst the availability of 3D LiDAR point cloud data has significantly grown in recent years, annotation remains expensive and time-consuming, leading to a demand for semisupervised semantic segmentation methods with application domains such as autonomous driving. Existing work very often employs relatively large segmentation backbone networks to improve segmentation accuracy, at the expense of computational costs. In addition, many use uniform sampling to reduce ground truth data requirements for learning needed, often resulting in sub-optimal performance. To address these issues, we propose a new pipeline that employs a smaller architecture, requiring fewer ground-truth annotations to achieve superior segmentation accuracy compared to contemporary approaches. This is facilitated via a novel Sparse Depthwise Separable Convolution module that significantly reduces the network parameter count while retaining overall task performance. To effectively sub-sample our training data, we propose a new Spatio-Temporal Redundant Frame Downsampling (ST-RFD) method that leverages knowledge of sensor motion within the environment to extract a more diverse subset of training data frame samples. To leverage the use of limited annotated data samples, we further propose a soft pseudo-label method informed by Li- DAR reflectivity. Our method outperforms contemporary semi-supervised work in terms of mIoU, using less labeled data, on the SemanticKITTI (59.5@5%) and ScribbleKITTI (58.1@5%) benchmark datasets, based on a 2.3× reduction in model parameters and 641× fewer multiply-add operations whilst also demonstrating significant performance improvement on limited training data (i.e., Less is More)
Semi-supervised Learning for Real-time Segmentation of Ultrasound Video Objects: A Review
Real-time intelligent segmentation of ultrasound video object is a demanding task in the field of medical image processing and serves as an essential and critical step in image-guided clinical procedures. However, obtaining reliable and accurate medical image annotations often necessitates expert guidance, making the acquisition of large-scale annotated datasets challenging and costly. This presents obstacles for traditional supervised learning methods. Consequently, semi-supervised learning (SSL) has emerged as a promising solution, capable of utilizing unlabeled data to enhance model performance and has been widely adopted in medical image segmentation tasks. However, striking a balance between segmentation accuracy and inference speed remains a challenge for real-time segmentation. This paper provides a comprehensive review of research progress in real-time intelligent semi-supervised ultrasound video object segmentation (SUVOS) and offers insights into future developments in this area
How good is good enough? Strategies for dealing with unreliable segmentation annotations of medical data
Medical image segmentation is an essential topic in computer vision and medical image analysis, because it enables the precise and accurate segmentation of organs and lesions for healthcare applications. Deep learning has dominated in medical image segmentation due to increasingly powerful computational resources, successful neural network architecture engineering, and access to large amounts of medical imaging data with high-quality annotations. However, annotating medical imaging data is time-consuming and expensive, and sometimes the annotations are unreliable.
This DPhil thesis presents a comprehensive study that explores deep learning techniques in medical image segmentation under various challenging situations of unreliable medical imaging data. These situations include: (1) conventional supervised learning to tackle comprehensive data annotation with full dense masks, (2) semi-supervised learning to tackle partial data annotation with full dense masks, (3) noise-robust learning to tackle comprehensive data annotation with noisy dense masks, and (4) weakly-supervised learning to tackle comprehensive data annotation with sketchy contours for network training.
The proposed medical image segmentation strategies improve deep learning techniques to effectively address a series of challenges in medical image analysis, including limited annotated data, noisy annotations, and sparse annotations. These advancements aim to bring deep learning techniques of medical image analysis into practical clinical scenarios. By overcoming these challenges, the strategies establish a more robust and reliable application of deep learning methods which is valuable for improving diagnostic precision and patient care outcomes in real-world clinical environments
Self-ensembling for visual domain adaptation
This paper explores the use of self-ensembling for visual domain adaptation
problems. Our technique is derived from the mean teacher variant (Tarvainen et
al., 2017) of temporal ensembling (Laine et al;, 2017), a technique that
achieved state of the art results in the area of semi-supervised learning. We
introduce a number of modifications to their approach for challenging domain
adaptation scenarios and evaluate its effectiveness. Our approach achieves
state of the art results in a variety of benchmarks, including our winning
entry in the VISDA-2017 visual domain adaptation challenge. In small image
benchmarks, our algorithm not only outperforms prior art, but can also achieve
accuracy that is close to that of a classifier trained in a supervised fashion.Comment: 20 pages, 3 figure, accepted as a poster at ICLR 201
- …