352 research outputs found

    Semi-Supervised Deep Regression with Uncertainty Consistency and Variational Model Ensembling via Bayesian Neural Networks

    Full text link
    Deep regression is an important problem with numerous applications. These range from computer vision tasks such as age estimation from photographs, to medical tasks such as ejection fraction estimation from echocardiograms for disease tracking. Semi-supervised approaches for deep regression are notably under-explored compared to classification and segmentation tasks, however. Unlike classification tasks, which rely on thresholding functions for generating class pseudo-labels, regression tasks use real number target predictions directly as pseudo-labels, making them more sensitive to prediction quality. In this work, we propose a novel approach to semi-supervised regression, namely Uncertainty-Consistent Variational Model Ensembling (UCVME), which improves training by generating high-quality pseudo-labels and uncertainty estimates for heteroscedastic regression. Given that aleatoric uncertainty is only dependent on input data by definition and should be equal for the same inputs, we present a novel uncertainty consistency loss for co-trained models. Our consistency loss significantly improves uncertainty estimates and allows higher quality pseudo-labels to be assigned greater importance under heteroscedastic regression. Furthermore, we introduce a novel variational model ensembling approach to reduce prediction noise and generate more robust pseudo-labels. We analytically show our method generates higher quality targets for unlabeled data and further improves training. Experiments show that our method outperforms state-of-the-art alternatives on different tasks and can be competitive with supervised methods that use full labels. Our code is available at https://github.com/xmed-lab/UCVME.Comment: Accepted by AAAI2

    Robust Surgical Tools Detection in Endoscopic Videos with Noisy Data

    Full text link
    Over the past few years, surgical data science has attracted substantial interest from the machine learning (ML) community. Various studies have demonstrated the efficacy of emerging ML techniques in analysing surgical data, particularly recordings of procedures, for digitizing clinical and non-clinical functions like preoperative planning, context-aware decision-making, and operating skill assessment. However, this field is still in its infancy and lacks representative, well-annotated datasets for training robust models in intermediate ML tasks. Also, existing datasets suffer from inaccurate labels, hindering the development of reliable models. In this paper, we propose a systematic methodology for developing robust models for surgical tool detection using noisy data. Our methodology introduces two key innovations: (1) an intelligent active learning strategy for minimal dataset identification and label correction by human experts; and (2) an assembling strategy for a student-teacher model-based self-training framework to achieve the robust classification of 14 surgical tools in a semi-supervised fashion. Furthermore, we employ weighted data loaders to handle difficult class labels and address class imbalance issues. The proposed methodology achieves an average F1-score of 85.88\% for the ensemble model-based self-training with class weights, and 80.88\% without class weights for noisy labels. Also, our proposed method significantly outperforms existing approaches, which effectively demonstrates its effectiveness

    Less is More: Reducing Task and Model Complexity for 3D Point Cloud Semantic Segmentation

    Get PDF
    Whilst the availability of 3D LiDAR point cloud data has significantly grown in recent years, annotation remains expensive and time-consuming, leading to a demand for semisupervised semantic segmentation methods with application domains such as autonomous driving. Existing work very often employs relatively large segmentation backbone networks to improve segmentation accuracy, at the expense of computational costs. In addition, many use uniform sampling to reduce ground truth data requirements for learning needed, often resulting in sub-optimal performance. To address these issues, we propose a new pipeline that employs a smaller architecture, requiring fewer ground-truth annotations to achieve superior segmentation accuracy compared to contemporary approaches. This is facilitated via a novel Sparse Depthwise Separable Convolution module that significantly reduces the network parameter count while retaining overall task performance. To effectively sub-sample our training data, we propose a new Spatio-Temporal Redundant Frame Downsampling (ST-RFD) method that leverages knowledge of sensor motion within the environment to extract a more diverse subset of training data frame samples. To leverage the use of limited annotated data samples, we further propose a soft pseudo-label method informed by Li- DAR reflectivity. Our method outperforms contemporary semi-supervised work in terms of mIoU, using less labeled data, on the SemanticKITTI (59.5@5%) and ScribbleKITTI (58.1@5%) benchmark datasets, based on a 2.3× reduction in model parameters and 641× fewer multiply-add operations whilst also demonstrating significant performance improvement on limited training data (i.e., Less is More)

    Semi-supervised Learning for Real-time Segmentation of Ultrasound Video Objects: A Review

    Get PDF
    Real-time intelligent segmentation of ultrasound video object is a demanding task in the field of medical image processing and serves as an essential and critical step in image-guided clinical procedures. However, obtaining reliable and accurate medical image annotations often necessitates expert guidance, making the acquisition of large-scale annotated datasets challenging and costly. This presents obstacles for traditional supervised learning methods. Consequently, semi-supervised learning (SSL) has emerged as a promising solution, capable of utilizing unlabeled data to enhance model performance and has been widely adopted in medical image segmentation tasks. However, striking a balance between segmentation accuracy and inference speed remains a challenge for real-time segmentation. This paper provides a comprehensive review of research progress in real-time intelligent semi-supervised ultrasound video object segmentation (SUVOS) and offers insights into future developments in this area

    How good is good enough? Strategies for dealing with unreliable segmentation annotations of medical data

    Get PDF
    Medical image segmentation is an essential topic in computer vision and medical image analysis, because it enables the precise and accurate segmentation of organs and lesions for healthcare applications. Deep learning has dominated in medical image segmentation due to increasingly powerful computational resources, successful neural network architecture engineering, and access to large amounts of medical imaging data with high-quality annotations. However, annotating medical imaging data is time-consuming and expensive, and sometimes the annotations are unreliable. This DPhil thesis presents a comprehensive study that explores deep learning techniques in medical image segmentation under various challenging situations of unreliable medical imaging data. These situations include: (1) conventional supervised learning to tackle comprehensive data annotation with full dense masks, (2) semi-supervised learning to tackle partial data annotation with full dense masks, (3) noise-robust learning to tackle comprehensive data annotation with noisy dense masks, and (4) weakly-supervised learning to tackle comprehensive data annotation with sketchy contours for network training. The proposed medical image segmentation strategies improve deep learning techniques to effectively address a series of challenges in medical image analysis, including limited annotated data, noisy annotations, and sparse annotations. These advancements aim to bring deep learning techniques of medical image analysis into practical clinical scenarios. By overcoming these challenges, the strategies establish a more robust and reliable application of deep learning methods which is valuable for improving diagnostic precision and patient care outcomes in real-world clinical environments

    Self-ensembling for visual domain adaptation

    Get PDF
    This paper explores the use of self-ensembling for visual domain adaptation problems. Our technique is derived from the mean teacher variant (Tarvainen et al., 2017) of temporal ensembling (Laine et al;, 2017), a technique that achieved state of the art results in the area of semi-supervised learning. We introduce a number of modifications to their approach for challenging domain adaptation scenarios and evaluate its effectiveness. Our approach achieves state of the art results in a variety of benchmarks, including our winning entry in the VISDA-2017 visual domain adaptation challenge. In small image benchmarks, our algorithm not only outperforms prior art, but can also achieve accuracy that is close to that of a classifier trained in a supervised fashion.Comment: 20 pages, 3 figure, accepted as a poster at ICLR 201
    • …
    corecore