285,009 research outputs found
PCDAL: A Perturbation Consistency-Driven Active Learning Approach for Medical Image Segmentation and Classification
In recent years, deep learning has become a breakthrough technique in
assisting medical image diagnosis. Supervised learning using convolutional
neural networks (CNN) provides state-of-the-art performance and has served as a
benchmark for various medical image segmentation and classification. However,
supervised learning deeply relies on large-scale annotated data, which is
expensive, time-consuming, and even impractical to acquire in medical imaging
applications. Active Learning (AL) methods have been widely applied in natural
image classification tasks to reduce annotation costs by selecting more
valuable examples from the unlabeled data pool. However, their application in
medical image segmentation tasks is limited, and there is currently no
effective and universal AL-based method specifically designed for 3D medical
image segmentation. To address this limitation, we propose an AL-based method
that can be simultaneously applied to 2D medical image classification,
segmentation, and 3D medical image segmentation tasks. We extensively validated
our proposed active learning method on three publicly available and challenging
medical image datasets, Kvasir Dataset, COVID-19 Infection Segmentation
Dataset, and BraTS2019 Dataset. The experimental results demonstrate that our
PCDAL can achieve significantly improved performance with fewer annotations in
2D classification and segmentation and 3D segmentation tasks. The codes of this
study are available at https://github.com/ortonwang/PCDAL
Automated interpretation of benthic stereo imagery
Autonomous benthic imaging, reduces human risk and increases the amount of collected data. However, manually interpreting these high volumes of data is onerous, time consuming and in many cases, infeasible. The objective of this thesis is to improve the scientific utility of the large image datasets. Fine-scale terrain complexity is typically quantified by rugosity and measured by divers using chains and tape measures. This thesis proposes a new technique for measuring terrain complexity from 3D stereo image reconstructions, which is non-contact and can be calculated at multiple scales over large spatial extents. Using robots, terrain complexity can be measured without endangering humans, beyond scuba depths. Results show that this approach is more robust, flexible and easily repeatable than traditional methods. These proposed terrain complexity features are combined with visual colour and texture descriptors and applied to classifying imagery. New multi-dataset feature selection methods are proposed for performing feature selection across multiple datasets, and are shown to improve the overall classification performance. The results show that the most informative predictors of benthic habitat types are the new terrain complexity measurements. This thesis presents a method that aims to reduce human labelling effort, while maximising classification performance by combining pre-clustering with active learning. The results support that utilising the structure of the unlabelled data in conjunction with uncertainty sampling can significantly reduce the number of labels required for a given level of accuracy. Typically 0.00001β0.00007% of image data is annotated and processed for science purposes (20β50 points in 1β2% of the images). This thesis proposes a framework that uses existing human-annotated point labels to train a superpixel-based automated classification system, which can extrapolate the classified results to every pixel across all the images of an entire survey
Automated interpretation of benthic stereo imagery
Autonomous benthic imaging, reduces human risk and increases the amount of collected data. However, manually interpreting these high volumes of data is onerous, time consuming and in many cases, infeasible. The objective of this thesis is to improve the scientific utility of the large image datasets. Fine-scale terrain complexity is typically quantified by rugosity and measured by divers using chains and tape measures. This thesis proposes a new technique for measuring terrain complexity from 3D stereo image reconstructions, which is non-contact and can be calculated at multiple scales over large spatial extents. Using robots, terrain complexity can be measured without endangering humans, beyond scuba depths. Results show that this approach is more robust, flexible and easily repeatable than traditional methods. These proposed terrain complexity features are combined with visual colour and texture descriptors and applied to classifying imagery. New multi-dataset feature selection methods are proposed for performing feature selection across multiple datasets, and are shown to improve the overall classification performance. The results show that the most informative predictors of benthic habitat types are the new terrain complexity measurements. This thesis presents a method that aims to reduce human labelling effort, while maximising classification performance by combining pre-clustering with active learning. The results support that utilising the structure of the unlabelled data in conjunction with uncertainty sampling can significantly reduce the number of labels required for a given level of accuracy. Typically 0.00001β0.00007% of image data is annotated and processed for science purposes (20β50 points in 1β2% of the images). This thesis proposes a framework that uses existing human-annotated point labels to train a superpixel-based automated classification system, which can extrapolate the classified results to every pixel across all the images of an entire survey
Crowdsourcing in Computer Vision
Computer vision systems require large amounts of manually annotated data to
properly learn challenging visual concepts. Crowdsourcing platforms offer an
inexpensive method to capture human knowledge and understanding, for a vast
number of visual perception tasks. In this survey, we describe the types of
annotations computer vision researchers have collected using crowdsourcing, and
how they have ensured that this data is of high quality while annotation effort
is minimized. We begin by discussing data collection on both classic (e.g.,
object recognition) and recent (e.g., visual story-telling) vision tasks. We
then summarize key design decisions for creating effective data collection
interfaces and workflows, and present strategies for intelligently selecting
the most important data instances to annotate. Finally, we conclude with some
thoughts on the future of crowdsourcing in computer vision.Comment: A 69-page meta review of the field, Foundations and Trends in
Computer Graphics and Vision, 201
- β¦