47 research outputs found
Learning Disentangled Representations in the Imaging Domain
Disentangled representation learning has been proposed as an approach to
learning general representations even in the absence of, or with limited,
supervision. A good general representation can be fine-tuned for new target
tasks using modest amounts of data, or used directly in unseen domains
achieving remarkable performance in the corresponding task. This alleviation of
the data and annotation requirements offers tantalising prospects for
applications in computer vision and healthcare. In this tutorial paper, we
motivate the need for disentangled representations, present key theory, and
detail practical building blocks and criteria for learning such
representations. We discuss applications in medical imaging and computer vision
emphasising choices made in exemplar key works. We conclude by presenting
remaining challenges and opportunities.Comment: Submitted. This paper follows a tutorial style but also surveys a
considerable (more than 200 citations) number of work
Semi-supervised Pathology Segmentation with Disentangled Representations.
Automated pathology segmentation remains a valuable diagnostic tool in clinical practice. However, collecting training data is challenging. Semi-supervised approaches by combining labelled and unlabelled data can offer a solution to data scarcity. An approach to semi-supervised learning relies on reconstruction objectives (as self-supervision objectives) that learns in a joint fashion suitable representations for the task. Here, we propose Anatomy-Pathology Disentanglement Network (APD-Net), a pathology segmentation model that attempts to learn jointly for the first time: disentanglement of anatomy, modality, and pathology. The model is trained in a semi-supervised fashion with new reconstruction losses directly aiming to improve pathology segmentation with limited annotations. In addition, a joint optimization strategy is proposed to fully take advantage of the available annotations. We evaluate our methods with two private cardiac infarction segmentation datasets with LGE-MRI scans. APD-Net can perform pathology segmentation with few annotations, maintain performance with different amounts of supervision, and outperform related deep learning methods
Deep learning for unsupervised domain adaptation in medical imaging: Recent advancements and future perspectives
Deep learning has demonstrated remarkable performance across various tasks in
medical imaging. However, these approaches primarily focus on supervised
learning, assuming that the training and testing data are drawn from the same
distribution. Unfortunately, this assumption may not always hold true in
practice. To address these issues, unsupervised domain adaptation (UDA)
techniques have been developed to transfer knowledge from a labeled domain to a
related but unlabeled domain. In recent years, significant advancements have
been made in UDA, resulting in a wide range of methodologies, including feature
alignment, image translation, self-supervision, and disentangled representation
methods, among others. In this paper, we provide a comprehensive literature
review of recent deep UDA approaches in medical imaging from a technical
perspective. Specifically, we categorize current UDA research in medical
imaging into six groups and further divide them into finer subcategories based
on the different tasks they perform. We also discuss the respective datasets
used in the studies to assess the divergence between the different domains.
Finally, we discuss emerging areas and provide insights and discussions on
future research directions to conclude this survey.Comment: Under Revie
Interpretable Diabetic Retinopathy Diagnosis based on Biomarker Activation Map
Deep learning classifiers provide the most accurate means of automatically
diagnosing diabetic retinopathy (DR) based on optical coherence tomography
(OCT) and its angiography (OCTA). The power of these models is attributable in
part to the inclusion of hidden layers that provide the complexity required to
achieve a desired task. However, hidden layers also render algorithm outputs
difficult to interpret. Here we introduce a novel biomarker activation map
(BAM) framework based on generative adversarial learning that allows clinicians
to verify and understand classifiers decision-making. A data set including 456
macular scans were graded as non-referable or referable DR based on current
clinical standards. A DR classifier that was used to evaluate our BAM was first
trained based on this data set. The BAM generation framework was designed by
combing two U-shaped generators to provide meaningful interpretability to this
classifier. The main generator was trained to take referable scans as input and
produce an output that would be classified by the classifier as non-referable.
The BAM is then constructed as the difference image between the output and
input of the main generator. To ensure that the BAM only highlights
classifier-utilized biomarkers an assistant generator was trained to do the
opposite, producing scans that would be classified as referable by the
classifier from non-referable scans. The generated BAMs highlighted known
pathologic features including nonperfusion area and retinal fluid. A fully
interpretable classifier based on these highlights could help clinicians better
utilize and verify automated DR diagnosis.Comment: 12 pages, 8 figure
Multimodal and disentangled representation learning for medical image analysis
Automated medical image analysis is a growing research field with various applications in
modern healthcare. Furthermore, a multitude of imaging techniques (or modalities) have been
developed, such as Magnetic Resonance (MR) and Computed Tomography (CT), to attenuate
different organ characteristics. Research on image analysis is predominately driven by deep
learning methods due to their demonstrated performance. In this thesis, we argue that their success and generalisation relies on learning good latent representations. We propose methods for
learning spatial representations that are suitable for medical image data, and can combine information coming from different modalities. Specifically, we aim to improve cardiac MR segmentation, a challenging task due to varied images and limited expert annotations, by considering
complementary information present in (potentially unaligned) images of other modalities.
In order to evaluate the benefit of multimodal learning, we initially consider a synthesis task
on spatially aligned multimodal brain MR images. We propose a deep network of multiple
encoders and decoders, which we demonstrate outperforms existing approaches. The encoders
(one per input modality) map the multimodal images into modality invariant spatial feature
maps. Common and unique information is combined into a fused representation, that is robust
to missing modalities, and can be decoded into synthetic images of the target modalities. Different experimental settings demonstrate the benefit of multimodal over unimodal synthesis,
although input and output image pairs are required for training. The need for paired images can
be overcome with the cycle consistency principle, which we use in conjunction with adversarial
training to transform images from one modality (e.g. MR) to images in another (e.g. CT). This
is useful especially in cardiac datasets, where different spatial and temporal resolutions make
image pairing difficult, if not impossible.
Segmentation can also be considered as a form of image synthesis, if one modality consists of
semantic maps. We consider the task of extracting segmentation masks for cardiac MR images,
and aim to overcome the challenge of limited annotations, by taking into account unannanotated images which are commonly ignored. We achieve this by defining suitable latent spaces,
which represent the underlying anatomies (spatial latent variable), as well as the imaging characteristics (non-spatial latent variable). Anatomical information is required for tasks such as
segmentation and regression, whereas imaging information can capture variability in intensity
characteristics for example due to different scanners. We propose two models that disentangle
cardiac images at different levels: the first extracts the myocardium from the surrounding information, whereas the second fully separates the anatomical from the imaging characteristics.
Experimental analysis confirms the utility of disentangled representations in semi-supervised
segmentation, and in regression of cardiac indices, while maintaining robustness to intensity
variations such as the ones induced by different modalities.
Finally, our prior research is aggregated into one framework that encodes multimodal images
into disentangled anatomical and imaging factors. Several challenges of multimodal cardiac
imaging, such as input misalignments and the lack of expert annotations, are successfully handled in the shared anatomy space. Furthermore, we demonstrate that this approach can be used
to combine complementary anatomical information for the purpose of multimodal segmentation. This can be achieved even when no annotations are provided for one of the modalities.
This thesis creates new avenues for further research in the area of multimodal and disentangled learning with spatial representations, which we believe are key to more generalised deep
learning solutions in healthcare
Deep generative models for medical image synthesis and strategies to utilise them
Medical imaging has revolutionised the diagnosis and treatments of diseases since the first
medical image was taken using X-rays in 1895. As medical imaging became an essential tool
in a modern healthcare system, more medical imaging techniques have been invented, such
as Magnetic Resonance Imaging (MRI), Positron Emission Tomography (PET), Computed
Tomography (CT), Ultrasound, etc. With the advance of medical imaging techniques, the
demand for processing and analysing these complex medical images is increasing rapidly.
Efforts have been put on developing approaches that can automatically analyse medical images. With the recent success of deep learning (DL) in computer vision, researchers have
applied and proposed many DL-based methods in the field of medical image analysis. However, one problem with data-driven DL-based methods is the lack of data. Unlike natural
images, medical images are more expensive to acquire and label. One way to alleviate the
lack of medical data is medical image synthesis.
In this thesis, I first start with pseudo healthy synthesis, which is to create a ‘healthy’ looking
medical image from a pathological one. The synthesised pseudo healthy images can be used
for the detection of pathology, segmentation, etc. Several challenges exist with this task. The
first challenge is the lack of ground-truth data, as a subject cannot be healthy and diseased at
the same time. The second challenge is how to evaluate the generated images. In this thesis,
I propose a deep learning method to learn to generate pseudo healthy images with adversarial
and cycle consistency losses to overcome the lack of ground-truth data. I also propose several
metrics to evaluate the quality of synthetic ‘healthy’ images. Pseudo healthy synthesis can be
viewed as transforming images between discrete domains, e.g. from pathological domain to
healthy domain. However, there are some changes in medical data that are continuous, e.g.
brain ageing progression.
Brain changes as age increases. With the ageing global population, research on brain ageing
has attracted increasing attention. In this thesis, I propose a deep learning method that can
simulate such brain ageing progression. Specifically, longitudinal brain data are not easy to
acquire; if some exist, they only cover several years. Thus, the proposed method focuses on
learning subject-specific brain ageing progression without training on longitudinal data. As
there are other factors, such as neurodegenerative diseases, that can affect brain ageing, the
proposed model also considers health status, i.e. the existence of Alzheimer’s Disease (AD).
Furthermore, to evaluate the quality of synthetic aged images, I define several metrics and
conducted a series of experiments.
Suppose we have a pre-trained deep generative model and a downstream tasks model, say
a classifier. One question is how to make the best of the generative model to improve the
performance of the classifier. In this thesis, I propose a simple procedure that can discover
the ‘weakness’ of the classifier and guide the generator to synthesise counterfactuals (synthetic
data) that are hard for the classifier. The proposed procedure constructs an adversarial
game between generative factors of the generator and the classifier. We demonstrate the effectiveness
of this proposed procedure through a series of experiments. Furthermore, we
consider the application of generative models in a continual learning context and investigate
the usefulness of them to alleviate spurious correlation.
This thesis creates new avenues for further research in the area of medical image synthesis
and how to utilise the medical generative models, which we believe could be important for
future studies in medical image analysis with deep learning
Towards generalizable machine learning models for computer-aided diagnosis in medicine
Hidden stratification represents a phenomenon in which a training dataset contains unlabeled (hidden) subsets of cases that may affect machine learning model performance. Machine learning models that ignore the hidden stratification phenomenon--despite promising overall performance measured as accuracy and sensitivity--often fail at predicting the low prevalence cases, but those cases remain important. In the medical domain, patients with diseases are often less common than healthy patients, and a misdiagnosis of a patient with a disease can have significant clinical impacts. Therefore, to build a robust and trustworthy CAD system and a reliable treatment effect prediction model, we cannot only pursue machine learning models with high overall accuracy, but we also need to discover any hidden stratification in the data and evaluate the proposing machine learning models with respect to both overall performance and the performance on certain subsets (groups) of the data, such as the ‘worst group’.
In this study, I investigated three approaches for data stratification: a novel algorithmic deep learning (DL) approach that learns similarities among cases and two schema completion approaches that utilize domain expert knowledge. I further proposed an innovative way to integrate the discovered latent groups into the loss functions of DL models to allow for better model generalizability under the domain shift scenario caused by the data heterogeneity.
My results on lung nodule Computed Tomography (CT) images and breast cancer histopathology images demonstrate that learning homogeneous groups within heterogeneous data significantly improves the performance of the computer-aided diagnosis (CAD) system, particularly for low-prevalence or worst-performing cases. This study emphasizes the importance of discovering and learning the latent stratification within the data, as it is a critical step towards building ML models that are generalizable and reliable. Ultimately, this discovery can have a profound impact on clinical decision-making, particularly for low-prevalence cases