241 research outputs found
Multimodal cardiac segmentation using disentangled representation learning.
Magnetic Resonance (MR) protocols use several sequences to evaluate pathology and organ status. Yet, despite recent advances, the analysis of each sequence’s images (modality hereafter) is treated in isolation. We propose a method suitable for multimodal and multi-input learning and analysis, that disentangles anatomical and imaging factors, and combines anatomical content across the modalities to extract more accurate segmentation masks. Mis-registrations between the inputs are handled with a Spatial Transformer Network, which non-linearly aligns the (now intensity-invariant) anatomical factors. We demonstrate applications in Late Gadolinium Enhanced (LGE) and cine MRI segmentation. We show that multi-input outperforms single-input models, and that we can train a (semi-supervised) model with few (or no) annotations for one of the modalities. Code is available at https://github.com/agis85/multimodal_segmentation
Disentangled representation learning in cardiac image analysis
Typically, a medical image offers spatial information on the anatomy (and pathology) modulated by imaging specific characteristics. Many imaging modalities including Magnetic Resonance Imaging (MRI) and Computed Tomography (CT) can be interpreted in this way. We can venture further and consider that a medical image naturally factors into some spatial factors depicting anatomy and factors that denote the imaging characteristics. Here, we explicitly learn this decomposed (disentangled) representation of imaging data, focusing in particular on cardiac images. We propose Spatial Decomposition Network (SDNet), which factorises 2D medical images into spatial anatomical factors and non-spatial modality factors. We demonstrate that this high-level representation is ideally suited for several medical image analysis tasks, such as semi-supervised segmentation, multi-task segmentation and regression, and image-to-image synthesis. Specifically, we show that our model can match the performance of fully supervised segmentation models, using only a fraction of the labelled images. Critically, we show that our factorised representation also benefits from supervision obtained either when we use auxiliary tasks to train the model in a multi-task setting (e.g. regressing to known cardiac indices), or when aggregating multimodal data from different sources (e.g. pooling together MRI and CT data). To explore the properties of the learned factorisation, we perform latent-space arithmetic and show that we can synthesise CT from MR and vice versa, by swapping the modality factors. We also demonstrate that the factor holding image specific information can be used to predict the input modality with high accuracy. Code will be made available at https://github.com/agis85/anatomy_modality_decomposition
SGDR: Semantic-guided Disentangled Representation for Unsupervised Cross-modality Medical Image Segmentation
Disentangled representation is a powerful technique to tackle domain shift
problem in medical image analysis in unsupervised domain adaptation
setting.However, previous methods only focus on exacting domain-invariant
feature and ignore whether exacted feature is meaningful for downstream
tasks.We propose a novel framework, called semantic-guided disentangled
representation (SGDR), an effective method to exact semantically meaningful
feature for segmentation task to improve performance of cross modality medical
image segmentation in unsupervised domain adaptation setting.To exact the
meaningful domain-invariant features of different modality, we introduce a
content discriminator to force the content representation to be embedded to the
same space and a feature discriminator to exact the meaningful
representation.We also use pixel-level annotations to guide the encoder to
learn features that are meaningful for segmentation task.We validated our
method on two public datasets and experiment results show that our approach
outperforms the state of the art methods on two evaluation metrics by a
significant margin.Comment: Tech Repor
Multi-modality cardiac image computing: a survey
Multi-modality cardiac imaging plays a key role in the management of patients with cardiovascular diseases. It allows a combination of complementary anatomical, morphological and functional information, increases diagnosis accuracy, and improves the efficacy of cardiovascular interventions and clinical outcomes. Fully-automated processing and quantitative analysis of multi-modality cardiac images could have a direct impact on clinical research and evidence-based patient management. However, these require overcoming significant challenges including inter-modality misalignment and finding optimal methods to integrate information from different modalities.
This paper aims to provide a comprehensive review of multi-modality imaging in cardiology, the computing methods, the validation strategies, the related clinical workflows and future perspectives. For the computing methodologies, we have a favored focus on the three tasks, i.e., registration, fusion and segmentation, which generally involve multi-modality imaging data, either combining information from different modalities or transferring information across modalities. The review highlights that multi-modality cardiac imaging data has the potential of wide applicability in the clinic, such as trans-aortic valve implantation guidance, myocardial viability assessment, and catheter ablation therapy and its patient selection. Nevertheless, many challenges remain unsolved, such as missing modality, modality selection, combination of imaging and non-imaging data, and uniform analysis and representation of different modalities. There is also work to do in defining how the well-developed techniques fit in clinical workflows and how much additional and relevant information they introduce. These problems are likely to continue to be an active field of research and the questions to be answered in the future
Learning Disentangled Representations in the Imaging Domain
Disentangled representation learning has been proposed as an approach to
learning general representations even in the absence of, or with limited,
supervision. A good general representation can be fine-tuned for new target
tasks using modest amounts of data, or used directly in unseen domains
achieving remarkable performance in the corresponding task. This alleviation of
the data and annotation requirements offers tantalising prospects for
applications in computer vision and healthcare. In this tutorial paper, we
motivate the need for disentangled representations, present key theory, and
detail practical building blocks and criteria for learning such
representations. We discuss applications in medical imaging and computer vision
emphasising choices made in exemplar key works. We conclude by presenting
remaining challenges and opportunities.Comment: Submitted. This paper follows a tutorial style but also surveys a
considerable (more than 200 citations) number of work
Multimodal and disentangled representation learning for medical image analysis
Automated medical image analysis is a growing research field with various applications in
modern healthcare. Furthermore, a multitude of imaging techniques (or modalities) have been
developed, such as Magnetic Resonance (MR) and Computed Tomography (CT), to attenuate
different organ characteristics. Research on image analysis is predominately driven by deep
learning methods due to their demonstrated performance. In this thesis, we argue that their success and generalisation relies on learning good latent representations. We propose methods for
learning spatial representations that are suitable for medical image data, and can combine information coming from different modalities. Specifically, we aim to improve cardiac MR segmentation, a challenging task due to varied images and limited expert annotations, by considering
complementary information present in (potentially unaligned) images of other modalities.
In order to evaluate the benefit of multimodal learning, we initially consider a synthesis task
on spatially aligned multimodal brain MR images. We propose a deep network of multiple
encoders and decoders, which we demonstrate outperforms existing approaches. The encoders
(one per input modality) map the multimodal images into modality invariant spatial feature
maps. Common and unique information is combined into a fused representation, that is robust
to missing modalities, and can be decoded into synthetic images of the target modalities. Different experimental settings demonstrate the benefit of multimodal over unimodal synthesis,
although input and output image pairs are required for training. The need for paired images can
be overcome with the cycle consistency principle, which we use in conjunction with adversarial
training to transform images from one modality (e.g. MR) to images in another (e.g. CT). This
is useful especially in cardiac datasets, where different spatial and temporal resolutions make
image pairing difficult, if not impossible.
Segmentation can also be considered as a form of image synthesis, if one modality consists of
semantic maps. We consider the task of extracting segmentation masks for cardiac MR images,
and aim to overcome the challenge of limited annotations, by taking into account unannanotated images which are commonly ignored. We achieve this by defining suitable latent spaces,
which represent the underlying anatomies (spatial latent variable), as well as the imaging characteristics (non-spatial latent variable). Anatomical information is required for tasks such as
segmentation and regression, whereas imaging information can capture variability in intensity
characteristics for example due to different scanners. We propose two models that disentangle
cardiac images at different levels: the first extracts the myocardium from the surrounding information, whereas the second fully separates the anatomical from the imaging characteristics.
Experimental analysis confirms the utility of disentangled representations in semi-supervised
segmentation, and in regression of cardiac indices, while maintaining robustness to intensity
variations such as the ones induced by different modalities.
Finally, our prior research is aggregated into one framework that encodes multimodal images
into disentangled anatomical and imaging factors. Several challenges of multimodal cardiac
imaging, such as input misalignments and the lack of expert annotations, are successfully handled in the shared anatomy space. Furthermore, we demonstrate that this approach can be used
to combine complementary anatomical information for the purpose of multimodal segmentation. This can be achieved even when no annotations are provided for one of the modalities.
This thesis creates new avenues for further research in the area of multimodal and disentangled learning with spatial representations, which we believe are key to more generalised deep
learning solutions in healthcare
- …