241 research outputs found

    Multimodal cardiac segmentation using disentangled representation learning.

    Get PDF
    Magnetic Resonance (MR) protocols use several sequences to evaluate pathology and organ status. Yet, despite recent advances, the analysis of each sequence’s images (modality hereafter) is treated in isolation. We propose a method suitable for multimodal and multi-input learning and analysis, that disentangles anatomical and imaging factors, and combines anatomical content across the modalities to extract more accurate segmentation masks. Mis-registrations between the inputs are handled with a Spatial Transformer Network, which non-linearly aligns the (now intensity-invariant) anatomical factors. We demonstrate applications in Late Gadolinium Enhanced (LGE) and cine MRI segmentation. We show that multi-input outperforms single-input models, and that we can train a (semi-supervised) model with few (or no) annotations for one of the modalities. Code is available at https://github.com/agis85/multimodal_segmentation

    Disentangled representation learning in cardiac image analysis

    Get PDF
    Typically, a medical image offers spatial information on the anatomy (and pathology) modulated by imaging specific characteristics. Many imaging modalities including Magnetic Resonance Imaging (MRI) and Computed Tomography (CT) can be interpreted in this way. We can venture further and consider that a medical image naturally factors into some spatial factors depicting anatomy and factors that denote the imaging characteristics. Here, we explicitly learn this decomposed (disentangled) representation of imaging data, focusing in particular on cardiac images. We propose Spatial Decomposition Network (SDNet), which factorises 2D medical images into spatial anatomical factors and non-spatial modality factors. We demonstrate that this high-level representation is ideally suited for several medical image analysis tasks, such as semi-supervised segmentation, multi-task segmentation and regression, and image-to-image synthesis. Specifically, we show that our model can match the performance of fully supervised segmentation models, using only a fraction of the labelled images. Critically, we show that our factorised representation also benefits from supervision obtained either when we use auxiliary tasks to train the model in a multi-task setting (e.g. regressing to known cardiac indices), or when aggregating multimodal data from different sources (e.g. pooling together MRI and CT data). To explore the properties of the learned factorisation, we perform latent-space arithmetic and show that we can synthesise CT from MR and vice versa, by swapping the modality factors. We also demonstrate that the factor holding image specific information can be used to predict the input modality with high accuracy. Code will be made available at https://github.com/agis85/anatomy_modality_decomposition

    SGDR: Semantic-guided Disentangled Representation for Unsupervised Cross-modality Medical Image Segmentation

    Full text link
    Disentangled representation is a powerful technique to tackle domain shift problem in medical image analysis in unsupervised domain adaptation setting.However, previous methods only focus on exacting domain-invariant feature and ignore whether exacted feature is meaningful for downstream tasks.We propose a novel framework, called semantic-guided disentangled representation (SGDR), an effective method to exact semantically meaningful feature for segmentation task to improve performance of cross modality medical image segmentation in unsupervised domain adaptation setting.To exact the meaningful domain-invariant features of different modality, we introduce a content discriminator to force the content representation to be embedded to the same space and a feature discriminator to exact the meaningful representation.We also use pixel-level annotations to guide the encoder to learn features that are meaningful for segmentation task.We validated our method on two public datasets and experiment results show that our approach outperforms the state of the art methods on two evaluation metrics by a significant margin.Comment: Tech Repor

    Multi-modality cardiac image computing: a survey

    Get PDF
    Multi-modality cardiac imaging plays a key role in the management of patients with cardiovascular diseases. It allows a combination of complementary anatomical, morphological and functional information, increases diagnosis accuracy, and improves the efficacy of cardiovascular interventions and clinical outcomes. Fully-automated processing and quantitative analysis of multi-modality cardiac images could have a direct impact on clinical research and evidence-based patient management. However, these require overcoming significant challenges including inter-modality misalignment and finding optimal methods to integrate information from different modalities. This paper aims to provide a comprehensive review of multi-modality imaging in cardiology, the computing methods, the validation strategies, the related clinical workflows and future perspectives. For the computing methodologies, we have a favored focus on the three tasks, i.e., registration, fusion and segmentation, which generally involve multi-modality imaging data, either combining information from different modalities or transferring information across modalities. The review highlights that multi-modality cardiac imaging data has the potential of wide applicability in the clinic, such as trans-aortic valve implantation guidance, myocardial viability assessment, and catheter ablation therapy and its patient selection. Nevertheless, many challenges remain unsolved, such as missing modality, modality selection, combination of imaging and non-imaging data, and uniform analysis and representation of different modalities. There is also work to do in defining how the well-developed techniques fit in clinical workflows and how much additional and relevant information they introduce. These problems are likely to continue to be an active field of research and the questions to be answered in the future

    Learning Disentangled Representations in the Imaging Domain

    Full text link
    Disentangled representation learning has been proposed as an approach to learning general representations even in the absence of, or with limited, supervision. A good general representation can be fine-tuned for new target tasks using modest amounts of data, or used directly in unseen domains achieving remarkable performance in the corresponding task. This alleviation of the data and annotation requirements offers tantalising prospects for applications in computer vision and healthcare. In this tutorial paper, we motivate the need for disentangled representations, present key theory, and detail practical building blocks and criteria for learning such representations. We discuss applications in medical imaging and computer vision emphasising choices made in exemplar key works. We conclude by presenting remaining challenges and opportunities.Comment: Submitted. This paper follows a tutorial style but also surveys a considerable (more than 200 citations) number of work

    Multimodal and disentangled representation learning for medical image analysis

    Get PDF
    Automated medical image analysis is a growing research field with various applications in modern healthcare. Furthermore, a multitude of imaging techniques (or modalities) have been developed, such as Magnetic Resonance (MR) and Computed Tomography (CT), to attenuate different organ characteristics. Research on image analysis is predominately driven by deep learning methods due to their demonstrated performance. In this thesis, we argue that their success and generalisation relies on learning good latent representations. We propose methods for learning spatial representations that are suitable for medical image data, and can combine information coming from different modalities. Specifically, we aim to improve cardiac MR segmentation, a challenging task due to varied images and limited expert annotations, by considering complementary information present in (potentially unaligned) images of other modalities. In order to evaluate the benefit of multimodal learning, we initially consider a synthesis task on spatially aligned multimodal brain MR images. We propose a deep network of multiple encoders and decoders, which we demonstrate outperforms existing approaches. The encoders (one per input modality) map the multimodal images into modality invariant spatial feature maps. Common and unique information is combined into a fused representation, that is robust to missing modalities, and can be decoded into synthetic images of the target modalities. Different experimental settings demonstrate the benefit of multimodal over unimodal synthesis, although input and output image pairs are required for training. The need for paired images can be overcome with the cycle consistency principle, which we use in conjunction with adversarial training to transform images from one modality (e.g. MR) to images in another (e.g. CT). This is useful especially in cardiac datasets, where different spatial and temporal resolutions make image pairing difficult, if not impossible. Segmentation can also be considered as a form of image synthesis, if one modality consists of semantic maps. We consider the task of extracting segmentation masks for cardiac MR images, and aim to overcome the challenge of limited annotations, by taking into account unannanotated images which are commonly ignored. We achieve this by defining suitable latent spaces, which represent the underlying anatomies (spatial latent variable), as well as the imaging characteristics (non-spatial latent variable). Anatomical information is required for tasks such as segmentation and regression, whereas imaging information can capture variability in intensity characteristics for example due to different scanners. We propose two models that disentangle cardiac images at different levels: the first extracts the myocardium from the surrounding information, whereas the second fully separates the anatomical from the imaging characteristics. Experimental analysis confirms the utility of disentangled representations in semi-supervised segmentation, and in regression of cardiac indices, while maintaining robustness to intensity variations such as the ones induced by different modalities. Finally, our prior research is aggregated into one framework that encodes multimodal images into disentangled anatomical and imaging factors. Several challenges of multimodal cardiac imaging, such as input misalignments and the lack of expert annotations, are successfully handled in the shared anatomy space. Furthermore, we demonstrate that this approach can be used to combine complementary anatomical information for the purpose of multimodal segmentation. This can be achieved even when no annotations are provided for one of the modalities. This thesis creates new avenues for further research in the area of multimodal and disentangled learning with spatial representations, which we believe are key to more generalised deep learning solutions in healthcare
    corecore