2,667 research outputs found

    ResViT: Residual vision transformers for multi-modal medical image synthesis

    Full text link
    Multi-modal imaging is a key healthcare technology that is often underutilized due to costs associated with multiple separate scans. This limitation yields the need for synthesis of unacquired modalities from the subset of available modalities. In recent years, generative adversarial network (GAN) models with superior depiction of structural details have been established as state-of-the-art in numerous medical image synthesis tasks. GANs are characteristically based on convolutional neural network (CNN) backbones that perform local processing with compact filters. This inductive bias in turn compromises learning of contextual features. Here, we propose a novel generative adversarial approach for medical image synthesis, ResViT, to combine local precision of convolution operators with contextual sensitivity of vision transformers. ResViT employs a central bottleneck comprising novel aggregated residual transformer (ART) blocks that synergistically combine convolutional and transformer modules. Comprehensive demonstrations are performed for synthesizing missing sequences in multi-contrast MRI, and CT images from MRI. Our results indicate superiority of ResViT against competing methods in terms of qualitative observations and quantitative metrics

    Dual-Domain Multi-Contrast MRI Reconstruction with Synthesis-based Fusion Network

    Full text link
    Purpose: To develop an efficient dual-domain reconstruction framework for multi-contrast MRI, with the focus on minimising cross-contrast misalignment in both the image and the frequency domains to enhance optimisation. Theory and Methods: Our proposed framework, based on deep learning, facilitates the optimisation for under-sampled target contrast using fully-sampled reference contrast that is quicker to acquire. The method consists of three key steps: 1) Learning to synthesise data resembling the target contrast from the reference contrast; 2) Registering the multi-contrast data to reduce inter-scan motion; and 3) Utilising the registered data for reconstructing the target contrast. These steps involve learning in both domains with regularisation applied to ensure their consistency. We also compare the reconstruction performance with existing deep learning-based methods using a dataset of brain MRI scans. Results: Extensive experiments demonstrate the superiority of our proposed framework, for up to an 8-fold acceleration rate, compared to state-of-the-art algorithms. Comprehensive analysis and ablation studies further present the effectiveness of the proposed components. Conclusion:Our dual-domain framework offers a promising approach to multi-contrast MRI reconstruction. It can also be integrated with existing methods to further enhance the reconstruction

    Learning to map between domains

    Get PDF
    Humans consume visual content avidly. We are in the midst of an imaging revolution enabled by inexpensive digital cameras and the internet. Almost everyone's cell phone has a camera. The photos taken by these cameras are shared massively and rapidly on the internet. However, there is an asymmetry: Each individual can consume only limited visual content in his limited lifetime, such that only a chosen few are talented enough to both express and understand something unseen visually and effectively. The rest of us try to understand and express something unseen by translating them to something seen before. Similarly, in the medical image field and radiological science, tens of thousands of medical images (MRI, CT, etc) of patients are taken. These medical images need to be studied and interpreted. In this dissertation, we investigate a number of data-driven approaches for mapping from an 'unseen' or hard to understand domain to a 'seen' or easy-to-understand domain. Our work includes mapping between two image domains and mapping from an image domain to a language domain, which in computer vision are called, respectively, image-to-image translation and image captioning. The presented methods not only help users to easily and accurately synthesize useful photos, but also enable new visual and linguistic effects not possible before this work. In the clinical diagnosis, these approaches can improve the accuracy and efficiency of the diagnosis process for the experienced radiologist. What's more, the approach of mapping from image domain to text domain can mimic the work of the experienced radiologist for automatic medical report generation. Part I: This part describes image segmentation, which can be treated as a special case of image-to-image translation. This part includes two works. The first work solves the anisotropic resolution problem for 3D medical image semantic segmentation in the Appendix A. The second work describes our US patented cross-domain medical image segmentation. The first domain has labels while the second domain has no labels; by designing a special domain mapping, we enable image semantic segmentation on the second domain. Both of these works can improve computer aided medical image interpretation and help the radiologist read the medical images more efficiently and accurately. Part II: In the clinical diagnosis, in order to combine the advantages of multiple medical imaging modalities together, medical image registrations or cross-domain image translation is needed. A crucial requirement for both is one to one correspondence. Because the medical images from multiple image modalities (such as MRI, CT) are from the same patients. This part presents learning a self-inverse network to realize one-to-one mapping for both paired and unpaired image-to-image translation. Part III: In the clinical diagnosis, the final output of the diagnosis is in text domain(such as medical report, medical prescriptions etc). Since medical report writing based on medical image can be error-prone for inexperienced physicians, and time-consuming and tedious for experienced physicians, automatic generation of medical image report can make this tedious and difficult task efficient. This part expands to learn the mapping from the image domain to the language domain. Specifically, the mapping is done by learning a language representation to form the language domain

    Learning Disentangled Representations in the Imaging Domain

    Full text link
    Disentangled representation learning has been proposed as an approach to learning general representations even in the absence of, or with limited, supervision. A good general representation can be fine-tuned for new target tasks using modest amounts of data, or used directly in unseen domains achieving remarkable performance in the corresponding task. This alleviation of the data and annotation requirements offers tantalising prospects for applications in computer vision and healthcare. In this tutorial paper, we motivate the need for disentangled representations, present key theory, and detail practical building blocks and criteria for learning such representations. We discuss applications in medical imaging and computer vision emphasising choices made in exemplar key works. We conclude by presenting remaining challenges and opportunities.Comment: Submitted. This paper follows a tutorial style but also surveys a considerable (more than 200 citations) number of work

    Cross-Modality Feature Learning for Three-Dimensional Brain Image Synthesis

    Get PDF
    Multi-modality medical imaging is increasingly used for comprehensive assessment of complex diseases in either diagnostic examinations or as part of medical research trials. Different imaging modalities provide complementary information about living tissues. However, multi-modal examinations are not always possible due to adversary factors such as patient discomfort, increased cost, prolonged scanning time and scanner unavailability. In addition, in large imaging studies, incomplete records are not uncommon owing to image artifacts, data corruption or data loss, which compromise the potential of multi-modal acquisitions. Moreover, independently of how well an imaging system is, the performance of the imaging equipment usually comes to a certain limit through different physical devices. Additional interferences arise (particularly for medical imaging systems), for example, limited acquisition times, sophisticated and costly equipment and patients with severe medical conditions, which also cause image degradation. The acquisitions can be considered as the degraded version of the original high-quality images. In this dissertation, we explore the problems of image super-resolution and cross-modality synthesis for one Magnetic Resonance Imaging (MRI) modality from an image of another MRI modality of the same subject using an image synthesis framework for reconstructing the missing/complex modality data. We develop models and techniques that allow us to connect the domain of source modality data and the domain of target modality data, enabling transformation between elements of the two domains. In particular, we first introduce the models that project both source modality data and target modality data into a common multi-modality feature space in a supervised setting. This common space then allows us to connect cross-modality features that depict a relationship between each other, and we can impose the learned association function that synthesizes any target modality image. Moreover, we develop a weakly-supervised method that takes a few registered multi-modality image pairs as training data and generates the desired modality data without being constrained a large number of multi-modality images collection of well-processed (\textit{e.g.}, skull-stripped and strictly registered) brain data. Finally, we propose an approach that provides a generic way of learning a dual mapping between source and target domains while considering both visually high-fidelity synthesis and task-practicability. We demonstrate that this model can be used to take any arbitrary modality and efficiently synthesize the desirable modality data in an unsupervised manner. We show that these proposed models advance the state-of-the-art on image super-resolution and cross-modality synthesis tasks that need jointly processing of multi-modality images and that we can design the algorithms in ways to generate the practically beneficial data to medical image analysis

    A Review on Low-Dose Emission Tomography Post-Reconstruction Denoising with Neural Network Approaches.

    Get PDF
    Low-dose emission tomography (ET) plays a crucial role in medical imaging, enabling the acquisition of functional information for various biological processes while minimizing the patient dose. However, the inherent randomness in the photon counting process is a source of noise which is amplified in low-dose ET. This review article provides an overview of existing post-processing techniques, with an emphasis on deep neural network (NN) approaches. Furthermore, we explore future directions in the field of NN-based low-dose ET. This comprehensive examination sheds light on the potential of deep learning in enhancing the quality and resolution of low-dose ET images, ultimately advancing the field of medical imaging

    A Review on Low-Dose Emission Tomography Post-Reconstruction Denoising with Neural Network Approaches

    Full text link
    Low-dose emission tomography (ET) plays a crucial role in medical imaging, enabling the acquisition of functional information for various biological processes while minimizing the patient dose. However, the inherent randomness in the photon counting process is a source of noise which is amplified in low-dose ET. This review article provides an overview of existing post-processing techniques, with an emphasis on deep neural network (NN) approaches. Furthermore, we explore future directions in the field of NN-based low-dose ET. This comprehensive examination sheds light on the potential of deep learning in enhancing the quality and resolution of low-dose ET images, ultimately advancing the field of medical imaging.Comment: 16 pages, 6 figure
    corecore