31 research outputs found

    Chan-Vese Attention U-Net: An attention mechanism for robust segmentation

    Full text link
    When studying the results of a segmentation algorithm using convolutional neural networks, one wonders about the reliability and consistency of the results. This leads to questioning the possibility of using such an algorithm in applications where there is little room for doubt. We propose in this paper a new attention gate based on the use of Chan-Vese energy minimization to control more precisely the segmentation masks given by a standard CNN architecture such as the U-Net model. This mechanism allows to obtain a constraint on the segmentation based on the resolution of a PDE. The study of the results allows us to observe the spatial information retained by the neural network on the region of interest and obtains competitive results on the binary segmentation. We illustrate the efficiency of this approach for medical image segmentation on a database of MRI brain images

    A Survey on Deep Learning in Medical Image Registration: New Technologies, Uncertainty, Evaluation Metrics, and Beyond

    Full text link
    Over the past decade, deep learning technologies have greatly advanced the field of medical image registration. The initial developments, such as ResNet-based and U-Net-based networks, laid the groundwork for deep learning-driven image registration. Subsequent progress has been made in various aspects of deep learning-based registration, including similarity measures, deformation regularizations, and uncertainty estimation. These advancements have not only enriched the field of deformable image registration but have also facilitated its application in a wide range of tasks, including atlas construction, multi-atlas segmentation, motion estimation, and 2D-3D registration. In this paper, we present a comprehensive overview of the most recent advancements in deep learning-based image registration. We begin with a concise introduction to the core concepts of deep learning-based image registration. Then, we delve into innovative network architectures, loss functions specific to registration, and methods for estimating registration uncertainty. Additionally, this paper explores appropriate evaluation metrics for assessing the performance of deep learning models in registration tasks. Finally, we highlight the practical applications of these novel techniques in medical imaging and discuss the future prospects of deep learning-based image registration

    Visual In-Context Learning for Few-Shot Eczema Segmentation

    Full text link
    Automated diagnosis of eczema from digital camera images is crucial for developing applications that allow patients to self-monitor their recovery. An important component of this is the segmentation of eczema region from such images. Current methods for eczema segmentation rely on deep neural networks such as convolutional (CNN)-based U-Net or transformer-based Swin U-Net. While effective, these methods require high volume of annotated data, which can be difficult to obtain. Here, we investigate the capabilities of visual in-context learning that can perform few-shot eczema segmentation with just a handful of examples and without any need for retraining models. Specifically, we propose a strategy for applying in-context learning for eczema segmentation with a generalist vision model called SegGPT. When benchmarked on a dataset of annotated eczema images, we show that SegGPT with just 2 representative example images from the training dataset performs better (mIoU: 36.69) than a CNN U-Net trained on 428 images (mIoU: 32.60). We also discover that using more number of examples for SegGPT may in fact be harmful to its performance. Our result highlights the importance of visual in-context learning in developing faster and better solutions to skin imaging tasks. Our result also paves the way for developing inclusive solutions that can cater to minorities in the demographics who are typically heavily under-represented in the training data

    Deep learning for unsupervised domain adaptation in medical imaging: Recent advancements and future perspectives

    Full text link
    Deep learning has demonstrated remarkable performance across various tasks in medical imaging. However, these approaches primarily focus on supervised learning, assuming that the training and testing data are drawn from the same distribution. Unfortunately, this assumption may not always hold true in practice. To address these issues, unsupervised domain adaptation (UDA) techniques have been developed to transfer knowledge from a labeled domain to a related but unlabeled domain. In recent years, significant advancements have been made in UDA, resulting in a wide range of methodologies, including feature alignment, image translation, self-supervision, and disentangled representation methods, among others. In this paper, we provide a comprehensive literature review of recent deep UDA approaches in medical imaging from a technical perspective. Specifically, we categorize current UDA research in medical imaging into six groups and further divide them into finer subcategories based on the different tasks they perform. We also discuss the respective datasets used in the studies to assess the divergence between the different domains. Finally, we discuss emerging areas and provide insights and discussions on future research directions to conclude this survey.Comment: Under Revie

    Fetal-BET: Brain Extraction Tool for Fetal MRI

    Full text link
    Fetal brain extraction is a necessary first step in most computational fetal brain MRI pipelines. However, it has been a very challenging task due to non-standard fetal head pose, fetal movements during examination, and vastly heterogeneous appearance of the developing fetal brain and the neighboring fetal and maternal anatomy across various sequences and scanning conditions. Development of a machine learning method to effectively address this task requires a large and rich labeled dataset that has not been previously available. As a result, there is currently no method for accurate fetal brain extraction on various fetal MRI sequences. In this work, we first built a large annotated dataset of approximately 72,000 2D fetal brain MRI images. Our dataset covers the three common MRI sequences including T2-weighted, diffusion-weighted, and functional MRI acquired with different scanners. Moreover, it includes normal and pathological brains. Using this dataset, we developed and validated deep learning methods, by exploiting the power of the U-Net style architectures, the attention mechanism, multi-contrast feature learning, and data augmentation for fast, accurate, and generalizable automatic fetal brain extraction. Our approach leverages the rich information from multi-contrast (multi-sequence) fetal MRI data, enabling precise delineation of the fetal brain structures. Evaluations on independent test data show that our method achieves accurate brain extraction on heterogeneous test data acquired with different scanners, on pathological brains, and at various gestational stages. This robustness underscores the potential utility of our deep learning model for fetal brain imaging and image analysis.Comment: 10 pages, 6 figures, 2 TABLES, This work has been submitted to the IEEE Transactions on Medical Imaging for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessibl

    FaceAtt: Enhancing Image Captioning with Facial Attributes for Portrait Images

    Full text link
    Automated image caption generation is a critical area of research that enhances accessibility and understanding of visual content for diverse audiences. In this study, we propose the FaceAtt model, a novel approach to attribute-focused image captioning that emphasizes the accurate depiction of facial attributes within images. FaceAtt automatically detects and describes a wide range of attributes, including emotions, expressions, pointed noses, fair skin tones, hair textures, attractiveness, and approximate age ranges. Leveraging deep learning techniques, we explore the impact of different image feature extraction methods on caption quality and evaluate our model's performance using metrics such as BLEU and METEOR. Our FaceAtt model leverages annotated attributes of portraits as supplementary prior knowledge for our portrait images before captioning. This innovative addition yields a subtle yet discernible enhancement in the resulting scores, exemplifying the potency of incorporating additional attribute vectors during training. Furthermore, our research contributes to the broader discourse on ethical considerations in automated captioning. This study sets the stage for future research in refining attribute-focused captioning techniques, with a focus on enhancing linguistic coherence, addressing biases, and accommodating diverse user needs

    A Multi-scale Learning of Data-driven and Anatomically Constrained Image Registration for Adult and Fetal Echo Images

    Full text link
    Temporal echo image registration is a basis for clinical quantifications such as cardiac motion estimation, myocardial strain assessments, and stroke volume quantifications. Deep learning image registration (DLIR) is consistently accurate, requires less computing effort, and has shown encouraging results in earlier applications. However, we propose that a greater focus on the warped moving image's anatomic plausibility and image quality can support robust DLIR performance. Further, past implementations have focused on adult echo, and there is an absence of DLIR implementations for fetal echo. We propose a framework combining three strategies for DLIR for both fetal and adult echo: (1) an anatomic shape-encoded loss to preserve physiological myocardial and left ventricular anatomical topologies in warped images; (2) a data-driven loss that is trained adversarially to preserve good image texture features in warped images; and (3) a multi-scale training scheme of a data-driven and anatomically constrained algorithm to improve accuracy. Our experiments show that the shape-encoded loss and the data-driven adversarial loss are strongly correlated to good anatomical topology and image textures, respectively. They improve different aspects of registration performance in a non-overlapping way, justifying their combination. We show that these strategies can provide excellent registration results in both adult and fetal echo using the publicly available CAMUS adult echo dataset and our private multi-demographic fetal echo dataset, despite fundamental distinctions between adult and fetal echo images. Our approach also outperforms traditional non-DL gold standard registration approaches, including Optical Flow and Elastix. Registration improvements could also be translated to more accurate and precise clinical quantification of cardiac ejection fraction, demonstrating a potential for translation

    3D Visualization, Skeletonization and Branching Analysis of Blood Vessels in Angiogenesis

    Get PDF
    Angiogenesis is the process of new blood vessels growing from existing vasculature. Visualizing them as a three-dimensional (3D) model is a challenging, yet relevant, task as it would be of great help to researchers, pathologists, and medical doctors. A branching analysis on the 3D model would further facilitate research and diagnostic purposes. In this paper, a pipeline of vision algorithms is elaborated to visualize and analyze blood vessels in 3D from formalin-fixed paraffin-embedded (FFPE) granulation tissue sections with two different staining methods. First, a U-net neural network is used to segment blood vessels from the tissues. Second, image registration is used to align the consecutive images. Coarse registration using an image-intensity optimization technique, followed by finetuning using a neural network based on Spatial Transformers, results in an excellent alignment of images. Lastly, the corresponding segmented masks depicting the blood vessels are aligned and interpolated using the results of the image registration, resulting in a visualized 3D model. Additionally, a skeletonization algorithm is used to analyze the branching characteristics of the 3D vascular model. In summary, computer vision and deep learning is used to reconstruct, visualize and analyze a 3D vascular model from a set of parallel tissue samples. Our technique opens innovative perspectives in the pathophysiological understanding of vascular morphogenesis under different pathophysiological conditions and its potential diagnostic role
    corecore