31 research outputs found
Chan-Vese Attention U-Net: An attention mechanism for robust segmentation
When studying the results of a segmentation algorithm using convolutional
neural networks, one wonders about the reliability and consistency of the
results. This leads to questioning the possibility of using such an algorithm
in applications where there is little room for doubt. We propose in this paper
a new attention gate based on the use of Chan-Vese energy minimization to
control more precisely the segmentation masks given by a standard CNN
architecture such as the U-Net model. This mechanism allows to obtain a
constraint on the segmentation based on the resolution of a PDE. The study of
the results allows us to observe the spatial information retained by the neural
network on the region of interest and obtains competitive results on the binary
segmentation. We illustrate the efficiency of this approach for medical image
segmentation on a database of MRI brain images
A Survey on Deep Learning in Medical Image Registration: New Technologies, Uncertainty, Evaluation Metrics, and Beyond
Over the past decade, deep learning technologies have greatly advanced the
field of medical image registration. The initial developments, such as
ResNet-based and U-Net-based networks, laid the groundwork for deep
learning-driven image registration. Subsequent progress has been made in
various aspects of deep learning-based registration, including similarity
measures, deformation regularizations, and uncertainty estimation. These
advancements have not only enriched the field of deformable image registration
but have also facilitated its application in a wide range of tasks, including
atlas construction, multi-atlas segmentation, motion estimation, and 2D-3D
registration. In this paper, we present a comprehensive overview of the most
recent advancements in deep learning-based image registration. We begin with a
concise introduction to the core concepts of deep learning-based image
registration. Then, we delve into innovative network architectures, loss
functions specific to registration, and methods for estimating registration
uncertainty. Additionally, this paper explores appropriate evaluation metrics
for assessing the performance of deep learning models in registration tasks.
Finally, we highlight the practical applications of these novel techniques in
medical imaging and discuss the future prospects of deep learning-based image
registration
Visual In-Context Learning for Few-Shot Eczema Segmentation
Automated diagnosis of eczema from digital camera images is crucial for
developing applications that allow patients to self-monitor their recovery. An
important component of this is the segmentation of eczema region from such
images. Current methods for eczema segmentation rely on deep neural networks
such as convolutional (CNN)-based U-Net or transformer-based Swin U-Net. While
effective, these methods require high volume of annotated data, which can be
difficult to obtain. Here, we investigate the capabilities of visual in-context
learning that can perform few-shot eczema segmentation with just a handful of
examples and without any need for retraining models. Specifically, we propose a
strategy for applying in-context learning for eczema segmentation with a
generalist vision model called SegGPT. When benchmarked on a dataset of
annotated eczema images, we show that SegGPT with just 2 representative example
images from the training dataset performs better (mIoU: 36.69) than a CNN U-Net
trained on 428 images (mIoU: 32.60). We also discover that using more number of
examples for SegGPT may in fact be harmful to its performance. Our result
highlights the importance of visual in-context learning in developing faster
and better solutions to skin imaging tasks. Our result also paves the way for
developing inclusive solutions that can cater to minorities in the demographics
who are typically heavily under-represented in the training data
Deep learning for unsupervised domain adaptation in medical imaging: Recent advancements and future perspectives
Deep learning has demonstrated remarkable performance across various tasks in
medical imaging. However, these approaches primarily focus on supervised
learning, assuming that the training and testing data are drawn from the same
distribution. Unfortunately, this assumption may not always hold true in
practice. To address these issues, unsupervised domain adaptation (UDA)
techniques have been developed to transfer knowledge from a labeled domain to a
related but unlabeled domain. In recent years, significant advancements have
been made in UDA, resulting in a wide range of methodologies, including feature
alignment, image translation, self-supervision, and disentangled representation
methods, among others. In this paper, we provide a comprehensive literature
review of recent deep UDA approaches in medical imaging from a technical
perspective. Specifically, we categorize current UDA research in medical
imaging into six groups and further divide them into finer subcategories based
on the different tasks they perform. We also discuss the respective datasets
used in the studies to assess the divergence between the different domains.
Finally, we discuss emerging areas and provide insights and discussions on
future research directions to conclude this survey.Comment: Under Revie
Fetal-BET: Brain Extraction Tool for Fetal MRI
Fetal brain extraction is a necessary first step in most computational fetal
brain MRI pipelines. However, it has been a very challenging task due to
non-standard fetal head pose, fetal movements during examination, and vastly
heterogeneous appearance of the developing fetal brain and the neighboring
fetal and maternal anatomy across various sequences and scanning conditions.
Development of a machine learning method to effectively address this task
requires a large and rich labeled dataset that has not been previously
available. As a result, there is currently no method for accurate fetal brain
extraction on various fetal MRI sequences. In this work, we first built a large
annotated dataset of approximately 72,000 2D fetal brain MRI images. Our
dataset covers the three common MRI sequences including T2-weighted,
diffusion-weighted, and functional MRI acquired with different scanners.
Moreover, it includes normal and pathological brains. Using this dataset, we
developed and validated deep learning methods, by exploiting the power of the
U-Net style architectures, the attention mechanism, multi-contrast feature
learning, and data augmentation for fast, accurate, and generalizable automatic
fetal brain extraction. Our approach leverages the rich information from
multi-contrast (multi-sequence) fetal MRI data, enabling precise delineation of
the fetal brain structures. Evaluations on independent test data show that our
method achieves accurate brain extraction on heterogeneous test data acquired
with different scanners, on pathological brains, and at various gestational
stages. This robustness underscores the potential utility of our deep learning
model for fetal brain imaging and image analysis.Comment: 10 pages, 6 figures, 2 TABLES, This work has been submitted to the
IEEE Transactions on Medical Imaging for possible publication. Copyright may
be transferred without notice, after which this version may no longer be
accessibl
FaceAtt: Enhancing Image Captioning with Facial Attributes for Portrait Images
Automated image caption generation is a critical area of research that
enhances accessibility and understanding of visual content for diverse
audiences. In this study, we propose the FaceAtt model, a novel approach to
attribute-focused image captioning that emphasizes the accurate depiction of
facial attributes within images. FaceAtt automatically detects and describes a
wide range of attributes, including emotions, expressions, pointed noses, fair
skin tones, hair textures, attractiveness, and approximate age ranges.
Leveraging deep learning techniques, we explore the impact of different image
feature extraction methods on caption quality and evaluate our model's
performance using metrics such as BLEU and METEOR. Our FaceAtt model leverages
annotated attributes of portraits as supplementary prior knowledge for our
portrait images before captioning. This innovative addition yields a subtle yet
discernible enhancement in the resulting scores, exemplifying the potency of
incorporating additional attribute vectors during training. Furthermore, our
research contributes to the broader discourse on ethical considerations in
automated captioning. This study sets the stage for future research in refining
attribute-focused captioning techniques, with a focus on enhancing linguistic
coherence, addressing biases, and accommodating diverse user needs
A Multi-scale Learning of Data-driven and Anatomically Constrained Image Registration for Adult and Fetal Echo Images
Temporal echo image registration is a basis for clinical quantifications such
as cardiac motion estimation, myocardial strain assessments, and stroke volume
quantifications. Deep learning image registration (DLIR) is consistently
accurate, requires less computing effort, and has shown encouraging results in
earlier applications. However, we propose that a greater focus on the warped
moving image's anatomic plausibility and image quality can support robust DLIR
performance. Further, past implementations have focused on adult echo, and
there is an absence of DLIR implementations for fetal echo. We propose a
framework combining three strategies for DLIR for both fetal and adult echo:
(1) an anatomic shape-encoded loss to preserve physiological myocardial and
left ventricular anatomical topologies in warped images; (2) a data-driven loss
that is trained adversarially to preserve good image texture features in warped
images; and (3) a multi-scale training scheme of a data-driven and anatomically
constrained algorithm to improve accuracy. Our experiments show that the
shape-encoded loss and the data-driven adversarial loss are strongly correlated
to good anatomical topology and image textures, respectively. They improve
different aspects of registration performance in a non-overlapping way,
justifying their combination. We show that these strategies can provide
excellent registration results in both adult and fetal echo using the publicly
available CAMUS adult echo dataset and our private multi-demographic fetal echo
dataset, despite fundamental distinctions between adult and fetal echo images.
Our approach also outperforms traditional non-DL gold standard registration
approaches, including Optical Flow and Elastix. Registration improvements could
also be translated to more accurate and precise clinical quantification of
cardiac ejection fraction, demonstrating a potential for translation
3D Visualization, Skeletonization and Branching Analysis of Blood Vessels in Angiogenesis
Angiogenesis is the process of new blood vessels growing from existing vasculature. Visualizing them as a three-dimensional (3D) model is a challenging, yet relevant, task as it would be of great help to researchers, pathologists, and medical doctors. A branching analysis on the 3D model would further facilitate research and diagnostic purposes. In this paper, a pipeline of vision algorithms is elaborated to visualize and analyze blood vessels in 3D from formalin-fixed paraffin-embedded (FFPE) granulation tissue sections with two different staining methods. First, a U-net neural network is used to segment blood vessels from the tissues. Second, image registration is used to align the consecutive images. Coarse registration using an image-intensity optimization technique, followed by finetuning using a neural network based on Spatial Transformers, results in an excellent alignment of images. Lastly, the corresponding segmented masks depicting the blood vessels are aligned and interpolated using the results of the image registration, resulting in a visualized 3D model. Additionally, a skeletonization algorithm is used to analyze the branching characteristics of the 3D vascular model. In summary, computer vision and deep learning is used to reconstruct, visualize and analyze a 3D vascular model from a set of parallel tissue samples. Our technique opens innovative perspectives in the pathophysiological understanding of vascular morphogenesis under different pathophysiological conditions and its potential diagnostic role