776 research outputs found
NiftyNet: a deep-learning platform for medical imaging
Medical image analysis and computer-assisted intervention problems are
increasingly being addressed with deep-learning-based solutions. Established
deep-learning platforms are flexible but do not provide specific functionality
for medical image analysis and adapting them for this application requires
substantial implementation effort. Thus, there has been substantial duplication
of effort and incompatible infrastructure developed across many research
groups. This work presents the open-source NiftyNet platform for deep learning
in medical imaging. The ambition of NiftyNet is to accelerate and simplify the
development of these solutions, and to provide a common mechanism for
disseminating research outputs for the community to use, adapt and build upon.
NiftyNet provides a modular deep-learning pipeline for a range of medical
imaging applications including segmentation, regression, image generation and
representation learning applications. Components of the NiftyNet pipeline
including data loading, data augmentation, network architectures, loss
functions and evaluation metrics are tailored to, and take advantage of, the
idiosyncracies of medical image analysis and computer-assisted intervention.
NiftyNet is built on TensorFlow and supports TensorBoard visualization of 2D
and 3D images and computational graphs by default.
We present 3 illustrative medical image analysis applications built using
NiftyNet: (1) segmentation of multiple abdominal organs from computed
tomography; (2) image regression to predict computed tomography attenuation
maps from brain magnetic resonance images; and (3) generation of simulated
ultrasound images for specified anatomical poses.
NiftyNet enables researchers to rapidly develop and distribute deep learning
solutions for segmentation, regression, image generation and representation
learning applications, or extend the platform to new applications.Comment: Wenqi Li and Eli Gibson contributed equally to this work. M. Jorge
Cardoso and Tom Vercauteren contributed equally to this work. 26 pages, 6
figures; Update includes additional applications, updated author list and
formatting for journal submissio
Constrained CycleGAN for Effective Generation of Ultrasound Sector Images of Improved Spatial Resolution
Objective. A phased or a curvilinear array produces ultrasound (US) images
with a sector field of view (FOV), which inherently exhibits spatially-varying
image resolution with inferior quality in the far zone and towards the two
sides azimuthally. Sector US images with improved spatial resolutions are
favorable for accurate quantitative analysis of large and dynamic organs, such
as the heart. Therefore, this study aims to translate US images with
spatially-varying resolution to ones with less spatially-varying resolution.
CycleGAN has been a prominent choice for unpaired medical image translation;
however, it neither guarantees structural consistency nor preserves
backscattering patterns between input and generated images for unpaired US
images. Approach. To circumvent this limitation, we propose a constrained
CycleGAN (CCycleGAN), which directly performs US image generation with unpaired
images acquired by different ultrasound array probes. In addition to
conventional adversarial and cycle-consistency losses of CycleGAN, CCycleGAN
introduces an identical loss and a correlation coefficient loss based on
intrinsic US backscattered signal properties to constrain structural
consistency and backscattering patterns, respectively. Instead of
post-processed B-mode images, CCycleGAN uses envelope data directly obtained
from beamformed radio-frequency signals without any other non-linear
postprocessing. Main Results. In vitro phantom results demonstrate that
CCycleGAN successfully generates images with improved spatial resolution as
well as higher peak signal-to-noise ratio (PSNR) and structural similarity
(SSIM) compared with benchmarks. Significance. CCycleGAN-generated US images of
the in vivo human beating heart further facilitate higher quality heart wall
motion estimation than benchmarks-generated ones, particularly in deep regions
Incorporating Ultrasound Tongue Images for Audio-Visual Speech Enhancement
Audio-visual speech enhancement (AV-SE) aims to enhance degraded speech along
with extra visual information such as lip videos, and has been shown to be more
effective than audio-only speech enhancement. This paper proposes the
incorporation of ultrasound tongue images to improve the performance of
lip-based AV-SE systems further. To address the challenge of acquiring
ultrasound tongue images during inference, we first propose to employ knowledge
distillation during training to investigate the feasibility of leveraging
tongue-related information without directly inputting ultrasound tongue images.
Specifically, we guide an audio-lip speech enhancement student model to learn
from a pre-trained audio-lip-tongue speech enhancement teacher model, thus
transferring tongue-related knowledge. To better model the alignment between
the lip and tongue modalities, we further propose the introduction of a
lip-tongue key-value memory network into the AV-SE model. This network enables
the retrieval of tongue features based on readily available lip features,
thereby assisting the subsequent speech enhancement task. Experimental results
demonstrate that both methods significantly improve the quality and
intelligibility of the enhanced speech compared to traditional lip-based AV-SE
baselines. Moreover, both proposed methods exhibit strong generalization
performance on unseen speakers and in the presence of unseen noises.
Furthermore, phone error rate (PER) analysis of automatic speech recognition
(ASR) reveals that while all phonemes benefit from introducing ultrasound
tongue images, palatal and velar consonants benefit most.Comment: Submmited to IEEE/ACM Transactions on Audio, Speech and Language
Processing. arXiv admin note: text overlap with arXiv:2305.1493
Deep Learning in Breast Cancer Imaging: A Decade of Progress and Future Directions
Breast cancer has reached the highest incidence rate worldwide among all
malignancies since 2020. Breast imaging plays a significant role in early
diagnosis and intervention to improve the outcome of breast cancer patients. In
the past decade, deep learning has shown remarkable progress in breast cancer
imaging analysis, holding great promise in interpreting the rich information
and complex context of breast imaging modalities. Considering the rapid
improvement in the deep learning technology and the increasing severity of
breast cancer, it is critical to summarize past progress and identify future
challenges to be addressed. In this paper, we provide an extensive survey of
deep learning-based breast cancer imaging research, covering studies on
mammogram, ultrasound, magnetic resonance imaging, and digital pathology images
over the past decade. The major deep learning methods, publicly available
datasets, and applications on imaging-based screening, diagnosis, treatment
response prediction, and prognosis are described in detail. Drawn from the
findings of this survey, we present a comprehensive discussion of the
challenges and potential avenues for future research in deep learning-based
breast cancer imaging.Comment: Survey, 41 page
Comparative Analysis of Segment Anything Model and U-Net for Breast Tumor Detection in Ultrasound and Mammography Images
In this study, the main objective is to develop an algorithm capable of
identifying and delineating tumor regions in breast ultrasound (BUS) and
mammographic images. The technique employs two advanced deep learning
architectures, namely U-Net and pretrained SAM, for tumor segmentation. The
U-Net model is specifically designed for medical image segmentation and
leverages its deep convolutional neural network framework to extract meaningful
features from input images. On the other hand, the pretrained SAM architecture
incorporates a mechanism to capture spatial dependencies and generate
segmentation results. Evaluation is conducted on a diverse dataset containing
annotated tumor regions in BUS and mammographic images, covering both benign
and malignant tumors. This dataset enables a comprehensive assessment of the
algorithm's performance across different tumor types. Results demonstrate that
the U-Net model outperforms the pretrained SAM architecture in accurately
identifying and segmenting tumor regions in both BUS and mammographic images.
The U-Net exhibits superior performance in challenging cases involving
irregular shapes, indistinct boundaries, and high tumor heterogeneity. In
contrast, the pretrained SAM architecture exhibits limitations in accurately
identifying tumor areas, particularly for malignant tumors and objects with
weak boundaries or complex shapes. These findings highlight the importance of
selecting appropriate deep learning architectures tailored for medical image
segmentation. The U-Net model showcases its potential as a robust and accurate
tool for tumor detection, while the pretrained SAM architecture suggests the
need for further improvements to enhance segmentation performance
- …