1,531 research outputs found
Segmentation of the Prostatic Gland and the Intraprostatic Lesions on Multiparametic MRI Using Mask-RCNN
Prostate cancer (PCa) is the most common cancer in men in the United States.
Multiparametic magnetic resonance imaging (mp-MRI) has been explored by many
researchers to targeted prostate biopsies and radiation therapy. However,
assessment on mp-MRI can be subjective, development of computer-aided diagnosis
systems to automatically delineate the prostate gland and the intraprostratic
lesions (ILs) becomes important to facilitate with radiologists in clinical
practice. In this paper, we first study the implementation of the Mask-RCNN
model to segment the prostate and ILs. We trained and evaluated models on 120
patients from two different cohorts of patients. We also used 2D U-Net and 3D
U-Net as benchmarks to segment the prostate and compared the model's
performance. The contour variability of ILs using the algorithm was also
benchmarked against the interobserver variability between two different
radiation oncologists on 19 patients. Our results indicate that the Mask-RCNN
model is able to reach state-of-art performance in the prostate segmentation
and outperforms several competitive baselines in ILs segmentation
A Large RGB-D Dataset for Semi-supervised Monocular Depth Estimation
Current self-supervised methods for monocular depth estimation are largely
based on deeply nested convolutional networks that leverage stereo image pairs
or monocular sequences during a training phase. However, they often exhibit
inaccurate results around occluded regions and depth boundaries. In this paper,
we present a simple yet effective approach for monocular depth estimation using
stereo image pairs. The study aims to propose a student-teacher strategy in
which a shallow student network is trained with the auxiliary information
obtained from a deeper and more accurate teacher network. Specifically, we
first train the stereo teacher network by fully utilizing the binocular
perception of 3-D geometry and then use the depth predictions of the teacher
network to train the student network for monocular depth inference. This
enables us to exploit all available depth data from massive unlabeled stereo
pairs. We propose a strategy that involves the use of a data ensemble to merge
the multiple depth predictions of the teacher network to improve the training
samples by collecting non-trivial knowledge beyond a single prediction. To
refine the inaccurate depth estimation that is used when training the student
network, we further propose stereo confidence-guided regression loss that
handles the unreliable pseudo depth values in occlusion, texture-less region,
and repetitive pattern. To complement the existing dataset comprising outdoor
driving scenes, we built a novel large-scale dataset consisting of one million
outdoor stereo images taken using hand-held stereo cameras. Finally, we
demonstrate that the monocular depth estimation network provides feature
representations that are suitable for high-level vision tasks. The experimental
results for various outdoor scenarios demonstrate the effectiveness and
flexibility of our approach, which outperforms state-of-the-art approaches.Comment: https://dimlrgbd.github.io
A Survey of the Recent Architectures of Deep Convolutional Neural Networks
Deep Convolutional Neural Network (CNN) is a special type of Neural Networks,
which has shown exemplary performance on several competitions related to
Computer Vision and Image Processing. Some of the exciting application areas of
CNN include Image Classification and Segmentation, Object Detection, Video
Processing, Natural Language Processing, and Speech Recognition. The powerful
learning ability of deep CNN is primarily due to the use of multiple feature
extraction stages that can automatically learn representations from the data.
The availability of a large amount of data and improvement in the hardware
technology has accelerated the research in CNNs, and recently interesting deep
CNN architectures have been reported. Several inspiring ideas to bring
advancements in CNNs have been explored, such as the use of different
activation and loss functions, parameter optimization, regularization, and
architectural innovations. However, the significant improvement in the
representational capacity of the deep CNN is achieved through architectural
innovations. Notably, the ideas of exploiting spatial and channel information,
depth and width of architecture, and multi-path information processing have
gained substantial attention. Similarly, the idea of using a block of layers as
a structural unit is also gaining popularity. This survey thus focuses on the
intrinsic taxonomy present in the recently reported deep CNN architectures and,
consequently, classifies the recent innovations in CNN architectures into seven
different categories. These seven categories are based on spatial exploitation,
depth, multi-path, width, feature-map exploitation, channel boosting, and
attention. Additionally, the elementary understanding of CNN components,
current challenges, and applications of CNN are also provided.Comment: Number of Pages: 70, Number of Figures: 11, Number of Tables: 11.
Artif Intell Rev (2020
Enhancing endoscopic navigation and polyp detection using artificial intelligence
Colorectal cancer (CRC) is one most common and deadly forms of cancer. It has a very high mortality rate if the disease advances to late stages however early diagnosis and treatment can be curative is hence essential to enhancing disease management. Colonoscopy is considered the gold standard for CRC screening and early therapeutic treatment. The effectiveness of colonoscopy is highly dependent on the operator’s skill, as a high level of hand-eye coordination is required to control the endoscope and fully examine the colon wall. Because of this, detection rates can vary between different gastroenterologists and technology have been proposed as solutions to assist disease detection and standardise detection rates. This thesis focuses on developing artificial intelligence algorithms to assist gastroenterologists during colonoscopy with the potential to ensure a baseline standard of quality in CRC screening. To achieve such assistance, the technical contributions develop deep learning methods and architectures for automated endoscopic image analysis to address both the detection of lesions in the endoscopic image and the 3D mapping of the endoluminal environment. The proposed detection models can run in real-time and assist visualization of different polyp types. Meanwhile the 3D reconstruction and mapping models developed are the basis for ensuring that the entire colon has been examined appropriately and to support quantitative measurement of polyp sizes using the image during a procedure. Results and validation studies presented within the thesis demonstrate how the developed algorithms perform on both general scenes and on clinical data. The feasibility of clinical translation is demonstrated for all of the models on endoscopic data from human participants during CRC screening examinations
A Survey on Deep Learning in Medical Image Analysis
Deep learning algorithms, in particular convolutional networks, have rapidly
become a methodology of choice for analyzing medical images. This paper reviews
the major deep learning concepts pertinent to medical image analysis and
summarizes over 300 contributions to the field, most of which appeared in the
last year. We survey the use of deep learning for image classification, object
detection, segmentation, registration, and other tasks and provide concise
overviews of studies per application area. Open challenges and directions for
future research are discussed.Comment: Revised survey includes expanded discussion section and reworked
introductory section on common deep architectures. Added missed papers from
before Feb 1st 201
Lose The Views: Limited Angle CT Reconstruction via Implicit Sinogram Completion
Computed Tomography (CT) reconstruction is a fundamental component to a wide
variety of applications ranging from security, to healthcare. The classical
techniques require measuring projections, called sinograms, from a full
180 view of the object. This is impractical in a limited angle
scenario, when the viewing angle is less than 180, which can occur due
to different factors including restrictions on scanning time, limited
flexibility of scanner rotation, etc. The sinograms obtained as a result, cause
existing techniques to produce highly artifact-laden reconstructions. In this
paper, we propose to address this problem through implicit sinogram completion,
on a challenging real world dataset containing scans of common checked-in
luggage. We propose a system, consisting of 1D and 2D convolutional neural
networks, that operates on a limited angle sinogram to directly produce the
best estimate of a reconstruction. Next, we use the x-ray transform on this
reconstruction to obtain a "completed" sinogram, as if it came from a full
180 measurement. We feed this to standard analytical and iterative
reconstruction techniques to obtain the final reconstruction. We show with
extensive experimentation that this combined strategy outperforms many
competitive baselines. We also propose a measure of confidence for the
reconstruction that enables a practitioner to gauge the reliability of a
prediction made by our network. We show that this measure is a strong indicator
of quality as measured by the PSNR, while not requiring ground truth at test
time. Finally, using a segmentation experiment, we show that our reconstruction
preserves the 3D structure of objects effectively.Comment: Spotlight presentation at CVPR 201
Residual Attention U-Net for Automated Multi-Class Segmentation of COVID-19 Chest CT Images
The novel coronavirus disease 2019 (COVID-19) has been spreading rapidly
around the world and caused significant impact on the public health and
economy. However, there is still lack of studies on effectively quantifying the
lung infection caused by COVID-19. As a basic but challenging task of the
diagnostic framework, segmentation plays a crucial role in accurate
quantification of COVID-19 infection measured by computed tomography (CT)
images. To this end, we proposed a novel deep learning algorithm for automated
segmentation of multiple COVID-19 infection regions. Specifically, we use the
Aggregated Residual Transformations to learn a robust and expressive feature
representation and apply the soft attention mechanism to improve the capability
of the model to distinguish a variety of symptoms of the COVID-19. With a
public CT image dataset, we validate the efficacy of the proposed algorithm in
comparison with other competing methods. Experimental results demonstrate the
outstanding performance of our algorithm for automated segmentation of COVID-19
Chest CT images. Our study provides a promising deep leaning-based segmentation
tool to lay a foundation to quantitative diagnosis of COVID-19 lung infection
in CT images
DoDNet: Learning to segment multi-organ and tumors from multiple partially labeled datasets
Due to the intensive cost of labor and expertise in annotating 3D medical
images at a voxel level, most benchmark datasets are equipped with the
annotations of only one type of organs and/or tumors, resulting in the
so-called partially labeling issue. To address this, we propose a dynamic
on-demand network (DoDNet) that learns to segment multiple organs and tumors on
partially labeled datasets. DoDNet consists of a shared encoder-decoder
architecture, a task encoding module, a controller for generating dynamic
convolution filters, and a single but dynamic segmentation head. The
information of the current segmentation task is encoded as a task-aware prior
to tell the model what the task is expected to solve. Different from existing
approaches which fix kernels after training, the kernels in dynamic head are
generated adaptively by the controller, conditioned on both input image and
assigned task. Thus, DoDNet is able to segment multiple organs and tumors, as
done by multiple networks or a multi-head network, in a much efficient and
flexible manner. We have created a large-scale partially labeled dataset,
termed MOTS, and demonstrated the superior performance of our DoDNet over other
competitors on seven organ and tumor segmentation tasks. We also transferred
the weights pre-trained on MOTS to a downstream multi-organ segmentation task
and achieved state-of-the-art performance. This study provides a general 3D
medical image segmentation model that has been pre-trained on a large-scale
partially labelled dataset and can be extended (after fine-tuning) to
downstream volumetric medical data segmentation tasks. The dataset and code
areavailableat: https://git.io/DoDNe
Federated learning for medical imaging radiology
Federated learning (FL) is gaining wide acceptance across the medical AI domains. FL promises to provide a fairly acceptable clinical-grade accuracy, privacy, and generalisability of machine learning models across multiple institutions. However, the research on FL for medical imaging AI is still in its early stages. This paper presents a review of recent research to outline the difference between state-of-the-art [SOTA] (published literature) and state-of-the-practice [SOTP] (applied research in realistic clinical environments). Furthermore, the review outlines the future research directions considering various factors such as data, learning models, system design, governance, and human-in-loop to translate the SOTA into SOTP and effectively collaborate across multiple institutions
Implicit Label Augmentation on Partially Annotated Clips via Temporally-Adaptive Features Learning
Partially annotated clips contain rich temporal contexts that can complement
the sparse key frame annotations in providing supervision for model training.
We present a novel paradigm called Temporally-Adaptive Features (TAF) learning
that can utilize such data to learn better single frame models. By imposing
distinct temporal change rate constraints on different factors in the model,
TAF enables learning from unlabeled frames using context to enhance model
accuracy. TAF generalizes "slow feature" learning and we present much stronger
empirical evidence than prior works, showing convincing gains for the
challenging semantic segmentation task over a variety of architecture designs
and on two popular datasets. TAF can be interpreted as an implicit label
augmentation method but is a more principled formulation compared to existing
explicit augmentation techniques. Our work thus connects two promising methods
that utilize partially annotated clips for single frame model training and can
inspire future explorations in this direction
- …