86 research outputs found
Deep Learning and Conditional Random Fields-based Depth Estimation and Topographical Reconstruction from Conventional Endoscopy
Colorectal cancer is the fourth leading cause of cancer deaths worldwide and
the second leading cause in the United States. The risk of colorectal cancer
can be mitigated by the identification and removal of premalignant lesions
through optical colonoscopy. Unfortunately, conventional colonoscopy misses
more than 20% of the polyps that should be removed, due in part to poor
contrast of lesion topography. Imaging tissue topography during a colonoscopy
is difficult because of the size constraints of the endoscope and the deforming
mucosa. Most existing methods make geometric assumptions or incorporate a
priori information, which limits accuracy and sensitivity. In this paper, we
present a method that avoids these restrictions, using a joint deep
convolutional neural network-conditional random field (CNN-CRF) framework.
Estimated depth is used to reconstruct the topography of the surface of the
colon from a single image. We train the unary and pairwise potential functions
of a CRF in a CNN on synthetic data, generated by developing an endoscope
camera model and rendering over 100,000 images of an anatomically-realistic
colon. We validate our approach with real endoscopy images from a porcine
colon, transferred to a synthetic-like domain, with ground truth from
registered computed tomography measurements. The CNN-CRF approach estimates
depths with a relative error of 0.152 for synthetic endoscopy images and 0.242
for real endoscopy images. We show that the estimated depth maps can be used
for reconstructing the topography of the mucosa from conventional colonoscopy
images. This approach can easily be integrated into existing endoscopy systems
and provides a foundation for improving computer-aided detection algorithms for
detection, segmentation and classification of lesions.Comment: 10 pages, 10 figure
Unsupervised Reverse Domain Adaptation for Synthetic Medical Images via Adversarial Training
To realize the full potential of deep learning for medical imaging, large
annotated datasets are required for training. Such datasets are difficult to
acquire because labeled medical images are not usually available due to privacy
issues, lack of experts available for annotation, underrepresentation of rare
conditions and poor standardization. Lack of annotated data has been addressed
in conventional vision applications using synthetic images refined via
unsupervised adversarial training to look like real images. However, this
approach is difficult to extend to general medical imaging because of the
complex and diverse set of features found in real human tissues. We propose an
alternative framework that uses a reverse flow, where adversarial training is
used to make real medical images more like synthetic images, and hypothesize
that clinically-relevant features can be preserved via self-regularization.
These domain-adapted images can then be accurately interpreted by networks
trained on large datasets of synthetic medical images. We test this approach
for the notoriously difficult task of depth-estimation from endoscopy. We train
a depth estimator on a large dataset of synthetic images generated using an
accurate forward model of an endoscope and an anatomically-realistic colon.
This network predicts significantly better depths when using synthetic-like
domain-adapted images compared to the real images, confirming that the
clinically-relevant features of depth are preserved.Comment: 10 pages, 8 figur
Large dynamic range autorefraction with a low-cost diffuser wavefront sensor
Wavefront sensing with a thin diffuser has emerged as a potential low-cost
alternative to a lenslet array for aberrometry. Diffuser wavefront sensors
(DWS) have previously relied on tracking speckle displacement and consequently
require coherent illumination. Here we show that displacement of caustic
patterns can be tracked for estimating wavefront gradient, enabling the use of
incoherent light sources and large dynamic-range wavefront measurements. We
compare the precision of a DWS to a Shack-Hartmann wavefront sensor (SHWS) when
using coherent, partially coherent, and incoherent illumination, in the
application of autorefraction. We induce spherical and cylindrical errors in a
model eye and use a multi-level Demon's non-rigid registration algorithm to
estimate caustic displacements relative to an emmetropic model eye. When
compared to spherical error measurements with the SHWS using partially coherent
illumination, the DWS demonstrates a 5-fold improvement in dynamic range
(-4.0 to +4.5 D vs. -22.0 to +19.5 D) with less than half the reduction in
resolution (0.072 vs. 0.116 D), enabling a 3-fold increase in the number
of resolvable prescriptions (118 vs. 358). In addition to being 40x lower-cost,
the unique, non-periodic nature of the caustic pattern formed by a diffuser
enables a larger dynamic range of aberration measurements compared to a lenslet
array.Comment: 18 pages, 11 figure
Rethinking Monocular Depth Estimation with Adversarial Training
Monocular depth estimation is an extensively studied computer vision problem
with a vast variety of applications. Deep learning-based methods have
demonstrated promise for both supervised and unsupervised depth estimation from
monocular images. Most existing approaches treat depth estimation as a
regression problem with a local pixel-wise loss function. In this work, we
innovate beyond existing approaches by using adversarial training to learn a
context-aware, non-local loss function. Such an approach penalizes the joint
configuration of predicted depth values at the patch-level instead of the
pixel-level, which allows networks to incorporate more global information. In
this framework, the generator learns a mapping between RGB images and its
corresponding depth map, while the discriminator learns to distinguish depth
map and RGB pairs from ground truth. This conditional GAN depth estimation
framework is stabilized using spectral normalization to prevent mode collapse
when learning from diverse datasets. We test this approach using a diverse set
of generators that include U-Net and joint CNN-CRF. We benchmark this approach
on the NYUv2, Make3D and KITTI datasets, and observe that adversarial training
reduces relative error by several fold, achieving state-of-the-art performance
DeepLSR: a deep learning approach for laser speckle reduction
Speckle artifacts degrade image quality in virtually all modalities that
utilize coherent energy, including optical coherence tomography, reflectance
confocal microscopy, ultrasound, and widefield imaging with laser illumination.
We present an adversarial deep learning framework for laser speckle reduction,
called DeepLSR (https://durr.jhu.edu/DeepLSR), that transforms images from a
source domain of coherent illumination to a target domain of speckle-free,
incoherent illumination. We apply this method to widefield images of objects
and tissues illuminated with a multi-wavelength laser, using light emitting
diode-illuminated images as ground truth. In images of gastrointestinal
tissues, DeepLSR reduces laser speckle noise by 6.4 dB, compared to a 2.9 dB
reduction from optimized non-local means processing, a 3.0 dB reduction from
BM3D, and a 3.7 dB reduction from an optical speckle reducer utilizing an
oscillating diffuser. Further, DeepLSR can be combined with optical speckle
reduction to reduce speckle noise by 9.4 dB. This dramatic reduction in speckle
noise may enable the use of coherent light sources in applications that require
small illumination sources and high-quality imaging, including medical
endoscopy
Rapid tissue oxygenation mapping from snapshot structured-light images with adversarial deep learning
Spatial frequency domain imaging (SFDI) is a powerful technique for mapping
tissue oxygen saturation over a wide field of view. However, current SFDI
methods either require a sequence of several images with different illumination
patterns or, in the case of single snapshot optical properties (SSOP),
introduce artifacts and sacrifice accuracy. To avoid this tradeoff, we
introduce OxyGAN: a data-driven, content-aware method to estimate tissue
oxygenation directly from single structured light images using end-to-end
generative adversarial networks. Conventional SFDI is used to obtain ground
truth tissue oxygenation maps for ex vivo human esophagi, in vivo hands and
feet, and an in vivo pig colon sample under 659 nm and 851 nm sinusoidal
illumination. We benchmark OxyGAN by comparing to SSOP and to a two-step hybrid
technique that uses a previously-developed deep learning model to predict
optical properties followed by a physical model to calculate tissue
oxygenation. When tested on human feet, a cross-validated OxyGAN maps tissue
oxygenation with an accuracy of 96.5%. When applied to sample types not
included in the training set, such as human hands and pig colon, OxyGAN
achieves a 93.0% accuracy, demonstrating robustness to various tissue types. On
average, OxyGAN outperforms SSOP and a hybrid model in estimating tissue
oxygenation by 24.9% and 24.7%, respectively. Lastly, we optimize OxyGAN
inference so that oxygenation maps are computed ~10 times faster than previous
work, enabling video-rate, 25Hz imaging. Due to its rapid acquisition and
processing speed, OxyGAN has the potential to enable real-time, high-fidelity
tissue oxygenation mapping that may be useful for many clinical applications
Structured Prediction using cGANs with Fusion Discriminator
We propose the fusion discriminator, a single unified framework for
incorporating conditional information into a generative adversarial network
(GAN) for a variety of distinct structured prediction tasks, including image
synthesis, semantic segmentation, and depth estimation. Much like commonly used
convolutional neural network -- conditional Markov random field (CNN-CRF)
models, the proposed method is able to enforce higher-order consistency in the
model, but without being limited to a very specific class of potentials. The
method is conceptually simple and flexible, and our experimental results
demonstrate improvement on several diverse structured prediction tasks.Comment: 13 pages, 5 figures, 3 table
Imaging human blood cells in vivo with oblique back-illumination capillaroscopy
We present a non-invasive, label-free method of imaging blood cells flowing
through human capillaries in vivo using oblique back-illumination
capillaroscopy (OBC). Green light illumination allows simultaneous phase and
absorption contrast, enhancing the ability to distinguish red and white blood
cells. Single-sided illumination through the objective lens enables 200 Hz
imaging with close illumination-detection separation and a simplified setup.
Phase contrast is optimized when the illumination axis is offset from the
detection axis by approximately 225 um when imaging 80 um deep in phantoms and
human ventral tongue. We demonstrate high-speed imaging of individual red blood
cells, white blood cells with sub-cellular detail, and platelets flowing
through capillaries and vessels in human tongue. A custom pneumatic cap placed
over the objective lens stabilizes the field of view, enabling longitudinal
imaging of a single capillary for up to seven minutes. We present high-quality
images of blood cells in individuals with Fitzpatrick skin phototypes II, IV,
and VI, showing that the technique is robust to high peripheral melanin
concentration. The signal quality, speed, simplicity, and robustness of this
approach underscores its potential for non-invasive blood cell counting.Comment: 10 pages, 7 Figure
Speckle illumination SFDI for projector-free optical property mapping
Spatial Frequency Domain Imaging can map tissue scattering and absorption
properties over a wide field of view, making it useful for clinical
applications such as wound assessment and surgical guidance. This technique has
previously required the projection of fully-characterized illumination
patterns. Here, we show that random and unknown speckle illumination can be
used to sample the modulation transfer function of tissues at known spatial
frequencies, allowing the quantitative mapping of optical properties with
simple laser diode illumination. We compute low- and high-spatial frequency
response parameters from the local power spectral density for each pixel and
use a look-up-table to accurately estimate absorption and scattering
coefficients in tissue phantoms, in vivo human hand, and ex vivo swine
esophagus. Because speckle patterns can be generated over a large depth of
field and field of view with simple coherent illumination, this approach may
enable optical property mapping in new form-factors and applications, including
endoscopy
A Deep Learning Bidirectional Temporal Tracking Algorithm for Automated Blood Cell Counting from Non-invasive Capillaroscopy Videos
Oblique back-illumination capillaroscopy has recently been introduced as a
method for high-quality, non-invasive blood cell imaging in human capillaries.
To make this technique practical for clinical blood cell counting, solutions
for automatic processing of acquired videos are needed. Here, we take the first
step towards this goal, by introducing a deep learning multi-cell tracking
model, named CycleTrack, which achieves accurate blood cell counting from
capillaroscopic videos. CycleTrack combines two simple online tracking models,
SORT and CenterTrack, and is tailored to features of capillary blood cell flow.
Blood cells are tracked by displacement vectors in two opposing temporal
directions (forward- and backward-tracking) between consecutive frames. This
approach yields accurate tracking despite rapidly moving and deforming blood
cells. The proposed model outperforms other baseline trackers, achieving 65.57%
Multiple Object Tracking Accuracy and 73.95% ID F1 score on test videos.
Compared to manual blood cell counting, CycleTrack achieves 96.58 2.43%
cell counting accuracy among 8 test videos with 1000 frames each compared to
93.45% and 77.02% accuracy for independent CenterTrack and SORT almost without
additional time expense. It takes 800s to track and count approximately 8000
blood cells from 9,600 frames captured in a typical one-minute video. Moreover,
the blood cell velocity measured by CycleTrack demonstrates a consistent,
pulsatile pattern within the physiological range of heart rate. Lastly, we
discuss future improvements for the CycleTrack framework, which would enable
clinical translation of the oblique back-illumination microscope towards a
real-time and non-invasive point-of-care blood cell counting and analyzing
technology.Comment: 10 pages, 6 figure
- …