86 research outputs found

    Deep Learning and Conditional Random Fields-based Depth Estimation and Topographical Reconstruction from Conventional Endoscopy

    Full text link
    Colorectal cancer is the fourth leading cause of cancer deaths worldwide and the second leading cause in the United States. The risk of colorectal cancer can be mitigated by the identification and removal of premalignant lesions through optical colonoscopy. Unfortunately, conventional colonoscopy misses more than 20% of the polyps that should be removed, due in part to poor contrast of lesion topography. Imaging tissue topography during a colonoscopy is difficult because of the size constraints of the endoscope and the deforming mucosa. Most existing methods make geometric assumptions or incorporate a priori information, which limits accuracy and sensitivity. In this paper, we present a method that avoids these restrictions, using a joint deep convolutional neural network-conditional random field (CNN-CRF) framework. Estimated depth is used to reconstruct the topography of the surface of the colon from a single image. We train the unary and pairwise potential functions of a CRF in a CNN on synthetic data, generated by developing an endoscope camera model and rendering over 100,000 images of an anatomically-realistic colon. We validate our approach with real endoscopy images from a porcine colon, transferred to a synthetic-like domain, with ground truth from registered computed tomography measurements. The CNN-CRF approach estimates depths with a relative error of 0.152 for synthetic endoscopy images and 0.242 for real endoscopy images. We show that the estimated depth maps can be used for reconstructing the topography of the mucosa from conventional colonoscopy images. This approach can easily be integrated into existing endoscopy systems and provides a foundation for improving computer-aided detection algorithms for detection, segmentation and classification of lesions.Comment: 10 pages, 10 figure

    Unsupervised Reverse Domain Adaptation for Synthetic Medical Images via Adversarial Training

    Full text link
    To realize the full potential of deep learning for medical imaging, large annotated datasets are required for training. Such datasets are difficult to acquire because labeled medical images are not usually available due to privacy issues, lack of experts available for annotation, underrepresentation of rare conditions and poor standardization. Lack of annotated data has been addressed in conventional vision applications using synthetic images refined via unsupervised adversarial training to look like real images. However, this approach is difficult to extend to general medical imaging because of the complex and diverse set of features found in real human tissues. We propose an alternative framework that uses a reverse flow, where adversarial training is used to make real medical images more like synthetic images, and hypothesize that clinically-relevant features can be preserved via self-regularization. These domain-adapted images can then be accurately interpreted by networks trained on large datasets of synthetic medical images. We test this approach for the notoriously difficult task of depth-estimation from endoscopy. We train a depth estimator on a large dataset of synthetic images generated using an accurate forward model of an endoscope and an anatomically-realistic colon. This network predicts significantly better depths when using synthetic-like domain-adapted images compared to the real images, confirming that the clinically-relevant features of depth are preserved.Comment: 10 pages, 8 figur

    Large dynamic range autorefraction with a low-cost diffuser wavefront sensor

    Full text link
    Wavefront sensing with a thin diffuser has emerged as a potential low-cost alternative to a lenslet array for aberrometry. Diffuser wavefront sensors (DWS) have previously relied on tracking speckle displacement and consequently require coherent illumination. Here we show that displacement of caustic patterns can be tracked for estimating wavefront gradient, enabling the use of incoherent light sources and large dynamic-range wavefront measurements. We compare the precision of a DWS to a Shack-Hartmann wavefront sensor (SHWS) when using coherent, partially coherent, and incoherent illumination, in the application of autorefraction. We induce spherical and cylindrical errors in a model eye and use a multi-level Demon's non-rigid registration algorithm to estimate caustic displacements relative to an emmetropic model eye. When compared to spherical error measurements with the SHWS using partially coherent illumination, the DWS demonstrates a ∼\sim5-fold improvement in dynamic range (-4.0 to +4.5 D vs. -22.0 to +19.5 D) with less than half the reduction in resolution (0.072 vs. 0.116 D), enabling a ∼\sim3-fold increase in the number of resolvable prescriptions (118 vs. 358). In addition to being 40x lower-cost, the unique, non-periodic nature of the caustic pattern formed by a diffuser enables a larger dynamic range of aberration measurements compared to a lenslet array.Comment: 18 pages, 11 figure

    Rethinking Monocular Depth Estimation with Adversarial Training

    Full text link
    Monocular depth estimation is an extensively studied computer vision problem with a vast variety of applications. Deep learning-based methods have demonstrated promise for both supervised and unsupervised depth estimation from monocular images. Most existing approaches treat depth estimation as a regression problem with a local pixel-wise loss function. In this work, we innovate beyond existing approaches by using adversarial training to learn a context-aware, non-local loss function. Such an approach penalizes the joint configuration of predicted depth values at the patch-level instead of the pixel-level, which allows networks to incorporate more global information. In this framework, the generator learns a mapping between RGB images and its corresponding depth map, while the discriminator learns to distinguish depth map and RGB pairs from ground truth. This conditional GAN depth estimation framework is stabilized using spectral normalization to prevent mode collapse when learning from diverse datasets. We test this approach using a diverse set of generators that include U-Net and joint CNN-CRF. We benchmark this approach on the NYUv2, Make3D and KITTI datasets, and observe that adversarial training reduces relative error by several fold, achieving state-of-the-art performance

    DeepLSR: a deep learning approach for laser speckle reduction

    Full text link
    Speckle artifacts degrade image quality in virtually all modalities that utilize coherent energy, including optical coherence tomography, reflectance confocal microscopy, ultrasound, and widefield imaging with laser illumination. We present an adversarial deep learning framework for laser speckle reduction, called DeepLSR (https://durr.jhu.edu/DeepLSR), that transforms images from a source domain of coherent illumination to a target domain of speckle-free, incoherent illumination. We apply this method to widefield images of objects and tissues illuminated with a multi-wavelength laser, using light emitting diode-illuminated images as ground truth. In images of gastrointestinal tissues, DeepLSR reduces laser speckle noise by 6.4 dB, compared to a 2.9 dB reduction from optimized non-local means processing, a 3.0 dB reduction from BM3D, and a 3.7 dB reduction from an optical speckle reducer utilizing an oscillating diffuser. Further, DeepLSR can be combined with optical speckle reduction to reduce speckle noise by 9.4 dB. This dramatic reduction in speckle noise may enable the use of coherent light sources in applications that require small illumination sources and high-quality imaging, including medical endoscopy

    Rapid tissue oxygenation mapping from snapshot structured-light images with adversarial deep learning

    Full text link
    Spatial frequency domain imaging (SFDI) is a powerful technique for mapping tissue oxygen saturation over a wide field of view. However, current SFDI methods either require a sequence of several images with different illumination patterns or, in the case of single snapshot optical properties (SSOP), introduce artifacts and sacrifice accuracy. To avoid this tradeoff, we introduce OxyGAN: a data-driven, content-aware method to estimate tissue oxygenation directly from single structured light images using end-to-end generative adversarial networks. Conventional SFDI is used to obtain ground truth tissue oxygenation maps for ex vivo human esophagi, in vivo hands and feet, and an in vivo pig colon sample under 659 nm and 851 nm sinusoidal illumination. We benchmark OxyGAN by comparing to SSOP and to a two-step hybrid technique that uses a previously-developed deep learning model to predict optical properties followed by a physical model to calculate tissue oxygenation. When tested on human feet, a cross-validated OxyGAN maps tissue oxygenation with an accuracy of 96.5%. When applied to sample types not included in the training set, such as human hands and pig colon, OxyGAN achieves a 93.0% accuracy, demonstrating robustness to various tissue types. On average, OxyGAN outperforms SSOP and a hybrid model in estimating tissue oxygenation by 24.9% and 24.7%, respectively. Lastly, we optimize OxyGAN inference so that oxygenation maps are computed ~10 times faster than previous work, enabling video-rate, 25Hz imaging. Due to its rapid acquisition and processing speed, OxyGAN has the potential to enable real-time, high-fidelity tissue oxygenation mapping that may be useful for many clinical applications

    Structured Prediction using cGANs with Fusion Discriminator

    Full text link
    We propose the fusion discriminator, a single unified framework for incorporating conditional information into a generative adversarial network (GAN) for a variety of distinct structured prediction tasks, including image synthesis, semantic segmentation, and depth estimation. Much like commonly used convolutional neural network -- conditional Markov random field (CNN-CRF) models, the proposed method is able to enforce higher-order consistency in the model, but without being limited to a very specific class of potentials. The method is conceptually simple and flexible, and our experimental results demonstrate improvement on several diverse structured prediction tasks.Comment: 13 pages, 5 figures, 3 table

    Imaging human blood cells in vivo with oblique back-illumination capillaroscopy

    Full text link
    We present a non-invasive, label-free method of imaging blood cells flowing through human capillaries in vivo using oblique back-illumination capillaroscopy (OBC). Green light illumination allows simultaneous phase and absorption contrast, enhancing the ability to distinguish red and white blood cells. Single-sided illumination through the objective lens enables 200 Hz imaging with close illumination-detection separation and a simplified setup. Phase contrast is optimized when the illumination axis is offset from the detection axis by approximately 225 um when imaging 80 um deep in phantoms and human ventral tongue. We demonstrate high-speed imaging of individual red blood cells, white blood cells with sub-cellular detail, and platelets flowing through capillaries and vessels in human tongue. A custom pneumatic cap placed over the objective lens stabilizes the field of view, enabling longitudinal imaging of a single capillary for up to seven minutes. We present high-quality images of blood cells in individuals with Fitzpatrick skin phototypes II, IV, and VI, showing that the technique is robust to high peripheral melanin concentration. The signal quality, speed, simplicity, and robustness of this approach underscores its potential for non-invasive blood cell counting.Comment: 10 pages, 7 Figure

    Speckle illumination SFDI for projector-free optical property mapping

    Full text link
    Spatial Frequency Domain Imaging can map tissue scattering and absorption properties over a wide field of view, making it useful for clinical applications such as wound assessment and surgical guidance. This technique has previously required the projection of fully-characterized illumination patterns. Here, we show that random and unknown speckle illumination can be used to sample the modulation transfer function of tissues at known spatial frequencies, allowing the quantitative mapping of optical properties with simple laser diode illumination. We compute low- and high-spatial frequency response parameters from the local power spectral density for each pixel and use a look-up-table to accurately estimate absorption and scattering coefficients in tissue phantoms, in vivo human hand, and ex vivo swine esophagus. Because speckle patterns can be generated over a large depth of field and field of view with simple coherent illumination, this approach may enable optical property mapping in new form-factors and applications, including endoscopy

    A Deep Learning Bidirectional Temporal Tracking Algorithm for Automated Blood Cell Counting from Non-invasive Capillaroscopy Videos

    Full text link
    Oblique back-illumination capillaroscopy has recently been introduced as a method for high-quality, non-invasive blood cell imaging in human capillaries. To make this technique practical for clinical blood cell counting, solutions for automatic processing of acquired videos are needed. Here, we take the first step towards this goal, by introducing a deep learning multi-cell tracking model, named CycleTrack, which achieves accurate blood cell counting from capillaroscopic videos. CycleTrack combines two simple online tracking models, SORT and CenterTrack, and is tailored to features of capillary blood cell flow. Blood cells are tracked by displacement vectors in two opposing temporal directions (forward- and backward-tracking) between consecutive frames. This approach yields accurate tracking despite rapidly moving and deforming blood cells. The proposed model outperforms other baseline trackers, achieving 65.57% Multiple Object Tracking Accuracy and 73.95% ID F1 score on test videos. Compared to manual blood cell counting, CycleTrack achieves 96.58 ±\pm 2.43% cell counting accuracy among 8 test videos with 1000 frames each compared to 93.45% and 77.02% accuracy for independent CenterTrack and SORT almost without additional time expense. It takes 800s to track and count approximately 8000 blood cells from 9,600 frames captured in a typical one-minute video. Moreover, the blood cell velocity measured by CycleTrack demonstrates a consistent, pulsatile pattern within the physiological range of heart rate. Lastly, we discuss future improvements for the CycleTrack framework, which would enable clinical translation of the oblique back-illumination microscope towards a real-time and non-invasive point-of-care blood cell counting and analyzing technology.Comment: 10 pages, 6 figure
    • …
    corecore