333 research outputs found
A deep learning framework for quality assessment and restoration in video endoscopy
Endoscopy is a routine imaging technique used for both diagnosis and
minimally invasive surgical treatment. Artifacts such as motion blur, bubbles,
specular reflections, floating objects and pixel saturation impede the visual
interpretation and the automated analysis of endoscopy videos. Given the
widespread use of endoscopy in different clinical applications, we contend that
the robust and reliable identification of such artifacts and the automated
restoration of corrupted video frames is a fundamental medical imaging problem.
Existing state-of-the-art methods only deal with the detection and restoration
of selected artifacts. However, typically endoscopy videos contain numerous
artifacts which motivates to establish a comprehensive solution.
We propose a fully automatic framework that can: 1) detect and classify six
different primary artifacts, 2) provide a quality score for each frame and 3)
restore mildly corrupted frames. To detect different artifacts our framework
exploits fast multi-scale, single stage convolutional neural network detector.
We introduce a quality metric to assess frame quality and predict image
restoration success. Generative adversarial networks with carefully chosen
regularization are finally used to restore corrupted frames.
Our detector yields the highest mean average precision (mAP at 5% threshold)
of 49.0 and the lowest computational time of 88 ms allowing for accurate
real-time processing. Our restoration models for blind deblurring, saturation
correction and inpainting demonstrate significant improvements over previous
methods. On a set of 10 test videos we show that our approach preserves an
average of 68.7% which is 25% more frames than that retained from the raw
videos.Comment: 14 page
Continuous Facial Motion Deblurring
We introduce a novel framework for continuous facial motion deblurring that
restores the continuous sharp moment latent in a single motion-blurred face
image via a moment control factor. Although a motion-blurred image is the
accumulated signal of continuous sharp moments during the exposure time, most
existing single image deblurring approaches aim to restore a fixed number of
frames using multiple networks and training stages. To address this problem, we
propose a continuous facial motion deblurring network based on GAN (CFMD-GAN),
which is a novel framework for restoring the continuous moment latent in a
single motion-blurred face image with a single network and a single training
stage. To stabilize the network training, we train the generator to restore
continuous moments in the order determined by our facial motion-based
reordering process (FMR) utilizing domain-specific knowledge of the face.
Moreover, we propose an auxiliary regressor that helps our generator produce
more accurate images by estimating continuous sharp moments. Furthermore, we
introduce a control-adaptive (ContAda) block that performs spatially deformable
convolution and channel-wise attention as a function of the control factor.
Extensive experiments on the 300VW datasets demonstrate that the proposed
framework generates a various number of continuous output frames by varying the
moment control factor. Compared with the recent single-to-single image
deblurring networks trained with the same 300VW training set, the proposed
method show the superior performance in restoring the central sharp frame in
terms of perceptual metrics, including LPIPS, FID and Arcface identity
distance. The proposed method outperforms the existing single-to-video
deblurring method for both qualitative and quantitative comparisons
- …