923 research outputs found
A deep learning framework for quality assessment and restoration in video endoscopy
Endoscopy is a routine imaging technique used for both diagnosis and
minimally invasive surgical treatment. Artifacts such as motion blur, bubbles,
specular reflections, floating objects and pixel saturation impede the visual
interpretation and the automated analysis of endoscopy videos. Given the
widespread use of endoscopy in different clinical applications, we contend that
the robust and reliable identification of such artifacts and the automated
restoration of corrupted video frames is a fundamental medical imaging problem.
Existing state-of-the-art methods only deal with the detection and restoration
of selected artifacts. However, typically endoscopy videos contain numerous
artifacts which motivates to establish a comprehensive solution.
We propose a fully automatic framework that can: 1) detect and classify six
different primary artifacts, 2) provide a quality score for each frame and 3)
restore mildly corrupted frames. To detect different artifacts our framework
exploits fast multi-scale, single stage convolutional neural network detector.
We introduce a quality metric to assess frame quality and predict image
restoration success. Generative adversarial networks with carefully chosen
regularization are finally used to restore corrupted frames.
Our detector yields the highest mean average precision (mAP at 5% threshold)
of 49.0 and the lowest computational time of 88 ms allowing for accurate
real-time processing. Our restoration models for blind deblurring, saturation
correction and inpainting demonstrate significant improvements over previous
methods. On a set of 10 test videos we show that our approach preserves an
average of 68.7% which is 25% more frames than that retained from the raw
videos.Comment: 14 page
Photometric stereo for strong specular highlights
Photometric stereo (PS) is a fundamental technique in computer vision known
to produce 3-D shape with high accuracy. The setting of PS is defined by using
several input images of a static scene taken from one and the same camera
position but under varying illumination. The vast majority of studies in this
3-D reconstruction method assume orthographic projection for the camera model.
In addition, they mainly consider the Lambertian reflectance model as the way
that light scatters at surfaces. So, providing reliable PS results from real
world objects still remains a challenging task. We address 3-D reconstruction
by PS using a more realistic set of assumptions combining for the first time
the complete Blinn-Phong reflectance model and perspective projection. To this
end, we will compare two different methods of incorporating the perspective
projection into our model. Experiments are performed on both synthetic and real
world images. Note that our real-world experiments do not benefit from
laboratory conditions. The results show the high potential of our method even
for complex real world applications such as medical endoscopy images which may
include high amounts of specular highlights
Recovery of surface orientation from diffuse polarization
When unpolarized light is reflected from a smooth dielectric surface, it becomes partially polarized. This is due to the orientation of dipoles induced in the reflecting medium and applies to both specular and diffuse reflection. This paper is concerned with exploiting polarization by surface reflection, using images of smooth dielectric objects, to recover surface normals and, hence, height. This paper presents the underlying physics of polarization by reflection, starting with the Fresnel equations. These equations are used to interpret images taken with a linear polarizer and digital camera, revealing the shape of the objects. Experimental results are presented that illustrate that the technique is accurate near object limbs, as the theory predicts, with less precise, but still useful, results elsewhere. A detailed analysis of the accuracy of the technique for a variety of materials is presented. A method for estimating refractive indices using a laser and linear polarizer is also given
Robust Principal Component Analysis?
This paper is about a curious phenomenon. Suppose we have a data matrix,
which is the superposition of a low-rank component and a sparse component. Can
we recover each component individually? We prove that under some suitable
assumptions, it is possible to recover both the low-rank and the sparse
components exactly by solving a very convenient convex program called Principal
Component Pursuit; among all feasible decompositions, simply minimize a
weighted combination of the nuclear norm and of the L1 norm. This suggests the
possibility of a principled approach to robust principal component analysis
since our methodology and results assert that one can recover the principal
components of a data matrix even though a positive fraction of its entries are
arbitrarily corrupted. This extends to the situation where a fraction of the
entries are missing as well. We discuss an algorithm for solving this
optimization problem, and present applications in the area of video
surveillance, where our methodology allows for the detection of objects in a
cluttered background, and in the area of face recognition, where it offers a
principled way of removing shadows and specularities in images of faces
Unsupervised Odometry and Depth Learning for Endoscopic Capsule Robots
In the last decade, many medical companies and research groups have tried to
convert passive capsule endoscopes as an emerging and minimally invasive
diagnostic technology into actively steerable endoscopic capsule robots which
will provide more intuitive disease detection, targeted drug delivery and
biopsy-like operations in the gastrointestinal(GI) tract. In this study, we
introduce a fully unsupervised, real-time odometry and depth learner for
monocular endoscopic capsule robots. We establish the supervision by warping
view sequences and assigning the re-projection minimization to the loss
function, which we adopt in multi-view pose estimation and single-view depth
estimation network. Detailed quantitative and qualitative analyses of the
proposed framework performed on non-rigidly deformable ex-vivo porcine stomach
datasets proves the effectiveness of the method in terms of motion estimation
and depth recovery.Comment: submitted to IROS 201
On Recognizing Transparent Objects in Domestic Environments Using Fusion of Multiple Sensor Modalities
Current object recognition methods fail on object sets that include both
diffuse, reflective and transparent materials, although they are very common in
domestic scenarios. We show that a combination of cues from multiple sensor
modalities, including specular reflectance and unavailable depth information,
allows us to capture a larger subset of household objects by extending a state
of the art object recognition method. This leads to a significant increase in
robustness of recognition over a larger set of commonly used objects.Comment: 12 page
On the Subspace of Image Gradient Orientations
We introduce the notion of Principal Component Analysis (PCA) of image
gradient orientations. As image data is typically noisy, but noise is
substantially different from Gaussian, traditional PCA of pixel intensities
very often fails to estimate reliably the low-dimensional subspace of a given
data population. We show that replacing intensities with gradient orientations
and the norm with a cosine-based distance measure offers, to some
extend, a remedy to this problem. Our scheme requires the eigen-decomposition
of a covariance matrix and is as computationally efficient as standard
PCA. We demonstrate some of its favorable properties on robust subspace
estimation
- …