923 research outputs found

    A deep learning framework for quality assessment and restoration in video endoscopy

    Full text link
    Endoscopy is a routine imaging technique used for both diagnosis and minimally invasive surgical treatment. Artifacts such as motion blur, bubbles, specular reflections, floating objects and pixel saturation impede the visual interpretation and the automated analysis of endoscopy videos. Given the widespread use of endoscopy in different clinical applications, we contend that the robust and reliable identification of such artifacts and the automated restoration of corrupted video frames is a fundamental medical imaging problem. Existing state-of-the-art methods only deal with the detection and restoration of selected artifacts. However, typically endoscopy videos contain numerous artifacts which motivates to establish a comprehensive solution. We propose a fully automatic framework that can: 1) detect and classify six different primary artifacts, 2) provide a quality score for each frame and 3) restore mildly corrupted frames. To detect different artifacts our framework exploits fast multi-scale, single stage convolutional neural network detector. We introduce a quality metric to assess frame quality and predict image restoration success. Generative adversarial networks with carefully chosen regularization are finally used to restore corrupted frames. Our detector yields the highest mean average precision (mAP at 5% threshold) of 49.0 and the lowest computational time of 88 ms allowing for accurate real-time processing. Our restoration models for blind deblurring, saturation correction and inpainting demonstrate significant improvements over previous methods. On a set of 10 test videos we show that our approach preserves an average of 68.7% which is 25% more frames than that retained from the raw videos.Comment: 14 page

    Photometric stereo for strong specular highlights

    Full text link
    Photometric stereo (PS) is a fundamental technique in computer vision known to produce 3-D shape with high accuracy. The setting of PS is defined by using several input images of a static scene taken from one and the same camera position but under varying illumination. The vast majority of studies in this 3-D reconstruction method assume orthographic projection for the camera model. In addition, they mainly consider the Lambertian reflectance model as the way that light scatters at surfaces. So, providing reliable PS results from real world objects still remains a challenging task. We address 3-D reconstruction by PS using a more realistic set of assumptions combining for the first time the complete Blinn-Phong reflectance model and perspective projection. To this end, we will compare two different methods of incorporating the perspective projection into our model. Experiments are performed on both synthetic and real world images. Note that our real-world experiments do not benefit from laboratory conditions. The results show the high potential of our method even for complex real world applications such as medical endoscopy images which may include high amounts of specular highlights

    Recovery of surface orientation from diffuse polarization

    Get PDF
    When unpolarized light is reflected from a smooth dielectric surface, it becomes partially polarized. This is due to the orientation of dipoles induced in the reflecting medium and applies to both specular and diffuse reflection. This paper is concerned with exploiting polarization by surface reflection, using images of smooth dielectric objects, to recover surface normals and, hence, height. This paper presents the underlying physics of polarization by reflection, starting with the Fresnel equations. These equations are used to interpret images taken with a linear polarizer and digital camera, revealing the shape of the objects. Experimental results are presented that illustrate that the technique is accurate near object limbs, as the theory predicts, with less precise, but still useful, results elsewhere. A detailed analysis of the accuracy of the technique for a variety of materials is presented. A method for estimating refractive indices using a laser and linear polarizer is also given

    Robust Principal Component Analysis?

    Full text link
    This paper is about a curious phenomenon. Suppose we have a data matrix, which is the superposition of a low-rank component and a sparse component. Can we recover each component individually? We prove that under some suitable assumptions, it is possible to recover both the low-rank and the sparse components exactly by solving a very convenient convex program called Principal Component Pursuit; among all feasible decompositions, simply minimize a weighted combination of the nuclear norm and of the L1 norm. This suggests the possibility of a principled approach to robust principal component analysis since our methodology and results assert that one can recover the principal components of a data matrix even though a positive fraction of its entries are arbitrarily corrupted. This extends to the situation where a fraction of the entries are missing as well. We discuss an algorithm for solving this optimization problem, and present applications in the area of video surveillance, where our methodology allows for the detection of objects in a cluttered background, and in the area of face recognition, where it offers a principled way of removing shadows and specularities in images of faces

    Unsupervised Odometry and Depth Learning for Endoscopic Capsule Robots

    Full text link
    In the last decade, many medical companies and research groups have tried to convert passive capsule endoscopes as an emerging and minimally invasive diagnostic technology into actively steerable endoscopic capsule robots which will provide more intuitive disease detection, targeted drug delivery and biopsy-like operations in the gastrointestinal(GI) tract. In this study, we introduce a fully unsupervised, real-time odometry and depth learner for monocular endoscopic capsule robots. We establish the supervision by warping view sequences and assigning the re-projection minimization to the loss function, which we adopt in multi-view pose estimation and single-view depth estimation network. Detailed quantitative and qualitative analyses of the proposed framework performed on non-rigidly deformable ex-vivo porcine stomach datasets proves the effectiveness of the method in terms of motion estimation and depth recovery.Comment: submitted to IROS 201

    On Recognizing Transparent Objects in Domestic Environments Using Fusion of Multiple Sensor Modalities

    Full text link
    Current object recognition methods fail on object sets that include both diffuse, reflective and transparent materials, although they are very common in domestic scenarios. We show that a combination of cues from multiple sensor modalities, including specular reflectance and unavailable depth information, allows us to capture a larger subset of household objects by extending a state of the art object recognition method. This leads to a significant increase in robustness of recognition over a larger set of commonly used objects.Comment: 12 page

    On the Subspace of Image Gradient Orientations

    Full text link
    We introduce the notion of Principal Component Analysis (PCA) of image gradient orientations. As image data is typically noisy, but noise is substantially different from Gaussian, traditional PCA of pixel intensities very often fails to estimate reliably the low-dimensional subspace of a given data population. We show that replacing intensities with gradient orientations and the â„“2\ell_2 norm with a cosine-based distance measure offers, to some extend, a remedy to this problem. Our scheme requires the eigen-decomposition of a covariance matrix and is as computationally efficient as standard â„“2\ell_2 PCA. We demonstrate some of its favorable properties on robust subspace estimation
    • …
    corecore