44 research outputs found
Dense Vision in Image-guided Surgery
Image-guided surgery needs an efficient and effective camera tracking system in order to perform augmented reality for overlaying preoperative models or label cancerous tissues on the 2D video images of the surgical scene. Tracking in endoscopic/laparoscopic scenes however is an extremely difficult task primarily due to tissue deformation, instrument invasion into the surgical scene and the presence of specular highlights. State of the art feature-based SLAM systems such as PTAM fail in tracking such scenes since the number of good features to track is very limited. When the scene is smoky and when there are instrument motions, it will cause feature-based tracking to fail immediately.
The work of this thesis provides a systematic approach to this problem using dense vision. We initially attempted to register a 3D preoperative model with multiple 2D endoscopic/laparoscopic images using a dense method but this approach did not perform well. We subsequently proposed stereo reconstruction to directly obtain the 3D structure of the scene. By using the dense reconstructed model together with robust estimation, we demonstrate that dense stereo tracking can be incredibly robust even within extremely challenging endoscopic/laparoscopic scenes.
Several validation experiments have been conducted in this thesis. The proposed stereo reconstruction algorithm has turned out to be the state of the art method for several publicly available ground truth datasets. Furthermore, the proposed robust dense stereo tracking algorithm has been proved highly accurate in synthetic environment (< 0.1 mm RMSE) and qualitatively extremely robust when being applied to real scenes in RALP prostatectomy surgery. This is an important step toward achieving accurate image-guided laparoscopic surgery.Open Acces
Photometric Depth Super-Resolution
This study explores the use of photometric techniques (shape-from-shading and
uncalibrated photometric stereo) for upsampling the low-resolution depth map
from an RGB-D sensor to the higher resolution of the companion RGB image. A
single-shot variational approach is first put forward, which is effective as
long as the target's reflectance is piecewise-constant. It is then shown that
this dependency upon a specific reflectance model can be relaxed by focusing on
a specific class of objects (e.g., faces), and delegate reflectance estimation
to a deep neural network. A multi-shot strategy based on randomly varying
lighting conditions is eventually discussed. It requires no training or prior
on the reflectance, yet this comes at the price of a dedicated acquisition
setup. Both quantitative and qualitative evaluations illustrate the
effectiveness of the proposed methods on synthetic and real-world scenarios.Comment: IEEE Transactions on Pattern Analysis and Machine Intelligence
(T-PAMI), 2019. First three authors contribute equall
Recommended from our members
Detailed and Practical 3D Reconstruction with Advanced Photometric Stereo Modelling
Object 3D reconstruction has always been one of the main objectives of computer vision. After many decades of research, most techniques are still unsuccessful at recovering high resolution surfaces, especially for objects with limited surface texture. Moreover, most shiny materials are particularly hard to reconstruct.
Photometric Stereo (PS), which operates by capturing multiple images under changing illumination has traditionally been one of the most successful techniques at recovering a large amount of surface details, by exploiting the relationship between shading and local shape. However, using PS has been highly impractical because most approaches are only applicable in a very controlled lab setting and limited to objects experiencing diffuse reflection.
Nevertheless, recent advances in differential modelling have made complicated Photometric Stereo models possible and variational optimisations for these kinds of models show remarkable resilience to real world imperfections such as non-Gaussian noise and other outliers. Thus, a highly accurate, photometric-based reconstruction system is now possible.
The contribution of this thesis is threefold. First of all, the Photometric Stereo model is extended in order to be able to deal with arbitrary ambient lighting. This is a step towards acquisition in a non-fully controlled lab setting. Secondly, the need for a priori knowledge of the light source brightness and attenuation characteristics is relaxed as an alternating optimisation procedure is proposed which is able to estimate these parameters. This extension allows for quick acquisition with inexpensive LEDs that exhibit unpredictable illumination characteristics (flickering etc). Finally, a volumetric parameterisation is proposed which allows one to tackle the multi-view Photometric Stereo problem in a similar manner, in a simple unified differential model. This final extension allows for complete object reconstruction merging information from multiple images taken from multiple viewpoints and variable illumination.
The theoretical work in this thesis is experimentally evaluated in a number of challenging real world experiments, with data captured by custom-made hardware. In addition, the applicability of the generality of the proposed models is demonstrated by presenting a differential model for the shape of polarisation problem, which leads to a unified optimisation problem, fusing information from both methods. This allows for the acquisition of geometrical information about objects such as semi-transparent glass, hitherto hard to deal with
Surface analysis and fingerprint recognition from multi-light imaging collections
Multi-light imaging captures a scene from a fixed viewpoint through multiple photographs, each of which are illuminated from a different direction. Every image reveals information about the surface, with the intensity reflected from each point being measured for all lighting directions. The images captured are known as multi-light image collections (MLICs), for which a variety of techniques have been developed over recent decades to acquire information from the images. These techniques include shape from shading, photometric stereo and reflectance transformation imaging (RTI). Pixel coordinates from one image in a MLIC will correspond to exactly the same position on the surface across all images in the MLIC since the camera does not move.
We assess the relevant literature to the methods presented in this thesis in chapter 1 and describe different types of reflections and surface types, as well as explaining the multi-light imaging process. In chapter 2 we present a novel automated RTI method which requires no calibration equipment (i.e. shiny reference spheres or 3D printed structures as other methods require) and automatically computes the lighting direction and compensates for non-uniform illumination.
Then in chapter 3 we describe our novel MLIC method termed Remote Extraction of Latent Fingerprints (RELF) which segments each multi-light imaging photograph into superpixels (small groups of pixels) and uses a neural network classifier to determine whether or not the superpixel contains fingerprint. The RELF algorithm then mosaics these superpixels which are classified as fingerprint together in order to obtain a complete latent print image, entirely contactlessly.
In chapter 4 we detail our work with the Metropolitan Police Service (MPS) UK, who described to us with their needs and requirements which helped us to create a prototype RELF imaging device which is now being tested by MPS officers who are validating the quality of the latent prints extracted using our technique.
In chapter 5 we then further developed our multi-light imaging latent fingerprint technique to extract latent prints from curved surfaces and automatically correct for surface curvature distortions. We have a patent pending for this method
Integrating Shape-from-Shading & Stereopsis
This thesis is concerned with inferring scene shape by combining two specifictechniques: shape-from-shading and stereopsis. Shape-from-shading calculates shape using the lighting equation, which takes surface orientation and lighting information to irradiance. As irradiance and lighting information are provided this is the problem of inverting a many to one function to get surface orientation. Surface orientation may be integrated to get depth. Stereopsismatches pixels between two images taken from different locations of the same scene - this is the correspondence problem. Depth can then be calculated using camera calibration information, via triangulation. These methods both fail for certain inputs; the advantage of combining them is that where one fails the other may continue to work. Notably, shape-from-shading requires a smoothly shaded surface, without texture, whilst stereopsis requires texture - each works where the other does not. The first work of this thesis tackles the problem directly. A novel modular solution is proposed to combine both methods; combining is itself done using Gaussian belief propagation. This modular approach highlights missing and weak modules; the rest of the thesis is then concerned with providing a new module and an improved module. The improved module is given in the second research chapter and consists of a new shape-from-shading algorithm. It again uses belief propagation, but this time with directional statistics to represent surface orientation. Message passing is performed using a novel method; it is analytical, which makes this algorithm particularly fast. In the final research chapter a new module is provided, to estimate the light source direction. Without such a modulethe user of the system has to provide it; this is tedious and error prone, andimpedes automation. It is a probabilistic method that uniquely estimates the light source direction using a stereo pair as input
Image based surface reflectance remapping for consistent and tool independent material appearence
Physically-based rendering in Computer Graphics requires the knowledge of material properties other than 3D shapes, textures and colors, in order to solve the rendering equation. A number of material models have been developed, since no model is currently able to reproduce the full range of available materials. Although only few material models have been widely adopted in current rendering systems, the lack of standardisation causes several issues in the 3D modelling
workflow, leading to a heavy tool dependency of material appearance. In industry, final decisions about products are often based on a virtual prototype, a crucial step for the production pipeline, usually developed by a collaborations among several
departments, which exchange data. Unfortunately, exchanged data often tends to differ from the original, when imported into a different application. As a result, delivering consistent visual results requires time, labour and computational cost.
This thesis begins with an examination of the current state of the art in material appearance representation and capture, in order to identify a suitable strategy to tackle material appearance consistency. Automatic solutions to this problem are suggested in this work, accounting for the constraints of real-world scenarios, where the only available information is a reference rendering and the renderer used to obtain it, with no access to the implementation of the shaders. In particular, two image-based frameworks are proposed, working under these constraints.
The first one, validated by means of perceptual studies, is aimed to the remapping of BRDF parameters and useful when the parameters used for the reference rendering are available. The second one provides consistent material appearance across different renderers, even when the parameters used for the reference are unknown. It allows the selection of an arbitrary reference rendering tool, and manipulates the output of other renderers in order to be consistent with the reference
Estimation of three dimensional structure from passport-style photographic images for enhanced face recognition performance in humans
EThOS - Electronic Theses Online ServiceGBUnited Kingdo
3D shape reconstruction using a polarisation reflectance model in conjunction with shading and stereo
Reconstructing the 3D geometry of objects from images is a fundamental problem in computer vision. This thesis focuses on shape from polarisation where the goal is to reconstruct a dense depth map from a sequence of polarisation images.
Firstly, we propose a linear differential constraints approach to depth estimation from polarisation images. We demonstrate that colour images can deliver more robust polarimetric measurements compared to monochrome images. Then we explore different constraints by taking the polarisation images under two different light conditions with fixed view and show that a dense depth map, albedo map and refractive index can be recovered.
Secondly, we propose a nonlinear method to reconstruct depth by an end-to-end method. We re-parameterise a polarisation reflectance model with respect to the depth map, and predict an optimum depth map by minimising an energy cost function between the prediction from the reflectance model and observed data using nonlinear least squares.
Thirdly, we propose to enhance the polarisation camera with an additional RGB camera in a second view. We construct a higher-order graphical model by utilising an initial rough depth map estimated from the stereo views. The graphical model will correct the surface normal ambiguity which arises from the polarisation reflectance model. We then build a linear system to combine the corrected surface normal, polarimetric information and rough depth map to produce an accurate and dense depth map.
Lastly, we derive a mixed polarisation model that describes specular and diffuse polarisation as well as mixtures of the two. This model is more physically accurate and allows us to decompose specular and diffuse reflectance from multiview images
Modelling appearance and geometry from images
Acquisition of realistic and relightable 3D models of large outdoor structures, such as buildings, requires the modelling of detailed geometry and visual appearance. Recovering these material characteristics can be very time consuming and needs specially dedicated equipment. Alternatively, surface detail can be conveyed by textures recovered from images, whose appearance is only valid under the originally photographed viewing and lighting conditions. Methods to easily capture locally detailed geometry, such as cracks in stone walls, and visual appearance require control of lighting conditions, which are usually restricted to small portions of surfaces captured at close range.This thesis investigates the acquisition of high-quality models from images, using simple photographic equipment and modest user intervention. The main focus of this investigation is on approximating detailed local depth information and visual appearance, obtained using a new image-based approach, and combining this with gross-scale 3D geometry. This is achieved by capturing these surface characteristics in small accessible regions and transferring them to the complete façade. This approach yields high-quality models, imparting the illusion of measured reflectance. In this thesis, we first present two novel algorithms for surface detail and visual appearance transfer, where these material properties are captured for small exemplars, using an image-based technique. Second, we develop an interactive solution to solve the problems of performing the transfer over both a large change in scale and to the different materials contained in a complete façade. Aiming to completely automate this process, a novel algorithm to differentiate between materials in the façade and associate them with the correct exemplars is introduced with promising results. Third, we present a new method for texture reconstruction from multiple images that optimises texture quality, by choosing the best view for every point and minimising seams. Material properties are transferred from the exemplars to the texture map, approximating reflectance and meso-structure. The combination of these techniques results in a complete working system capable of producing realistic relightable models of full building façades, containing high-resolution geometry and plausible visual appearance.EThOS - Electronic Theses Online ServiceGBUnited Kingdo