139 research outputs found

    Laparoscopic Image Recovery and Stereo Matching

    Get PDF
    Laparoscopic imaging can play a significant role in the minimally invasive surgical procedure. However, laparoscopic images often suffer from insufficient and irregular light sources, specular highlight surfaces, and a lack of depth information. These problems can negatively influence the surgeons during surgery, and lead to erroneous visual tracking and potential surgical risks. Thus, developing effective image-processing algorithms for laparoscopic vision recovery and stereo matching is of significant importance. Most related algorithms are effective on nature images, but less effective on laparoscopic images. The first purpose of this thesis is to restore low-light laparoscopic vision, where an effective image enhancement method is proposed by identifying different illumination regions and designing the enhancement criteria for desired image quality. This method can enhance the low-light region by reducing noise amplification during the enhancement process. In addition, this thesis also proposes a simplified Retinex optimization method for non-uniform illumination enhancement. By integrating the prior information of the illumination and reflectance into the optimization process, this method can significantly enhance the dark region while preserving naturalness, texture details, and image structures. Moreover, due to the replacement of the total variation term with two l2l_2-norm terms, the proposed algorithm has a significant computational advantage. Second, a global optimization method for specular highlight removal from a single laparoscopic image is proposed. This method consists of a modified dichromatic reflection model and a novel diffuse chromaticity estimation technique. Due to utilizing the limited color variation of the laparoscopic image, the estimated diffuse chromaticity can approximate the true diffuse chromaticity, which allows us to effectively remove the specular highlight with texture detail preservation. Third, a robust edge-preserving stereo matching method is proposed, based on sparse feature matching, left and right illumination equalization, and refined disparity optimization processes. The sparse feature matching and illumination equalization techniques can provide a good disparity map initialization so that our refined disparity optimization can quickly obtain an accurate disparity map. This approach is particularly promising on surgical tool edges, smooth soft tissues, and surfaces with strong specular highlight

    Visual Computing and Machine Learning Techniques for Digital Forensics

    Get PDF
    It is impressive how fast science has improved day by day in so many different fields. In special, technology advances are shocking so many people bringing to their reality facts that previously were beyond their imagination. Inspired by methods earlier presented in scientific fiction shows, the computer science community has created a new research area named Digital Forensics, which aims at developing and deploying methods for fighting against digital crimes such as digital image forgery.This work presents some of the main concepts associated with Digital Forensics and, complementarily, presents some recent and powerful techniques relying on Computer Graphics, Image Processing, Computer Vision and Machine Learning concepts for detecting forgeries in photographs. Some topics addressed in this work include: sourceattribution, spoofing detection, pornography detection, multimedia phylogeny, and forgery detection. Finally, this work highlights the challenges and open problems in Digital Image Forensics to provide the readers with the myriad opportunities available for research

    Ridge Regression Approach to Color Constancy

    Get PDF
    This thesis presents the work on color constancy and its application in the field of computer vision. Color constancy is a phenomena of representing (visualizing) the reflectance properties of the scene independent of the illumination spectrum. The motivation behind this work is two folds:The primary motivation is to seek ‘consistency and stability’ in color reproduction and algorithm performance respectively because color is used as one of the important features in many computer vision applications; therefore consistency of the color features is essential for high application success. Second motivation is to reduce ‘computational complexity’ without sacrificing the primary motivation.This work presents machine learning approach to color constancy. An empirical model is developed from the training data. Neural network and support vector machine are two prominent nonlinear learning theories. The work on support vector machine based color constancy shows its superior performance over neural networks based color constancy in terms of stability. But support vector machine is time consuming method. Alternative approach to support vectormachine, is a simple, fast and analytically solvable linear modeling technique known as ‘Ridge regression’. It learns the dependency between the surface reflectance and illumination from a presented training sample of data. Ridge regression provides answer to the two fold motivation behind this work, i.e., stable and computationally simple approach. The proposed algorithms, ‘Support vector machine’ and ‘Ridge regression’ involves three step processes: First, an input matrix constructed from the preprocessed training data set is trained toobtain a trained model. Second, test images are presented to the trained model to obtain the chromaticity estimate of the illuminants present in the testing images. Finally, linear diagonal transformation is performed to obtain the color corrected image. The results show the effectiveness of the proposed algorithms on both calibrated and uncalibrated data set in comparison to the methods discussed in literature review. Finally, thesis concludes with a complete discussion and summary on comparison between the proposed approaches and other algorithms

    Highlights Analysis System (HAnS) for low dynamic range to high dynamic range conversion of cinematic low dynamic range content

    Get PDF
    We propose a novel and efficient algorithm for detection of specular reflections and light sources (highlights) in cinematic content. The detection of highlights is important for reconstructing them properly in the conversion of the low dynamic range (LDR) to high dynamic range (HDR) content. Highlights are often difficult to be distinguished from bright diffuse surfaces, due to their brightness being reduced in the conventional LDR content production. Moreover, the cinematic LDR content is subject to the artistic use of effects that change the apparent brightness of certain image regions (e.g. limiting depth of field, grading, complex multi-lighting setup, etc.). To ensure the robustness of highlights detection to these effects, the proposed algorithm goes beyond considering only absolute brightness and considers five different features. These features are: the size of the highlight relative to the size of the surrounding image structures, the relative contrast in the surrounding of the highlight, its absolute brightness expressed through the luminance (luma feature), through the saturation in the color space (maxRGB feature) and through the saturation in white (minRGB feature). We evaluate the algorithm on two different image data-sets. The first one is a publicly available LDR image data-set without cinematic content, which allows comparison to the broader State of the art. Additionally, for the evaluation on cinematic content, we create an image data-set consisted of manually annotated cinematic frames and real-world images. For the purpose of demonstrating the proposed highlights detection algorithm in a complete LDR-to-HDR conversion pipeline, we additionally propose a simple inverse-tone-mapping algorithm. The experimental analysis shows that the proposed approach outperforms conventional highlights detection algorithms on both image data-sets, achieves high quality reconstruction of the HDR content and is suited for use in LDR-to-HDR conversion

    Haze visibility enhancement: A Survey and quantitative benchmarking

    Get PDF
    This paper provides a comprehensive survey of methods dealing with visibility enhancement of images taken in hazy or foggy scenes. The survey begins with discussing the optical models of atmospheric scattering media and image formation. This is followed by a survey of existing methods, which are categorized into: multiple image methods, polarizing filter-based methods, methods with known depth, and single-image methods. We also provide a benchmark of a number of well-known single-image methods, based on a recent dataset provided by Fattal (2014) and our newly generated scattering media dataset that contains ground truth images for quantitative evaluation. To our knowledge, this is the first benchmark using numerical metrics to evaluate dehazing techniques. This benchmark allows us to objectively compare the results of existing methods and to better identify the strengths and limitations of each method.This study is supported by an Nvidia GPU Grant and a Canadian NSERC Discovery grant. R. T. Tan’s work in this research is supported by the National Research Foundation, Prime Ministers Office, Singapore under its International Research Centre in Singapore Funding Initiativ

    Multisensory Imagery Cues for Object Separation, Specularity Detection and Deep Learning based Inpainting

    Full text link
    Multisensory imagery cues have been actively investigated in diverse applications in the computer vision community to provide additional geometric information that is either absent or difficult to capture from mainstream two-dimensional imaging. The inherent features of multispectral polarimetric light field imagery (MSPLFI) include object distribution over spectra, surface properties, shape, shading and pixel flow in light space. The aim of this dissertation is to explore these inherent properties to exploit new structures and methodologies for the tasks of object separation, specularity detection and deep learning-based inpainting in MSPLFI. In the first part of this research, an application to separate foreground objects from the background in both outdoor and indoor scenes using multispectral polarimetric imagery (MSPI) cues is examined. Based on the pixel neighbourhood relationship, an on-demand clustering technique is proposed and implemented to separate artificial objects from natural background in a complex outdoor scene. However, due to indoor scenes only containing artificial objects, with vast variations in energy levels among spectra, a multiband fusion technique followed by a background segmentation algorithm is proposed to separate the foreground from the background. In this regard, first, each spectrum is decomposed into low and high frequencies using the fast Fourier transform (FFT) method. Second, principal component analysis (PCA) is applied on both frequency images of the individual spectrum and then combined with the first principal components as a fused image. Finally, a polarimetric background segmentation (BS) algorithm based on the Stokes vector is proposed and implemented on the fused image. The performance of the proposed approaches are evaluated and compared using publicly available MSPI datasets and the dice similarity coefficient (DSC). The proposed multiband fusion and BS methods demonstrate better fusion quality and higher segmentation accuracy compared with other studies for several metrics, including mean absolute percentage error (MAPE), peak signal-to-noise ratio (PSNR), Pearson correlation coefficient (PCOR) mutual information (MI), accuracy, Geometric Mean (G-mean), precision, recall and F1-score. In the second part of this work, a twofold framework for specular reflection detection (SRD) and specular reflection inpainting (SRI) in transparent objects is proposed. The SRD algorithm is based on the mean, the covariance and the Mahalanobis distance for predicting anomalous pixels in MSPLFI. The SRI algorithm first selects four-connected neighbouring pixels from sub-aperture images and then replaces the SRD pixel with the closest matched pixel. For both algorithms, a 6D MSPLFI transparent object dataset is captured from multisensory imagery cues due to the unavailability of this kind of dataset. The experimental results demonstrate that the proposed algorithms predict higher SRD accuracy and better SRI quality than the existing approaches reported in this part in terms of F1-score, G-mean, accuracy, the structural similarity index (SSIM), the PSNR, the mean squared error (IMMSE) and the mean absolute deviation (MAD). However, due to synthesising SRD pixels based on the pixel neighbourhood relationship, the proposed inpainting method in this research produces artefacts and errors when inpainting large specularity areas with irregular holes. Therefore, in the last part of this research, the emphasis is on inpainting large specularity areas with irregular holes based on the deep feature extraction from multisensory imagery cues. The proposed six-stage deep learning inpainting (DLI) framework is based on the generative adversarial network (GAN) architecture and consists of a generator network and a discriminator network. First, pixels’ global flow in the sub-aperture images is calculated by applying the large displacement optical flow (LDOF) method. The proposed training algorithm combines global flow with local flow and coarse inpainting results predicted from the baseline method. The generator attempts to generate best-matched features, while the discriminator seeks to predict the maximum difference between the predicted results and the actual results. The experimental results demonstrate that in terms of the PSNR, MSSIM, IMMSE and MAD, the proposed DLI framework predicts superior inpainting quality to the baseline method and the previous part of this research

    Illumination Invariant Outdoor Perception

    Get PDF
    This thesis proposes the use of a multi-modal sensor approach to achieve illumination invariance in images taken in outdoor environments. The approach is automatic in that it does not require user input for initialisation, and is not reliant on the input of atmospheric radiative transfer models. While it is common to use pixel colour and intensity as features in high level vision algorithms, their performance is severely limited by the uncontrolled lighting and complex geometric structure of outdoor scenes. The appearance of a material is dependent on the incident illumination, which can vary due to spatial and temporal factors. This variability causes identical materials to appear differently depending on their location. Illumination invariant representations of the scene can potentially improve the performance of high level vision algorithms as they allow discrimination between pixels to occur based on the underlying material characteristics. The proposed approach to obtaining illumination invariance utilises fused image and geometric data. An approximation of the outdoor illumination is used to derive per-pixel scaling factors. This has the effect of relighting the entire scene using a single illuminant that is common in terms of colour and intensity for all pixels. The approach is extended to radiometric normalisation and the multi-image scenario, meaning that the resultant dataset is both spatially and temporally illumination invariant. The proposed illumination invariance approach is evaluated on several datasets and shows that spatial and temporal invariance can be achieved without loss of spectral dimensionality. The system requires very few tuning parameters, meaning that expert knowledge is not required in order for its operation. This has potential implications for robotics and remote sensing applications where perception systems play an integral role in developing a rich understanding of the scene
    • …
    corecore