12 research outputs found

    Endoscopic video defogging using luminance blending.

    Get PDF
    Endoscopic video sequences provide surgeons with direct surgical field or visualisation on anatomical targets in the patient during robotic surgery. Unfortunately, these video images are unavoidably hazy or foggy to prevent surgeons from clear surgical vision due to typical surgical operations such as ablation and cauterisation during surgery. This Letter aims at removing fog or smoke on endoscopic video sequences to enhance and maintain a direct and clear visualisation of the operating field during robotic surgery. The authors propose a new luminance blending framework that integrates contrast enhancement with visibility restoration for foggy endoscopic video processing. The proposed method was validated on clinical endoscopic videos that were collected from robotic surgery. The experimental results demonstrate that their method provides a promising means to effectively remove fog or smoke on endoscopic video images. In particular, the visual quality of defogged endoscopic images was improved from 0.5088 to 0.6475

    Mutual Information-driven Triple Interaction Network for Efficient Image Dehazing

    Full text link
    Multi-stage architectures have exhibited efficacy in image dehazing, which usually decomposes a challenging task into multiple more tractable sub-tasks and progressively estimates latent hazy-free images. Despite the remarkable progress, existing methods still suffer from the following shortcomings: (1) limited exploration of frequency domain information; (2) insufficient information interaction; (3) severe feature redundancy. To remedy these issues, we propose a novel Mutual Information-driven Triple interaction Network (MITNet) based on spatial-frequency dual domain information and two-stage architecture. To be specific, the first stage, named amplitude-guided haze removal, aims to recover the amplitude spectrum of the hazy images for haze removal. And the second stage, named phase-guided structure refined, devotes to learning the transformation and refinement of the phase spectrum. To facilitate the information exchange between two stages, an Adaptive Triple Interaction Module (ATIM) is developed to simultaneously aggregate cross-domain, cross-scale, and cross-stage features, where the fused features are further used to generate content-adaptive dynamic filters so that applying them to enhance global context representation. In addition, we impose the mutual information minimization constraint on paired scale encoder and decoder features from both stages. Such an operation can effectively reduce information redundancy and enhance cross-stage feature complementarity. Extensive experiments on multiple public datasets exhibit that our MITNet performs superior performance with lower model complexity.The code and models are available at https://github.com/it-hao/MITNet.Comment: Accepted in ACM MM 202

    A Fast-Dehazing Technique using Generative Adversarial Network model for Illumination Adjustment in Hazy Videos

    Get PDF
    Haze significantly lowers the quality of the photos and videos that are taken. This might potentially be dangerous in addition to having an impact on the monitoring equipment' dependability. Recent years have seen an increase in issues brought on by foggy settings, necessitating the development of real-time dehazing techniques. Intelligent vision systems, such as surveillance and monitoring systems, rely fundamentally on the characteristics of the input pictures having a significant impact on the accuracy of the object detection. This paper presents a fast video dehazing technique using Generative Adversarial Network (GAN) model. The haze in the input video is estimated using depth in the scene extracted using a pre trained monocular depth ResNet model. Based on the amount of haze, an appropriate model is selected which is trained for specific haze conditions. The novelty of the proposed work is that the generator model is kept simple to get faster results in real-time. The discriminator is kept complex to make the generator more efficient. The traditional loss function is replaced with Visual Geometry Group (VGG) feature loss for better dehazing. The proposed model produced better results when compared to existing models. The Peak Signal to Noise Ratio (PSNR) obtained for most of the frames is above 32. The execution time is less than 60 milli seconds which makes the proposed model suited for video dehazing
    corecore