469 research outputs found

    L0L_0-Sampler: An L0L_{0} Model Guided Volume Sampling for NeRF

    Full text link
    Since being proposed, Neural Radiance Fields (NeRF) have achieved great success in related tasks, mainly adopting the hierarchical volume sampling (HVS) strategy for volume rendering. However, the HVS of NeRF approximates distributions using piecewise constant functions, which provides a relatively rough estimation. Based on the observation that a well-trained weight function w(t)w(t) and the L0L_0 distance between points and the surface have very high similarity, we propose L0L_0-Sampler by incorporating the L0L_0 model into w(t)w(t) to guide the sampling process. Specifically, we propose to use piecewise exponential functions rather than piecewise constant functions for interpolation, which can not only approximate quasi-L0L_0 weight distributions along rays quite well but also can be easily implemented with few lines of code without additional computational burden. Stable performance improvements can be achieved by applying L0L_0-Sampler to NeRF and its related tasks like 3D reconstruction. Code is available at https://ustc3dv.github.io/L0-Sampler/ .Comment: Project page: https://ustc3dv.github.io/L0-Sampler

    A NOVEL ALGORITHM BASED ON CASCADING OF NEURAL NETWORK MODELS AND WAVELET TRANSFORM FOR IMAGE ENHANCEMENT.

    Get PDF
    Image enhancement and restoration is pre-request of computer vision. The distortion and degradation of image suffered the process of pattern matching and quality of image. Wavelet is very important transform function play a role in image enhancement and image de-noising. The concept of wavelet used as soft thresholding and hard thresholding. A processing of data through wavelet is very efficient in process of neural network. In this paper we discuss the proposed algorithm for image enhancement based on self organized map network and wavelet transform. Basically self organized map network is unsupervised training mechanisms of pattern, due to this reason the processing of network is very fast in compression of another artificial neural network method. And the combination of wavelet and self organized map network have great advantage over conventional method such as histogram equalization and multi-point histogram equalization and another conventional technique of image enhancement

    GazeStereo3D: seamless disparity manipulations

    Get PDF
    Producing a high quality stereoscopic impression on current displays is a challenging task. The content has to be carefully prepared in order to maintain visual comfort, which typically affects the quality of depth reproduction. In this work, we show that this problem can be significantly alleviated when the eye fixation regions can be roughly estimated. We propose a new method for stereoscopic depth adjustment that utilizes eye tracking or other gaze prediction information. The key idea that distinguishes our approach from the previous work is to apply gradual depth adjustments at the eye fixation stage, so that they remain unnoticeable. To this end, we measure the limits imposed on the speed of disparity changes in various depth adjustment scenarios, and formulate a new model that can guide such seamless stereoscopic content processing. Based on this model, we propose a real-time controller that applies local manipulations to stereoscopic content to find the optimum between depth reproduction and visual comfort. We show that the controller is mostly immune to the limitations of low-cost eye tracking solutions. We also demonstrate benefits of our model in off-line applications, such as stereoscopic movie production, where skillful directors can reliably guide and predict viewers' attention or where attended image regions are identified during eye tracking sessions. We validate both our model and the controller in a series of user experiments. They show significant improvements in depth perception without sacrificing the visual quality when our techniques are applied

    Towards Transcervical Ultrasound Image Guidance for Transoral Robotic Surgery

    Full text link
    Purpose: Trans-oral robotic surgery (TORS) using the da Vinci surgical robot is a new minimally-invasive surgery method to treat oropharyngeal tumors, but it is a challenging operation. Augmented reality (AR) based on intra-operative ultrasound (US) has the potential to enhance the visualization of the anatomy and cancerous tumors to provide additional tools for decision-making in surgery. Methods: We propose and carry out preliminary evaluations of a US-guided AR system for TORS, with the transducer placed on the neck for a transcervical view. Firstly, we perform a novel MRI-transcervical 3D US registration study. Secondly, we develop a US-robot calibration method with an optical tracker and an AR system to display the anatomy mesh model in the real-time endoscope images inside the surgeon console. Results: Our AR system reaches a mean projection error of 26.81 and 27.85 pixels for the projection from the US to stereo cameras in a water bath experiment. The average target registration error for MRI to 3D US is 8.90 mm for the 3D US transducer and 5.85 mm for freehand 3D US, and the average distance between the vessel centerlines is 2.32 mm. Conclusion: We demonstrate the first proof-of-concept transcervical US-guided AR system for TORS and the feasibility of trans-cervical 3D US-MRI registration. Our results show that trans-cervical 3D US is a promising technique for TORS image guidance.Comment: 12 pages, 8 figures. Accepted by Information Processing for Computer Assisted Interventions (IPCAI 2023

    Learning GAN-based Foveated Reconstruction to Recover Perceptually Important Image Features

    Get PDF
    A foveated image can be entirely reconstructed from a sparse set of samples distributed according to the retinal sensitivity of the human visual system, which rapidly decreases with increasing eccentricity. The use of Generative Adversarial Networks has recently been shown to be a promising solution for such a task, as they can successfully hallucinate missing image information. As in the case of other supervised learning approaches, the definition of the loss function and the training strategy heavily influence the quality of the output. In this work,we consider the problem of efficiently guiding thetraining of foveated reconstruction techniques such that they are more aware of the capabilities and limitations of the human visual system, and thus can reconstruct visually important image features. Our primary goal is to make the training procedure less sensitive to distortions that humans cannot detect and focus on penalizing perceptually important artifacts. Given the nature of GAN-based solutions, we focus on the sensitivity of human vision to hallucination in case of input samples with different densities. We propose psychophysical experiments, a dataset, and a procedure for training foveated image reconstruction. The proposed strategy renders the generator network flexible by penalizing only perceptually important deviations in the output. As a result, the method emphasized the recovery of perceptually important image features. We evaluated our strategy and compared it with alternative solutions by using a newly trained objective metric, a recent foveated video quality metric, and user experiments. Our evaluations revealed significant improvements in the perceived image reconstruction quality compared with the standard GAN-based training approach
    • …
    corecore