32 research outputs found

    Roadmap on 3D integral imaging: Sensing, processing, and display

    Get PDF
    This Roadmap article on three-dimensional integral imaging provides an overview of some of the research activities in the field of integral imaging. The article discusses various aspects of the field including sensing of 3D scenes, processing of captured information, and 3D display and visualization of information. The paper consists of a series of 15 sections from the experts presenting various aspects of the field on sensing, processing, displays, augmented reality, microscopy, object recognition, and other applications. Each section represents the vision of its author to describe the progress, potential, vision, and challenging issues in this field

    Efficient training procedures for multi-spectral demosaicing

    Get PDF
    The simultaneous acquisition of multi-spectral images on a single sensor can be efficiently performed by single shot capture using a mutli-spectral filter array. This paper focused on the demosaicing of color and near-infrared bands and relied on a convolutional neural network (CNN). To train the deep learning model robustly and accurately, it is necessary to provide enough training data, with sufficient variability. We focused on the design of an efficient training procedure by discovering an optimal training dataset. We propose two data selection strategies, motivated by slightly different concepts. The general term that will be used for the proposed models trained using data selection is data selection-based multi-spectral demosaicing (DSMD). The first idea is clustering-based data selection (DSMD-C), with the goal to discover a representative subset with a high variance so as to train a robust model. The second is an adaptive-based data selection (DSMD-A), a self-guided approach that selects new data based on the current model accuracy. We performed a controlled experimental evaluation of the proposed training strategies and the results show that a careful selection of data does benefit the speed and accuracy of training. We are still able to achieve high reconstruction accuracy with a lightweight model

    Polarimetric Multi-View Inverse Rendering

    Full text link
    A polarization camera has great potential for 3D reconstruction since the angle of polarization (AoP) and the degree of polarization (DoP) of reflected light are related to an object's surface normal. In this paper, we propose a novel 3D reconstruction method called Polarimetric Multi-View Inverse Rendering (Polarimetric MVIR) that effectively exploits geometric, photometric, and polarimetric cues extracted from input multi-view color-polarization images. We first estimate camera poses and an initial 3D model by geometric reconstruction with a standard structure-from-motion and multi-view stereo pipeline. We then refine the initial model by optimizing photometric rendering errors and polarimetric errors using multi-view RGB, AoP, and DoP images, where we propose a novel polarimetric cost function that enables an effective constraint on the estimated surface normal of each vertex, while considering four possible ambiguous azimuth angles revealed from the AoP measurement. The weight for the polarimetric cost is effectively determined based on the DoP measurement, which is regarded as the reliability of polarimetric information. Experimental results using both synthetic and real data demonstrate that our Polarimetric MVIR can reconstruct a detailed 3D shape without assuming a specific surface material and lighting condition.Comment: Paper accepted in IEEE Transactions on Pattern Analysis and Machine Intelligence (2022). arXiv admin note: substantial text overlap with arXiv:2007.0883

    Quanta Burst Photography

    Full text link
    Single-photon avalanche diodes (SPADs) are an emerging sensor technology capable of detecting individual incident photons, and capturing their time-of-arrival with high timing precision. While these sensors were limited to single-pixel or low-resolution devices in the past, recently, large (up to 1 MPixel) SPAD arrays have been developed. These single-photon cameras (SPCs) are capable of capturing high-speed sequences of binary single-photon images with no read noise. We present quanta burst photography, a computational photography technique that leverages SPCs as passive imaging devices for photography in challenging conditions, including ultra low-light and fast motion. Inspired by recent success of conventional burst photography, we design algorithms that align and merge binary sequences captured by SPCs into intensity images with minimal motion blur and artifacts, high signal-to-noise ratio (SNR), and high dynamic range. We theoretically analyze the SNR and dynamic range of quanta burst photography, and identify the imaging regimes where it provides significant benefits. We demonstrate, via a recently developed SPAD array, that the proposed method is able to generate high-quality images for scenes with challenging lighting, complex geometries, high dynamic range and moving objects. With the ongoing development of SPAD arrays, we envision quanta burst photography finding applications in both consumer and scientific photography.Comment: A version with better-quality images can be found on the project webpage: http://wisionlab.cs.wisc.edu/project/quanta-burst-photography

    Optical Camera Communications: Principles, Modulations, Potential and Challenges

    Get PDF
    Optical wireless communications (OWC) are emerging as cost-effective and practical solutions to the congested radio frequency-based wireless technologies. As part of OWC, optical camera communications (OCC) have become very attractive, considering recent developments in cameras and the use of fitted cameras in smart devices. OCC together with visible light communications (VLC) is considered within the framework of the IEEE 802.15.7m standardization. OCCs based on both organic and inorganic light sources as well as cameras are being considered for low-rate transmissions and localization in indoor as well as outdoor short-range applications and within the framework of the IEEE 802.15.7m standardization together with VLC. This paper introduces the underlying principles of OCC and gives a comprehensive overview of this emerging technology with recent standardization activities in OCC. It also outlines the key technical issues such as mobility, coverage, interference, performance enhancement, etc. Future research directions and open issues are also presented

    From Capture to Display: A Survey on Volumetric Video

    Full text link
    Volumetric video, which offers immersive viewing experiences, is gaining increasing prominence. With its six degrees of freedom, it provides viewers with greater immersion and interactivity compared to traditional videos. Despite their potential, volumetric video services poses significant challenges. This survey conducts a comprehensive review of the existing literature on volumetric video. We firstly provide a general framework of volumetric video services, followed by a discussion on prerequisites for volumetric video, encompassing representations, open datasets, and quality assessment metrics. Then we delve into the current methodologies for each stage of the volumetric video service pipeline, detailing capturing, compression, transmission, rendering, and display techniques. Lastly, we explore various applications enabled by this pioneering technology and we present an array of research challenges and opportunities in the domain of volumetric video services. This survey aspires to provide a holistic understanding of this burgeoning field and shed light on potential future research trajectories, aiming to bring the vision of volumetric video to fruition.Comment: Submitte

    Deformed Reality

    Get PDF
    International audienceWe present Deformed Reality, a new way of interacting with an augmented reality environment by manipulating 3D objects in an intuitive and physically-consistent manner. Using the core principle of augmented reality to estimate rigid pose over time, our method makes it possible for the user to deform the targeted object while it is being rendered with its natural texture, giving the sense of a interactive scene editing. Our framework follows a computationally efficient pipeline that uses a proxy CAD model for both pose computation, physically-based manipulations and scene appearance estimation. The final composition is built upon a continuous image completion and re-texturing process to preserve visual consistency. The presented results show that our method can open new ways of using augmented reality by not only augmenting the environment but also interacting with objects intuitively

    Modeling and Mapping Location-Dependent Human Appearance

    Get PDF
    Human appearance is highly variable and depends on individual preferences, such as fashion, facial expression, and makeup. These preferences depend on many factors including a person\u27s sense of style, what they are doing, and the weather. These factors, in turn, are dependent upon geographic location and time. In our work, we build computational models to learn the relationship between human appearance, geographic location, and time. The primary contributions are a framework for collecting and processing geotagged imagery of people, a large dataset collected by our framework, and several generative and discriminative models that use our dataset to learn the relationship between human appearance, location, and time. Additionally, we build interactive maps that allow for inspection and demonstration of what our models have learned
    corecore