32 research outputs found
Roadmap on 3D integral imaging: Sensing, processing, and display
This Roadmap article on three-dimensional integral imaging provides an overview of some of the research activities in the field of integral imaging. The article discusses various aspects of the field including sensing of 3D scenes, processing of captured information, and 3D display and visualization of information. The paper consists of a series of 15 sections from the experts presenting various aspects of the field on sensing, processing, displays, augmented reality, microscopy, object recognition, and other applications. Each section represents the vision of its author to describe the progress, potential, vision, and challenging issues in this field
Efficient training procedures for multi-spectral demosaicing
The simultaneous acquisition of multi-spectral images on a single sensor can be efficiently performed by single shot capture using a mutli-spectral filter array. This paper focused on the demosaicing of color and near-infrared bands and relied on a convolutional neural network (CNN). To train the deep learning model robustly and accurately, it is necessary to provide enough training data, with sufficient variability. We focused on the design of an efficient training procedure by discovering an optimal training dataset. We propose two data selection strategies, motivated by slightly different concepts. The general term that will be used for the proposed models trained using data selection is data selection-based multi-spectral demosaicing (DSMD). The first idea is clustering-based data selection (DSMD-C), with the goal to discover a representative subset with a high variance so as to train a robust model. The second is an adaptive-based data selection (DSMD-A), a self-guided approach that selects new data based on the current model accuracy. We performed a controlled experimental evaluation of the proposed training strategies and the results show that a careful selection of data does benefit the speed and accuracy of training. We are still able to achieve high reconstruction accuracy with a lightweight model
Polarimetric Multi-View Inverse Rendering
A polarization camera has great potential for 3D reconstruction since the
angle of polarization (AoP) and the degree of polarization (DoP) of reflected
light are related to an object's surface normal. In this paper, we propose a
novel 3D reconstruction method called Polarimetric Multi-View Inverse Rendering
(Polarimetric MVIR) that effectively exploits geometric, photometric, and
polarimetric cues extracted from input multi-view color-polarization images. We
first estimate camera poses and an initial 3D model by geometric reconstruction
with a standard structure-from-motion and multi-view stereo pipeline. We then
refine the initial model by optimizing photometric rendering errors and
polarimetric errors using multi-view RGB, AoP, and DoP images, where we propose
a novel polarimetric cost function that enables an effective constraint on the
estimated surface normal of each vertex, while considering four possible
ambiguous azimuth angles revealed from the AoP measurement. The weight for the
polarimetric cost is effectively determined based on the DoP measurement, which
is regarded as the reliability of polarimetric information. Experimental
results using both synthetic and real data demonstrate that our Polarimetric
MVIR can reconstruct a detailed 3D shape without assuming a specific surface
material and lighting condition.Comment: Paper accepted in IEEE Transactions on Pattern Analysis and Machine
Intelligence (2022). arXiv admin note: substantial text overlap with
arXiv:2007.0883
Quanta Burst Photography
Single-photon avalanche diodes (SPADs) are an emerging sensor technology
capable of detecting individual incident photons, and capturing their
time-of-arrival with high timing precision. While these sensors were limited to
single-pixel or low-resolution devices in the past, recently, large (up to 1
MPixel) SPAD arrays have been developed. These single-photon cameras (SPCs) are
capable of capturing high-speed sequences of binary single-photon images with
no read noise. We present quanta burst photography, a computational photography
technique that leverages SPCs as passive imaging devices for photography in
challenging conditions, including ultra low-light and fast motion. Inspired by
recent success of conventional burst photography, we design algorithms that
align and merge binary sequences captured by SPCs into intensity images with
minimal motion blur and artifacts, high signal-to-noise ratio (SNR), and high
dynamic range. We theoretically analyze the SNR and dynamic range of quanta
burst photography, and identify the imaging regimes where it provides
significant benefits. We demonstrate, via a recently developed SPAD array, that
the proposed method is able to generate high-quality images for scenes with
challenging lighting, complex geometries, high dynamic range and moving
objects. With the ongoing development of SPAD arrays, we envision quanta burst
photography finding applications in both consumer and scientific photography.Comment: A version with better-quality images can be found on the project
webpage: http://wisionlab.cs.wisc.edu/project/quanta-burst-photography
Optical Camera Communications: Principles, Modulations, Potential and Challenges
Optical wireless communications (OWC) are emerging as cost-effective and practical solutions to the congested radio frequency-based wireless technologies. As part of OWC, optical camera communications (OCC) have become very attractive, considering recent developments in cameras and the use of fitted cameras in smart devices. OCC together with visible light communications (VLC) is considered within the framework of the IEEE 802.15.7m standardization. OCCs based on both organic and inorganic light sources as well as cameras are being considered for low-rate transmissions and localization in indoor as well as outdoor short-range applications and within the framework of the IEEE 802.15.7m standardization together with VLC. This paper introduces the underlying principles of OCC and gives a comprehensive overview of this emerging technology with recent standardization activities in OCC. It also outlines the key technical issues such as mobility, coverage, interference, performance enhancement, etc. Future research directions and open issues are also presented
From Capture to Display: A Survey on Volumetric Video
Volumetric video, which offers immersive viewing experiences, is gaining
increasing prominence. With its six degrees of freedom, it provides viewers
with greater immersion and interactivity compared to traditional videos.
Despite their potential, volumetric video services poses significant
challenges. This survey conducts a comprehensive review of the existing
literature on volumetric video. We firstly provide a general framework of
volumetric video services, followed by a discussion on prerequisites for
volumetric video, encompassing representations, open datasets, and quality
assessment metrics. Then we delve into the current methodologies for each stage
of the volumetric video service pipeline, detailing capturing, compression,
transmission, rendering, and display techniques. Lastly, we explore various
applications enabled by this pioneering technology and we present an array of
research challenges and opportunities in the domain of volumetric video
services. This survey aspires to provide a holistic understanding of this
burgeoning field and shed light on potential future research trajectories,
aiming to bring the vision of volumetric video to fruition.Comment: Submitte
Deformed Reality
International audienceWe present Deformed Reality, a new way of interacting with an augmented reality environment by manipulating 3D objects in an intuitive and physically-consistent manner. Using the core principle of augmented reality to estimate rigid pose over time, our method makes it possible for the user to deform the targeted object while it is being rendered with its natural texture, giving the sense of a interactive scene editing. Our framework follows a computationally efficient pipeline that uses a proxy CAD model for both pose computation, physically-based manipulations and scene appearance estimation. The final composition is built upon a continuous image completion and re-texturing process to preserve visual consistency. The presented results show that our method can open new ways of using augmented reality by not only augmenting the environment but also interacting with objects intuitively
Modeling and Mapping Location-Dependent Human Appearance
Human appearance is highly variable and depends on individual preferences, such as fashion, facial expression, and makeup. These preferences depend on many factors including a person\u27s sense of style, what they are doing, and the weather. These factors, in turn, are dependent upon geographic location and time. In our work, we build computational models to learn the relationship between human appearance, geographic location, and time. The primary contributions are a framework for collecting and processing geotagged imagery of people, a large dataset collected by our framework, and several generative and discriminative models that use our dataset to learn the relationship between human appearance, location, and time. Additionally, we build interactive maps that allow for inspection and demonstration of what our models have learned