3,134 research outputs found
Adaptive foveated single-pixel imaging with dynamic super-sampling
As an alternative to conventional multi-pixel cameras, single-pixel cameras
enable images to be recorded using a single detector that measures the
correlations between the scene and a set of patterns. However, to fully sample
a scene in this way requires at least the same number of correlation
measurements as there are pixels in the reconstructed image. Therefore
single-pixel imaging systems typically exhibit low frame-rates. To mitigate
this, a range of compressive sensing techniques have been developed which rely
on a priori knowledge of the scene to reconstruct images from an under-sampled
set of measurements. In this work we take a different approach and adopt a
strategy inspired by the foveated vision systems found in the animal kingdom -
a framework that exploits the spatio-temporal redundancy present in many
dynamic scenes. In our single-pixel imaging system a high-resolution foveal
region follows motion within the scene, but unlike a simple zoom, every frame
delivers new spatial information from across the entire field-of-view. Using
this approach we demonstrate a four-fold reduction in the time taken to record
the detail of rapidly evolving features, whilst simultaneously accumulating
detail of more slowly evolving regions over several consecutive frames. This
tiered super-sampling technique enables the reconstruction of video streams in
which both the resolution and the effective exposure-time spatially vary and
adapt dynamically in response to the evolution of the scene. The methods
described here can complement existing compressive sensing approaches and may
be applied to enhance a variety of computational imagers that rely on
sequential correlation measurements.Comment: 13 pages, 5 figure
IntrinsicNGP: Intrinsic Coordinate based Hash Encoding for Human NeRF
Recently, many works have been proposed to utilize the neural radiance field
for novel view synthesis of human performers. However, most of these methods
require hours of training, making them difficult for practical use. To address
this challenging problem, we propose IntrinsicNGP, which can train from scratch
and achieve high-fidelity results in few minutes with videos of a human
performer. To achieve this target, we introduce a continuous and optimizable
intrinsic coordinate rather than the original explicit Euclidean coordinate in
the hash encoding module of instant-NGP. With this novel intrinsic coordinate,
IntrinsicNGP can aggregate inter-frame information for dynamic objects with the
help of proxy geometry shapes. Moreover, the results trained with the given
rough geometry shapes can be further refined with an optimizable offset field
based on the intrinsic coordinate.Extensive experimental results on several
datasets demonstrate the effectiveness and efficiency of IntrinsicNGP. We also
illustrate our approach's ability to edit the shape of reconstructed subjects.Comment: Project page:https://ustc3dv.github.io/IntrinsicNGP/. arXiv admin
note: substantial text overlap with arXiv:2210.0165
Graph Element Networks: adaptive, structured computation and memory
We explore the use of graph neural networks (GNNs) to model spatial processes
in which there is no a priori graphical structure. Similar to finite element
analysis, we assign nodes of a GNN to spatial locations and use a computational
process defined on the graph to model the relationship between an initial
function defined over a space and a resulting function in the same space. We
use GNNs as a computational substrate, and show that the locations of the nodes
in space as well as their connectivity can be optimized to focus on the most
complex parts of the space. Moreover, this representational strategy allows the
learned input-output relationship to generalize over the size of the underlying
space and run the same model at different levels of precision, trading
computation for accuracy. We demonstrate this method on a traditional PDE
problem, a physical prediction problem from robotics, and learning to predict
scene images from novel viewpoints.Comment: Accepted to ICML 201
Extracting Triangular 3D Models, Materials, and Lighting From Images
We present an efficient method for joint optimization of topology, materials
and lighting from multi-view image observations. Unlike recent multi-view
reconstruction approaches, which typically produce entangled 3D representations
encoded in neural networks, we output triangle meshes with spatially-varying
materials and environment lighting that can be deployed in any traditional
graphics engine unmodified. We leverage recent work in differentiable
rendering, coordinate-based networks to compactly represent volumetric
texturing, alongside differentiable marching tetrahedrons to enable
gradient-based optimization directly on the surface mesh. Finally, we introduce
a differentiable formulation of the split sum approximation of environment
lighting to efficiently recover all-frequency lighting. Experiments show our
extracted models used in advanced scene editing, material decomposition, and
high quality view interpolation, all running at interactive rates in
triangle-based renderers (rasterizers and path tracers). Project website:
https://nvlabs.github.io/nvdiffrec/ .Comment: Project website: https://nvlabs.github.io/nvdiffrec
Development of a calibration pipeline for a monocular-view structured illumination 3D sensor utilizing an array projector
Commercial off-the-shelf digital projection systems are commonly used in active structured illumination photogrammetry of macro-scale surfaces due to their relatively low cost, accessibility, and ease of use. They can be described as inverse pinhole modelled. The calibration pipeline of a 3D sensor utilizing pinhole devices in a projector-camera setup configuration is already well-established. Recently, there have been advances in creating projection systems offering projection speeds greater than that available from conventional off-the-shelf digital projectors. However, they cannot be calibrated using well established techniques based on the pinole assumption. They are chip-less and without projection lens. This work is based on the utilization of unconventional projection systems known as array projectors which contain not one but multiple projection channels that project a temporal sequence of illumination patterns. None of the channels implement a digital projection chip or a projection lens. To workaround the calibration problem, previous realizations of a 3D sensor based on an array projector required a stereo-camera setup. Triangulation took place between the two pinhole modelled cameras instead. However, a monocular setup is desired as a single camera configuration results in decreased cost, weight, and form-factor. This study presents a novel calibration pipeline that realizes a single camera setup. A generalized intrinsic calibration process without model assumptions was developed that directly samples the illumination frustum of each array projection channel. An extrinsic calibration process was then created that determines the pose of the single camera through a downhill simplex optimization initialized by particle swarm. Lastly, a method to store the intrinsic calibration with the aid of an easily realizable calibration jig was developed for re-use in arbitrary measurement camera positions so that intrinsic calibration does not have to be repeated
- …