76 research outputs found
PDO-eCNNs: Partial Differential Operator Based Equivariant Spherical CNNs
Spherical signals exist in many applications, e.g., planetary data, LiDAR
scans and digitalization of 3D objects, calling for models that can process
spherical data effectively. It does not perform well when simply projecting
spherical data into the 2D plane and then using planar convolution neural
networks (CNNs), because of the distortion from projection and ineffective
translation equivariance. Actually, good principles of designing spherical CNNs
are avoiding distortions and converting the shift equivariance property in
planar CNNs to rotation equivariance in the spherical domain. In this work, we
use partial differential operators (PDOs) to design a spherical equivariant
CNN, PDO-eCNN, which is exactly rotation equivariant in the
continuous domain. We then discretize PDO-eCNNs, and analyze
the equivariance error resulted from discretization. This is the first time
that the equivariance error is theoretically analyzed in the spherical domain.
In experiments, PDO-eCNNs show greater parameter efficiency
and outperform other spherical CNNs significantly on several tasks.Comment: Accepted by AAAI202
Deep representations of structures in the 3D-world
This thesis demonstrates a collection of neural network tools that leverage the structures and symmetries of the 3D-world. We have explored various aspects of a vision system ranging from relative pose estimation to 3D-part decomposition from 2D images. For any vision system, it is crucially important to understand and to resolve visual ambiguities in 3D arising from imaging methods. This thesis has shown that leveraging prior knowledge about the structures and the symmetries of the 3D-world in neural network architectures brings about better representations for ambiguous situations. It helps solve problems which are inherently ill-posed
360° Optical Flow using Tangent Images
Omnidirectional 360° images have found many promising and exciting applications in computer vision, robotics and other fields, thanks to their increasing affordability, portability and their 360° field of view. The most common format for storing, processing and visualising 360° images is equirectangular projection (ERP). However, the distortion introduced by the nonlinear mapping from 360° images to ERP images is still a barrier that holds back ERP images from being used as easily as conventional perspective images. This is especially relevant when estimating 360° optical flow, as the distortions need to be mitigated appropriately. In this paper, we propose a 360° optical flow method based on tangent images. Our method leverages gnomonic projection to locally convert ERP images to perspective images, and uniformly samples the ERP image by projection to a cubemap and regular icosahedron faces, to incrementally refine the estimated 360° flow fields even in the presence of large rotations. Our experiments demonstrate the benefits of our proposed method both quantitatively and qualitatively
360MonoDepth: High-Resolution 360° Monocular Depth Estimation
360{\deg} cameras can capture complete environments in a single shot, which
makes 360{\deg} imagery alluring in many computer vision tasks. However,
monocular depth estimation remains a challenge for 360{\deg} data, particularly
for high resolutions like 2K (2048x1024) and beyond that are important for
novel-view synthesis and virtual reality applications. Current CNN-based
methods do not support such high resolutions due to limited GPU memory. In this
work, we propose a flexible framework for monocular depth estimation from
high-resolution 360{\deg} images using tangent images. We project the 360{\deg}
input image onto a set of tangent planes that produce perspective views, which
are suitable for the latest, most accurate state-of-the-art perspective
monocular depth estimators. To achieve globally consistent disparity estimates,
we recombine the individual depth estimates using deformable multi-scale
alignment followed by gradient-domain blending. The result is a dense,
high-resolution 360{\deg} depth map with a high level of detail, also for
outdoor scenes which are not supported by existing methods. Our source code and
data are available at https://manurare.github.io/360monodepth/.Comment: CVPR 2022. Project page: https://manurare.github.io/360monodepth
Virtual Home Staging: Inverse Rendering and Editing an Indoor Panorama under Natural Illumination
We propose a novel inverse rendering method that enables the transformation
of existing indoor panoramas with new indoor furniture layouts under natural
illumination. To achieve this, we captured indoor HDR panoramas along with
real-time outdoor hemispherical HDR photographs. Indoor and outdoor HDR images
were linearly calibrated with measured absolute luminance values for accurate
scene relighting. Our method consists of three key components: (1) panoramic
furniture detection and removal, (2) automatic floor layout design, and (3)
global rendering with scene geometry, new furniture objects, and a real-time
outdoor photograph. We demonstrate the effectiveness of our workflow in
rendering indoor scenes under different outdoor illumination conditions.
Additionally, we contribute a new calibrated HDR (Cali-HDR) dataset that
consists of 137 calibrated indoor panoramas and their associated outdoor
photographs
- …