7 research outputs found
OmniZoomer: Learning to Move and Zoom in on Sphere at High-Resolution
Omnidirectional images (ODIs) have become increasingly popular, as their
large field-of-view (FoV) can offer viewers the chance to freely choose the
view directions in immersive environments such as virtual reality. The M\"obius
transformation is typically employed to further provide the opportunity for
movement and zoom on ODIs, but applying it to the image level often results in
blurry effect and aliasing problem. In this paper, we propose a novel deep
learning-based approach, called \textbf{OmniZoomer}, to incorporate the
M\"obius transformation into the network for movement and zoom on ODIs. By
learning various transformed feature maps under different conditions, the
network is enhanced to handle the increasing edge curvatures, which alleviates
the blurry effect. Moreover, to address the aliasing problem, we propose two
key components. Firstly, to compensate for the lack of pixels for describing
curves, we enhance the feature maps in the high-resolution (HR) space and
calculate the transformed index map with a spatial index generation module.
Secondly, considering that ODIs are inherently represented in the spherical
space, we propose a spherical resampling module that combines the index map and
HR feature maps to transform the feature maps for better spherical correlation.
The transformed feature maps are decoded to output a zoomed ODI. Experiments
show that our method can produce HR and high-quality ODIs with the flexibility
to move and zoom in to the object of interest. Project page is available at
http://vlislab22.github.io/OmniZoomer/.Comment: Accepted by ICCV 202
360MonoDepth: High-Resolution 360° Monocular Depth Estimation
360{\deg} cameras can capture complete environments in a single shot, which
makes 360{\deg} imagery alluring in many computer vision tasks. However,
monocular depth estimation remains a challenge for 360{\deg} data, particularly
for high resolutions like 2K (2048x1024) and beyond that are important for
novel-view synthesis and virtual reality applications. Current CNN-based
methods do not support such high resolutions due to limited GPU memory. In this
work, we propose a flexible framework for monocular depth estimation from
high-resolution 360{\deg} images using tangent images. We project the 360{\deg}
input image onto a set of tangent planes that produce perspective views, which
are suitable for the latest, most accurate state-of-the-art perspective
monocular depth estimators. To achieve globally consistent disparity estimates,
we recombine the individual depth estimates using deformable multi-scale
alignment followed by gradient-domain blending. The result is a dense,
high-resolution 360{\deg} depth map with a high level of detail, also for
outdoor scenes which are not supported by existing methods. Our source code and
data are available at https://manurare.github.io/360monodepth/.Comment: CVPR 2022. Project page: https://manurare.github.io/360monodepth
3D Scene Geometry Estimation from 360 Imagery: A Survey
This paper provides a comprehensive survey on pioneer and state-of-the-art 3D
scene geometry estimation methodologies based on single, two, or multiple
images captured under the omnidirectional optics. We first revisit the basic
concepts of the spherical camera model, and review the most common acquisition
technologies and representation formats suitable for omnidirectional (also
called 360, spherical or panoramic) images and videos. We then survey
monocular layout and depth inference approaches, highlighting the recent
advances in learning-based solutions suited for spherical data. The classical
stereo matching is then revised on the spherical domain, where methodologies
for detecting and describing sparse and dense features become crucial. The
stereo matching concepts are then extrapolated for multiple view camera setups,
categorizing them among light fields, multi-view stereo, and structure from
motion (or visual simultaneous localization and mapping). We also compile and
discuss commonly adopted datasets and figures of merit indicated for each
purpose and list recent results for completeness. We conclude this paper by
pointing out current and future trends.Comment: Published in ACM Computing Survey
Coordinate Independent Convolutional Networks -- Isometry and Gauge Equivariant Convolutions on Riemannian Manifolds
Motivated by the vast success of deep convolutional networks, there is a
great interest in generalizing convolutions to non-Euclidean manifolds. A major
complication in comparison to flat spaces is that it is unclear in which
alignment a convolution kernel should be applied on a manifold. The underlying
reason for this ambiguity is that general manifolds do not come with a
canonical choice of reference frames (gauge). Kernels and features therefore
have to be expressed relative to arbitrary coordinates. We argue that the
particular choice of coordinatization should not affect a network's inference
-- it should be coordinate independent. A simultaneous demand for coordinate
independence and weight sharing is shown to result in a requirement on the
network to be equivariant under local gauge transformations (changes of local
reference frames). The ambiguity of reference frames depends thereby on the
G-structure of the manifold, such that the necessary level of gauge
equivariance is prescribed by the corresponding structure group G. Coordinate
independent convolutions are proven to be equivariant w.r.t. those isometries
that are symmetries of the G-structure. The resulting theory is formulated in a
coordinate free fashion in terms of fiber bundles. To exemplify the design of
coordinate independent convolutions, we implement a convolutional network on
the M\"obius strip. The generality of our differential geometric formulation of
convolutional networks is demonstrated by an extensive literature review which
explains a large number of Euclidean CNNs, spherical CNNs and CNNs on general
surfaces as specific instances of coordinate independent convolutions.Comment: The implementation of orientation independent M\"obius convolutions
is publicly available at https://github.com/mauriceweiler/MobiusCNN