4 research outputs found
Object Detection in Equirectangular Panorama
We introduced a high-resolution equirectangular panorama (360-degree, virtual
reality) dataset for object detection and propose a multi-projection variant of
YOLO detector. The main challenge with equirectangular panorama image are i)
the lack of annotated training data, ii) high-resolution imagery and iii)
severe geometric distortions of objects near the panorama projection poles. In
this work, we solve the challenges by i) using training examples available in
the "conventional datasets" (ImageNet and COCO), ii) employing only
low-resolution images that require only moderate GPU computing power and
memory, and iii) our multi-projection YOLO handles projection distortions by
making multiple stereographic sub-projections. In our experiments, YOLO
outperforms the other state-of-art detector, Faster RCNN and our
multi-projection YOLO achieves the best accuracy with low-resolution input.Comment: 6 page
Automatic Content-aware Projection for 360{\deg} Videos
To watch 360{\deg} videos on normal 2D displays, we need to project the
selected part of the 360{\deg} image onto the 2D display plane. In this paper,
we propose a fully-automated framework for generating content-aware 2D
normal-view perspective videos from 360{\deg} videos. Especially, we focus on
the projection step preserving important image contents and reducing image
distortion. Basically, our projection method is based on Pannini projection
model. At first, the salient contents such as linear structures and salient
regions in the image are preserved by optimizing the single Panini projection
model. Then, the multiple Panini projection models at salient regions are
interpolated to suppress image distortion globally. Finally, the temporal
consistency for image projection is enforced for producing temporally stable
normal-view videos. Our proposed projection method does not require any
user-interaction and is much faster than previous content-preserving methods.
It can be applied to not only images but also videos taking the temporal
consistency of projection into account. Experiments on various 360{\deg} videos
show the superiority of the proposed projection method quantitatively and
qualitatively.Comment: Accepted to International Conference on Computer Vision (ICCV), 201
Snap Angle Prediction for 360 Panoramas
360 panoramas are a rich medium, yet notoriously difficult to
visualize in the 2D image plane. We explore how intelligent rotations of a
spherical image may enable content-aware projection with fewer perceptible
distortions. Whereas existing approaches assume the viewpoint is fixed,
intuitively some viewing angles within the sphere preserve high-level objects
better than others. To discover the relationship between these optimal snap
angles and the spherical panorama's content, we develop a reinforcement
learning approach for the cubemap projection model. Implemented as a deep
recurrent neural network, our method selects a sequence of rotation actions and
receives reward for avoiding cube boundaries that overlap with important
foreground objects. We show our approach creates more visually pleasing
panoramas while using 5x less computation than the baseline.Comment: ECCV 201
Wide-angle Image Rectification: A Survey
Wide field-of-view (FOV) cameras, which capture a larger scene area than
narrow FOV cameras, are used in many applications including 3D reconstruction,
autonomous driving, and video surveillance. However, wide-angle images contain
distortions that violate the assumptions underlying pinhole camera models,
resulting in object distortion, difficulties in estimating scene distance,
area, and direction, and preventing the use of off-the-shelf deep models
trained on undistorted images for downstream computer vision tasks. Image
rectification, which aims to correct these distortions, can solve these
problems. In this paper, we comprehensively survey progress in wide-angle image
rectification from transformation models to rectification methods.
Specifically, we first present a detailed description and discussion of the
camera models used in different approaches. Then, we summarize several
distortion models including radial distortion and projection distortion. Next,
we review both traditional geometry-based image rectification methods and deep
learning-based methods, where the former formulate distortion parameter
estimation as an optimization problem and the latter treat it as a regression
problem by leveraging the power of deep neural networks. We evaluate the
performance of state-of-the-art methods on public datasets and show that
although both kinds of methods can achieve good results, these methods only
work well for specific camera models and distortion types. We also provide a
strong baseline model and carry out an empirical study of different distortion
models on synthetic datasets and real-world wide-angle images. Finally, we
discuss several potential research directions that are expected to further
advance this area in the future.Comment: Accepted by the International Journal of Computer Vision (IJCV). Both
the datasets and source code are available at
https://github.com/loong8888/WAI