1,067 research outputs found
Perceptual Quality Assessment of Omnidirectional Images as Moving Camera Videos
Omnidirectional images (also referred to as static 360{\deg} panoramas)
impose viewing conditions much different from those of regular 2D images. How
do humans perceive image distortions in immersive virtual reality (VR)
environments is an important problem which receives less attention. We argue
that, apart from the distorted panorama itself, two types of VR viewing
conditions are crucial in determining the viewing behaviors of users and the
perceived quality of the panorama: the starting point and the exploration time.
We first carry out a psychophysical experiment to investigate the interplay
among the VR viewing conditions, the user viewing behaviors, and the perceived
quality of 360{\deg} images. Then, we provide a thorough analysis of the
collected human data, leading to several interesting findings. Moreover, we
propose a computational framework for objective quality assessment of 360{\deg}
images, embodying viewing conditions and behaviors in a delightful way.
Specifically, we first transform an omnidirectional image to several video
representations using different user viewing behaviors under different viewing
conditions. We then leverage advanced 2D full-reference video quality models to
compute the perceived quality. We construct a set of specific quality measures
within the proposed framework, and demonstrate their promises on three VR
quality databases.Comment: 11 pages, 11 figure, 9 tables. This paper has been accepted by IEEE
Transactions on Visualization and Computer Graphic
OmniDepth: Dense Depth Estimation for Indoors Spherical Panoramas
Recent work on depth estimation up to now has only focused on projective
images ignoring 360 content which is now increasingly and more easily produced.
We show that monocular depth estimation models trained on traditional images
produce sub-optimal results on omnidirectional images, showcasing the need for
training directly on 360 datasets, which however, are hard to acquire. In this
work, we circumvent the challenges associated with acquiring high quality 360
datasets with ground truth depth annotations, by re-using recently released
large scale 3D datasets and re-purposing them to 360 via rendering. This
dataset, which is considerably larger than similar projective datasets, is
publicly offered to the community to enable future research in this direction.
We use this dataset to learn in an end-to-end fashion the task of depth
estimation from 360 images. We show promising results in our synthesized data
as well as in unseen realistic images.Comment: Pre-print to appear in ECCV1
Self-Supervised Learning of Depth and Camera Motion from 360{\deg} Videos
As 360{\deg} cameras become prevalent in many autonomous systems (e.g.,
self-driving cars and drones), efficient 360{\deg} perception becomes more and
more important. We propose a novel self-supervised learning approach for
predicting the omnidirectional depth and camera motion from a 360{\deg} video.
In particular, starting from the SfMLearner, which is designed for cameras with
normal field-of-view, we introduce three key features to process 360{\deg}
images efficiently. Firstly, we convert each image from equirectangular
projection to cubic projection in order to avoid image distortion. In each
network layer, we use Cube Padding (CP), which pads intermediate features from
adjacent faces, to avoid image boundaries. Secondly, we propose a novel
"spherical" photometric consistency constraint on the whole viewing sphere. In
this way, no pixel will be projected outside the image boundary which typically
happens in images with normal field-of-view. Finally, rather than naively
estimating six independent camera motions (i.e., naively applying SfM-Learner
to each face on a cube), we propose a novel camera pose consistency loss to
ensure the estimated camera motions reaching consensus. To train and evaluate
our approach, we collect a new PanoSUNCG dataset containing a large amount of
360{\deg} videos with groundtruth depth and camera motion. Our approach
achieves state-of-the-art depth prediction and camera motion estimation on
PanoSUNCG with faster inference speed comparing to equirectangular. In
real-world indoor videos, our approach can also achieve qualitatively
reasonable depth prediction by acquiring model pre-trained on PanoSUNCG.Comment: ACCV 2018 Ora
Improved Person Detection on Omnidirectional Images with Non-maxima Suppression
We propose a person detector on omnidirectional images, an accurate method to
generate minimal enclosing rectangles of persons. The basic idea is to adapt
the qualitative detection performance of a convolutional neural network based
method, namely YOLOv2 to fish-eye images. The design of our approach picks up
the idea of a state-of-the-art object detector and highly overlapping areas of
images with their regions of interests. This overlap reduces the number of
false negatives. Based on the raw bounding boxes of the detector we fine-tuned
overlapping bounding boxes by three approaches: non-maximum suppression, soft
non-maximum suppression and soft non-maximum suppression with Gaussian
smoothing. The evaluation was done on the PIROPO database and an own annotated
Flat dataset, supplemented with bounding boxes on omnidirectional images. We
achieve an average precision of 64.4 % with YOLOv2 for the class person on
PIROPO and 77.6 % on Flat. For this purpose we fine-tuned the soft non-maximum
suppression with Gaussian smoothing.Comment: 8 pages, VISAPP 201
Proceedings of the 1st Workshop on Robotics Challenges and Vision (RCV2013)
Proceedings of the 1st Workshop on Robotics Challenges and Vision (RCV2013)Comment: http://compbio.cs.wayne.edu/robotics/rcv2013/proceedings-emb.pd
Learning Compressible 360{\deg} Video Isomers
Standard video encoders developed for conventional narrow field-of-view video
are widely applied to 360{\deg} video as well, with reasonable results.
However, while this approach commits arbitrarily to a projection of the
spherical frames, we observe that some orientations of a 360{\deg} video, once
projected, are more compressible than others. We introduce an approach to
predict the sphere rotation that will yield the maximal compression rate. Given
video clips in their original encoding, a convolutional neural network learns
the association between a clip's visual content and its compressibility at
different rotations of a cubemap projection. Given a novel video, our
learning-based approach efficiently infers the most compressible direction in
one shot, without repeated rendering and compression of the source video. We
validate our idea on thousands of video clips and multiple popular video
codecs. The results show that this untapped dimension of 360{\deg} compression
has substantial potential--"good" rotations are typically 8-10% more
compressible than bad ones, and our learning approach can predict them reliably
82% of the time
A dataset of annotated omnidirectional videos for distancing applications
Omnidirectional (or 360◦ ) cameras are acquisition devices that, in the next few years, could have a big impact on video surveillance applications, research, and industry, as they can record a spherical view of a whole environment from every perspective. This paper presents two new contributions to the research community: the CVIP360 dataset, an annotated dataset of 360◦ videos for distancing applications, and a new method to estimate the distances of objects in a scene from a single 360◦ image. The CVIP360 dataset includes 16 videos acquired outdoors and indoors, annotated by adding information about the pedestrians in the scene (bounding boxes) and the distances to the camera of some points in the 3D world by using markers at fixed and known intervals. The proposed distance estimation algorithm is based on geometry facts regarding the acquisition process of the omnidirectional device, and is uncalibrated in practice: the only required parameter is the camera height. The proposed algorithm was tested on the CVIP360 dataset, and empirical results demonstrate that the estimation error is negligible for distancing applications
Long-term experiments with an adaptive spherical view representation for navigation in changing environments
Real-world environments such as houses and offices change over time, meaning that a mobile robot’s map will become out of date. In this work, we introduce a method to update the reference views in a hybrid metric-topological map so that a mobile robot can continue to localize itself in a changing environment. The updating mechanism, based on the multi-store model of human memory, incorporates a spherical metric representation of the observed visual features for each node in the map, which enables the robot to estimate its heading and navigate using multi-view geometry, as well as representing the local 3D geometry of the environment. A series of experiments demonstrate the persistence performance of the proposed system in real changing environments, including analysis of the long-term stability
WoodScape: A multi-task, multi-camera fisheye dataset for autonomous driving
Fisheye cameras are commonly employed for obtaining a large field of view in
surveillance, augmented reality and in particular automotive applications. In
spite of their prevalence, there are few public datasets for detailed
evaluation of computer vision algorithms on fisheye images. We release the
first extensive fisheye automotive dataset, WoodScape, named after Robert Wood
who invented the fisheye camera in 1906. WoodScape comprises of four surround
view cameras and nine tasks including segmentation, depth estimation, 3D
bounding box detection and soiling detection. Semantic annotation of 40 classes
at the instance level is provided for over 10,000 images and annotation for
other tasks are provided for over 100,000 images. With WoodScape, we would like
to encourage the community to adapt computer vision models for fisheye camera
instead of using naive rectification.Comment: Accepted for Oral Presentation at IEEE International Conference on
Computer Vision (ICCV) 2019. Please refer to our website
https://woodscape.valeo.com and https://github.com/valeoai/woodscape for
release status and update
- …