146 research outputs found
Deep Learning for Vanishing Point Detection Using an Inverse Gnomonic Projection
We present a novel approach for vanishing point detection from uncalibrated
monocular images. In contrast to state-of-the-art, we make no a priori
assumptions about the observed scene. Our method is based on a convolutional
neural network (CNN) which does not use natural images, but a Gaussian sphere
representation arising from an inverse gnomonic projection of lines detected in
an image. This allows us to rely on synthetic data for training, eliminating
the need for labelled images. Our method achieves competitive performance on
three horizon estimation benchmark datasets. We further highlight some
additional use cases for which our vanishing point detection algorithm can be
used.Comment: Accepted for publication at German Conference on Pattern Recognition
(GCPR) 2017. This research was supported by German Research Foundation DFG
within Priority Research Programme 1894 "Volunteered Geographic Information:
Interpretation, Visualisation and Social Computing
3D Scene Geometry Estimation from 360 Imagery: A Survey
This paper provides a comprehensive survey on pioneer and state-of-the-art 3D
scene geometry estimation methodologies based on single, two, or multiple
images captured under the omnidirectional optics. We first revisit the basic
concepts of the spherical camera model, and review the most common acquisition
technologies and representation formats suitable for omnidirectional (also
called 360, spherical or panoramic) images and videos. We then survey
monocular layout and depth inference approaches, highlighting the recent
advances in learning-based solutions suited for spherical data. The classical
stereo matching is then revised on the spherical domain, where methodologies
for detecting and describing sparse and dense features become crucial. The
stereo matching concepts are then extrapolated for multiple view camera setups,
categorizing them among light fields, multi-view stereo, and structure from
motion (or visual simultaneous localization and mapping). We also compile and
discuss commonly adopted datasets and figures of merit indicated for each
purpose and list recent results for completeness. We conclude this paper by
pointing out current and future trends.Comment: Published in ACM Computing Survey
The Visual Social Distancing Problem
One of the main and most effective measures to contain the recent viral
outbreak is the maintenance of the so-called Social Distancing (SD). To comply
with this constraint, workplaces, public institutions, transports and schools
will likely adopt restrictions over the minimum inter-personal distance between
people. Given this actual scenario, it is crucial to massively measure the
compliance to such physical constraint in our life, in order to figure out the
reasons of the possible breaks of such distance limitations, and understand if
this implies a possible threat given the scene context. All of this, complying
with privacy policies and making the measurement acceptable. To this end, we
introduce the Visual Social Distancing (VSD) problem, defined as the automatic
estimation of the inter-personal distance from an image, and the
characterization of the related people aggregations. VSD is pivotal for a
non-invasive analysis to whether people comply with the SD restriction, and to
provide statistics about the level of safety of specific areas whenever this
constraint is violated. We then discuss how VSD relates with previous
literature in Social Signal Processing and indicate which existing Computer
Vision methods can be used to manage such problem. We conclude with future
challenges related to the effectiveness of VSD systems, ethical implications
and future application scenarios.Comment: 9 pages, 5 figures. All the authors equally contributed to this
manuscript and they are listed by alphabetical order. Under submissio
Single Image Human Proxemics Estimation for Visual Social Distancing
In this work, we address the problem of estimating the so-called "Social
Distancing" given a single uncalibrated image in unconstrained scenarios. Our
approach proposes a semi-automatic solution to approximate the homography
matrix between the scene ground and image plane. With the estimated homography,
we then leverage an off-the-shelf pose detector to detect body poses on the
image and to reason upon their inter-personal distances using the length of
their body-parts. Inter-personal distances are further locally inspected to
detect possible violations of the social distancing rules. We validate our
proposed method quantitatively and qualitatively against baselines on public
domain datasets for which we provided groundtruth on inter-personal distances.
Besides, we demonstrate the application of our method deployed in a real
testing scenario where statistics on the inter-personal distances are currently
used to improve the safety in a critical environment.Comment: Paper accepted at WACV 2021 conferenc
Exploiting Structural Regularities and Beyond: Vision-based Localization and Mapping in Man-Made Environments
Image-based estimation of camera motion, known as visual odometry
(VO), plays a very important role in many robotic applications
such as control and navigation of unmanned mobile robots,
especially when no external navigation reference signal is
available. The core problem of VO is the estimation of the
camera’s ego-motion (i.e. tracking) either between successive
frames, namely relative pose estimation, or with respect to a
global map, namely absolute pose estimation. This thesis aims to
develop efficient, accurate and robust VO solutions by taking
advantage of structural regularities in man-made environments,
such as piece-wise planar structures, Manhattan World and more
generally, contours and edges. Furthermore, to handle challenging
scenarios that are beyond the limits of classical sensor based VO
solutions, we investigate a recently emerging sensor — the
event camera and study on event-based mapping — one of the key
problems in the event-based VO/SLAM. The main achievements are
summarized as follows.
First, we revisit an old topic on relative pose estimation:
accurately and robustly estimating the fundamental matrix given a
collection of independently estimated homograhies. Three
classical methods are reviewed and then we show a simple but
nontrivial two-step normalization
within the direct linear method that achieves similar performance
to the less attractive and more computationally intensive
hallucinated points based method.
Second, an efficient 3D rotation estimation algorithm for depth
cameras in piece-wise planar environments is presented. It shows
that by using surface normal vectors as an input, planar modes in
the corresponding density distribution function can be discovered
and continuously
tracked using efficient non-parametric estimation techniques. The
relative rotation can be estimated by registering entire bundles
of planar modes by using robust L1-norm minimization.
Third, an efficient alternative to the iterative closest point
algorithm for real-time tracking of modern depth cameras in
ManhattanWorlds is developed. We exploit the common orthogonal
structure of man-made environments in order to decouple the
estimation of the rotation and the three degrees of freedom of
the translation. The derived camera orientation is absolute and
thus free of long-term drift, which in turn benefits the accuracy
of the translation estimation as well.
Fourth, we look into a more general structural
regularity—edges. A real-time VO system that uses Canny edges
is proposed for RGB-D cameras. Two novel alternatives to
classical distance transforms are developed with great properties
that significantly improve the classical Euclidean distance field
based methods in terms of efficiency, accuracy and robustness.
Finally, to deal with challenging scenarios that go beyond what
standard RGB/RGB-D cameras can handle, we investigate the
recently emerging event camera and focus on the problem of 3D
reconstruction from data captured by a stereo event-camera rig
moving in a static
scene, such as in the context of stereo Simultaneous Localization
and Mapping
The toulouse vanishing points dataset
International audienceIn this paper we present the Toulouse Vanishing Points Dataset, a public photographs database of Manhattan scenes taken with an iPad Air 1. The purpose of this dataset is the evaluation of vanishing points estimation algorithms. Its originality is the addition of Inertial Measurement Unit (IMU) data synchronized with the camera under the form of rotation matrices. Moreover, contrary to existing works which provide vanishing points of reference in the form of single points, we computed uncertainty regions. The Toulouse Vanishing Points Dataset is publicly available at http://ubee.enseeiht.fr/tvp
- …