8 research outputs found
Footprints and Free Space from a Single Color Image
Understanding the shape of a scene from a single color image is a formidable
computer vision task. However, most methods aim to predict the geometry of
surfaces that are visible to the camera, which is of limited use when planning
paths for robots or augmented reality agents. Such agents can only move when
grounded on a traversable surface, which we define as the set of classes which
humans can also walk over, such as grass, footpaths and pavement. Models which
predict beyond the line of sight often parameterize the scene with voxels or
meshes, which can be expensive to use in machine learning frameworks.
We introduce a model to predict the geometry of both visible and occluded
traversable surfaces, given a single RGB image as input. We learn from stereo
video sequences, using camera poses, per-frame depth and semantic segmentation
to form training data, which is used to supervise an image-to-image network. We
train models from the KITTI driving dataset, the indoor Matterport dataset, and
from our own casually captured stereo footage. We find that a surprisingly low
bar for spatial coverage of training scenes is required. We validate our
algorithm against a range of strong baselines, and include an assessment of our
predictions for a path-planning task.Comment: Accepted to CVPR 2020 as an oral presentatio
The Visual Social Distancing Problem
One of the main and most effective measures to contain the recent viral
outbreak is the maintenance of the so-called Social Distancing (SD). To comply
with this constraint, workplaces, public institutions, transports and schools
will likely adopt restrictions over the minimum inter-personal distance between
people. Given this actual scenario, it is crucial to massively measure the
compliance to such physical constraint in our life, in order to figure out the
reasons of the possible breaks of such distance limitations, and understand if
this implies a possible threat given the scene context. All of this, complying
with privacy policies and making the measurement acceptable. To this end, we
introduce the Visual Social Distancing (VSD) problem, defined as the automatic
estimation of the inter-personal distance from an image, and the
characterization of the related people aggregations. VSD is pivotal for a
non-invasive analysis to whether people comply with the SD restriction, and to
provide statistics about the level of safety of specific areas whenever this
constraint is violated. We then discuss how VSD relates with previous
literature in Social Signal Processing and indicate which existing Computer
Vision methods can be used to manage such problem. We conclude with future
challenges related to the effectiveness of VSD systems, ethical implications
and future application scenarios.Comment: 9 pages, 5 figures. All the authors equally contributed to this
manuscript and they are listed by alphabetical order. Under submissio
A depth-based hybrid approach for safe flight corridor generation in memoryless planning
This paper presents a depth-based hybrid method to generate safe flight corridors for a memoryless local navigation planner. It is first proposed to use raw depth images as inputs in the learning-based object-detection engine with no requirement for map fusion. We then employ an object-detection network to directly predict the base of polyhedral safe corridors in a new raw depth image. Furthermore, we apply a verification procedure to eliminate any false predictions so that the resulting collision-free corridors are guaranteed. More importantly, the proposed mechanism helps produce separate safe corridors with minimal overlap that are suitable to be used as space boundaries for path planning. The average intersection of union (IoU) of corridors obtained by the proposed algorithm is less than 2%. To evaluate the effectiveness of our method, we incorporated it into a memoryless planner with a straight-line path-planning algorithm. We then tested the entire system in both synthetic and real-world obstacle-dense environments. The obtained results with very high success rates demonstrate that the proposed approach is highly capable of producing safe corridors for memoryless local planning. © 2023 by the authors
Helping the Blind to Get through COVID-19: Social Distancing Assistant Using Real-Time Semantic Segmentation on RGB-D Video
The current COVID-19 pandemic is having a major impact on our daily lives. Social distancing is one of the measures that has been implemented with the aim of slowing the spread of the disease, but it is difficult for blind people to comply with this. In this paper, we present a system that helps blind people to maintain physical distance to other persons using a combination of RGB and depth cameras. We use a real-time semantic segmentation algorithm on the RGB camera to detect where persons are and use the depth camera to assess the distance to them; then, we provide audio feedback through bone-conducting headphones if a person is closer than 1.5 m. Our system warns the user only if persons are nearby but does not react to non-person objects such as walls, trees or doors; thus, it is not intrusive, and it is possible to use it in combination with other assistive devices. We have tested our prototype system on one blind and four blindfolded persons, and found that the system is precise, easy to use, and amounts to low cognitive load
Single Image Human Proxemics Estimation for Visual Social Distancing
In this work, we address the problem of estimating the so-called "Social
Distancing" given a single uncalibrated image in unconstrained scenarios. Our
approach proposes a semi-automatic solution to approximate the homography
matrix between the scene ground and image plane. With the estimated homography,
we then leverage an off-the-shelf pose detector to detect body poses on the
image and to reason upon their inter-personal distances using the length of
their body-parts. Inter-personal distances are further locally inspected to
detect possible violations of the social distancing rules. We validate our
proposed method quantitatively and qualitatively against baselines on public
domain datasets for which we provided groundtruth on inter-personal distances.
Besides, we demonstrate the application of our method deployed in a real
testing scenario where statistics on the inter-personal distances are currently
used to improve the safety in a critical environment.Comment: Paper accepted at WACV 2021 conferenc
Minimizing Supervision for Vision-Based Perception and Control in Autonomous Driving
The research presented in this dissertation focuses on reducing the need for supervision in two tasks related to autonomous driving: end-to-end steering and free space segmentation.
For end-to-end steering, we devise a new regularization technique which relies on pixel-relevance heatmaps to force the steering model to focus on lane markings. This improves performance across a variety of offline metrics. In relation to this work, we publicly release the RoboBus dataset, which consists of extensive driving data recorded using a commercial bus on a cross-border public transport route on the Luxembourgish-French border.
We also tackle pseudo-supervised free space segmentation from three different angles: (1) we propose a Stochastic Co-Teaching training scheme that explicitly attempts to filter out the noise in pseudo-labels, (2) we study the impact of self-training and of different data augmentation techniques, (3) we devise a novel pseudo-label generation method based on road plane distance estimation from approximate depth maps.
Finally, we investigate semi-supervised free space estimation and find that combining our techniques with a restricted subset of labeled samples results in substantial improvements in IoU, Precision and Recall