4,491 research outputs found
Generating Light Estimation for Mixed-reality Devices through Collaborative Visual Sensing
abstract: Mixed reality mobile platforms co-locate virtual objects with physical spaces, creating immersive user experiences. To create visual harmony between virtual and physical spaces, the virtual scene must be accurately illuminated with realistic physical lighting. To this end, a system was designed that Generates Light Estimation Across Mixed-reality (GLEAM) devices to continually sense realistic lighting of a physical scene in all directions. GLEAM optionally operate across multiple mobile mixed-reality devices to leverage collaborative multi-viewpoint sensing for improved estimation. The system implements policies that prioritize resolution, coverage, or update interval of the illumination estimation depending on the situational needs of the virtual scene and physical environment.
To evaluate the runtime performance and perceptual efficacy of the system, GLEAM was implemented on the Unity 3D Game Engine. The implementation was deployed on Android and iOS devices. On these implementations, GLEAM can prioritize dynamic estimation with update intervals as low as 15 ms or prioritize high spatial quality with update intervals of 200 ms. User studies across 99 participants and 26 scene comparisons reported a preference towards GLEAM over other lighting techniques in 66.67% of the presented augmented scenes and indifference in 12.57% of the scenes. A controlled lighting user study on 18 participants revealed a general preference for policies that strike a balance between resolution and update rate.Dissertation/ThesisMasters Thesis Computer Science 201
High-Dynamic-Range Lighting Estimation From Face Portraits.
We present a CNN-based method for outdoor highdynamic-range (HDR) environment map prediction from low-dynamic-range (LDR) portrait images. Our method relies on two different CNN architectures, one for light encoding and another for face-to-light prediction. Outdoor lighting is characterised by an extremely high dynamic range, and thus our encoding splits the environment map data between low and high-intensity components, and encodes them using tailored representations. The combination of both network architectures constitutes an end-to-end method for accurate HDR light prediction from faces at real-time rates, inaccessible for previous methods which focused on low dynamic range lighting or relied on non-linear optimisation schemes. We train our networks using both real and synthetic images, we compare our light encoding with other methods for light representation, and we analyse our results for light prediction on real images. We show that our predicted HDR environment maps can be used as accurate illumination sources for scene renderings, with potential applications in 3D object insertion for augmented reality
Estimation of environmental lighting from human face for illumination of augmented reality scenes
In this thesis, we propose a method to solve a common problem in augmented reality domain; estimating light sources in an outdoor scene and lighting virtual objects accordingly. As a basis of our method we developed a framework based on estimation of environmental lighting from well defined objects, specifically human faces. The method is tuned for outdoor use, and the algorithm is further enhanced to illuminate virtual objects exposed to direct sunlight. In the first part of this thesis, we propose a novel lighting estimation technique where we assume a user is looking straight to mobile devices camera. This technique extracts information from input images to calculate possible light sources to pass to the rendering stage. In the second part of this thesis, we propose a lighting model which uses the output from our lighting estimation in order to make objects appear as they are lit correctly by the sun light. This model uses a mathematical technique called Spherical Harmonics Lighting for real-time realistic rendering
Footprints and Free Space from a Single Color Image
Understanding the shape of a scene from a single color image is a formidable
computer vision task. However, most methods aim to predict the geometry of
surfaces that are visible to the camera, which is of limited use when planning
paths for robots or augmented reality agents. Such agents can only move when
grounded on a traversable surface, which we define as the set of classes which
humans can also walk over, such as grass, footpaths and pavement. Models which
predict beyond the line of sight often parameterize the scene with voxels or
meshes, which can be expensive to use in machine learning frameworks.
We introduce a model to predict the geometry of both visible and occluded
traversable surfaces, given a single RGB image as input. We learn from stereo
video sequences, using camera poses, per-frame depth and semantic segmentation
to form training data, which is used to supervise an image-to-image network. We
train models from the KITTI driving dataset, the indoor Matterport dataset, and
from our own casually captured stereo footage. We find that a surprisingly low
bar for spatial coverage of training scenes is required. We validate our
algorithm against a range of strong baselines, and include an assessment of our
predictions for a path-planning task.Comment: Accepted to CVPR 2020 as an oral presentatio
Object-based Illumination Estimation with Rendering-aware Neural Networks
We present a scheme for fast environment light estimation from the RGBD
appearance of individual objects and their local image areas. Conventional
inverse rendering is too computationally demanding for real-time applications,
and the performance of purely learning-based techniques may be limited by the
meager input data available from individual objects. To address these issues,
we propose an approach that takes advantage of physical principles from inverse
rendering to constrain the solution, while also utilizing neural networks to
expedite the more computationally expensive portions of its processing, to
increase robustness to noisy input data as well as to improve temporal and
spatial stability. This results in a rendering-aware system that estimates the
local illumination distribution at an object with high accuracy and in real
time. With the estimated lighting, virtual objects can be rendered in AR
scenarios with shading that is consistent to the real scene, leading to
improved realism.Comment: ECCV 202
Past, Present, and Future of Simultaneous Localization And Mapping: Towards the Robust-Perception Age
Simultaneous Localization and Mapping (SLAM)consists in the concurrent
construction of a model of the environment (the map), and the estimation of the
state of the robot moving within it. The SLAM community has made astonishing
progress over the last 30 years, enabling large-scale real-world applications,
and witnessing a steady transition of this technology to industry. We survey
the current state of SLAM. We start by presenting what is now the de-facto
standard formulation for SLAM. We then review related work, covering a broad
set of topics including robustness and scalability in long-term mapping, metric
and semantic representations for mapping, theoretical performance guarantees,
active SLAM and exploration, and other new frontiers. This paper simultaneously
serves as a position paper and tutorial to those who are users of SLAM. By
looking at the published research with a critical eye, we delineate open
challenges and new research issues, that still deserve careful scientific
investigation. The paper also contains the authors' take on two questions that
often animate discussions during robotics conferences: Do robots need SLAM? and
Is SLAM solved
- …