1,578 research outputs found
Ultimate SLAM? Combining Events, Images, and IMU for Robust Visual SLAM in HDR and High Speed Scenarios
Event cameras are bio-inspired vision sensors that output pixel-level
brightness changes instead of standard intensity frames. These cameras do not
suffer from motion blur and have a very high dynamic range, which enables them
to provide reliable visual information during high speed motions or in scenes
characterized by high dynamic range. However, event cameras output only little
information when the amount of motion is limited, such as in the case of almost
still motion. Conversely, standard cameras provide instant and rich information
about the environment most of the time (in low-speed and good lighting
scenarios), but they fail severely in case of fast motions, or difficult
lighting such as high dynamic range or low light scenes. In this paper, we
present the first state estimation pipeline that leverages the complementary
advantages of these two sensors by fusing in a tightly-coupled manner events,
standard frames, and inertial measurements. We show on the publicly available
Event Camera Dataset that our hybrid pipeline leads to an accuracy improvement
of 130% over event-only pipelines, and 85% over standard-frames-only
visual-inertial systems, while still being computationally tractable.
Furthermore, we use our pipeline to demonstrate - to the best of our knowledge
- the first autonomous quadrotor flight using an event camera for state
estimation, unlocking flight scenarios that were not reachable with traditional
visual-inertial odometry, such as low-light environments and high-dynamic range
scenes.Comment: 8 pages, 9 figures, 2 table
A Robust Head Tracking System Based on Monocular Vision and Planar Templates
This paper details the implementation of a head tracking system suitable for its use in teleoperation stations or control centers, taking into account the limitations and constraints usually associated to those environments. The paper discusses and justifies the selection of the different methods and sensors to build the head tracking system, detailing also the processing steps of the system in operation. A prototype to validate the proposed approach is also presented along with several tests in a real environment with promising results
Low-latency Cloud-based Volumetric Video Streaming Using Head Motion Prediction
Volumetric video is an emerging key technology for immersive representation
of 3D spaces and objects. Rendering volumetric video requires lots of
computational power which is challenging especially for mobile devices. To
mitigate this, we developed a streaming system that renders a 2D view from the
volumetric video at a cloud server and streams a 2D video stream to the client.
However, such network-based processing increases the motion-to-photon (M2P)
latency due to the additional network and processing delays. In order to
compensate the added latency, prediction of the future user pose is necessary.
We developed a head motion prediction model and investigated its potential to
reduce the M2P latency for different look-ahead times. Our results show that
the presented model reduces the rendering errors caused by the M2P latency
compared to a baseline system in which no prediction is performed.Comment: 7 pages, 4 figure
Autonomous Sweet Pepper Harvesting for Protected Cropping Systems
In this letter, we present a new robotic harvester (Harvey) that can
autonomously harvest sweet pepper in protected cropping environments. Our
approach combines effective vision algorithms with a novel end-effector design
to enable successful harvesting of sweet peppers. Initial field trials in
protected cropping environments, with two cultivar, demonstrate the efficacy of
this approach achieving a 46% success rate for unmodified crop, and 58% for
modified crop. Furthermore, for the more favourable cultivar we were also able
to detach 90% of sweet peppers, indicating that improvements in the grasping
success rate would result in greatly improved harvesting performance
SILVR: A Synthetic Immersive Large-Volume Plenoptic Dataset
In six-degrees-of-freedom light-field (LF) experiences, the viewer's freedom
is limited by the extent to which the plenoptic function was sampled. Existing
LF datasets represent only small portions of the plenoptic function, such that
they either cover a small volume, or they have limited field of view.
Therefore, we propose a new LF image dataset "SILVR" that allows for
six-degrees-of-freedom navigation in much larger volumes while maintaining full
panoramic field of view. We rendered three different virtual scenes in various
configurations, where the number of views ranges from 642 to 2226. One of these
scenes (called Zen Garden) is a novel scene, and is made publicly available. We
chose to position the virtual cameras closely together in large cuboid and
spherical organisations ( to ), equipped with 180{\deg} fish-eye
lenses. Every view is rendered to a color image and depth map of 2048px
2048px. Additionally, we present the software used to automate the
multi-view rendering process, as well as a lens-reprojection tool that converts
between images with panoramic or fish-eye projection to a standard rectilinear
(i.e., perspective) projection. Finally, we demonstrate how the proposed
dataset and software can be used to evaluate LF coding/rendering techniques(in
this case for training NeRFs with instant-ngp). As such, we provide the first
publicly-available LF dataset for large volumes of light with full panoramic
field of viewComment: In 13th ACM Multimedia Systems Conference (MMSys '22), June 14-17,
2022, Athlone, Ireland. ACM, New York, NY, USA, 6 page
Realnav: Exploring Natural User Interfaces For Locomotion In Video Games
We present an exploration into realistic locomotion interfaces in video games using spatially convenient input hardware. In particular, we use Nintendo Wii Remotes to create natural mappings between user actions and their representation in a video game. Targeting American Football video games, we used the role of the quarterback as an exemplar since the game player needs to maneuver effectively in a small area, run down the field, and perform evasive gestures such as spinning, jumping, or the juke . In our study, we developed three locomotion techniques. The first technique used a single Wii Remote, placed anywhere on the user\u27s body, using only the acceleration data. The second technique just used the Wii Remote\u27s infrared sensor and had to be placed on the user\u27s head. The third technique combined a Wii Remote\u27s acceleration and infrared data using a Kalman filter. The Wii Motion Plus was also integrated to add the orientation of the user into the video game. To evaluate the different techniques, we compared them with a cost effective six degree of freedom (6DOF) optical tracker and two Wii Remotes placed on the user\u27s feet. Experiments were performed comparing each to this technique. Finally, a user study was performed to determine if a preference existed among these techniques. The results showed that the second and third technique had the same location accuracy as the cost effective 6DOF tracker, but the first was too inaccurate for video game players. Furthermore, the range of the Wii remote infrared and Motion Plus exceeded the optical tracker of the comparison technique. Finally, the user study showed that video game players preferred the third method over the second, but were split on the use of the Motion Plus when the tasks did not require it
DeepFactors: Real-time probabilistic dense monocular SLAM
The ability to estimate rich geometry and camera motion from monocular imagery is fundamental to future interactive robotics and augmented reality applications. Different approaches have been proposed that vary in scene geometry representation (sparse landmarks, dense maps), the consistency metric used for optimising the multi-view problem, and the use of learned priors. We present a SLAM system that unifies these methods in a probabilistic framework while still maintaining real-time performance. This is achieved through the use of a learned compact depth map representation and reformulating three different types of errors: photometric, reprojection and geometric, which we make use of within standard factor graph software. We evaluate our system on trajectory estimation and depth reconstruction on real-world sequences and present various examples of estimated dense geometry
- …