165,603 research outputs found
Resource-Constrained Adaptive Search and Tracking for Sparse Dynamic Targets
This paper considers the problem of resource-constrained and noise-limited
localization and estimation of dynamic targets that are sparsely distributed
over a large area. We generalize an existing framework [Bashan et al, 2008] for
adaptive allocation of sensing resources to the dynamic case, accounting for
time-varying target behavior such as transitions to neighboring cells and
varying amplitudes over a potentially long time horizon. The proposed adaptive
sensing policy is driven by minimization of a modified version of the
previously introduced ARAP objective function, which is a surrogate function
for mean squared error within locations containing targets. We provide
theoretical upper bounds on the performance of adaptive sensing policies by
analyzing solutions with oracle knowledge of target locations, gaining insight
into the effect of target motion and amplitude variation as well as sparsity.
Exact minimization of the multi-stage objective function is infeasible, but
myopic optimization yields a closed-form solution. We propose a simple
non-myopic extension, the Dynamic Adaptive Resource Allocation Policy (D-ARAP),
that allocates a fraction of resources for exploring all locations rather than
solely exploiting the current belief state. Our numerical studies indicate that
D-ARAP has the following advantages: (a) it is more robust than the myopic
policy to noise, missing data, and model mismatch; (b) it performs comparably
to well-known approximate dynamic programming solutions but at significantly
lower computational complexity; and (c) it improves greatly upon non-adaptive
uniform resource allocation in terms of estimation error and probability of
detection.Comment: 49 pages, 1 table, 11 figure
Vehicle-Pedestrian Dynamic Interaction through Tractography of Relative Movements and Articulated Pedestrian Pose Estimation
To design robust Pre-Collision Systems (PCS) we must develop new techniques that will allow a better understanding of the vehicle-pedestrian dynamic relationship, and which can predict pedestrian future movements. This paper focuses on the potential-conflict situations where a collision may happen if no avoidance action is taken from driver or pedestrian. We have used 1000 15-second videos to find vehicle-pedestrian relative dynamic trajectories and pose of pedestrians. Adaptive structural local appearance model and particle filter methods have been implemented to track the pedestrians. We have obtained accurate tractography results for over 82% of the videos. For pose estimation, we have used flexible mixture model for capturing cooccurrence between pedestrian body segments. Based on existing single-frame human pose estimation model, we have implemented Kalman filtering with other new techniques to make stable stickfigure videos of the pedestrian dynamic motion. These tractography and pose estimation data were used as features to train a neural network for classifying 'potential conflict' and 'no potential conflict' situations. The training of the network achieved 91.2% true label accuracy, and 8.8% false level accuracy. Finally, the trained network was used to assess the probability of collision over time for the 15 seconds videos which generates a spike when there is a 'potential conflict' situation. The paper enables new analysis on potential-conflict pedestrian cases with 2D tractography data and stick-figure pose representation of pedestrians, which provides significant insight on the vehicle-pedestrian dynamics that are critical for safe autonomous driving and transportation safety innovations
Segmenting Foreground Objects from a Dynamic Textured Background via a Robust Kalman Filter
The algorithm presented in this paper aims to segment the foreground objects in video (e.g., people) given time-varying, textured backgrounds. Examples of time-varying backgrounds include waves on water, clouds moving, trees waving in the wind, automobile traffic, moving crowds, escalators, etc. We have developed a novel foreground-background segmentation algorithm that explicitly accounts for the non-stationary nature and clutter-like appearance of many dynamic textures. The dynamic texture is modeled by an Autoregressive Moving Average Model (ARMA). A robust Kalman filter algorithm iteratively estimates the intrinsic appearance of the dynamic texture, as well as the regions of the foreground objects. Preliminary experiments with this method have demonstrated promising results
Multi-view dynamic scene modeling
Modeling dynamic scenes/events from multiple fixed-location vision sensors, such as video camcorders, infrared cameras, Time-of-Flight sensors etc, is of broad interest in computer vision society, with many applications including 3D TV, virtual reality, medical surgery, markerless motion capture, video games, and security surveillance. However, most of the existing multi-view systems are set up in a strictly controlled indoor environment, with fixed lighting conditions and simple background views. Many challenges are preventing the technology to an outdoor natural environment. These include varying sunlight, shadows, reflections, background motion and visual occlusion. In this thesis, I address different aspects to overcome all of the aforementioned difficulties, so as to reduce human preparation and manipulation, and to make a robust outdoor system as automatic as possible. In particular, the main novel technical contributions of this thesis are as follows: a generic heterogeneous sensor fusion framework for robust 3D shape estimation together; a way to automatically recover 3D shapes of static occluder from dynamic object silhouette cues, which explicitly models the static visual occluding event along the viewing rays; a system to model multiple dynamic objects shapes and track their identities simultaneously, which explicitly models the inter-occluding event between dynamic objects; a scheme to recover an object's dense 3D motion flow over time, without assuming any prior knowledge of the underlying structure of the dynamic object being modeled, which helps to enforce temporal consistency of natural motions and initializes more advanced shape learning and motion analysis. A unified automatic calibration algorithm for the heterogeneous network of conventional cameras/camcorders and new Time-of-Flight sensors is also proposed
Driven to Distraction: Self-Supervised Distractor Learning for Robust Monocular Visual Odometry in Urban Environments
We present a self-supervised approach to ignoring "distractors" in camera
images for the purposes of robustly estimating vehicle motion in cluttered
urban environments. We leverage offline multi-session mapping approaches to
automatically generate a per-pixel ephemerality mask and depth map for each
input image, which we use to train a deep convolutional network. At run-time we
use the predicted ephemerality and depth as an input to a monocular visual
odometry (VO) pipeline, using either sparse features or dense photometric
matching. Our approach yields metric-scale VO using only a single camera and
can recover the correct egomotion even when 90% of the image is obscured by
dynamic, independently moving objects. We evaluate our robust VO methods on
more than 400km of driving from the Oxford RobotCar Dataset and demonstrate
reduced odometry drift and significantly improved egomotion estimation in the
presence of large moving vehicles in urban traffic.Comment: International Conference on Robotics and Automation (ICRA), 2018.
Video summary: http://youtu.be/ebIrBn_nc-
Occlusion-Robust MVO: Multimotion Estimation Through Occlusion Via Motion Closure
Visual motion estimation is an integral and well-studied challenge in
autonomous navigation. Recent work has focused on addressing multimotion
estimation, which is especially challenging in highly dynamic environments.
Such environments not only comprise multiple, complex motions but also tend to
exhibit significant occlusion.
Previous work in object tracking focuses on maintaining the integrity of
object tracks but usually relies on specific appearance-based descriptors or
constrained motion models. These approaches are very effective in specific
applications but do not generalize to the full multimotion estimation problem.
This paper presents a pipeline for estimating multiple motions, including the
camera egomotion, in the presence of occlusions. This approach uses an
expressive motion prior to estimate the SE (3) trajectory of every motion in
the scene, even during temporary occlusions, and identify the reappearance of
motions through motion closure. The performance of this occlusion-robust
multimotion visual odometry (MVO) pipeline is evaluated on real-world data and
the Oxford Multimotion Dataset.Comment: To appear at the 2020 IEEE/RSJ International Conference on
Intelligent Robots and Systems (IROS). An earlier version of this work first
appeared at the Long-term Human Motion Planning Workshop (ICRA 2019). 8
pages, 5 figures. Video available at
https://www.youtube.com/watch?v=o_N71AA6FR
Ultimate SLAM? Combining Events, Images, and IMU for Robust Visual SLAM in HDR and High Speed Scenarios
Event cameras are bio-inspired vision sensors that output pixel-level
brightness changes instead of standard intensity frames. These cameras do not
suffer from motion blur and have a very high dynamic range, which enables them
to provide reliable visual information during high speed motions or in scenes
characterized by high dynamic range. However, event cameras output only little
information when the amount of motion is limited, such as in the case of almost
still motion. Conversely, standard cameras provide instant and rich information
about the environment most of the time (in low-speed and good lighting
scenarios), but they fail severely in case of fast motions, or difficult
lighting such as high dynamic range or low light scenes. In this paper, we
present the first state estimation pipeline that leverages the complementary
advantages of these two sensors by fusing in a tightly-coupled manner events,
standard frames, and inertial measurements. We show on the publicly available
Event Camera Dataset that our hybrid pipeline leads to an accuracy improvement
of 130% over event-only pipelines, and 85% over standard-frames-only
visual-inertial systems, while still being computationally tractable.
Furthermore, we use our pipeline to demonstrate - to the best of our knowledge
- the first autonomous quadrotor flight using an event camera for state
estimation, unlocking flight scenarios that were not reachable with traditional
visual-inertial odometry, such as low-light environments and high-dynamic range
scenes.Comment: 8 pages, 9 figures, 2 table
Robust Legged Robot State Estimation Using Factor Graph Optimization
Legged robots, specifically quadrupeds, are becoming increasingly attractive
for industrial applications such as inspection. However, to leave the
laboratory and to become useful to an end user requires reliability in harsh
conditions. From the perspective of state estimation, it is essential to be
able to accurately estimate the robot's state despite challenges such as uneven
or slippery terrain, textureless and reflective scenes, as well as dynamic
camera occlusions. We are motivated to reduce the dependency on foot contact
classifications, which fail when slipping, and to reduce position drift during
dynamic motions such as trotting. To this end, we present a factor graph
optimization method for state estimation which tightly fuses and smooths
inertial navigation, leg odometry and visual odometry. The effectiveness of the
approach is demonstrated using the ANYmal quadruped robot navigating in a
realistic outdoor industrial environment. This experiment included trotting,
walking, crossing obstacles and ascending a staircase. The proposed approach
decreased the relative position error by up to 55% and absolute position error
by 76% compared to kinematic-inertial odometry.Comment: 8 pages, 12 figures. Accepted to RA-L + IROS 2019, July 201
- …