39,731 research outputs found
Multi-Action Recognition via Stochastic Modelling of Optical Flow and Gradients
In this paper we propose a novel approach to multi-action recognition that
performs joint segmentation and classification. This approach models each
action using a Gaussian mixture using robust low-dimensional action features.
Segmentation is achieved by performing classification on overlapping temporal
windows, which are then merged to produce the final result. This approach is
considerably less complicated than previous methods which use dynamic
programming or computationally expensive hidden Markov models (HMMs). Initial
experiments on a stitched version of the KTH dataset show that the proposed
approach achieves an accuracy of 78.3%, outperforming a recent HMM-based
approach which obtained 71.2%
Object-Oriented Dynamics Learning through Multi-Level Abstraction
Object-based approaches for learning action-conditioned dynamics has
demonstrated promise for generalization and interpretability. However, existing
approaches suffer from structural limitations and optimization difficulties for
common environments with multiple dynamic objects. In this paper, we present a
novel self-supervised learning framework, called Multi-level Abstraction
Object-oriented Predictor (MAOP), which employs a three-level learning
architecture that enables efficient object-based dynamics learning from raw
visual observations. We also design a spatial-temporal relational reasoning
mechanism for MAOP to support instance-level dynamics learning and handle
partial observability. Our results show that MAOP significantly outperforms
previous methods in terms of sample efficiency and generalization over novel
environments for learning environment models. We also demonstrate that learned
dynamics models enable efficient planning in unseen environments, comparable to
true environment models. In addition, MAOP learns semantically and visually
interpretable disentangled representations.Comment: Accepted to the Thirthy-Fourth AAAI Conference On Artificial
Intelligence (AAAI), 202
Robust dense visual SLAM using sensor fusion and motion segmentation
Visual simultaneous localisation and mapping (SLAM) is an important technique for
enabling mobile robots to navigate autonomously within their environments. Using
cameras, robots reconstruct a representation of their environment and simultaneously
localise themselves within it. A dense visual SLAM system produces a high-resolution
and detailed reconstruction of the environment which can be used for obstacle avoidance or semantic reasoning.
State-of-the-art dense visual SLAM systems demonstrate robust performance and
impressive accuracy in ideal conditions. However, these techniques are based on requirements which limit the extent to which they can be deployed in real applications.
Fundamentally, they require constant scene illumination, smooth camera motion and
no moving objects being present in the scene. Overcoming these requirements is not
trivial and significant effort is needed to make dense visual SLAM approaches more
robust to real-world conditions.
The objective of this thesis is to develop dense visual SLAM systems which are
more robust to real-world visually challenging conditions. For this, we leverage sensor
fusion and motion segmentation for situations where camera data is unsuitable.
The first contribution is a visual SLAM system for the NASA Valkyrie humanoid
robot which is robust to the robot’s operation. It is based on a sensor fusion approach
which combines visual SLAM and leg odometry to demonstrate increased robustness
to illumination changes and fast camera motion.
Second, we research methods for robust visual odometry in the presence of moving
objects. We propose a formulation for joint visual odometry and motion segmentation
that demonstrates increased robustness in scenes with moving objects compared to
state-of-the-art approaches.
We then extend this method using inertial information from a gyroscope to compare the contributions of motion segmentation and motion prior integration for robustness to scene dynamics. As part of this study we provide a dataset recorded in
scenes with different numbers of moving objects.
In conclusion, we find that both motion segmentation and motion prior integration
are necessary for achieving significantly better results in real-world conditions. While
motion priors increase robustness, motion segmentation increases the accuracy of the
reconstruction results through filtering of moving objects.Edinburgh Centre for RoboticsEngineering and Physical Sciences Research Council (EPSRC
- …