7,744 research outputs found
Deep Thermal Imaging: Proximate Material Type Recognition in the Wild through Deep Learning of Spatial Surface Temperature Patterns
We introduce Deep Thermal Imaging, a new approach for close-range automatic
recognition of materials to enhance the understanding of people and ubiquitous
technologies of their proximal environment. Our approach uses a low-cost mobile
thermal camera integrated into a smartphone to capture thermal textures. A deep
neural network classifies these textures into material types. This approach
works effectively without the need for ambient light sources or direct contact
with materials. Furthermore, the use of a deep learning network removes the
need to handcraft the set of features for different materials. We evaluated the
performance of the system by training it to recognise 32 material types in both
indoor and outdoor environments. Our approach produced recognition accuracies
above 98% in 14,860 images of 15 indoor materials and above 89% in 26,584
images of 17 outdoor materials. We conclude by discussing its potentials for
real-time use in HCI applications and future directions.Comment: Proceedings of the 2018 CHI Conference on Human Factors in Computing
System
Cross-View Visual Geo-Localization for Outdoor Augmented Reality
Precise estimation of global orientation and location is critical to ensure a
compelling outdoor Augmented Reality (AR) experience. We address the problem of
geo-pose estimation by cross-view matching of query ground images to a
geo-referenced aerial satellite image database. Recently, neural network-based
methods have shown state-of-the-art performance in cross-view matching.
However, most of the prior works focus only on location estimation, ignoring
orientation, which cannot meet the requirements in outdoor AR applications. We
propose a new transformer neural network-based model and a modified triplet
ranking loss for joint location and orientation estimation. Experiments on
several benchmark cross-view geo-localization datasets show that our model
achieves state-of-the-art performance. Furthermore, we present an approach to
extend the single image query-based geo-localization approach by utilizing
temporal information from a navigation pipeline for robust continuous
geo-localization. Experimentation on several large-scale real-world video
sequences demonstrates that our approach enables high-precision and stable AR
insertion.Comment: IEEE VR 202
GANVO: Unsupervised Deep Monocular Visual Odometry and Depth Estimation with Generative Adversarial Networks
In the last decade, supervised deep learning approaches have been extensively
employed in visual odometry (VO) applications, which is not feasible in
environments where labelled data is not abundant. On the other hand,
unsupervised deep learning approaches for localization and mapping in unknown
environments from unlabelled data have received comparatively less attention in
VO research. In this study, we propose a generative unsupervised learning
framework that predicts 6-DoF pose camera motion and monocular depth map of the
scene from unlabelled RGB image sequences, using deep convolutional Generative
Adversarial Networks (GANs). We create a supervisory signal by warping view
sequences and assigning the re-projection minimization to the objective loss
function that is adopted in multi-view pose estimation and single-view depth
generation network. Detailed quantitative and qualitative evaluations of the
proposed framework on the KITTI and Cityscapes datasets show that the proposed
method outperforms both existing traditional and unsupervised deep VO methods
providing better results for both pose estimation and depth recovery.Comment: ICRA 2019 - accepte
Explainable shared control in assistive robotics
Shared control plays a pivotal role in designing assistive robots to complement human capabilities during everyday tasks. However, traditional shared control relies on users forming an accurate mental model of expected robot behaviour. Without this accurate mental image, users may encounter confusion or frustration whenever their actions do not elicit the intended system response, forming a misalignment between the respective internal models of the robot and human. The Explainable Shared Control paradigm introduced in this thesis attempts to resolve such model misalignment by jointly considering assistance and transparency.
There are two perspectives of transparency to Explainable Shared Control: the human's and the robot's. Augmented reality is presented as an integral component that addresses the human viewpoint by visually unveiling the robot's internal mechanisms. Whilst the robot perspective requires an awareness of human "intent", and so a clustering framework composed of a deep generative model is developed for human intention inference.
Both transparency constructs are implemented atop a real assistive robotic wheelchair and tested with human users. An augmented reality headset is incorporated into the robotic wheelchair and different interface options are evaluated across two user studies to explore their influence on mental model accuracy. Experimental results indicate that this setup facilitates transparent assistance by improving recovery times from adverse events associated with model misalignment. As for human intention inference, the clustering framework is applied to a dataset collected from users operating the robotic wheelchair. Findings from this experiment demonstrate that the learnt clusters are interpretable and meaningful representations of human intent.
This thesis serves as a first step in the interdisciplinary area of Explainable Shared Control. The contributions to shared control, augmented reality and representation learning contained within this thesis are likely to help future research advance the proposed paradigm, and thus bolster the prevalence of assistive robots.Open Acces
Multiple Trajectory Prediction of Moving Agents with Memory Augmented Networks
Pedestrians and drivers are expected to safely navigate complex urban environments along with several non cooperating agents. Autonomous vehicles will soon replicate this capability. Each agent acquires a representation of the world from an egocentric perspective and must make decisions ensuring safety for itself and others. This requires to predict motion patterns of observed agents for a far enough future. In this paper we propose MANTRA, a model that exploits memory augmented networks to effectively predict multiple trajectories of other agents, observed from an egocentric perspective. Our model stores observations in memory and uses trained controllers to write meaningful pattern encodings and read trajectories that are most likely to occur in future. We show that our method is able to natively perform multi-modal trajectory prediction obtaining state-of-the art results on four datasets. Moreover, thanks to the non-parametric nature of the memory module, we show how once trained our system can continuously improve by ingesting novel patterns
- …