Search CORE

7,744 research outputs found

Deep Thermal Imaging: Proximate Material Type Recognition in the Wild through Deep Learning of Spatial Surface Temperature Patterns

Author: Anind
Callister William D.
Cho Youngjun
Cho Youngjun
Hendriks Antonius
Holone H.
Jaderberg Max
Krizhevsky Alex
Kurz Daniel
Leonard John J.
Liu C.
Liu H.
Lloyd J. M.
Marks
Nair Vinod
Ren Shaoqing
Romera-Paredes Bernardino
Victor
Vollmer Michael
Zhu Xiangxin
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 06/03/2018
Field of study

We introduce Deep Thermal Imaging, a new approach for close-range automatic recognition of materials to enhance the understanding of people and ubiquitous technologies of their proximal environment. Our approach uses a low-cost mobile thermal camera integrated into a smartphone to capture thermal textures. A deep neural network classifies these textures into material types. This approach works effectively without the need for ambient light sources or direct contact with materials. Furthermore, the use of a deep learning network removes the need to handcraft the set of features for different materials. We evaluated the performance of the system by training it to recognise 32 material types in both indoor and outdoor environments. Our approach produced recognition accuracies above 98% in 14,860 images of 15 indoor materials and above 89% in 26,584 images of 17 outdoor materials. We conclude by discussing its potentials for real-time use in HCI applications and future directions.Comment: Proceedings of the 2018 CHI Conference on Human Factors in Computing System

arXiv.org e-Print Archive

Crossref

UCL Discovery

Cross-View Visual Geo-Localization for Outdoor Augmented Reality

Author: Chiu Han-Pang
Kumar Rakesh
Minhas Kshitij
Mithun Niluthpol Chowdhury
Oskiper Taragay
Samarasekera Supun
Sizintsev Mikhail
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 27/03/2023
Field of study

Precise estimation of global orientation and location is critical to ensure a compelling outdoor Augmented Reality (AR) experience. We address the problem of geo-pose estimation by cross-view matching of query ground images to a geo-referenced aerial satellite image database. Recently, neural network-based methods have shown state-of-the-art performance in cross-view matching. However, most of the prior works focus only on location estimation, ignoring orientation, which cannot meet the requirements in outdoor AR applications. We propose a new transformer neural network-based model and a modified triplet ranking loss for joint location and orientation estimation. Experiments on several benchmark cross-view geo-localization datasets show that our model achieves state-of-the-art performance. Furthermore, we present an approach to extend the single image query-based geo-localization approach by utilizing temporal information from a navigation pipeline for robust continuous geo-localization. Experimentation on several large-scale real-world video sequences demonstrates that our approach enables high-precision and stable AR insertion.Comment: IEEE VR 202

arXiv.org e-Print Archive

GANVO: Unsupervised Deep Monocular Visual Odometry and Depth Estimation with Generative Adversarial Networks

Author: Almalioglu Yasin
de Gusmao Pedro P. B.
Markham Andrew
Saputra Muhamad Risqi U.
Trigoni Niki
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

In the last decade, supervised deep learning approaches have been extensively employed in visual odometry (VO) applications, which is not feasible in environments where labelled data is not abundant. On the other hand, unsupervised deep learning approaches for localization and mapping in unknown environments from unlabelled data have received comparatively less attention in VO research. In this study, we propose a generative unsupervised learning framework that predicts 6-DoF pose camera motion and monocular depth map of the scene from unlabelled RGB image sequences, using deep convolutional Generative Adversarial Networks (GANs). We create a supervisory signal by warping view sequences and assigning the re-projection minimization to the objective loss function that is adopted in multi-view pose estimation and single-view depth generation network. Detailed quantitative and qualitative evaluations of the proposed framework on the KITTI and Cityscapes datasets show that the proposed method outperforms both existing traditional and unsupervised deep VO methods providing better results for both pose estimation and depth recovery.Comment: ICRA 2019 - accepte

arXiv.org e-Print Archive

Crossref

Oxford University Research Archive

Explainable shared control in assistive robotics

Author: Zolotas Mark
Publication venue: Electrical and Electronic Engineering, Imperial College London
Publication date: 01/02/2021
Field of study

Shared control plays a pivotal role in designing assistive robots to complement human capabilities during everyday tasks. However, traditional shared control relies on users forming an accurate mental model of expected robot behaviour. Without this accurate mental image, users may encounter confusion or frustration whenever their actions do not elicit the intended system response, forming a misalignment between the respective internal models of the robot and human. The Explainable Shared Control paradigm introduced in this thesis attempts to resolve such model misalignment by jointly considering assistance and transparency. There are two perspectives of transparency to Explainable Shared Control: the human's and the robot's. Augmented reality is presented as an integral component that addresses the human viewpoint by visually unveiling the robot's internal mechanisms. Whilst the robot perspective requires an awareness of human "intent", and so a clustering framework composed of a deep generative model is developed for human intention inference. Both transparency constructs are implemented atop a real assistive robotic wheelchair and tested with human users. An augmented reality headset is incorporated into the robotic wheelchair and different interface options are evaluated across two user studies to explore their influence on mental model accuracy. Experimental results indicate that this setup facilitates transparent assistance by improving recovery times from adverse events associated with model misalignment. As for human intention inference, the clustering framework is applied to a dataset collected from users operating the robotic wheelchair. Findings from this experiment demonstrate that the learnt clusters are interpretable and meaningful representations of human intent. This thesis serves as a first step in the interdisciplinary area of Explainable Shared Control. The contributions to shared control, augmented reality and representation learning contained within this thesis are likely to help future research advance the proposed paradigm, and thus bolster the prevalence of assistive robots.Open Acces

Spiral - Imperial College Digital Repository

Multiple Trajectory Prediction of Moving Agents with Memory Augmented Networks

Author: Becattini Federico
Del Bimbo Alberto
Marchetti Francesco
Seidenari Lorenzo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

Pedestrians and drivers are expected to safely navigate complex urban environments along with several non cooperating agents. Autonomous vehicles will soon replicate this capability. Each agent acquires a representation of the world from an egocentric perspective and must make decisions ensuring safety for itself and others. This requires to predict motion patterns of observed agents for a far enough future. In this paper we propose MANTRA, a model that exploits memory augmented networks to effectively predict multiple trajectories of other agents, observed from an egocentric perspective. Our model stores observations in memory and uses trained controllers to write meaningful pattern encodings and read trajectories that are most likely to occur in future. We show that our method is able to natively perform multi-modal trajectory prediction obtaining state-of-the art results on four datasets. Moreover, thanks to the non-parametric nature of the memory module, we show how once trained our system can continuously improve by ingesting novel patterns

Archivio della Ricerca - Università degli Studi di Siena

Florence Research