Search CORE

3,780 research outputs found

Joint localization of pursuit quadcopters and target using monocular cues

Author: A Barrientos
A Gktogan
Abdul Basit
AJ Davison
D Comaniciu
G Welch
H Zhou
JA Jiménez-Berni
Matthew N. Dailey
N Funk
S Denman
S Herwitz
S Perreault
Tomáš Krajník
Waqar S. Qureshi
WS Qureshi
Y Bi
Y Lan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/06/2015
Field of study

Pursuit robots (autonomous robots tasked with tracking and pursuing a moving target) require accurate tracking of the target's position over time. One possibly effective pursuit platform is a quadcopter equipped with basic sensors and a monocular camera. However, combined noise of the quadcopter's sensors causes large disturbances of target's 3D position estimate. To solve this problem, in this paper, we propose a novel method for joint localization of a quadcopter pursuer with a monocular camera and an arbitrary target. Our method localizes both the pursuer and target with respect to a common reference frame. The joint localization method fuses the quadcopter's kinematics and the target's dynamics in a joint state space model. We show that predicting and correcting pursuer and target trajectories simultaneously produces better results than standard approaches to estimating relative target trajectories in a 3D coordinate system. Our method also comprises a computationally efficient visual tracking method capable of redetecting a temporarily lost target. The efficiency of the proposed method is demonstrated by a series of experiments with a real quadcopter pursuing a human. The results show that the visual tracker can deal effectively with target occlusions and that joint localization outperforms standard localization methods

University of Lincoln Institutional Repository

Crossref

Encoderless Gimbal Calibration of Dynamic Multi-Camera Clusters

Author: Choi Christopher L.
Das Arun
Ganti Pranav
Koppel Leonid
Rebello Jason
Waslander Steven L.
Publication venue
Publication date: 24/07/2018
Field of study

Dynamic Camera Clusters (DCCs) are multi-camera systems where one or more cameras are mounted on actuated mechanisms such as a gimbal. Existing methods for DCC calibration rely on joint angle measurements to resolve the time-varying transformation between the dynamic and static camera. This information is usually provided by motor encoders, however, joint angle measurements are not always readily available on off-the-shelf mechanisms. In this paper, we present an encoderless approach for DCC calibration which simultaneously estimates the kinematic parameters of the transformation chain as well as the unknown joint angles. We also demonstrate the integration of an encoderless gimbal mechanism with a state-of-the art VIO algorithm, and show the extensions required in order to perform simultaneous online estimation of the joint angles and vehicle localization state. The proposed calibration approach is validated both in simulation and on a physical DCC composed of a 2-DOF gimbal mounted on a UAV. Finally, we show the experimental results of the calibrated mechanism integrated into the OKVIS VIO package, and demonstrate successful online joint angle estimation while maintaining localization accuracy that is comparable to a standard static multi-camera configuration.Comment: ICRA 201

arXiv.org e-Print Archive

Crossref

GANVO: Unsupervised Deep Monocular Visual Odometry and Depth Estimation with Generative Adversarial Networks

Author: Almalioglu Yasin
de Gusmao Pedro P. B.
Markham Andrew
Saputra Muhamad Risqi U.
Trigoni Niki
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

In the last decade, supervised deep learning approaches have been extensively employed in visual odometry (VO) applications, which is not feasible in environments where labelled data is not abundant. On the other hand, unsupervised deep learning approaches for localization and mapping in unknown environments from unlabelled data have received comparatively less attention in VO research. In this study, we propose a generative unsupervised learning framework that predicts 6-DoF pose camera motion and monocular depth map of the scene from unlabelled RGB image sequences, using deep convolutional Generative Adversarial Networks (GANs). We create a supervisory signal by warping view sequences and assigning the re-projection minimization to the objective loss function that is adopted in multi-view pose estimation and single-view depth generation network. Detailed quantitative and qualitative evaluations of the proposed framework on the KITTI and Cityscapes datasets show that the proposed method outperforms both existing traditional and unsupervised deep VO methods providing better results for both pose estimation and depth recovery.Comment: ICRA 2019 - accepte

arXiv.org e-Print Archive

Crossref

Oxford University Research Archive

XNect: Real-time Multi-Person 3D Motion Capture with a Single RGB Camera

Author: Elgharib Mohamed
Fua Pascal
Mehta Dushyant
Mueller Franziska
Pons-Moll Gerard
Rhodin Helge
Seidel Hans-Peter
Sotnychenko Oleksandr
Theobalt Christian
Xu Weipeng
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2020
Field of study

We present a real-time approach for multi-person 3D motion capture at over 30 fps using a single RGB camera. It operates successfully in generic scenes which may contain occlusions by objects and by other people. Our method operates in subsequent stages. The first stage is a convolutional neural network (CNN) that estimates 2D and 3D pose features along with identity assignments for all visible joints of all individuals.We contribute a new architecture for this CNN, called SelecSLS Net, that uses novel selective long and short range skip connections to improve the information flow allowing for a drastically faster network without compromising accuracy. In the second stage, a fully connected neural network turns the possibly partial (on account of occlusion) 2Dpose and 3Dpose features for each subject into a complete 3Dpose estimate per individual. The third stage applies space-time skeletal model fitting to the predicted 2D and 3D pose per subject to further reconcile the 2D and 3D pose, and enforce temporal coherence. Our method returns the full skeletal pose in joint angles for each subject. This is a further key distinction from previous work that do not produce joint angle results of a coherent skeleton in real time for multi-person scenes. The proposed system runs on consumer hardware at a previously unseen speed of more than 30 fps given 512x320 images as input while achieving state-of-the-art accuracy, which we will demonstrate on a range of challenging real-world scenes.Comment: To appear in ACM Transactions on Graphics (SIGGRAPH) 202

arXiv.org e-Print Archive

MPG.PuRe

XNect: Real-time Multi-person 3D Human Pose Estimation with a Single RGB Camera

Author: Elgharib M.
Fua P.
Mehta D.
Mueller F.
Pons-Moll G.
Rhodin H.
Seidel H.
Sotnychenko O.
Theobalt C.
Xu W.
Publication venue
Publication date: 01/01/2019
Field of study

We present a real-time approach for multi-person 3D motion capture at over 30 fps using a single RGB camera. It operates in generic scenes and is robust to difficult occlusions both by other people and objects. Our method operates in subsequent stages. The first stage is a convolutional neural network (CNN) that estimates 2D and 3D pose features along with identity assignments for all visible joints of all individuals. We contribute a new architecture for this CNN, called SelecSLS Net, that uses novel selective long and short range skip connections to improve the information flow allowing for a drastically faster network without compromising accuracy. In the second stage, a fully-connected neural network turns the possibly partial (on account of occlusion) 2D pose and 3D pose features for each subject into a complete 3D pose estimate per individual. The third stage applies space-time skeletal model fitting to the predicted 2D and 3D pose per subject to further reconcile the 2D and 3D pose, and enforce temporal coherence. Our method returns the full skeletal pose in joint angles for each subject. This is a further key distinction from previous work that neither extracted global body positions nor joint angle results of a coherent skeleton in real time for multi-person scenes. The proposed system runs on consumer hardware at a previously unseen speed of more than 30 fps given 512x320 images as input while achieving state-of-the-art accuracy, which we will demonstrate on a range of challenging real-world scenes

arXiv.org e-Print Archive

MPG.PuRe

Keyframe-based monocular SLAM: design, survey, and future directions

Author: Asmar Daniel
Shammas Elie
Younes Georges
Zelek John
Publication venue: 'Elsevier BV'
Publication date: 01/12/2017
Field of study

Extensive research in the field of monocular SLAM for the past fifteen years has yielded workable systems that found their way into various applications in robotics and augmented reality. Although filter-based monocular SLAM systems were common at some time, the more efficient keyframe-based solutions are becoming the de facto methodology for building a monocular SLAM system. The objective of this paper is threefold: first, the paper serves as a guideline for people seeking to design their own monocular SLAM according to specific environmental constraints. Second, it presents a survey that covers the various keyframe-based monocular SLAM systems in the literature, detailing the components of their implementation, and critically assessing the specific strategies made in each proposed solution. Third, the paper provides insight into the direction of future research in this field, to address the major limitations still facing monocular SLAM; namely, in the issues of illumination changes, initialization, highly dynamic motion, poorly textured scenes, repetitive textures, map maintenance, and failure recovery

arXiv.org e-Print Archive

University of Waterloo's Institutional Repository

Optical techniques for 3D surface reconstruction in computer-assisted laparoscopic surgery

Author: A. Bartoli
A. Groch
A. Kolb
Ali
Audette
Bachta
Bailey
Barnard
Baumhauer
Benincasa
Besl
Blake
Bogatyrenko
Bronstein
Brown
Burschka
Böhme
Cash
Cash
Chen
Chen
Chen
Chen
Clancy
Clancy
Clatz
Cleary
Clements
Criminisi
Cryer
D. Elson
D. Stoyanov
Dumpuri
Durrant-Whyte
Elhawary
Falk
Faugeras
Fayad
Feuerstein
Fichtinger
Foix
Fuchs
Galvez-Lopez
Giannarou
Ginhoux
Glocker
Gorthi
Gudmundsson
H. Elhawary
Haneishi
Hartley
Hayashibe
Horn
Hu
Huhle
Huhle
Ieiri
Iftimia
J. Sorger
Jannin
Jannin
Jerabkova
Jin
Kolmogorov
Konishi
Kowalczuk
L. Maier-Hein
Lindner
Lindner
Lipman
M. Rodrigues
Maier-Hein
Marchesseau
Marescaux
Markelj
Marr
Marr
Marvik
Megali
Mersmann
Mezger
Miller
Mirota
Mountney
Mutter
Nalpantidis
Nicolau
Nozaki
Okatani
Ortmaier
P. Mountney
Pavlidis
Perriollat
Pilet
Pizarro
Placht
Pluim
Pratt
Rauth
Richa
Robinson
Röhl
S. Speidel
Salvi
Salzmann
Sauvee
Schaller
Scharstein
Schmalz
Shekhar
Simpfendorfer
Simpson
Soper
Stoyanov
Su
Szpala
Taffinder
Thrun
Thrun
Totz
Ukimura
Ullman
van Kaick
Vigneron
Warren
Wentz
Wittek
Wittek
Wolf
Wu
Wu
Wu
Wöhler
Yip
Yoon
Zhang
Zhang
Zhu
Publication venue: 'Elsevier BV'
Publication date: 03/05/2013
Field of study

One of the main challenges for computer-assisted surgery (CAS) is to determine the intra-opera- tive morphology and motion of soft-tissues. This information is prerequisite to the registration of multi-modal patient-specific data for enhancing the surgeon’s navigation capabilites by observ- ing beyond exposed tissue surfaces and for providing intelligent control of robotic-assisted in- struments. In minimally invasive surgery (MIS), optical techniques are an increasingly attractive approach for in vivo 3D reconstruction of the soft-tissue surface geometry. This paper reviews the state-of-the-art methods for optical intra-operative 3D reconstruction in laparoscopic surgery and discusses the technical challenges and future perspectives towards clinical translation. With the recent paradigm shift of surgical practice towards MIS and new developments in 3D opti- cal imaging, this is a timely discussion about technologies that could facilitate complex CAS procedures in dynamic and deformable anatomical regions

Crossref

Sheffield Hallam University Research Archive

UCL Discovery

Spiral - Imperial College Digital Repository