Search CORE

6,800 research outputs found

Hybrid One-Shot 3D Hand Pose Estimation by Exploiting Uncertainties

Author: Argyros Antonis A.
Bischof Horst
Michel Damien
Poier Georg
Roditakis Konstantinos
Schulter Samuel
Publication venue: 'British Machine Vision Association and Society for Pattern Recognition'
Publication date: 01/01/2015
Field of study

Model-based approaches to 3D hand tracking have been shown to perform well in a wide range of scenarios. However, they require initialisation and cannot recover easily from tracking failures that occur due to fast hand motions. Data-driven approaches, on the other hand, can quickly deliver a solution, but the results often suffer from lower accuracy or missing anatomical validity compared to those obtained from model-based approaches. In this work we propose a hybrid approach for hand pose estimation from a single depth image. First, a learned regressor is employed to deliver multiple initial hypotheses for the 3D position of each hand joint. Subsequently, the kinematic parameters of a 3D hand model are found by deliberately exploiting the inherent uncertainty of the inferred joint proposals. This way, the method provides anatomically valid and accurate solutions without requiring manual initialisation or suffering from track losses. Quantitative results on several standard datasets demonstrate that the proposed method outperforms state-of-the-art representatives of the model-based, data-driven and hybrid paradigms.Comment: BMVC 2015 (oral); see also http://lrs.icg.tugraz.at/research/hybridhape

arXiv.org e-Print Archive

Crossref

DeepKey: Towards End-to-End Physical Key Replication From a Single Photograph

Author: Alex Krizhevsky
EH Adelson
Jürgen Schmidhuber
Kaiming He
S Ren
Publication venue
Publication date: 04/11/2018
Field of study

This paper describes DeepKey, an end-to-end deep neural architecture capable of taking a digital RGB image of an 'everyday' scene containing a pin tumbler key (e.g. lying on a table or carpet) and fully automatically inferring a printable 3D key model. We report on the key detection performance and describe how candidates can be transformed into physical prints. We show an example opening a real-world lock. Our system is described in detail, providing a breakdown of all components including key detection, pose normalisation, bitting segmentation and 3D model inference. We provide an in-depth evaluation and conclude by reflecting on limitations, applications, potential security risks and societal impact. We contribute the DeepKey Datasets of 5, 300+ images covering a few test keys with bounding boxes, pose and unaligned mask data.Comment: 14 pages, 12 figure

arXiv.org e-Print Archive

Crossref

Explore Bristol Research

Enhancing RGB-D SLAM Using Deep Learning

Author: Ming Yuhang
Publication venue
Publication date: 24/01/2023
Field of study

Explore Bristol Research

Exploiting Spatio-Temporal Coherence for Video Object Detection in Robotics

Author: Fernandez-Chaves David
Gonzalez-Jimenez Javier
Matez-Bandera Jose Luis
Monroy Javier
Petkov Nicolai
Ruiz-Sarmiento Jose Raul
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2021
Field of study

This paper proposes a method to enhance video object detection for indoor environments in robotics. Concretely, it exploits knowledge about the camera motion between frames to propagate previously detected objects to successive frames. The proposal is rooted in the concepts of planar homography to propose regions of interest where to find objects, and recursive Bayesian filtering to integrate observations over time. The proposal is evaluated on six virtual, indoor environments, accounting for the detection of nine object classes over a total of ∼ 7k frames. Results show that our proposal improves the recall and the F1-score by a factor of 1.41 and 1.27, respectively, as well as it achieves a significant reduction of the object categorization entropy (58.8%) when compared to a two-stage video object detection method used as baseline, at the cost of small time overheads (120 ms) and precision loss (0.92).</p

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Generic 3D Representation via Pose Estimation and Matching

Author: B Caprile
B Li
C Xu
D Tell
DG Lowe
EJ Gibson
H Bay
J Matas
J Weston
JM Morel
K Köser
K Mikolajczyk
Karen Simonyan
L Smith
L Van der Maaten
M Brown
MJ Tarr
N Silberman
Nancy Rader
P Denis
P Moreels
R Hartley
R Held
R Kümmerle
S Agarwal
Z Zhang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 23/10/2017
Field of study

Though a large body of computer vision research has investigated developing generic semantic representations, efforts towards developing a similar representation for 3D has been limited. In this paper, we learn a generic 3D representation through solving a set of foundational proxy 3D tasks: object-centric camera pose estimation and wide baseline feature matching. Our method is based upon the premise that by providing supervision over a set of carefully selected foundational tasks, generalization to novel tasks and abstraction capabilities can be achieved. We empirically show that the internal representation of a multi-task ConvNet trained to solve the above core problems generalizes to novel 3D tasks (e.g., scene layout estimation, object pose estimation, surface normal estimation) without the need for fine-tuning and shows traits of abstraction abilities (e.g., cross-modality pose estimation). In the context of the core supervised tasks, we demonstrate our representation achieves state-of-the-art wide baseline feature matching results without requiring apriori rectification (unlike SIFT and the majority of learned features). We also show 6DOF camera pose estimation given a pair local image patches. The accuracy of both supervised tasks come comparable to humans. Finally, we contribute a large-scale dataset composed of object-centric street view scenes along with point correspondences and camera pose information, and conclude with a discussion on the learned representation and open research questions.Comment: Published in ECCV16. See the project website http://3drepresentation.stanford.edu/ and dataset website https://github.com/amir32002/3D_Street_Vie

arXiv.org e-Print Archive

Crossref

Towards Advanced Robotic Manipulations for Nuclear Decommissioning

Author: Bekiroglu Yasemin
Kuo Jeffrey
Marturi Naresh
Ortenzi Valerio
Rajasekaran Vijaykumar
Rastegarpanah Alireza
Stolkin Rustam
Publication venue: 'IntechOpen'
Publication date: 01/01/2017
Field of study

Despite enormous remote handling requirements, remarkably very few robots are being used by the nuclear industry. Most of the remote handling tasks are still performed manually, using conventional mechanical master‐slave devices. The few robotic manipulators deployed are directly tele‐operated in rudimentary ways, with almost no autonomy or even a pre‐programmed motion. In addition, majority of these robots are under‐sensored (i.e. with no proprioception), which prevents them to use for automatic tasks. In this context, primarily this chapter discusses the human operator performance in accomplishing heavy‐duty remote handling tasks in hazardous environments such as nuclear decommissioning. Multiple factors are evaluated to analyse the human operators’ performance and workload. Also, direct human tele‐operation is compared against human‐supervised semi‐autonomous control exploiting computer vision. Secondarily, a vision‐guided solution towards enabling advanced control and automating the under‐sensored robots is presented. Maintaining the coherence with real nuclear scenario, the experiments are conducted in the lab environment and results are discussed

IntechOpen

University of Birmingham Research Portal

Chalmers Research

Egocentric Perception of Hands and Its Applications

Author: Lu Yao
Publication venue
Publication date: 22/03/2022
Field of study

Explore Bristol Research