6,800 research outputs found
Hybrid One-Shot 3D Hand Pose Estimation by Exploiting Uncertainties
Model-based approaches to 3D hand tracking have been shown to perform well in
a wide range of scenarios. However, they require initialisation and cannot
recover easily from tracking failures that occur due to fast hand motions.
Data-driven approaches, on the other hand, can quickly deliver a solution, but
the results often suffer from lower accuracy or missing anatomical validity
compared to those obtained from model-based approaches. In this work we propose
a hybrid approach for hand pose estimation from a single depth image. First, a
learned regressor is employed to deliver multiple initial hypotheses for the 3D
position of each hand joint. Subsequently, the kinematic parameters of a 3D
hand model are found by deliberately exploiting the inherent uncertainty of the
inferred joint proposals. This way, the method provides anatomically valid and
accurate solutions without requiring manual initialisation or suffering from
track losses. Quantitative results on several standard datasets demonstrate
that the proposed method outperforms state-of-the-art representatives of the
model-based, data-driven and hybrid paradigms.Comment: BMVC 2015 (oral); see also
http://lrs.icg.tugraz.at/research/hybridhape
DeepKey: Towards End-to-End Physical Key Replication From a Single Photograph
This paper describes DeepKey, an end-to-end deep neural architecture capable
of taking a digital RGB image of an 'everyday' scene containing a pin tumbler
key (e.g. lying on a table or carpet) and fully automatically inferring a
printable 3D key model. We report on the key detection performance and describe
how candidates can be transformed into physical prints. We show an example
opening a real-world lock. Our system is described in detail, providing a
breakdown of all components including key detection, pose normalisation,
bitting segmentation and 3D model inference. We provide an in-depth evaluation
and conclude by reflecting on limitations, applications, potential security
risks and societal impact. We contribute the DeepKey Datasets of 5, 300+ images
covering a few test keys with bounding boxes, pose and unaligned mask data.Comment: 14 pages, 12 figure
Exploiting Spatio-Temporal Coherence for Video Object Detection in Robotics
This paper proposes a method to enhance video object detection for indoor environments in robotics. Concretely, it exploits knowledge about the camera motion between frames to propagate previously detected objects to successive frames. The proposal is rooted in the concepts of planar homography to propose regions of interest where to find objects, and recursive Bayesian filtering to integrate observations over time. The proposal is evaluated on six virtual, indoor environments, accounting for the detection of nine object classes over a total of ∼ 7k frames. Results show that our proposal improves the recall and the F1-score by a factor of 1.41 and 1.27, respectively, as well as it achieves a significant reduction of the object categorization entropy (58.8%) when compared to a two-stage video object detection method used as baseline, at the cost of small time overheads (120 ms) and precision loss (0.92).</p
Generic 3D Representation via Pose Estimation and Matching
Though a large body of computer vision research has investigated developing
generic semantic representations, efforts towards developing a similar
representation for 3D has been limited. In this paper, we learn a generic 3D
representation through solving a set of foundational proxy 3D tasks:
object-centric camera pose estimation and wide baseline feature matching. Our
method is based upon the premise that by providing supervision over a set of
carefully selected foundational tasks, generalization to novel tasks and
abstraction capabilities can be achieved. We empirically show that the internal
representation of a multi-task ConvNet trained to solve the above core problems
generalizes to novel 3D tasks (e.g., scene layout estimation, object pose
estimation, surface normal estimation) without the need for fine-tuning and
shows traits of abstraction abilities (e.g., cross-modality pose estimation).
In the context of the core supervised tasks, we demonstrate our representation
achieves state-of-the-art wide baseline feature matching results without
requiring apriori rectification (unlike SIFT and the majority of learned
features). We also show 6DOF camera pose estimation given a pair local image
patches. The accuracy of both supervised tasks come comparable to humans.
Finally, we contribute a large-scale dataset composed of object-centric street
view scenes along with point correspondences and camera pose information, and
conclude with a discussion on the learned representation and open research
questions.Comment: Published in ECCV16. See the project website
http://3drepresentation.stanford.edu/ and dataset website
https://github.com/amir32002/3D_Street_Vie
Towards Advanced Robotic Manipulations for Nuclear Decommissioning
Despite enormous remote handling requirements, remarkably very few robots are being used by the nuclear industry. Most of the remote handling tasks are still performed manually, using conventional mechanical master‐slave devices. The few robotic manipulators deployed are directly tele‐operated in rudimentary ways, with almost no autonomy or even a pre‐programmed motion. In addition, majority of these robots are under‐sensored (i.e. with no proprioception), which prevents them to use for automatic tasks. In this context, primarily this chapter discusses the human operator performance in accomplishing heavy‐duty remote handling tasks in hazardous environments such as nuclear decommissioning. Multiple factors are evaluated to analyse the human operators’ performance and workload. Also, direct human tele‐operation is compared against human‐supervised semi‐autonomous control exploiting computer vision. Secondarily, a vision‐guided solution towards enabling advanced control and automating the under‐sensored robots is presented. Maintaining the coherence with real nuclear scenario, the experiments are conducted in the lab environment and results are discussed
- …