6,211 research outputs found
3-D Hand Pose Estimation from Kinect's Point Cloud Using Appearance Matching
We present a novel appearance-based approach for pose estimation of a human
hand using the point clouds provided by the low-cost Microsoft Kinect sensor.
Both the free-hand case, in which the hand is isolated from the surrounding
environment, and the hand-object case, in which the different types of
interactions are classified, have been considered. The hand-object case is
clearly the most challenging task having to deal with multiple tracks. The
approach proposed here belongs to the class of partial pose estimation where
the estimated pose in a frame is used for the initialization of the next one.
The pose estimation is obtained by applying a modified version of the Iterative
Closest Point (ICP) algorithm to synthetic models to obtain the rigid
transformation that aligns each model with respect to the input data. The
proposed framework uses a "pure" point cloud as provided by the Kinect sensor
without any other information such as RGB values or normal vector components.
For this reason, the proposed method can also be applied to data obtained from
other types of depth sensor, or RGB-D camera
Optical techniques for 3D surface reconstruction in computer-assisted laparoscopic surgery
One of the main challenges for computer-assisted surgery (CAS) is to determine the intra-opera- tive morphology and motion of soft-tissues. This information is prerequisite to the registration of multi-modal patient-specific data for enhancing the surgeon’s navigation capabilites by observ- ing beyond exposed tissue surfaces and for providing intelligent control of robotic-assisted in- struments. In minimally invasive surgery (MIS), optical techniques are an increasingly attractive approach for in vivo 3D reconstruction of the soft-tissue surface geometry. This paper reviews the state-of-the-art methods for optical intra-operative 3D reconstruction in laparoscopic surgery and discusses the technical challenges and future perspectives towards clinical translation. With the recent paradigm shift of surgical practice towards MIS and new developments in 3D opti- cal imaging, this is a timely discussion about technologies that could facilitate complex CAS procedures in dynamic and deformable anatomical regions
LiveSketch: Query Perturbations for Guided Sketch-based Visual Search
LiveSketch is a novel algorithm for searching large image collections using
hand-sketched queries. LiveSketch tackles the inherent ambiguity of sketch
search by creating visual suggestions that augment the query as it is drawn,
making query specification an iterative rather than one-shot process that helps
disambiguate users' search intent. Our technical contributions are: a triplet
convnet architecture that incorporates an RNN based variational autoencoder to
search for images using vector (stroke-based) queries; real-time clustering to
identify likely search intents (and so, targets within the search embedding);
and the use of backpropagation from those targets to perturb the input stroke
sequence, so suggesting alterations to the query in order to guide the search.
We show improvements in accuracy and time-to-task over contemporary baselines
using a 67M image corpus.Comment: Accepted to CVPR 201
AutoSynth: Learning to Generate 3D Training Data for Object Point Cloud Registration
In the current deep learning paradigm, the amount and quality of training
data are as critical as the network architecture and its training details.
However, collecting, processing, and annotating real data at scale is
difficult, expensive, and time-consuming, particularly for tasks such as 3D
object registration. While synthetic datasets can be created, they require
expertise to design and include a limited number of categories. In this paper,
we introduce a new approach called AutoSynth, which automatically generates 3D
training data for point cloud registration. Specifically, AutoSynth
automatically curates an optimal dataset by exploring a search space
encompassing millions of potential datasets with diverse 3D shapes at a low
cost.To achieve this, we generate synthetic 3D datasets by assembling shape
primitives, and develop a meta-learning strategy to search for the best
training data for 3D registration on real point clouds. For this search to
remain tractable, we replace the point cloud registration network with a much
smaller surrogate network, leading to a times speedup. We demonstrate
the generality of our approach by implementing it with two different point
cloud registration networks, BPNet and IDAM. Our results on TUD-L, LINEMOD and
Occluded-LINEMOD evidence that a neural network trained on our searched dataset
yields consistently better performance than the same one trained on the widely
used ModelNet40 dataset.Comment: accepted by ICCV202
Model-Based Environmental Visual Perception for Humanoid Robots
The visual perception of a robot should answer two fundamental questions: What? and Where? In order to properly and efficiently reply to these questions, it is essential to establish a bidirectional coupling between the external stimuli and the internal representations. This coupling links the physical world with the inner abstraction models by sensor transformation, recognition, matching and optimization algorithms. The objective of this PhD is to establish this sensor-model coupling
- …