Search CORE

4 research outputs found

3D keypoint detectors and descriptors for 3D objects recognition with TOF camera

Author: Moutarde Fabien
Shaiek Ayet
Publication venue: HAL CCSD
Publication date: 01/01/2011
Field of study

International audienceThe goal of this work is to evaluate 3D keypoints detectors and descriptors, which could be used for quasi real time 3D object recognition. The work presented has three main objectives: extracting descriptors from real depth images, obtaining an accurate degree of invariance and robustness to scale and viewpoints, and maintaining the computation time as low as possible. Using a 3D time-of-flight (ToF) depth camera, we record a sequence for several objects at 3 different distances and from 5 viewpoints. 3D salient points are then extracted using 2 different curvatures-based detectors. For each point, two local surface descriptors are computed by combining the shape index histogram and the normalized histogram of angles between the normal of reference feature point and the normals of its neighbours. A comparison of the two detectors and descriptors was conducted on 4 different objects. Experimentations show that both detectors and descriptors are rather invariant to variations of scale and viewpoint. We also find that the new 3D keypoints detector proposed by us is more stable than a previously proposed Shape Index based detector

HAL-MINES ParisTech

Fast 3D keypoints detector and descriptor for view-based 3D objects recognition

Author: Moutarde Fabien
Shaiek Ayet
Publication venue: HAL CCSD
Publication date: 11/11/2012
Field of study

International audienceIn this paper, we propose a new 3D object recognition method that employs a set of 3D keypoints extracted from point cloud representation of 3D views. The method makes use of the 2D organization of range data produced by 3D sensor. Our novel 3D interest points approach relies on surface type classifi-cation and combines the Shape Index (SI) - curvedness(C) map with the Gaus-sian (H) - Mean (K) map. For each extracted keypoint, a local description using the point and its neighbors is computed by joining the Shape Index histogram and the normalized histogram of angles between normals. This new proposed descriptor IndSHOT stems from the descriptor CSHOT (Color Signature of Histograms of OrienTations) which is based on the definition of a local, robust and invariant Reference Frame RF. This surface patch descriptor is used to find the correspondences between query-model view pairs in effective and robust way. Experimental results on Kinect based datasets are presented to validate the proposed approach in view based 3D object recognition

HAL-MINES ParisTech

Combining shape and color. A bottom-up approach to evaluate object similarities

Author: PASCUCCI ALESSIO
Publication venue
Publication date: 05/10/2011
Field of study

The objective of the present work is to develop a bottom-up approach to estimate the similarity between two unknown objects. Given a set of digital images, we want to identify the main objects and to determine whether they are similar or not. In the last decades many object recognition and classification strategies, driven by higher-level activities, have been successfully developed. The peculiarity of this work, instead, is the attempt to work without any training phase nor a priori knowledge about the objects or their context. Indeed, if we suppose to be in an unstructured and completely unknown environment, usually we have to deal with novel objects never seen before; under these hypothesis, it would be very useful to define some kind of similarity among the instances under analysis (even if we do not know which category they belong to). To obtain this result, we start observing that human beings use a lot of information and analyze very different aspects to achieve object recognition: shape, position, color and so on. Hence we try to reproduce part of this process, combining different methodologies (each working on a specific characteristic) to obtain a more meaningful idea of similarity. Mainly inspired by the human conception of representation, we identify two main characteristics and we called them the implicit and explicit models. The term "explicit" is used to account for the main traits of what, in the human representation, connotes a principal source of information regarding a category, a sort of a visual synecdoche (corresponding to the shape); the term "implicit", on the other hand, accounts for the object rendered by shadows and lights, colors and volumetric impression, a sort of a visual metonymy (corresponding to the chromatic characteristics). During the work, we had to face several problems and we tried to define specific solutions. In particular, our contributions are about: - defining a bottom-up approach for image segmentation (which does not rely on any a priori knowledge); - combining different features to evaluate objects similarity (particularly focusiing on shape and color); - defining a generic distance (similarity) measure between objects (without any attempt to identify the possible category they belong to); - analyzing the consequences of using the number of modes as an estimation of the number of mixture’s components (in the Expectation-Maximization algorithm)

Pubblicazioni Aperte Digitali Interateneo Sapienza

Archivio della ricerca- Università di Roma La Sapienza

Exploiting Spatio-Temporal Coherence for Video Object Detection in Robotics

Author: Fernandez-Chaves David
Gonzalez-Jimenez Javier
Matez-Bandera Jose Luis
Monroy Javier
Petkov Nicolai
Ruiz-Sarmiento Jose Raul
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2021
Field of study

This paper proposes a method to enhance video object detection for indoor environments in robotics. Concretely, it exploits knowledge about the camera motion between frames to propagate previously detected objects to successive frames. The proposal is rooted in the concepts of planar homography to propose regions of interest where to find objects, and recursive Bayesian filtering to integrate observations over time. The proposal is evaluated on six virtual, indoor environments, accounting for the detection of nine object classes over a total of ∼ 7k frames. Results show that our proposal improves the recall and the F1-score by a factor of 1.41 and 1.27, respectively, as well as it achieves a significant reduction of the object categorization entropy (58.8%) when compared to a two-stage video object detection method used as baseline, at the cost of small time overheads (120 ms) and precision loss (0.92).</p

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen