Search CORE

13 research outputs found

A systems engineering approach to robotic bin picking

Author: Ghita Ovidiu
Whelan Paul F.
Publication venue: 'IntechOpen'
Publication date: 01/11/2008
Field of study

In recent times the presence of vision and robotic systems in industry has become common place, but in spite of many achievements a large range of industrial tasks still remain unsolved due to the lack of flexibility of the vision systems when dealing with highly adaptive manufacturing environments. An important task found across a broad range of modern flexible manufacturing environments is the need to present parts to automated machinery from a supply bin. In order to carry out grasping and manipulation operations safely and efficiently we need to know the identity, location and spatial orientation of the objects that lie in an unstructured heap in a bin. Historically, the bin picking problem was tackled using mechanical vibratory feeders where the vision feedback was unavailable. This solution has certain problems with parts jamming and more important they are highly dedicated. In this regard if a change in the manufacturing process is required, the changeover may include an extensive re-tooling and a total revision of the system control strategy (Kelley et al., 1982). Due to these disadvantages modern bin picking systems perform grasping and manipulation operations using vision feedback (Yoshimi & Allen, 1994). Vision based robotic bin picking has been the subject of research since the introduction of the automated vision controlled processes in industry and a review of existing systems indicates that none of the proposed solutions were able to solve this classic vision problem in its generality. One of the main challenges facing such a bin picking system is its ability to deal with overlapping objects. The object recognition in cluttered scenes is the main objective of these systems and early approaches attempted to perform bin picking operations for similar objects that are jumbled together in an unstructured heap using no knowledge about the pose or geometry of the parts (Birk et al., 1981). While these assumptions may be acceptable for a restricted number of applications, in most practical cases a flexible system must deal with more than one type of object with a wide scale of shapes. A flexible bin picking system has to address three difficult problems: scene interpretation, object recognition and pose estimation. Initial approaches to these tasks were based on modeling parts using the 2D surface representations. Typical 2D representations include invariant shape descriptors (Zisserman et al., 1994), algebraic curves (Tarel & Cooper, 2000), 2 Name of the book (Header position 1,5) conics (Bolles & Horaud, 1986; Forsyth et al., 1991) and appearance based models (Murase & Nayar, 1995; Ohba & Ikeuchi, 1997). These systems are generally better suited to planar object recognition and they are not able to deal with severe viewpoint distortions or objects with complex shapes/textures. Also the spatial orientation cannot be robustly estimated for objects with free-form contours. To address this limitation most bin picking systems attempt to recognize the scene objects and estimate their spatial orientation using the 3D information (Fan et al., 1989; Faugeras & Hebert, 1986). Notable approaches include the use of 3D local descriptors (Ansar & Daniilidis, 2003; Campbell & Flynn, 2001; Kim & Kak, 1991), polyhedra (Rothwell & Stern, 1996), generalized cylinders (Ponce et al., 1989; Zerroug & Nevatia, 1996), super-quadrics (Blane et al., 2000) and visual learning methods (Johnson & Hebert, 1999; Mittrapiyanuruk et al., 2004). The most difficult problem for 3D bin picking systems that are based on a structural description of the objects (local descriptors or 3D primitives) is the complex procedure required to perform the scene to model feature matching. This procedure is usually based on complex graph-searching techniques and is increasingly more difficult when dealing with object occlusions, a situation when the structural description of the scene objects is incomplete. Visual learning methods based on eigenimage analysis have been proposed as an alternative solution to address the object recognition and pose estimation for objects with complex appearances. In this regard, Johnson and Hebert (Johnson & Hebert, 1999) developed an object recognition scheme that is able to identify multiple 3D objects in scenes affected by clutter and occlusion. They proposed an eigenimage analysis approach that is applied to match surface points using the spin image representation. The main attraction of this approach resides in the use of spin images that are local surface descriptors; hence they can be easily identified in real scenes that contain clutter and occlusions. This approach returns accurate results but the pose estimation cannot be inferred, as the spin images are local descriptors and they are not robust to capture the object orientation. In general the pose sampling for visual learning methods is a problem difficult to solve as the numbers of views required to sample the full 6 degree of freedom for object pose is prohibitive. This issue was addressed in the paper by Edwards (Edwards, 1996) when he applied eigenimage analysis to a one-object scene and his approach was able to estimate the pose only in cases where the tilt angle was limited to 30 degrees with respect to the optical axis of the sensor. In this chapter we describe the implementation of a vision sensor for robotic bin picking where we attempt to eliminate the main problem faced by the visual learning methods, namely the pose sampling problem. This paper is organized as follows. Section 2 outlines the overall system. Section 3 describes the implementation of the range sensor while Section 4 details the edge-based segmentation algorithm. Section 5 presents the viewpoint correction algorithm that is applied to align the detected object surfaces perpendicular on the optical axis of the sensor. Section 6 describes the object recognition algorithm. This is followed in Section 7 by an outline of the pose estimation algorithm. Section 8 presents a number of experimental results illustrating the benefits of the approach outlined in this chapter

IntechOpen

Crossref

Irish Universities

DCU Online Research Access Service

Deux méthodes de comparaison d'images pour l'identification d'objets à partir de données prospectives

Author: ARQUES P. Y.
COSTA S.
HILLION A.
SEDANO H.
Publication venue: GRETSI, Groupe d’Etudes du Traitement du Signal et des Images
Publication date: 01/01/1997
Field of study

Cette étude aborde le problème de l'identification d'objets mobiles à partir de données délivrées par un senseur prospectif dont la conception est actuellement en cours. Le but est d'estimer la faisabilité d'une telle identification à l'aide d'outils disponibles à ce jour en reconnaissance des formes. On présente dans ce papier la réalisation complète d'une chaîne de simulation, comprenant à la fois la génération des données (non disponibles) et la mise en place de processus capables de les exploiter dans un but d'identification. Des paramètres variables contrôlent la nature des images (richesse, niveau de bruit) tout au long de la simulation, ceci afin de pouvoir prendre en compte des données de qualité variable

I-Revues

Shape description and matching using integral invariants on eccentricity transformed images

Author: A Amanatiadis
A Bengtsson
A Elad
A Ion
A Shashua
A Zisserman
AB Hamza
AM Bronstein
AM Bronstein
AM Bruckstein
BW Hong
C Hann
C Xu
C Xu
C-L Huang
CA Rothwell
CA Rothwell
CT Zahn
D Forsyth
D Forsyth
D Mumford
D Zhang
DG Kendall
DM Squire
DP Bertsekas
E Calabi
E Ozcan
E Sharon
E Trucco
EGM Petrakis
ER Davies
EW Dijkstra
EW Dijkstra
EW Dijkstra
F Arrebola
F Janan
F Mokhtarian
F Mokhtarian
F Mokhtarian
Faraz Janan
FS Cohen
G Hadley
G Peyré
G Peyré
H Ling
H Pottmann
H Sundar
H Zhao
I Weiss
J Maciel
J Sato
J Shi
J Tian
J Tian
J Zeng
JA Sethian
JB Cole
JN Tsitsiklis
K Kanatani
K Mikolajczyk
K Siddiqi
KS Arun
KV Mardia
L Gool Van
L Gorelick
L Nielsen
L Torresani
LD Cohen
LS Davis
M Boué
M Frenkel
M Jeffreys
M Kliot
M Rusinol
M Sniedovich
M Sonka
Michael Brady
MP Sampat
MR Ruggeri
O Duchenne
O Van Kaick
PJ Olver
PL Rosin
QX Huang
R Alferez
R Highnam
R Kimmel
R Kimmel
R Kimmel
R Lenz
R Osada
RC Veltkamp
RD Brandt
S Belongie
S Helgason
S Manay
S Manay
SM Smith
SZ Li
TB Sebastian
TB Sebastian
TH Reiss
TY Thomas
W Cao
Y Gdalyahu
Y Wang
Y Xu
Y-H Gu
Y-HR Tsai
YW Chen
Z Huang
Z-G Qu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/06/2015
Field of study

Matching occluded and noisy shapes is a problem frequently encountered in medical image analysis and more generally in computer vision. To keep track of changes inside the breast, for example, it is important for a computer aided detection system to establish correspondences between regions of interest. Shape transformations, computed both with integral invariants (II) and with geodesic distance, yield signatures that are invariant to isometric deformations, such as bending and articulations. Integral invariants describe the boundaries of planar shapes. However, they provide no information about where a particular feature lies on the boundary with regard to the overall shape structure. Conversely, eccentricity transforms (Ecc) can match shapes by signatures of geodesic distance histograms based on information from inside the shape; but they ignore the boundary information. We describe a method that combines the boundary signature of a shape obtained from II and structural information from the Ecc to yield results that improve on them separately

University of Lincoln Institutional Repository

Crossref

Generalization to Novel Views: Universal, Class-based, and Model-based Processing", Int

Author: Shimon Ullman
Yael Moses
Publication venue
Publication date
Field of study

Abstract. A major problem in object recognition is that a novel image of a given object can be different from all previously seen images. Images can vary considerably due to changes in viewing conditions such as viewing position and illumination. In this paper we distinguish between three types of recognition schemes by the level at which generalization to novel images takes place: universal, class, and model-based. The first is applicable equally to all objects, the second to a class of objects, and the third uses known properties of individual objects. We derive theoretical limitations on each of the three generalization levels. For the universal level, previous results have shown that no invariance can be obtained. Here we show that this limitation holds even when the assumptions made on the objects and the recognition functions are relaxed. We also extend the results to changes of illumination direction. For the class level, previous studies presented specific examples of classes of objects for which functions invariant to viewpoint exist. Here, we distinguish between classes that admit such invariance and classes that do not. We demonstrate that there is a tradeoff between the set of objects that can be discriminated by a given recognition function and the set of images from which the recognition function can recognize these objects. Furthermore, we demonstrate that although functions that are invariant to illumination direction do not exist at the universal level, when the objects are restricted to belong to a given class, an invariant function to illumination direction can be defined. A general conclusion of this study is that class-based processing, that has not been used extensively in the past, is often advantageous for dealing with variations due to viewpoint and illuminant changes. Keywords: object recognition, invariance 1

CiteSeerX

Recognizing Large Isolated 3-D Objects Through Next View Planning Using Inner Camera Invariants

Author: S. Banerjee
S. Chaudhury
S. DuttaRoy
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Lunar Crater Identification in Digital Images

Author: Christian John A.
Derksen Harm
Watkins Ryan
Publication venue
Publication date: 14/09/2020
Field of study

It is often necessary to identify a pattern of observed craters in a single image of the lunar surface and without any prior knowledge of the camera's location. This so-called "lost-in-space" crater identification problem is common in both crater-based terrain relative navigation (TRN) and in automatic registration of scientific imagery. Past work on crater identification has largely been based on heuristic schemes, with poor performance outside of a narrowly defined operating regime (e.g., nadir pointing images, small search areas). This work provides the first mathematically rigorous treatment of the general crater identification problem. It is shown when it is (and when it is not) possible to recognize a pattern of elliptical crater rims in an image formed by perspective projection. For the cases when it is possible to recognize a pattern, descriptors are developed using invariant theory that provably capture all of the viewpoint invariant information. These descriptors may be pre-computed for known crater patterns and placed in a searchable index for fast recognition. New techniques are also developed for computing pose from crater rim observations and for evaluating crater rim correspondences. These techniques are demonstrated on both synthetic and real images

arXiv.org e-Print Archive

Active recognition through next view planning: a survey

Author
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Bottom-up Object Segmentation for Visual Recognition

Author: Carreira João Luís da Silva
Publication venue: Universitäts- und Landesbibliothek Bonn
Publication date
Field of study

Automatic recognition and segmentation of objects in images is a central open problem in computer vision. Most previous approaches have pursued either sliding-window object detection or dense classification of overlapping local image patches. Differently, the framework introduced in this thesis attempts to identify the spatial extent of objects prior to recognition, using bottom-up computational processes and mid-level selection cues. After a set of plausible object hypotheses is identified, a sequential recognition process is executed, based on continuous estimates of the spatial overlap between the image segment hypotheses and each putative class. The object hypotheses are represented as figure-ground segmentations, and are extracted automatically, without prior knowledge of the properties of individual object classes, by solving a sequence of constrained parametric min-cut problems (CPMC) on a regular image grid. It is show that CPMC significantly outperforms the state of the art for low-level segmentation in the PASCAL VOC 2009 and 2010 datasets. Results beyond the current state of the art for image classification, object detection and semantic segmentation are also demonstrated in a number of challenging datasets including Caltech-101, ETHZ-Shape as well as PASCAL VOC 2009-11. These results suggest that a greater emphasis on grouping and image organization may be valuable for making progress in high-level tasks such as object recognition and scene understanding

bonndoc – Der Publikationsserver der Universität Bonn

Author index—Volumes 1–89

Author
Publication venue: Published by Elsevier B.V.
Publication date
Field of study

Elsevier - Publisher Connector

3D compositional hierarchies for object categorization

Author: Kramarev Vladislav Vadimovich
Publication venue
Publication date: 01/12/2017
Field of study

Deep learning methods have become the default tool for image classification. However, application of deep learning to surface shape classification is burdened by the limitations of existing methods, in particular, by lack of invariance to geometric transformations of input data. This thesis proposes two novel frameworks for learning a multi-layer representation of surface shape features, namely the view-based and the surface-based compositional hierarchical frameworks. The proposed representation is a hierarchical vocabulary of shape features, termed parts. Parts of the first layer are pre-defined, while parts of the subsequent layers, describing spatial relations of subparts, are learned. The view-based framework describes spatial relations between subparts using a camera-based reference frame. The key stage of the learning algorithm is part selection which forms the vocabulary based on multi-objective optimization, considering different importance measures of parts. Our experiments show that this framework enables efficient category recognition on a large-scale dataset. The surface-based framework exploits part-based intrinsic reference frames, which are computed for lower layers parts and inherited by parts of the subsequent layers. During learning spatial relations between subparts are described in these reference frames. During inference, a part is detected in input data when its subparts are detected at certain positions and orientations in each other’s reference frames. Since rigid body transformations don’t change positions and orientations of parts in intrinsic reference frames, this approach enables efficient recognition from unseen poses. Experiments show that this framework exhibits a large discriminative power and greater robustness to rigid body transformations than advanced CNN-based methods

University of Birmingham Research Archive, E-theses Repository