17 research outputs found

    Parallel Evidence-Based Indexing of Complex Three-Dimensional Models Using Prototypical Parts and Relations (Dissertation Proposal)

    Get PDF
    This proposal is concerned with three-dimensional object recognition from range data using superquadric primitives. Superquadrics are a family of parametric shape models which represent objects at the part level and can account for a wide variety of natural and man-made forms. An integrated framework for segmenting dense range data of complex 3-D objects into their constituent parts in terms of bi-quadric surface patches and superquadric shape primitives is described in [29]. We propose a vision architecture that scales well as the size of its model database grows. Following the recovery of superquadric primitives from the input depth map, we split the computation into two concurrent processing streams. One is concerned with the classification of individual parts using viewpoint-invariant shape information while the other classifies pairwise part relationships using their relative size, orientation and type of joint. The major contribution of this proposal lies in a principled solution to the very difficult problems of superquadric part classification and model indexing. The problem is how to retrieve the best matched models without exploring all possible object matches. Our approach is to cluster together similar model parts to create a reasonable number of prototypical part classes (protoparts). Each superquadric part recovered from the input is paired with the best matching protopart using precomputed class statistics. A parallel, theoretically-well grounded evidential recognition algorithm quickly selects models consistent with the classified parts. Classified part relations (protorelations) are used to further reduce the number of consistent models and remaining ambiguities are resolved using sequential top-down search

    Integration of Quantitative and Qualitative Techniques for Deformable Model Fitting from Orthographic, Perspective, and Stereo Projections

    Get PDF
    In this paper, we synthesize a new approach to 3-D object shape recovery by integrating qualitative shape recovery techniques and quantitative physics based shape estimation techniques. Specifically, we first use qualitative shape recovery and recognition techniques to provide strong fitting constraints on physics-based deformable model recovery techniques. Secondly, we extend our previously developed technique of fitting deformable models to occluding image contours to the case of image data captured under general orthographic, perspective, and stereo projections

    Superquadric representation of scenes from multi-view range data

    Get PDF
    Object representation denotes representing three-dimensional (3D) real-world objects with known graphic or mathematic primitives recognizable to computers. This research has numerous applications for object-related tasks in areas including computer vision, computer graphics, reverse engineering, etc. Superquadrics, as volumetric and parametric models, have been selected to be the representation primitives throughout this research. Superquadrics are able to represent a large family of solid shapes by a single equation with only a few parameters. This dissertation addresses superquadric representation of multi-part objects and multiobject scenes. Two issues motivate this research. First, superquadric representation of multipart objects or multi-object scenes has been an unsolved problem due to the complex geometry of objects. Second, superquadrics recovered from single-view range data tend to have low confidence and accuracy due to partially scanned object surfaces caused by inherent occlusions. To address these two problems, this dissertation proposes a multi-view superquadric representation algorithm. By incorporating both part decomposition and multi-view range data, the proposed algorithm is able to not only represent multi-part objects or multi-object scenes, but also achieve high confidence and accuracy of recovered superquadrics. The multi-view superquadric representation algorithm consists of (i) initial superquadric model recovery from single-view range data, (ii) pairwise view registration based on recovered superquadric models, (iii) view integration, (iv) part decomposition, and (v) final superquadric fitting for each decomposed part. Within the multi-view superquadric representation framework, this dissertation proposes a 3D part decomposition algorithm to automatically decompose multi-part objects or multiobject scenes into their constituent single parts consistent with human visual perception. Superquadrics can then be recovered for each decomposed single-part object. The proposed part decomposition algorithm is based on curvature analysis, and includes (i) Gaussian curvature estimation, (ii) boundary labeling, (iii) part growing and labeling, and (iv) post-processing. In addition, this dissertation proposes an extended view registration algorithm based on superquadrics. The proposed view registration algorithm is able to handle deformable superquadrics as well as 3D unstructured data sets. For superquadric fitting, two objective functions primarily used in the literature have been comprehensively investigated with respect to noise, viewpoints, sample resolutions, etc. The objective function proved to have better performance has been used throughout this dissertation. In summary, the three algorithms (contributions) proposed in this dissertation are generic and flexible in the sense of handling triangle meshes, which are standard surface primitives in computer vision and graphics. For each proposed algorithm, the dissertation presents both theory and experimental results. The results demonstrate the efficiency of the algorithms using both synthetic and real range data of a large variety of objects and scenes. In addition, the experimental results include comparisons with previous methods from the literature. Finally, the dissertation concludes with a summary of the contributions to the state of the art in superquadric representation, and presents possible future extensions to this research

    Part-based Grouping and Recognition: A Model-Guided Approach

    Get PDF
    Institute of Perception, Action and BehaviourThe recovery of generic solid parts is a fundamental step towards the realization of general-purpose vision systems. This thesis investigates issues in grouping, segmentation and recognition of parts from two-dimensional edge images. A new paradigm of part-based grouping of features is introduced that bridges the classical grouping and model-based approaches with the purpose of directly recovering parts from real images, and part-like models are used that both yield low theoretical complexity and reliably recover part-plausible groups of features. The part-like models used are statistical point distribution models, whose training set is built using random deformable superellipse. The computational approach that is proposed to perform model-guided part-based grouping consists of four distinct stages. In the first stage, codons, contour portions of similar curvature, are extracted from the raw edge image. They are considered to be indivisible image features because they have the desirable property of belonging either to single parts or joints. In the second stage, small seed groups (currently pairs, but further extension are proposed) of codons are found that give enough structural information for part hypotheses to be created. The third stage consists in initialising and pre-shaping the models to all the seed groups and then performing a full fitting to a large neighbourhood of the pre-shaped model. The concept of pre-shaping to a few significant features is a relatively new concept in deformable model fitting that has helped to dramatically increase robustness. The initialisations of the part models to the seed groups is performed by the first direct least-square ellipse fitting algorithm, which has been jointly discovered during this research; a full theoretical proof of the method is provided. The last stage pertains to the global filtering of all the hypotheses generated by the previous stages according to the Minimum Description Length criterion: the small number of grouping hypotheses that survive this filtering stage are the most economical representation of the image in terms of the part-like models. The filtering is performed by the maximisation of a boolean quadratic function by a genetic algorithm, which has resulted in the best trade-off between speed and robustness. Finally, images of parts can have a pronounced 3D structure, with ends or sides clearly visible. In order to recover this important information, the part-based grouping method is extended by employing parametrically deformable aspects models which, starting from the initial position provided by the previous stages, are fitted to the raw image by simulated annealing. These models are inspired by deformable superquadrics but are built by geometric construction, which render them two order of magnitudes faster to generate than in previous works. A large number of experiments is provided that validate the approach and, since several new issues have been opened by it, some future work is proposed

    A highly adaptable model based – method for colour image interpretation

    Get PDF
    This Thesis presents a model-based interpretation of images that can vary greatly in appearance. Rather than seek characteristic landmarks to model objects we sample points at regular intervals on the boundary to model objects with a smooth boundary. A statistical model of form in the exponent domain of an extended superellipse is created using sampled points and appearance by sampling inside objects. A colour Maximum Likelihood Ratio criterion (MLR) was used to detect cues to the location of potential pedestrians. The adaptability and specificity of this cue detector was evaluated using over 700 images. A True Positive Rate (TPR) of 0.95 and a False Positive Rate (FPR) of 0.20 were obtained. To detect objects with axes at various orientations a variant method using an interpolated colour MLR has been developed. This had a TPR of 0.94 and an FPR of 0.21 when tested over 700 images of pedestrians. Interpretation was evaluated using over 220 video sequences (640 x 480 pixels per frame) and 1000 images of people alone and people associated with other objects. The objective was not so much to evaluate pedestrian detection but the precision and reliability of object delineation. More than 94% of pedestrians were correctly interpreted

    Reconstruction of moving surfaces of revolution from sparse 3-D measurements using a stereo camera and structured light

    Get PDF
    Das Ziel dieser Arbeit ist die Entwicklung und Analyse der algorithmischen Methodik zur Rekonstruktion eines parametrischen OberflĂ€chenmodells fĂŒr ein rotationssymmetrisches Objekt aus einer Sequenz von dĂŒnnen 3D-Punktwolken. Dabei kommt ein neuartiges Messsystem mit großem Sichtfeld zum Einsatz, das auch in schwierigen Bedingungen eingesetzt werden kann. Das zu vermessende Objekt kann wĂ€hrend der Aufnahme der Sequenz einer als analytisches Modell formulierbaren Bewegung unterliegen. Das Verfahren wird anhand einer praktischen Anwendung zur OberflĂ€chenrĂŒckgewinnung eines Rades analysiert und entwickelt. Es wird gezeigt, dass die durch Fit eines einfachen Models fĂŒr jede Einzelmessung erzielbare Genauigkeit durch Anpassung eines globalen Modells unter gleichzeitiger Einbeziehung aller Einzelmessungen und unter BerĂŒcksichtigung eines geeigneten Bewegungsmodells erheblich verbessert werden kann. Die Gewinnung der dreidimensionalen Punktdaten erfolgt mit einem Stereokamerasystem in Verbindung mit aktiver Beleuchtung in Form eines Punktmusters. Eine relativ hohe Punktdichte im gesamten Sichtfeld des Stereokamerasystems wird durch Verbindung mehrerer Laserprojektoren zu einer Projektionseinheit erzielt. Durch exakte Kalibrierung des Kamerasystems und der Projektionseinheit wird trotz großer Streuung der Laserpunkte im Kamerabild unter Ausnutzung der trifokalen geometrischen Bedingungen eine hohe Genauigkeit in den dreidimensionalen Punktdaten erzielt

    Combining shape and color. A bottom-up approach to evaluate object similarities

    Get PDF
    The objective of the present work is to develop a bottom-up approach to estimate the similarity between two unknown objects. Given a set of digital images, we want to identify the main objects and to determine whether they are similar or not. In the last decades many object recognition and classification strategies, driven by higher-level activities, have been successfully developed. The peculiarity of this work, instead, is the attempt to work without any training phase nor a priori knowledge about the objects or their context. Indeed, if we suppose to be in an unstructured and completely unknown environment, usually we have to deal with novel objects never seen before; under these hypothesis, it would be very useful to define some kind of similarity among the instances under analysis (even if we do not know which category they belong to). To obtain this result, we start observing that human beings use a lot of information and analyze very different aspects to achieve object recognition: shape, position, color and so on. Hence we try to reproduce part of this process, combining different methodologies (each working on a specific characteristic) to obtain a more meaningful idea of similarity. Mainly inspired by the human conception of representation, we identify two main characteristics and we called them the implicit and explicit models. The term "explicit" is used to account for the main traits of what, in the human representation, connotes a principal source of information regarding a category, a sort of a visual synecdoche (corresponding to the shape); the term "implicit", on the other hand, accounts for the object rendered by shadows and lights, colors and volumetric impression, a sort of a visual metonymy (corresponding to the chromatic characteristics). During the work, we had to face several problems and we tried to define specific solutions. In particular, our contributions are about: - defining a bottom-up approach for image segmentation (which does not rely on any a priori knowledge); - combining different features to evaluate objects similarity (particularly focusiing on shape and color); - defining a generic distance (similarity) measure between objects (without any attempt to identify the possible category they belong to); - analyzing the consequences of using the number of modes as an estimation of the number of mixture’s components (in the Expectation-Maximization algorithm)
    corecore