14 research outputs found

    Investigations on skeleton completeness for skeleton-based shape matching

    Get PDF
    Skeleton is an important shape descriptor for deformable shape matching, because it integrates both geometrical and topological features of a shape. As the skeletonisation process often generates redundant skeleton branches that may seriously disturb the skeleton matching and cause high computational complexity, skeleton pruning is required to remove the inaccurate or redundant branches while preserving the essential topology of the original skeleton. However, pruning approaches normally require manual intervention to produce visually complete skeletons. As different people may have different perceptions for identifying visually complete skeletons, it is unclear how much the accuracy of skeleton-based shape matching is influenced by human selection. Moreover, it is also unclear how skeleton completeness impacts the accuracy of skeleton-based shapematching. We investigate here these two questions in a structured way. In addition, we present experimental evidence to show that it is possible to do automatic skeleton pruning while maintaining the matching accuracy by estimating the approximate pruning power of each shape

    Matching Disparate Image Pairs Using Shape-Aware ConvNets

    Full text link
    An end-to-end trainable ConvNet architecture, that learns to harness the power of shape representation for matching disparate image pairs, is proposed. Disparate image pairs are deemed those that exhibit strong affine variations in scale, viewpoint and projection parameters accompanied by the presence of partial or complete occlusion of objects and extreme variations in ambient illumination. Under these challenging conditions, neither local nor global feature-based image matching methods, when used in isolation, have been observed to be effective. The proposed correspondence determination scheme for matching disparate images exploits high-level shape cues that are derived from low-level local feature descriptors, thus combining the best of both worlds. A graph-based representation for the disparate image pair is generated by constructing an affinity matrix that embeds the distances between feature points in two images, thus modeling the correspondence determination problem as one of graph matching. The eigenspectrum of the affinity matrix, i.e., the learned global shape representation, is then used to further regress the transformation or homography that defines the correspondence between the source image and target image. The proposed scheme is shown to yield state-of-the-art results for both, coarse-level shape matching as well as fine point-wise correspondence determination.Comment: First two authors contributed equally, to Appear in the IEEE Winter Conference on Applications of Computer Vision (WACV) 201

    A Multicamera System for Gesture Tracking With Three Dimensional Hand Pose Estimation

    Get PDF
    The goal of any visual tracking system is to successfully detect then follow an object of interest through a sequence of images. The difficulty of tracking an object depends on the dynamics, the motion and the characteristics of the object as well as on the environ ment. For example, tracking an articulated, self-occluding object such as a signing hand has proven to be a very difficult problem. The focus of this work is on tracking and pose estimation with applications to hand gesture interpretation. An approach that attempts to integrate the simplicity of a region tracker with single hand 3D pose estimation methods is presented. Additionally, this work delves into the pose estimation problem. This is ac complished by both analyzing hand templates composed of their morphological skeleton, and addressing the skeleton\u27s inherent instability. Ligature points along the skeleton are flagged in order to determine their effect on skeletal instabilities. Tested on real data, the analysis finds the flagging of ligature points to proportionally increase the match strength of high similarity image-template pairs by about 6%. The effectiveness of this approach is further demonstrated in a real-time multicamera hand tracking system that tracks hand gestures through three-dimensional space as well as estimate the three-dimensional pose of the hand

    Representation and Detection of Shapes in Images

    Get PDF
    We present a set of techniques that can be used to represent and detect shapes in images. Our methods revolve around a particular shape representation based on the description of objects using triangulated polygons. This representation is similar to the medial axis transform and has important properties from a computational perspective. The first problem we consider is the detection of non-rigid objects in images using deformable models. We present an efficient algorithm to solve this problem in a wide range of situations, and show examples in both natural and medical images. We also consider the problem of learning an accurate non-rigid shape model for a class of objects from examples. We show how to learn good models while constraining them to the form required by the detection algorithm. Finally, we consider the problem of low-level image segmentation and grouping. We describe a stochastic grammar that generates arbitrary triangulated polygons while capturing Gestalt principles of shape regularity. This grammar is used as a prior model over random shapes in a low level algorithm that detects objects in images

    Shape Representation in Primate Visual Area 4 and Inferotemporal Cortex

    Get PDF
    The representation of contour shape is an essential component of object recognition, but the cortical mechanisms underlying it are incompletely understood, leaving it a fundamental open question in neuroscience. Such an understanding would be useful theoretically as well as in developing computer vision and Brain-Computer Interface applications. We ask two fundamental questions: “How is contour shape represented in cortex and how can neural models and computer vision algorithms more closely approximate this?” We begin by analyzing the statistics of contour curvature variation and develop a measure of salience based upon the arc length over which it remains within a constrained range. We create a population of V4-like cells – responsive to a particular local contour conformation located at a specific position on an object’s boundary – and demonstrate high recognition accuracies classifying handwritten digits in the MNIST database and objects in the MPEG-7 Shape Silhouette database. We compare the performance of the cells to the “shape-context” representation (Belongie et al., 2002) and achieve roughly comparable recognition accuracies using a small test set. We analyze the relative contributions of various feature sensitivities to recognition accuracy and robustness to noise. Local curvature appears to be the most informative for shape recognition. We create a population of IT-like cells, which integrate specific information about the 2-D boundary shapes of multiple contour fragments, and evaluate its performance on a set of real images as a function of the V4 cell inputs. We determine the sub-population of cells that are most effective at identifying a particular category. We classify based upon cell population response and obtain very good results. We use the Morris-Lecar neuronal model to more realistically illustrate the previously explored shape representation pathway in V4 – IT. We demonstrate recognition using spatiotemporal patterns within a winnerless competition network with FitzHugh-Nagumo model neurons. Finally, we use the Izhikevich neuronal model to produce an enhanced response in IT, correlated with recognition, via gamma synchronization in V4. Our results support the hypothesis that the response properties of V4 and IT cells, as well as our computer models of them, function as robust shape descriptors in the object recognition process

    A Pluralist Perspective on Shape Constancy

    Get PDF
    The ability to perceive the shapes of things as enduring through changes in how they stimulate our sense organs is vital to our sense of stability in the world. But what sort of capacity is shape constancy, and how is it reflected in perceptual experience? This paper defends a pluralist account of shape constancy: There are multiple kinds of shape constancy centered on geometrical properties at various levels of abstraction, and properties at these various levels feature in the content of perceptual experience, governing patterns of apparent shape similarity. I propose that the varieties of shape constancy are subserved by the syntactic complexity of perceptual shape representations. By assigning discrete constituents to various abstract shape parameters, these representations attune us to the preservation of certain abstract shape properties through changes in more determinate shape properties. Finally, I draw broader lessons concerning the nature and function of perceptual constancy

    Probabilistic procrustean models for shape recognition with an application to robotic grasping

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.Includes bibliographical references (p. 92-98).Robot manipulators largely rely on complete knowledge of object geometry in order to plan their motion and compute successful grasps. If an object is fully in view, the object geometry can be inferred from sensor data and a grasp computed directly. If the object is occluded by other entities in the scene, manipulations based on the visible part of the object may fail; to compensate, object recognition is often used to identify the location of the object and compute the grasp from a prior model. However, new instances of a known class of objects may vary from the prior model, and known objects may appear in novel configurations if they are not perfectly rigid. As a result, manipulation can pose a substantial modeling challenge when objects are not fully in view. In this thesis, we will attempt to model the shapes of objects in a way that is robust to both deformations and occlusions. In addition, we will develop a model that allows us to recover the hidden parts of occluded objects (shape completion), and which maintains information about the object boundary for use in robotic grasp planning. Our approach will be data-driven and generative, and we will base our probabilistic models on Kendall's Procrustean theory of shape.by Jared Marshall Glover.S.M

    Dominant points detection for shape analysis

    Get PDF
    The growing interest in recent years towards the multimedia and the large amount of information exchanged across the network involves the various fields of research towards the study of methods for automatic identification. One of the main objectives is to associate the information content of images, using techniques for identifying composing objects. Among image descriptors, contours reveal are very important because most of the information can be extracted from them and the contour analysis offers a lower computational complexity also. The contour analysis can be restricted to the study of some salient points with high curvature from which it is possible to reconstruct the original contour. The thesis is focused on the polygonal approximation of closed digital curves. After an overview of the most common shape descriptors, distinguished between simple descriptors and external methods, that focus on the analysis of boundary points of objects, and internal methods, which use the pixels inside the object also, a description of the major methods regarding the extraction of dominant points studied so far and the metrics typically used to evaluate the goodness of the polygonal approximation found is given. Three novel approaches to the problem are then discussed in detail: a fast iterative method (DPIL), more suitable for realtime processing, and two metaheuristics methods (GAPA, ACOPA) based on genetic algorithms and Ant Colony Optimization (ACO), more com- plex from the point of view of the calculation, but more precise. Such techniques are then compared with the other main methods cited in literature, in order to assess the performance in terms of computational complexity and polygonal approximation error, and measured between them, in order to evaluate the robustness with respect to affine transformations and conditions of noise. Two new techniques of shape matching, i.e. identification of objects belonging to the same class in a database of images, are then described. The first one is based on the shape alignment and the second is based on a correspondence by ACO, which puts in evidence the excellent results, both in terms of computational time and recognition accuracy, obtained through the use of dominant points. In the first matching algorithm the results are compared with a selection of dominant points generated by a human operator while in the second the dominant points are used instead of a constant sampling of the outline typically used for this kind of approach

    Contributions to the content-based image retrieval using pictorial queries

    Get PDF
    Descripció del recurs: el 02 de novembre de 2010L'accés massiu a les càmeres digitals, els ordinadors personals i a Internet, ha propiciat la creació de grans volums de dades en format digital. En aquest context, cada vegada adquireixen major rellevància totes aquelles eines dissenyades per organitzar la informació i facilitar la seva cerca. Les imatges són un cas particular de dades que requereixen tècniques específiques de descripció i indexació. L'àrea de la visió per computador encarregada de l'estudi d'aquestes tècniques rep el nom de Recuperació d'Imatges per Contingut, en anglès Content-Based Image Retrieval (CBIR). Els sistemes de CBIR no utilitzen descripcions basades en text sinó que es basen en característiques extretes de les pròpies imatges. En contrast a les més de 6000 llengües parlades en el món, les descripcions basades en característiques visuals representen una via d'expressió universal. La intensa recerca en el camp dels sistemes de CBIR s'ha aplicat en àrees de coneixement molt diverses. Així doncs s'han desenvolupat aplicacions de CBIR relacionades amb la medicina, la protecció de la propietat intel·lectual, el periodisme, el disseny gràfic, la cerca d'informació en Internet, la preservació dels patrimoni cultural, etc. Un dels punts importants d'una aplicació de CBIR resideix en el disseny de les funcions de l'usuari. L'usuari és l'encarregat de formular les consultes a partir de les quals es fa la cerca de les imatges. Nosaltres hem centrat l'atenció en aquells sistemes en què la consulta es formula a partir d'una representació pictòrica. Hem plantejat una taxonomia dels sistemes de consulta en composada per quatre paradigmes diferents: Consulta-segons-Selecció, Consulta-segons-Composició-Icònica, Consulta-segons-Esboç i Consulta-segons-Il·lustració. Cada paradigma incorpora un nivell diferent en el potencial expressiu de l'usuari. Des de la simple selecció d'una imatge, fins a la creació d'una il·lustració en color, l'usuari és qui pren el control de les dades d'entrada del sistema. Al llarg dels capítols d'aquesta tesi hem analitzat la influència que cada paradigma de consulta exerceix en els processos interns d'un sistema de CBIR. D'aquesta manera també hem proposat un conjunt de contribucions que hem exemplificat des d'un punt de vista pràctic mitjançant una aplicació final
    corecore