9,908 research outputs found

    Medial/skeletal linking structures for multi-region configurations

    Full text link
    We consider a generic configuration of regions, consisting of a collection of distinct compact regions {Ωi}\{\Omega_i\} in Rn+1\mathbb{R}^{n+1} which may be either smooth regions disjoint from the others or regions which meet on their piecewise smooth boundaries Bi\mathcal{B}_i in a generic way. We introduce a skeletal linking structure for the collection of regions which simultaneously captures the regions' individual shapes and geometric properties as well as the "positional geometry" of the collection. The linking structure extends in a minimal way the individual "skeletal structures" on each of the regions, allowing us to significantly extend the mathematical methods introduced for single regions to the configuration. We prove for a generic configuration of regions the existence of a special type of Blum linking structure which builds upon the Blum medial axes of the individual regions. This requires proving several transversality theorems for certain associated "multi-distance" and "height-distance" functions for such configurations. We show that by relaxing the conditions on the Blum linking structures we obtain the more general class of skeletal linking structures which still capture the geometric properties. In addition to yielding geometric invariants which capture the shapes and geometry of individual regions, the linking structures are used to define invariants which measure positional properties of the configuration such as: measures of relative closeness of neighboring regions and relative significance of the individual regions for the configuration. These invariants, which are computed by formulas involving "skeletal linking integrals" on the internal skeletal structures, are then used to construct a "tiered linking graph," which identifies subconfigurations and provides a hierarchical ordering of the regions.Comment: 135 pages, 36 figures. Version to appear in Memoirs of the Amer. Math. So

    Modeling Multi-object Configurations via Medial/Skeletal Linking Structures

    Get PDF
    We introduce a method for modeling a configuration of objects in 2D or 3D images using a mathematical "skeletal linking structure" which will simultaneously capture the individual shape features of the objects and their positional information relative to one another. The objects may either have smooth boundaries and be disjoint from the others or share common portions of their boundaries with other objects in a piecewise smooth manner. These structures include a special class of "Blum medial linking structures," which are intrinsically associated to the configuration and build upon the Blum medial axes of the individual objects. We give a classification of the properties of Blum linking structures for generic configurations. The skeletal linking structures add increased flexibility for modeling configurations of objects by relaxing the Blum conditions and they extend in a minimal way the individual "skeletal structures" which have been previously used for modeling individual objects and capturing their geometric properties. This allows for the mathematical methods introduced for single objects to be significantly extended to the entire configuration of objects. These methods not only capture the internal shape structures of the individual objects but also the external structure of the neighboring regions of the objects

    Enhanced spatial pyramid matching using log-polar-based image subdivision and representation

    Get PDF
    This paper presents a new model for capturing spatial information for object categorization with bag-of-words (BOW). BOW models have recently become popular for the task of object recognition, owing to their good performance and simplicity. Much work has been proposed over the years to improve the BOW model, where the Spatial Pyramid Matching (SPM) technique is the most notable. We propose a new method to exploit spatial relationships between image features, based on binned log-polar grids. Our model works by partitioning the image into grids of different scales and orientations and computing histogram of local features within each grid. Experimental results show that our approach improves the results on three diverse datasets over the SPM technique

    F-formation Detection: Individuating Free-standing Conversational Groups in Images

    Full text link
    Detection of groups of interacting people is a very interesting and useful task in many modern technologies, with application fields spanning from video-surveillance to social robotics. In this paper we first furnish a rigorous definition of group considering the background of the social sciences: this allows us to specify many kinds of group, so far neglected in the Computer Vision literature. On top of this taxonomy, we present a detailed state of the art on the group detection algorithms. Then, as a main contribution, we present a brand new method for the automatic detection of groups in still images, which is based on a graph-cuts framework for clustering individuals; in particular we are able to codify in a computational sense the sociological definition of F-formation, that is very useful to encode a group having only proxemic information: position and orientation of people. We call the proposed method Graph-Cuts for F-formation (GCFF). We show how GCFF definitely outperforms all the state of the art methods in terms of different accuracy measures (some of them are brand new), demonstrating also a strong robustness to noise and versatility in recognizing groups of various cardinality.Comment: 32 pages, submitted to PLOS On

    Towards Automatic Speech Identification from Vocal Tract Shape Dynamics in Real-time MRI

    Full text link
    Vocal tract configurations play a vital role in generating distinguishable speech sounds, by modulating the airflow and creating different resonant cavities in speech production. They contain abundant information that can be utilized to better understand the underlying speech production mechanism. As a step towards automatic mapping of vocal tract shape geometry to acoustics, this paper employs effective video action recognition techniques, like Long-term Recurrent Convolutional Networks (LRCN) models, to identify different vowel-consonant-vowel (VCV) sequences from dynamic shaping of the vocal tract. Such a model typically combines a CNN based deep hierarchical visual feature extractor with Recurrent Networks, that ideally makes the network spatio-temporally deep enough to learn the sequential dynamics of a short video clip for video classification tasks. We use a database consisting of 2D real-time MRI of vocal tract shaping during VCV utterances by 17 speakers. The comparative performances of this class of algorithms under various parameter settings and for various classification tasks are discussed. Interestingly, the results show a marked difference in the model performance in the context of speech classification with respect to generic sequence or video classification tasks.Comment: To appear in the INTERSPEECH 2018 Proceeding

    Few-Shot In-Context Imitation Learning via Implicit Graph Alignment

    Full text link
    Consider the following problem: given a few demonstrations of a task across a few different objects, how can a robot learn to perform that same task on new, previously unseen objects? This is challenging because the large variety of objects within a class makes it difficult to infer the task-relevant relationship between the new objects and the objects in the demonstrations. We address this by formulating imitation learning as a conditional alignment problem between graph representations of objects. Consequently, we show that this conditioning allows for in-context learning, where a robot can perform a task on a set of new objects immediately after the demonstrations, without any prior knowledge about the object class or any further training. In our experiments, we explore and validate our design choices, and we show that our method is highly effective for few-shot learning of several real-world, everyday tasks, whilst outperforming baselines. Videos are available on our project webpage at https://www.robot-learning.uk/implicit-graph-alignment.Comment: Published at CoRL 2023. Videos are available on our project webpage at https://www.robot-learning.uk/implicit-graph-alignmen
    corecore