9,908 research outputs found
Medial/skeletal linking structures for multi-region configurations
We consider a generic configuration of regions, consisting of a collection of
distinct compact regions in which may be
either smooth regions disjoint from the others or regions which meet on their
piecewise smooth boundaries in a generic way. We introduce a
skeletal linking structure for the collection of regions which simultaneously
captures the regions' individual shapes and geometric properties as well as the
"positional geometry" of the collection. The linking structure extends in a
minimal way the individual "skeletal structures" on each of the regions,
allowing us to significantly extend the mathematical methods introduced for
single regions to the configuration.
We prove for a generic configuration of regions the existence of a special
type of Blum linking structure which builds upon the Blum medial axes of the
individual regions. This requires proving several transversality theorems for
certain associated "multi-distance" and "height-distance" functions for such
configurations. We show that by relaxing the conditions on the Blum linking
structures we obtain the more general class of skeletal linking structures
which still capture the geometric properties.
In addition to yielding geometric invariants which capture the shapes and
geometry of individual regions, the linking structures are used to define
invariants which measure positional properties of the configuration such as:
measures of relative closeness of neighboring regions and relative significance
of the individual regions for the configuration. These invariants, which are
computed by formulas involving "skeletal linking integrals" on the internal
skeletal structures, are then used to construct a "tiered linking graph," which
identifies subconfigurations and provides a hierarchical ordering of the
regions.Comment: 135 pages, 36 figures. Version to appear in Memoirs of the Amer.
Math. So
Modeling Multi-object Configurations via Medial/Skeletal Linking Structures
We introduce a method for modeling a configuration of objects in 2D or 3D images using a mathematical "skeletal linking structure" which will simultaneously capture the individual shape features of the objects and their positional information relative to one another. The objects may either have smooth boundaries and be disjoint from the others or share common portions of their boundaries with other objects in a piecewise smooth manner. These structures include a special class of "Blum medial linking structures," which are intrinsically associated to the configuration and build upon the Blum medial axes of the individual objects. We give a classification of the properties of Blum linking structures for generic configurations. The skeletal linking structures add increased flexibility for modeling configurations of objects by relaxing the Blum conditions and they extend in a minimal way the individual "skeletal structures" which have been previously used for modeling individual objects and capturing their geometric properties. This allows for the mathematical methods introduced for single objects to be significantly extended to the entire configuration of objects. These methods not only capture the internal shape structures of the individual objects but also the external structure of the neighboring regions of the objects
Enhanced spatial pyramid matching using log-polar-based image subdivision and representation
This paper presents a new model for capturing spatial information for object categorization with bag-of-words (BOW). BOW models have recently become popular for the task of object recognition, owing to their good performance and simplicity. Much work has been proposed over the years to improve the BOW model, where the Spatial Pyramid Matching (SPM) technique is the most notable. We propose a new method to exploit spatial relationships between image features, based on binned log-polar grids. Our model works by partitioning the image into grids of different scales and orientations and computing histogram of local features within each grid. Experimental results show that our approach improves the results on three diverse datasets over the SPM technique
F-formation Detection: Individuating Free-standing Conversational Groups in Images
Detection of groups of interacting people is a very interesting and useful
task in many modern technologies, with application fields spanning from
video-surveillance to social robotics. In this paper we first furnish a
rigorous definition of group considering the background of the social sciences:
this allows us to specify many kinds of group, so far neglected in the Computer
Vision literature. On top of this taxonomy, we present a detailed state of the
art on the group detection algorithms. Then, as a main contribution, we present
a brand new method for the automatic detection of groups in still images, which
is based on a graph-cuts framework for clustering individuals; in particular we
are able to codify in a computational sense the sociological definition of
F-formation, that is very useful to encode a group having only proxemic
information: position and orientation of people. We call the proposed method
Graph-Cuts for F-formation (GCFF). We show how GCFF definitely outperforms all
the state of the art methods in terms of different accuracy measures (some of
them are brand new), demonstrating also a strong robustness to noise and
versatility in recognizing groups of various cardinality.Comment: 32 pages, submitted to PLOS On
Towards Automatic Speech Identification from Vocal Tract Shape Dynamics in Real-time MRI
Vocal tract configurations play a vital role in generating distinguishable
speech sounds, by modulating the airflow and creating different resonant
cavities in speech production. They contain abundant information that can be
utilized to better understand the underlying speech production mechanism. As a
step towards automatic mapping of vocal tract shape geometry to acoustics, this
paper employs effective video action recognition techniques, like Long-term
Recurrent Convolutional Networks (LRCN) models, to identify different
vowel-consonant-vowel (VCV) sequences from dynamic shaping of the vocal tract.
Such a model typically combines a CNN based deep hierarchical visual feature
extractor with Recurrent Networks, that ideally makes the network
spatio-temporally deep enough to learn the sequential dynamics of a short video
clip for video classification tasks. We use a database consisting of 2D
real-time MRI of vocal tract shaping during VCV utterances by 17 speakers. The
comparative performances of this class of algorithms under various parameter
settings and for various classification tasks are discussed. Interestingly, the
results show a marked difference in the model performance in the context of
speech classification with respect to generic sequence or video classification
tasks.Comment: To appear in the INTERSPEECH 2018 Proceeding
Few-Shot In-Context Imitation Learning via Implicit Graph Alignment
Consider the following problem: given a few demonstrations of a task across a
few different objects, how can a robot learn to perform that same task on new,
previously unseen objects? This is challenging because the large variety of
objects within a class makes it difficult to infer the task-relevant
relationship between the new objects and the objects in the demonstrations. We
address this by formulating imitation learning as a conditional alignment
problem between graph representations of objects. Consequently, we show that
this conditioning allows for in-context learning, where a robot can perform a
task on a set of new objects immediately after the demonstrations, without any
prior knowledge about the object class or any further training. In our
experiments, we explore and validate our design choices, and we show that our
method is highly effective for few-shot learning of several real-world,
everyday tasks, whilst outperforming baselines. Videos are available on our
project webpage at https://www.robot-learning.uk/implicit-graph-alignment.Comment: Published at CoRL 2023. Videos are available on our project webpage
at https://www.robot-learning.uk/implicit-graph-alignmen
- …