19,083 research outputs found
A study into annotation ranking metrics in geo-tagged image corpora
Community contributed datasets are becoming increasingly common in automated image annotation systems. One important issue with community image data is that there is no guarantee that the associated metadata is relevant. A method is required that can accurately rank the semantic relevance of community annotations. This should enable the extracting of relevant subsets from potentially noisy collections of these annotations. Having relevant, non heterogeneous tags assigned to images should improve community image retrieval systems, such as Flickr, which are based on text retrieval methods. In the literature, the current state of the art approach to ranking the semantic relevance of Flickr tags is based on the widely used tf-idf metric. In the case of datasets containing landmark images, however, this metric is inefficient due to the high frequency of common landmark tags within the data set and can be improved upon. In this paper, we present a landmark recognition framework, that provides end-to-end automated recognition and annotation. In our study into automated annotation, we evaluate 5 alternate approaches to tf-idf
to rank tag relevance in community contributed landmark image corpora. We carry out a thorough evaluation of each of these ranking metrics and results of this evaluation demonstrate that four of these proposed techniques outperform the current commonly-used tf-idf approach for this task
Semantic Part Segmentation using Compositional Model combining Shape and Appearance
In this paper, we study the problem of semantic part segmentation for
animals. This is more challenging than standard object detection, object
segmentation and pose estimation tasks because semantic parts of animals often
have similar appearance and highly varying shapes. To tackle these challenges,
we build a mixture of compositional models to represent the object boundary and
the boundaries of semantic parts. And we incorporate edge, appearance, and
semantic part cues into the compositional model. Given part-level segmentation
annotation, we develop a novel algorithm to learn a mixture of compositional
models under various poses and viewpoints for certain animal classes.
Furthermore, a linear complexity algorithm is offered for efficient inference
of the compositional model using dynamic programming. We evaluate our method
for horse and cow using a newly annotated dataset on Pascal VOC 2010 which has
pixelwise part labels. Experimental results demonstrate the effectiveness of
our method
Occlusion Coherence: Detecting and Localizing Occluded Faces
The presence of occluders significantly impacts object recognition accuracy.
However, occlusion is typically treated as an unstructured source of noise and
explicit models for occluders have lagged behind those for object appearance
and shape. In this paper we describe a hierarchical deformable part model for
face detection and landmark localization that explicitly models part occlusion.
The proposed model structure makes it possible to augment positive training
data with large numbers of synthetically occluded instances. This allows us to
easily incorporate the statistics of occlusion patterns in a discriminatively
trained model. We test the model on several benchmarks for landmark
localization and detection including challenging new data sets featuring
significant occlusion. We find that the addition of an explicit occlusion model
yields a detection system that outperforms existing approaches for occluded
instances while maintaining competitive accuracy in detection and landmark
localization for unoccluded instances
Fast Approximate -Means via Cluster Closures
-means, a simple and effective clustering algorithm, is one of the most
widely used algorithms in multimedia and computer vision community. Traditional
-means is an iterative algorithm---in each iteration new cluster centers are
computed and each data point is re-assigned to its nearest center. The cluster
re-assignment step becomes prohibitively expensive when the number of data
points and cluster centers are large.
In this paper, we propose a novel approximate -means algorithm to greatly
reduce the computational complexity in the assignment step. Our approach is
motivated by the observation that most active points changing their cluster
assignments at each iteration are located on or near cluster boundaries. The
idea is to efficiently identify those active points by pre-assembling the data
into groups of neighboring points using multiple random spatial partition
trees, and to use the neighborhood information to construct a closure for each
cluster, in such a way only a small number of cluster candidates need to be
considered when assigning a data point to its nearest cluster. Using complexity
analysis, image data clustering, and applications to image retrieval, we show
that our approach out-performs state-of-the-art approximate -means
algorithms in terms of clustering quality and efficiency
- …