7,527 research outputs found

    Adaptive Tag Selection for Image Annotation

    Full text link
    Not all tags are relevant to an image, and the number of relevant tags is image-dependent. Although many methods have been proposed for image auto-annotation, the question of how to determine the number of tags to be selected per image remains open. The main challenge is that for a large tag vocabulary, there is often a lack of ground truth data for acquiring optimal cutoff thresholds per tag. In contrast to previous works that pre-specify the number of tags to be selected, we propose in this paper adaptive tag selection. The key insight is to divide the vocabulary into two disjoint subsets, namely a seen set consisting of tags having ground truth available for optimizing their thresholds and a novel set consisting of tags without any ground truth. Such a division allows us to estimate how many tags shall be selected from the novel set according to the tags that have been selected from the seen set. The effectiveness of the proposed method is justified by our participation in the ImageCLEF 2014 image annotation task. On a set of 2,065 test images with ground truth available for 207 tags, the benchmark evaluation shows that compared to the popular top-kk strategy which obtains an F-score of 0.122, adaptive tag selection achieves a higher F-score of 0.223. Moreover, by treating the underlying image annotation system as a black box, the new method can be used as an easy plug-in to boost the performance of existing systems

    An Empirical Study and Analysis of Generalized Zero-Shot Learning for Object Recognition in the Wild

    Full text link
    Zero-shot learning (ZSL) methods have been studied in the unrealistic setting where test data are assumed to come from unseen classes only. In this paper, we advocate studying the problem of generalized zero-shot learning (GZSL) where the test data's class memberships are unconstrained. We show empirically that naively using the classifiers constructed by ZSL approaches does not perform well in the generalized setting. Motivated by this, we propose a simple but effective calibration method that can be used to balance two conflicting forces: recognizing data from seen classes versus those from unseen ones. We develop a performance metric to characterize such a trade-off and examine the utility of this metric in evaluating various ZSL approaches. Our analysis further shows that there is a large gap between the performance of existing approaches and an upper bound established via idealized semantic embeddings, suggesting that improving class semantic embeddings is vital to GZSL.Comment: ECCV2016 camera-read

    Face Identification and Clustering

    Full text link
    In this thesis, we study two problems based on clustering algorithms. In the first problem, we study the role of visual attributes using an agglomerative clustering algorithm to whittle down the search area where the number of classes is high to improve the performance of clustering. We observe that as we add more attributes, the clustering performance increases overall. In the second problem, we study the role of clustering in aggregating templates in a 1:N open set protocol using multi-shot video as a probe. We observe that by increasing the number of clusters, the performance increases with respect to the baseline and reaches a peak, after which increasing the number of clusters causes the performance to degrade. Experiments are conducted using recently introduced unconstrained IARPA Janus IJB-A, CS2, and CS3 face recognition datasets

    Review of Person Re-identification Techniques

    Full text link
    Person re-identification across different surveillance cameras with disjoint fields of view has become one of the most interesting and challenging subjects in the area of intelligent video surveillance. Although several methods have been developed and proposed, certain limitations and unresolved issues remain. In all of the existing re-identification approaches, feature vectors are extracted from segmented still images or video frames. Different similarity or dissimilarity measures have been applied to these vectors. Some methods have used simple constant metrics, whereas others have utilised models to obtain optimised metrics. Some have created models based on local colour or texture information, and others have built models based on the gait of people. In general, the main objective of all these approaches is to achieve a higher-accuracy rate and lowercomputational costs. This study summarises several developments in recent literature and discusses the various available methods used in person re-identification. Specifically, their advantages and disadvantages are mentioned and compared.Comment: Published 201
    corecore