118,804 research outputs found

    Video retrieval using objects and ostensive relevance feedback

    Get PDF
    The thesis discusses and evaluates a model of video information retrieval that incorporates a variation of Relevance Feedback and facilitates object-based interaction and ranking. Video and image retrieval systems suffer from poor retrieval performance compared to text-based information retrieval systems and this is mainly due to the poor discrimination power of visual features that provide the search index. Relevance Feedback is an iterative approach where the user provides the system with relevant and non-relevant judgements of the results and the system re-ranks the results based on the user judgements. Relevance feedback for video retrieval can help overcome the poor discrimination power of the features with the user essentially pointing the system in the right direction based on their judgements. The ostensive relevance feedback approach discussed in this work weights user judgements based on the o r d e r in which they are made with newer judgements weighted higher than older judgements. The main aim of the thesis is to explore the benefit of ostensive relevance feedback for video retrieval with a secondary aim of exploring the effectiveness of object retrieval. A user experiment has been developed in which three video retrieval system variants are evaluated on a corpus of video content. The first system applies standard relevance feedback weighting while the second and third apply ostensive relevance feedback with variations in the decay weight. In order to evaluate effective object retrieval, animated video content provides the corpus content for the evaluation experiment as animated content offers the highest performance for object detection and extraction

    A study into annotation ranking metrics in geo-tagged image corpora

    Get PDF
    Community contributed datasets are becoming increasingly common in automated image annotation systems. One important issue with community image data is that there is no guarantee that the associated metadata is relevant. A method is required that can accurately rank the semantic relevance of community annotations. This should enable the extracting of relevant subsets from potentially noisy collections of these annotations. Having relevant, non heterogeneous tags assigned to images should improve community image retrieval systems, such as Flickr, which are based on text retrieval methods. In the literature, the current state of the art approach to ranking the semantic relevance of Flickr tags is based on the widely used tf-idf metric. In the case of datasets containing landmark images, however, this metric is inefficient due to the high frequency of common landmark tags within the data set and can be improved upon. In this paper, we present a landmark recognition framework, that provides end-to-end automated recognition and annotation. In our study into automated annotation, we evaluate 5 alternate approaches to tf-idf to rank tag relevance in community contributed landmark image corpora. We carry out a thorough evaluation of each of these ranking metrics and results of this evaluation demonstrate that four of these proposed techniques outperform the current commonly-used tf-idf approach for this task

    Diversity, Assortment, Dissimilarity, Variety: A Study of Diversity Measures Using Low Level Features for Video Retrieval

    Get PDF
    In this paper we present a number of methods for re-ranking video search results in order to introduce diversity into the set of search results. The usefulness of these approaches is evaluated in comparison with similarity based measures, for the TRECVID 2007 collection and tasks [11]. For the MAP of the search results we find that some of our approaches perform as well as similarity based methods. We also find that some of these results can improve the P@N values for some of the lower N values. The most successful of these approaches was then implemented in an interactive search system for the TRECVID 2008 interactive search tasks. The responses from the users indicate that they find the more diverse search results extremely useful

    From Query-By-Keyword to Query-By-Example: LinkedIn Talent Search Approach

    Full text link
    One key challenge in talent search is to translate complex criteria of a hiring position into a search query, while it is relatively easy for a searcher to list examples of suitable candidates for a given position. To improve search efficiency, we propose the next generation of talent search at LinkedIn, also referred to as Search By Ideal Candidates. In this system, a searcher provides one or several ideal candidates as the input to hire for a given position. The system then generates a query based on the ideal candidates and uses it to retrieve and rank results. Shifting from the traditional Query-By-Keyword to this new Query-By-Example system poses a number of challenges: How to generate a query that best describes the candidates? When moving to a completely different paradigm, how does one leverage previous product logs to learn ranking models and/or evaluate the new system with no existing usage logs? Finally, given the different nature between the two search paradigms, the ranking features typically used for Query-By-Keyword systems might not be optimal for Query-By-Example. This paper describes our approach to solving these challenges. We present experimental results confirming the effectiveness of the proposed solution, particularly on query building and search ranking tasks. As of writing this paper, the new system has been available to all LinkedIn members

    Representativeness and Diversity in Photos via Crowd-Sourced Media Analysis

    Get PDF
    In this paper we present a hybrid three steps mechanism for automated-human media analysis employed for selecting a small number of representative and diverse images in the context of a noisy set of images. The first step consists in the automatic retrieval from web of a large database of candidate images. In the second step, a proposed image analysis method is employed with the goal of diminishing the time, pay and cognitive load and implicitly people’s work. This is done by automatically selecting a set of potentially relevant and diverse images. Considering the semantic gap between low-level features and high-level semantics in images, the last step is necessary and consists in images being annotated and assessed by the crowd. The aim is to evaluate the level of representativeness and diversity of the selected set of images and providing images of highest quality. The method was validated in the context of the retrieval of images with monuments and using more than 30,000 images retrieved from various social image search platforms

    Simple to Complex Cross-modal Learning to Rank

    Get PDF
    The heterogeneity-gap between different modalities brings a significant challenge to multimedia information retrieval. Some studies formalize the cross-modal retrieval tasks as a ranking problem and learn a shared multi-modal embedding space to measure the cross-modality similarity. However, previous methods often establish the shared embedding space based on linear mapping functions which might not be sophisticated enough to reveal more complicated inter-modal correspondences. Additionally, current studies assume that the rankings are of equal importance, and thus all rankings are used simultaneously, or a small number of rankings are selected randomly to train the embedding space at each iteration. Such strategies, however, always suffer from outliers as well as reduced generalization capability due to their lack of insightful understanding of procedure of human cognition. In this paper, we involve the self-paced learning theory with diversity into the cross-modal learning to rank and learn an optimal multi-modal embedding space based on non-linear mapping functions. This strategy enhances the model's robustness to outliers and achieves better generalization via training the model gradually from easy rankings by diverse queries to more complex ones. An efficient alternative algorithm is exploited to solve the proposed challenging problem with fast convergence in practice. Extensive experimental results on several benchmark datasets indicate that the proposed method achieves significant improvements over the state-of-the-arts in this literature.Comment: 14 pages; Accepted by Computer Vision and Image Understandin
    corecore