5 research outputs found

    Dynamic Two-Stage Image Retrieval from Large Multimodal Databases

    No full text
    Abstract. Content-based image retrieval (CBIR) with global features is notori-ously noisy, especially for image queries with low percentages of relevant images in a collection. Moreover, CBIR typically ranks the whole collection, which is inefficient for large databases. We experiment with a method for image retrieval from multimodal databases, which improves both the effectiveness and efficiency of traditional CBIR by exploring secondary modalities. We perform retrieval in a two-stage fashion: first rank by a secondary modality, and then perform CBIR only on the top-K items. Thus, effectiveness is improved by performing CBIR on a ‘better ’ subset. Using a relatively ‘cheap ’ first stage, efficiency is also im-proved via the fewer CBIR operations performed. Our main novelty is that K is dynamic, i.e. estimated per query to optimize a predefined effectiveness measure. We show that such dynamic two-stage setups can be significantly more effective and robust than similar setups with static thresholds previously proposed.

    Dynamic two-stage image retrieval from large multimodal databases

    No full text
    Abstract. Content-based image retrieval (CBIR) with global features is notoriously noisy, especially for image queries with low percentages of relevant images in a collection. Moreover, CBIR typically ranks the whole collection, which is inefficient for large databases. We experiment with a method for image retrieval from multimodal databases, which improves both the effectiveness and efficiency of traditional CBIR by exploring secondary modalities. We perform retrieval in a two-stage fashion: first rank by a secondary modality, and then perform CBIR only on the top-K items. Thus, effectiveness is improved by performing CBIR on a 'better' subset. Using a relatively 'cheap' first stage, efficiency is also improved via the fewer CBIR operations performed. Our main novelty is that K is dynamic, i.e. estimated per query to optimize a predefined effectiveness measure. We show that such dynamic two-stage setups can be significantly more effective and robust than similar setups with static thresholds previously proposed

    Dynamic two-stage image retrieval from large multimodal databases

    No full text
    a b s t r a c t Content-based image retrieval (CBIR) with global features is notoriously noisy, especially for image queries with low percentages of relevant images in a collection. Moreover, CBIR typically ranks the whole collection, which is inefficient for large databases. We experiment with a method for image retrieval from multimedia databases, which improves both the effectiveness and efficiency of traditional CBIR by exploring secondary media. We perform retrieval in a two-stage fashion: first rank by a secondary medium, and then perform CBIR only on the top-K items. Thus, effectiveness is improved by performing CBIR on a 'better' subset. Using a relatively 'cheap' first stage, efficiency is also improved via the fewer CBIR operations performed. Our main novelty is that K is dynamic, i.e. estimated per query to optimize a predefined effectiveness measure. We show that our dynamic two-stage method can be significantly more effective and robust than similar setups with static thresholds previously proposed. In additional experiments using local feature derivatives in the visual stage instead of global, such as the emerging visual codebook approach, we find that two-stage does not work very well. We attribute the weaker performance of the visual codebook to the enhanced visual diversity produced by the textual stage which diminishes codebook's advantage over global features. Furthermore, we compare dynamic two-stage retrieval to traditional score-based fusion of results retrieved visually and textually. We find that fusion is also significantly more effective than single-medium baselines. Although, there is no clear winner between two-stage and fusion, the methods exhibit different robustness features; nevertheless, two-stage retrieval provides efficiency benefits over fusion

    Contribuições para a localização e mapeamento em robótica através da identificação visual de lugares

    Get PDF
    Tese de doutoramento, Informática (Engenharia Informática), Universidade de Lisboa, Faculdade de Ciências, 2015Em robótica móvel, os métodos baseados na aparência visual constituem umaabordagem atractiva para o tratamento dos problemas da localização e mapeamento.Contudo, para o seu sucesso é fundamental o uso de características visuais suficientemente discriminativas. Esta é uma condição necessária para assegurar o reconhecimento de lugares na presença de factores inibidores, tais como a semelhança entre lugares ou as variações de luminosidade. Esta tese debruça-se sobre os problemas de localização e mapeamento, tendo como objectivo transversal a obtenção de representações mais discriminativas ou com menores custos computacionais. Em termos gerais, dois tipos de características visuais são usadas, as características locais e globais. A aplicação de características locais na descrição da aparência tem sido dominada pelo modelo BoW (Bag-of-Words), segundo o qual os descritores são quantizados e substituídos por palavras visuais. Nesta tese questiona-se esta opção através do estudo da abordagem alternativa, a representação não-quantizada (NQ). Em resultado deste estudo, contribui-se com um novo método para a localização global de robôs móveis,o classificador NQ. Este, para além de apresentar maior precisão do que o modeloBoW, admite simplificações importantes que o tornam competitivo, também emtermos de eficiência, com a representação quantizada. Nesta tese é também estudado o problema anterior à localização, o da extracção de um mapa do ambiente, sendo focada, em particular, a detecção da revisitação de lugares. Para o tratamento deste problema é proposta uma nova característica global,designada LBP-Gist, que combina a análise de texturas pelo método LBP com a codificação da estrutura global da imagem, inerente à característica Gist. A avaliação deste método em vários datasets demonstra a viabilidade do detector proposto, o qual apresenta precisão e eficiência superiores ao state-of–the-art em ambientes de exterior.In the mobile robotics field, appearance-based methods are at the core of several attractive systems for localization and mapping. To be successful, however, these methods require features having good descriptive power. This is a necessary condition to ensure place recognition in the presence of disturbing factors, such as high similarity between places or lighting variations. This thesis addresses the localization and mapping problems, globally seeking representations which are more discriminative or more efficient. To this end, two broad types of visual features are used, local and global features. Appearance representations based on local features have been dominated by the BoW (Bag of Words) model, which prescribes the quantization of descriptors and their labelling with visual words. In this thesis, this method is challenged through the study of the alternative approach, the non-quantized representation (NQ). As an outcome of this study, we contribute with a novel global localization method, the NQ classifier. Besides offering higher precision than the BoW model, this classifier is susceptible of significant simplifications, through which it is made competitive to the quantized representation in terms of efficiency. This thesis also addresses the problem posed prior to localization, the mapping of the environment, focusing specifically on the loop closure detection task. To support loop closing, a new global feature, LBP-Gist, is proposed. As the name suggests, this feature combines texture analysis, provided by the LBP method, with the encoding of global image structure, underlying the Gist feature. Evaluation on several datasets demonstrates the validity of the proposed detector. Concretely, precision and efficiency of the method are shown to be superior to the state-of-the-art in outdoor environments

    Understanding search

    Get PDF
    This thesis provides a framework for information retrieval based on a set of models which together illustrate how users of search engines come to express their needs in a particular way. With such insights, we may be able to improve systems’ capabilities of understanding users’ requests and through that eventually the ability to satisfy their needs. Developing the framework necessitates discussion of context, relevance, need development, and the cybernetics of search, all of which are controversial topics. Transaction log data from two enterprise search engines are analysed using a specially developed method which classifies queries according to what aspect of the need they refer to
    corecore