1,036 research outputs found

    Automatic Query Image Disambiguation for Content-Based Image Retrieval

    Full text link
    Query images presented to content-based image retrieval systems often have various different interpretations, making it difficult to identify the search objective pursued by the user. We propose a technique for overcoming this ambiguity, while keeping the amount of required user interaction at a minimum. To achieve this, the neighborhood of the query image is divided into coherent clusters from which the user may choose the relevant ones. A novel feedback integration technique is then employed to re-rank the entire database with regard to both the user feedback and the original query. We evaluate our approach on the publicly available MIRFLICKR-25K dataset, where it leads to a relative improvement of average precision by 23% over the baseline retrieval, which does not distinguish between different image senses.Comment: VISAPP 2018 paper, 8 pages, 5 figures. Source code: https://github.com/cvjena/ai

    Relevant clouds: leveraging relevance feedback to build tag clouds for image search

    Full text link
    The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-642-40802-1_18Previous work in the literature has been aimed at exploring tag clouds to improve image search and potentially increase retrieval performance. However, to date none has considered the idea of building tag clouds derived from relevance feedback. We propose a simple approach to such an idea, where the tag cloud gives more importance to the words from the relevant images than the non-relevant ones. A preliminary study with 164 queries inspected by 14 participants over a 30M dataset of automatically annotated images showed that 1) tag clouds derived this way are found to be informative: users considered roughly 20% of the presented tags to be relevant for any query at any time; and 2) the importance given to the tags correlates with user judgments: tags ranked in the first positions tended to be perceived more often as relevant to the topic that users had in mind.Work supported by EU FP7/2007-2013 under grant agreements 600707 (tranScriptorium) and 287576 (CasMaCat), and by the STraDA project (TIN2012-37475-C02-01).Leiva Torres, LA.; Villegas Santamaría, M.; Paredes Palacios, R. (2013). Relevant clouds: leveraging relevance feedback to build tag clouds for image search. En Information Access Evaluation. Multilinguality, Multimodality, and Visualization. Springer Verlag (Germany). 143-149. https://doi.org/10.1007/978-3-642-40802-1_18S143149Begelman, G., Keller, P., Smadja, F.: Automated tag clustering: Improving search and exploration in the tag space. In: Collaborative Web Tagging (2006)Callegari, J., Morreale, P.: Assessment of the utility of tag clouds for faster image retrieval. In: Proc. MIR (2010)Ganchev, K., Hall, K., McDonald, R., Petrov, S.: Using search-logs to improve query tagging. In: Proc. ACL (2012)Hassan-Montero, Y., Herrero-Solana, V.: Improving tag-clouds as visual information retrieval interfaces. In: Proc. InSciT (2006)Leiva, L.A., Villegas, M., Paredes, R.: Query refinement suggestion in multimodal interactive image retrieval. In: Proc. ICMI (2011)Liu, D., Hua, X.-S., Yang, L., Wang, M., Zhang, H.-J.: Tag ranking. In: Proc. WWW (2009)Overell, S., Sigurbjörnsson, B., van Zwol, R.: Classifying tags using open content resources. In: Proc. WSDM (2009)Rui, Y., Huang, T.S., Ortega, M., Mehrotra, S.: Relevance feedback: A power tool for interactive content-based image retrieval. T. Circ. Syst. Vid. 8(5) (1998)Sigurbjörnsson, B., van Zwol, R.: Flickr tag recommendation based on collective knowledge. In: Proc. WWW (2008)Trattner, C., Lin, Y.-L., Parra, D., Yue, Z., Real, W., Brusilovsky, P.: Evaluating tag-based information access in image collections. In: Proc. HT (2012)Villegas, M., Paredes, R.: Image-text dataset generation for image annotation and retrieval. In: Proc. CERI (2012)Zhang, C., Chai, J.Y., Jin, R.: User term feedback in interactive text-based image retrieval. In: Proc. SIGIR (2005

    Diverse Contributions to Implicit Human-Computer Interaction

    Full text link
    Cuando las personas interactúan con los ordenadores, hay mucha información que no se proporciona a propósito. Mediante el estudio de estas interacciones implícitas es posible entender qué características de la interfaz de usuario son beneficiosas (o no), derivando así en implicaciones para el diseño de futuros sistemas interactivos. La principal ventaja de aprovechar datos implícitos del usuario en aplicaciones informáticas es que cualquier interacción con el sistema puede contribuir a mejorar su utilidad. Además, dichos datos eliminan el coste de tener que interrumpir al usuario para que envíe información explícitamente sobre un tema que en principio no tiene por qué guardar relación con la intención de utilizar el sistema. Por el contrario, en ocasiones las interacciones implícitas no proporcionan datos claros y concretos. Por ello, hay que prestar especial atención a la manera de gestionar esta fuente de información. El propósito de esta investigación es doble: 1) aplicar una nueva visión tanto al diseño como al desarrollo de aplicaciones que puedan reaccionar consecuentemente a las interacciones implícitas del usuario, y 2) proporcionar una serie de metodologías para la evaluación de dichos sistemas interactivos. Cinco escenarios sirven para ilustrar la viabilidad y la adecuación del marco de trabajo de la tesis. Resultados empíricos con usuarios reales demuestran que aprovechar la interacción implícita es un medio tanto adecuado como conveniente para mejorar de múltiples maneras los sistemas interactivos.Leiva Torres, LA. (2012). Diverse Contributions to Implicit Human-Computer Interaction [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/17803Palanci

    CHORUS Deliverable 2.2: Second report - identification of multi-disciplinary key issues for gap analysis toward EU multimedia search engines roadmap

    Get PDF
    After addressing the state-of-the-art during the first year of Chorus and establishing the existing landscape in multimedia search engines, we have identified and analyzed gaps within European research effort during our second year. In this period we focused on three directions, notably technological issues, user-centred issues and use-cases and socio- economic and legal aspects. These were assessed by two central studies: firstly, a concerted vision of functional breakdown of generic multimedia search engine, and secondly, a representative use-cases descriptions with the related discussion on requirement for technological challenges. Both studies have been carried out in cooperation and consultation with the community at large through EC concertation meetings (multimedia search engines cluster), several meetings with our Think-Tank, presentations in international conferences, and surveys addressed to EU projects coordinators as well as National initiatives coordinators. Based on the obtained feedback we identified two types of gaps, namely core technological gaps that involve research challenges, and “enablers”, which are not necessarily technical research challenges, but have impact on innovation progress. New socio-economic trends are presented as well as emerging legal challenges

    Recuperação de informação multimodal em repositórios de imagem médica

    Get PDF
    The proliferation of digital medical imaging modalities in hospitals and other diagnostic facilities has created huge repositories of valuable data, often not fully explored. Moreover, the past few years show a growing trend of data production. As such, studying new ways to index, process and retrieve medical images becomes an important subject to be addressed by the wider community of radiologists, scientists and engineers. Content-based image retrieval, which encompasses various methods, can exploit the visual information of a medical imaging archive, and is known to be beneficial to practitioners and researchers. However, the integration of the latest systems for medical image retrieval into clinical workflows is still rare, and their effectiveness still show room for improvement. This thesis proposes solutions and methods for multimodal information retrieval, in the context of medical imaging repositories. The major contributions are a search engine for medical imaging studies supporting multimodal queries in an extensible archive; a framework for automated labeling of medical images for content discovery; and an assessment and proposal of feature learning techniques for concept detection from medical images, exhibiting greater potential than feature extraction algorithms that were pertinently used in similar tasks. These contributions, each in their own dimension, seek to narrow the scientific and technical gap towards the development and adoption of novel multimodal medical image retrieval systems, to ultimately become part of the workflows of medical practitioners, teachers, and researchers in healthcare.A proliferação de modalidades de imagem médica digital, em hospitais, clínicas e outros centros de diagnóstico, levou à criação de enormes repositórios de dados, frequentemente não explorados na sua totalidade. Além disso, os últimos anos revelam, claramente, uma tendência para o crescimento da produção de dados. Portanto, torna-se importante estudar novas maneiras de indexar, processar e recuperar imagens médicas, por parte da comunidade alargada de radiologistas, cientistas e engenheiros. A recuperação de imagens baseada em conteúdo, que envolve uma grande variedade de métodos, permite a exploração da informação visual num arquivo de imagem médica, o que traz benefícios para os médicos e investigadores. Contudo, a integração destas soluções nos fluxos de trabalho é ainda rara e a eficácia dos mais recentes sistemas de recuperação de imagem médica pode ser melhorada. A presente tese propõe soluções e métodos para recuperação de informação multimodal, no contexto de repositórios de imagem médica. As contribuições principais são as seguintes: um motor de pesquisa para estudos de imagem médica com suporte a pesquisas multimodais num arquivo extensível; uma estrutura para a anotação automática de imagens; e uma avaliação e proposta de técnicas de representation learning para deteção automática de conceitos em imagens médicas, exibindo maior potencial do que as técnicas de extração de features visuais outrora pertinentes em tarefas semelhantes. Estas contribuições procuram reduzir as dificuldades técnicas e científicas para o desenvolvimento e adoção de sistemas modernos de recuperação de imagem médica multimodal, de modo a que estes façam finalmente parte das ferramentas típicas dos profissionais, professores e investigadores da área da saúde.Programa Doutoral em Informátic

    The Search as Learning Spaceship: Toward a Comprehensive Model of Psychological and Technological Facets of Search as Learning

    Get PDF
    Using a Web search engine is one of today’s most frequent activities. Exploratory search activities which are carried out in order to gain knowledge are conceptualized and denoted as Search as Learning (SAL). In this paper, we introduce a novel framework model which incorporates the perspective of both psychology and computer science to describe the search as learning process by reviewing recent literature. The main entities of the model are the learner who is surrounded by a specific learning context, the interface that mediates between the learner and the information environment, the information retrieval (IR) backend which manages the processes between the interface and the set of Web resources, that is, the collective Web knowledge represented in resources of different modalities. At first, we provide an overview of the current state of the art with regard to the five main entities of our model, before we outline areas of future research to improve our understanding of search as learning processes

    The Search as Learning Spaceship: Toward a Comprehensive Model of Psychological and Technological Facets of Search as Learning

    Get PDF
    Using a Web search engine is one of today’s most frequent activities. Exploratory search activities which are carried out in order to gain knowledge are conceptualized and denoted as Search as Learning (SAL). In this paper, we introduce a novel framework model which incorporates the perspective of both psychology and computer science to describe the search as learning process by reviewing recent literature. The main entities of the model are the learner who is surrounded by a specific learning context, the interface that mediates between the learner and the information environment, the information retrieval (IR) backend which manages the processes between the interface and the set of Web resources, that is, the collective Web knowledge represented in resources of different modalities. At first, we provide an overview of the current state of the art with regard to the five main entities of our model, before we outline areas of future research to improve our understanding of search as learning processes. Copyright © 2022 von Hoyer, Hoppe, Kammerer, Otto, Pardi, Rokicki, Yu, Dietze, Ewerth and Holtz

    The Infinite Index: Information Retrieval on Generative Text-To-Image Models

    Full text link
    Conditional generative models such as DALL-E and Stable Diffusion generate images based on a user-defined text, the prompt. Finding and refining prompts that produce a desired image has become the art of prompt engineering. Generative models do not provide a built-in retrieval model for a user's information need expressed through prompts. In light of an extensive literature review, we reframe prompt engineering for generative models as interactive text-based retrieval on a novel kind of "infinite index". We apply these insights for the first time in a case study on image generation for game design with an expert. Finally, we envision how active learning may help to guide the retrieval of generated images.Comment: Final version for CHIIR 202
    corecore