1,036 research outputs found
Automatic Query Image Disambiguation for Content-Based Image Retrieval
Query images presented to content-based image retrieval systems often have
various different interpretations, making it difficult to identify the search
objective pursued by the user. We propose a technique for overcoming this
ambiguity, while keeping the amount of required user interaction at a minimum.
To achieve this, the neighborhood of the query image is divided into coherent
clusters from which the user may choose the relevant ones. A novel feedback
integration technique is then employed to re-rank the entire database with
regard to both the user feedback and the original query. We evaluate our
approach on the publicly available MIRFLICKR-25K dataset, where it leads to a
relative improvement of average precision by 23% over the baseline retrieval,
which does not distinguish between different image senses.Comment: VISAPP 2018 paper, 8 pages, 5 figures. Source code:
https://github.com/cvjena/ai
Relevant clouds: leveraging relevance feedback to build tag clouds for image search
The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-642-40802-1_18Previous work in the literature has been aimed at exploring tag clouds to improve image search and potentially increase retrieval performance. However, to date none has considered the idea of building tag clouds derived from relevance feedback. We propose a simple approach to such an idea, where the tag cloud gives more importance to the words from the relevant images than the non-relevant ones. A preliminary study with 164 queries inspected by 14 participants over a 30M dataset of automatically annotated images showed that 1) tag clouds derived this way are found to be informative: users considered roughly 20% of the presented tags to be relevant for any query at any time; and 2) the importance given to the tags correlates with user judgments: tags ranked in the first positions tended to be perceived more often as relevant to the topic that users had in mind.Work supported by EU FP7/2007-2013 under grant agreements 600707 (tranScriptorium) and 287576 (CasMaCat), and by the STraDA project (TIN2012-37475-C02-01).Leiva Torres, LA.; Villegas Santamaría, M.; Paredes Palacios, R. (2013). Relevant clouds: leveraging relevance feedback to build tag clouds for image search. En Information Access Evaluation. Multilinguality, Multimodality, and Visualization. Springer Verlag (Germany). 143-149. https://doi.org/10.1007/978-3-642-40802-1_18S143149Begelman, G., Keller, P., Smadja, F.: Automated tag clustering: Improving search and exploration in the tag space. In: Collaborative Web Tagging (2006)Callegari, J., Morreale, P.: Assessment of the utility of tag clouds for faster image retrieval. In: Proc. MIR (2010)Ganchev, K., Hall, K., McDonald, R., Petrov, S.: Using search-logs to improve query tagging. In: Proc. ACL (2012)Hassan-Montero, Y., Herrero-Solana, V.: Improving tag-clouds as visual information retrieval interfaces. In: Proc. InSciT (2006)Leiva, L.A., Villegas, M., Paredes, R.: Query refinement suggestion in multimodal interactive image retrieval. In: Proc. ICMI (2011)Liu, D., Hua, X.-S., Yang, L., Wang, M., Zhang, H.-J.: Tag ranking. In: Proc. WWW (2009)Overell, S., Sigurbjörnsson, B., van Zwol, R.: Classifying tags using open content resources. In: Proc. WSDM (2009)Rui, Y., Huang, T.S., Ortega, M., Mehrotra, S.: Relevance feedback: A power tool for interactive content-based image retrieval. T. Circ. Syst. Vid. 8(5) (1998)Sigurbjörnsson, B., van Zwol, R.: Flickr tag recommendation based on collective knowledge. In: Proc. WWW (2008)Trattner, C., Lin, Y.-L., Parra, D., Yue, Z., Real, W., Brusilovsky, P.: Evaluating tag-based information access in image collections. In: Proc. HT (2012)Villegas, M., Paredes, R.: Image-text dataset generation for image annotation and retrieval. In: Proc. CERI (2012)Zhang, C., Chai, J.Y., Jin, R.: User term feedback in interactive text-based image retrieval. In: Proc. SIGIR (2005
Diverse Contributions to Implicit Human-Computer Interaction
Cuando las personas interactúan con los ordenadores, hay mucha
información que no se proporciona a propósito. Mediante el estudio de estas
interacciones implícitas es posible entender qué características de la interfaz
de usuario son beneficiosas (o no), derivando así en implicaciones para el
diseño de futuros sistemas interactivos.
La principal ventaja de aprovechar datos implícitos del usuario en
aplicaciones informáticas es que cualquier interacción con el sistema puede
contribuir a mejorar su utilidad. Además, dichos datos eliminan el coste de
tener que interrumpir al usuario para que envíe información explícitamente
sobre un tema que en principio no tiene por qué guardar relación con la
intención de utilizar el sistema. Por el contrario, en ocasiones las
interacciones implícitas no proporcionan datos claros y concretos. Por ello,
hay que prestar especial atención a la manera de gestionar esta fuente de
información.
El propósito de esta investigación es doble: 1) aplicar una nueva visión tanto
al diseño como al desarrollo de aplicaciones que puedan reaccionar
consecuentemente a las interacciones implícitas del usuario, y 2)
proporcionar una serie de metodologías para la evaluación de dichos
sistemas interactivos. Cinco escenarios sirven para ilustrar la viabilidad y la
adecuación del marco de trabajo de la tesis. Resultados empíricos con
usuarios reales demuestran que aprovechar la interacción implícita es un
medio tanto adecuado como conveniente para mejorar de múltiples maneras
los sistemas interactivos.Leiva Torres, LA. (2012). Diverse Contributions to Implicit Human-Computer Interaction [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/17803Palanci
CHORUS Deliverable 2.2: Second report - identification of multi-disciplinary key issues for gap analysis toward EU multimedia search engines roadmap
After addressing the state-of-the-art during the first year of Chorus and establishing the existing landscape in
multimedia search engines, we have identified and analyzed gaps within European research effort during our second year.
In this period we focused on three directions, notably technological issues, user-centred issues and use-cases and socio-
economic and legal aspects. These were assessed by two central studies: firstly, a concerted vision of functional breakdown
of generic multimedia search engine, and secondly, a representative use-cases descriptions with the related discussion on
requirement for technological challenges. Both studies have been carried out in cooperation and consultation with the
community at large through EC concertation meetings (multimedia search engines cluster), several meetings with our
Think-Tank, presentations in international conferences, and surveys addressed to EU projects coordinators as well as
National initiatives coordinators. Based on the obtained feedback we identified two types of gaps, namely core
technological gaps that involve research challenges, and “enablers”, which are not necessarily technical research
challenges, but have impact on innovation progress. New socio-economic trends are presented as well as emerging legal
challenges
Recuperação de informação multimodal em repositórios de imagem médica
The proliferation of digital medical imaging modalities in hospitals and other
diagnostic facilities has created huge repositories of valuable data, often
not fully explored. Moreover, the past few years show a growing trend
of data production. As such, studying new ways to index, process and
retrieve medical images becomes an important subject to be addressed by
the wider community of radiologists, scientists and engineers. Content-based
image retrieval, which encompasses various methods, can exploit the visual
information of a medical imaging archive, and is known to be beneficial to
practitioners and researchers. However, the integration of the latest systems
for medical image retrieval into clinical workflows is still rare, and their
effectiveness still show room for improvement.
This thesis proposes solutions and methods for multimodal information
retrieval, in the context of medical imaging repositories. The major
contributions are a search engine for medical imaging studies supporting
multimodal queries in an extensible archive; a framework for automated
labeling of medical images for content discovery; and an assessment and
proposal of feature learning techniques for concept detection from medical
images, exhibiting greater potential than feature extraction algorithms that
were pertinently used in similar tasks. These contributions, each in their
own dimension, seek to narrow the scientific and technical gap towards
the development and adoption of novel multimodal medical image retrieval
systems, to ultimately become part of the workflows of medical practitioners,
teachers, and researchers in healthcare.A proliferação de modalidades de imagem médica digital, em hospitais,
clínicas e outros centros de diagnóstico, levou à criação de enormes
repositórios de dados, frequentemente não explorados na sua totalidade.
Além disso, os últimos anos revelam, claramente, uma tendência para o
crescimento da produção de dados. Portanto, torna-se importante estudar
novas maneiras de indexar, processar e recuperar imagens médicas, por
parte da comunidade alargada de radiologistas, cientistas e engenheiros. A
recuperação de imagens baseada em conteúdo, que envolve uma grande
variedade de métodos, permite a exploração da informação visual num
arquivo de imagem médica, o que traz benefícios para os médicos e
investigadores. Contudo, a integração destas soluções nos fluxos de trabalho
é ainda rara e a eficácia dos mais recentes sistemas de recuperação de
imagem médica pode ser melhorada.
A presente tese propõe soluções e métodos para recuperação de informação
multimodal, no contexto de repositórios de imagem médica. As contribuições
principais são as seguintes: um motor de pesquisa para estudos de imagem
médica com suporte a pesquisas multimodais num arquivo extensível; uma
estrutura para a anotação automática de imagens; e uma avaliação e
proposta de técnicas de representation learning para deteção automática de
conceitos em imagens médicas, exibindo maior potencial do que as técnicas
de extração de features visuais outrora pertinentes em tarefas semelhantes.
Estas contribuições procuram reduzir as dificuldades técnicas e científicas
para o desenvolvimento e adoção de sistemas modernos de recuperação de
imagem médica multimodal, de modo a que estes façam finalmente parte
das ferramentas típicas dos profissionais, professores e investigadores da área
da saúde.Programa Doutoral em Informátic
The Search as Learning Spaceship: Toward a Comprehensive Model of Psychological and Technological Facets of Search as Learning
Using a Web search engine is one of today’s most frequent activities. Exploratory search activities which are carried out in order to gain knowledge are conceptualized and denoted as Search as Learning (SAL). In this paper, we introduce a novel framework model which incorporates the perspective of both psychology and computer science to describe the search as learning process by reviewing recent literature. The main entities of the model are the learner who is surrounded by a specific learning context, the interface that mediates between the learner and the information environment, the information retrieval (IR) backend which manages the processes between the interface and the set of Web resources, that is, the collective Web knowledge represented in resources of different modalities. At first, we provide an overview of the current state of the art with regard to the five main entities of our model, before we outline areas of future research to improve our understanding of search as learning processes
The Search as Learning Spaceship: Toward a Comprehensive Model of Psychological and Technological Facets of Search as Learning
Using a Web search engine is one of today’s most frequent activities. Exploratory search activities which are carried out in order to gain knowledge are conceptualized and denoted as Search as Learning (SAL). In this paper, we introduce a novel framework model which incorporates the perspective of both psychology and computer science to describe the search as learning process by reviewing recent literature. The main entities of the model are the learner who is surrounded by a specific learning context, the interface that mediates between the learner and the information environment, the information retrieval (IR) backend which manages the processes between the interface and the set of Web resources, that is, the collective Web knowledge represented in resources of different modalities. At first, we provide an overview of the current state of the art with regard to the five main entities of our model, before we outline areas of future research to improve our understanding of search as learning processes. Copyright © 2022 von Hoyer, Hoppe, Kammerer, Otto, Pardi, Rokicki, Yu, Dietze, Ewerth and Holtz
The Infinite Index: Information Retrieval on Generative Text-To-Image Models
Conditional generative models such as DALL-E and Stable Diffusion generate
images based on a user-defined text, the prompt. Finding and refining prompts
that produce a desired image has become the art of prompt engineering.
Generative models do not provide a built-in retrieval model for a user's
information need expressed through prompts. In light of an extensive literature
review, we reframe prompt engineering for generative models as interactive
text-based retrieval on a novel kind of "infinite index". We apply these
insights for the first time in a case study on image generation for game design
with an expert. Finally, we envision how active learning may help to guide the
retrieval of generated images.Comment: Final version for CHIIR 202
- …