7,663 research outputs found

    Evaluating a workspace's usefulness for image retrieval

    Get PDF
    Image searching is a creative process. We have proposed a novel image retrieval system that supports creative search sessions by allowing the user to organise their search results on a workspace. The workspace’s usefulness is evaluated in a task-oriented and user-centred comparative experiment, involving design professionals and several types of realistic search tasks. In particular, we focus on its effect on task conceptualisation and query formulation. A traditional relevance feedback system serves as a baseline. The results of this study show that the workspace is more useful in terms of both of the above aspects and that the proposed approach leads to a more effective and enjoyable search experience. This paper also highlights the influence of tasks on the users’ search and organisation strategy

    Automated illustration of multimedia stories

    Get PDF
    Submitted in part fulfillment of the requirements for the degree of Master in Computer ScienceWe all had the problem of forgetting about what we just read a few sentences before. This comes from the problem of attention and is more common with children and the elderly. People feel either bored or distracted by something more interesting. The challenge is how can multimedia systems assist users in reading and remembering stories? One solution is to use pictures to illustrate stories as a mean to captivate ones interest as it either tells a story or makes the viewer imagine one. This thesis researches the problem of automated story illustration as a method to increase the readers’ interest and attention. We formulate the hypothesis that an automated multimedia system can help users in reading a story by stimulating their reading memory with adequate visual illustrations. We propose a framework that tells a story and attempts to capture the readers’ attention by providing illustrations that spark the readers’ imagination. The framework automatically creates a multimedia presentation of the news story by (1) rendering news text in a sentence by-sentence fashion, (2) providing mechanisms to select the best illustration for each sentence and (3) select the set of illustrations that guarantees the best sequence. These mechanisms are rooted in image and text retrieval techniques. To further improve users’ attention, users may also activate a text-to-speech functionality according to their preference or reading difficulties. First experiments show how Flickr images can illustrate BBC news articles and provide a better experience to news readers. On top of the illustration methods, a user feedback feature was implemented to perfect the illustrations selection. With this feature users can aid the framework in selecting more accurate results. Finally, empirical evaluations were performed in order to test the user interface,image/sentence association algorithms and users’ feedback functionalities. The respective results are discussed

    The Infinite Index: Information Retrieval on Generative Text-To-Image Models

    Full text link
    Conditional generative models such as DALL-E and Stable Diffusion generate images based on a user-defined text, the prompt. Finding and refining prompts that produce a desired image has become the art of prompt engineering. Generative models do not provide a built-in retrieval model for a user's information need expressed through prompts. In light of an extensive literature review, we reframe prompt engineering for generative models as interactive text-based retrieval on a novel kind of "infinite index". We apply these insights for the first time in a case study on image generation for game design with an expert. Finally, we envision how active learning may help to guide the retrieval of generated images.Comment: Final version for CHIIR 202

    Eye of the Beholder: Improved Relation Generalization for Text-based Reinforcement Learning Agents

    Full text link
    Text-based games (TBGs) have become a popular proving ground for the demonstration of learning-based agents that make decisions in quasi real-world settings. The crux of the problem for a reinforcement learning agent in such TBGs is identifying the objects in the world, and those objects' relations with that world. While the recent use of text-based resources for increasing an agent's knowledge and improving its generalization have shown promise, we posit in this paper that there is much yet to be learned from visual representations of these same worlds. Specifically, we propose to retrieve images that represent specific instances of text observations from the world and train our agents on such images. This improves the agent's overall understanding of the game 'scene' and objects' relationships to the world around them, and the variety of visual representations on offer allow the agent to generate a better generalization of a relationship. We show that incorporating such images improves the performance of agents in various TBG settings

    CHORUS Deliverable 3.3: Vision Document - Intermediate version

    Get PDF
    The goal of the CHORUS vision document is to create a high level vision on audio-visual search engines in order to give guidance to the future R&D work in this area (in line with the mandate of CHORUS as a Coordination Action). This current intermediate draft of the CHORUS vision document (D3.3) is based on the previous CHORUS vision documents D3.1 to D3.2 and on the results of the six CHORUS Think-Tank meetings held in March, September and November 2007 as well as in April, July and October 2008, and on the feedback from other CHORUS events. The outcome of the six Think-Thank meetings will not just be to the benefit of the participants which are stakeholders and experts from academia and industry – CHORUS, as a coordination action of the EC, will feed back the findings (see Summary) to the projects under its purview and, via its website, to the whole community working in the domain of AV content search. A few subjections of this deliverable are to be completed after the eights (and presumably last) Think-Tank meeting in spring 2009
    • …
    corecore