8,908 research outputs found

    Colour appearance descriptors for image browsing and retrieval

    No full text
    In this paper, we focus on the development of whole-scene colour appearance descriptors for classification to be used in browsing applications. The descriptors can classify a whole-scene image into various categories of semantically-based colour appearance. Colour appearance is an important feature and has been extensively used in image-analysis, retrieval and classification. By using pre-existing global CIELAB colour histograms, firstly, we try to develop metrics for wholescene colour appearance: “colour strength”, “high/low lightness” and “multicoloured”. Secondly we propose methods using these metrics either alone or combined to classify whole-scene images into five categories of appearance: strong, pastel, dark, pale and multicoloured. Experiments show positive results and that the global colour histogram is actually useful and can be used for whole-scene colour appearance classification. We have also conducted a small-scale human evaluation test on whole-scene colour appearance. The results show, with suitable threshold settings, the proposed methods can describe the whole-scene colour appearance of images close to human classification. The descriptors were tested on thousands of images from various scenes: paintings, natural scenes, objects, photographs and documents. The colour appearance classifications are being integrated into an image browsing system which allows them to also be used to refine browsing

    Indexing, browsing and searching of digital video

    Get PDF
    Video is a communications medium that normally brings together moving pictures with a synchronised audio track into a discrete piece or pieces of information. The size of a “piece ” of video can variously be referred to as a frame, a shot, a scene, a clip, a programme or an episode, and these are distinguished by their lengths and by their composition. We shall return to the definition of each of these in section 4 this chapter. In modern society, video is ver

    TRECVID 2004 experiments in Dublin City University

    Get PDF
    In this paper, we describe our experiments for TRECVID 2004 for the Search task. In the interactive search task, we developed two versions of a video search/browse system based on the Físchlár Digital Video System: one with text- and image-based searching (System A); the other with only image (System B). These two systems produced eight interactive runs. In addition we submitted ten fully automatic supplemental runs and two manual runs. A.1, Submitted Runs: ‱ DCUTREC13a_{1,3,5,7} for System A, four interactive runs based on text and image evidence. ‱ DCUTREC13b_{2,4,6,8} for System B, also four interactive runs based on image evidence alone. ‱ DCUTV2004_9, a manual run based on filtering faces from an underlying text search engine for certain queries. ‱ DCUTV2004_10, a manual run based on manually generated queries processed automatically. ‱ DCU_AUTOLM{1,2,3,4,5,6,7}, seven fully automatic runs based on language models operating over ASR text transcripts and visual features. ‱ DCUauto_{01,02,03}, three fully automatic runs based on exploring the benefits of multiple sources of text evidence and automatic query expansion. A.2, In the interactive experiment it was confirmed that text and image based retrieval outperforms an image-only system. In the fully automatic runs, DCUauto_{01,02,03}, it was found that integrating ASR, CC and OCR text into the text ranking outperforms using ASR text alone. Furthermore, applying automatic query expansion to the initial results of ASR, CC, OCR text further increases performance (MAP), though not at high rank positions. For the language model-based fully automatic runs, DCU_AUTOLM{1,2,3,4,5,6,7}, we found that interpolated language models perform marginally better than other tested language models and that combining image and textual (ASR) evidence was found to marginally increase performance (MAP) over textual models alone. For our two manual runs we found that employing a face filter disimproved MAP when compared to employing textual evidence alone and that manually generated textual queries improved MAP over fully automatic runs, though the improvement was marginal. A.3, Our conclusions from our fully automatic text based runs suggest that integrating ASR, CC and OCR text into the retrieval mechanism boost retrieval performance over ASR alone. In addition, a text-only Language Modelling approach such as DCU_AUTOLM1 will outperform our best conventional text search system. From our interactive runs we conclude that textual evidence is an important lever for locating relevant content quickly, but that image evidence, if used by experienced users can aid retrieval performance. A.4, We learned that incorporating multiple text sources improves over ASR alone and that an LM approach which integrates shot text, neighbouring shots and entire video contents provides even better retrieval performance. These findings will influence how we integrate textual evidence into future Video IR systems. It was also found that a system based on image evidence alone can perform reasonably and given good query images can aid retrieval performance

    Towards automatic extraction of expressive elements from motion pictures : tempo

    Full text link
    This paper proposes a unique computational approach to extraction of expressive elements of motion pictures for deriving high level semantics of stories portrayed, thus enabling better video annotation and interpretation systems. This approach, motivated and directed by the existing cinematic conventions known as film grammar, as a first step towards demonstrating its effectiveness, uses the attributes of motion and shot length to define and compute a novel measure of tempo of a movie. Tempo flow plots are defined and derived for four full-length movies and edge analysis is performed leading to the extraction of dramatic story sections and events signaled by their unique tempo. The results confirm tempo as a useful attribute in its own right and a promising component of semantic constructs such as tone or mood of a film

    TRECVID 2003 - an overview

    Get PDF

    A Study of the Role of Visual Information in Supporting Ideation in Graphic Design

    Get PDF
    Existing computer technologies poorly support the ideation phase common to graphic design practice. Finding and indexing visual material to assist the process of ideation often fall on the designer, leading to user experiences that are less than ideal. To inform development of computer systems to assist graphic designers in the ideation phase of the design process, we conducted interviews with 15 professional graphic designers about their design process and visual information needs. Based on the study, we propose a set of requirements for an ideation-support system for graphic design

    Tools for modelling support and construction of optimization applications

    Get PDF
    We argue the case for an open systems approach towards modelling and application support. We discuss how the 'usability' and 'skills' analysis naturally leads to a viable strategy for integrating application construction with modelling tools and optimizers. The role of the implementation environment is also seen to be critical in that it is retained as a building block within the resulting system

    Deliberate Multimodality of the Newspaper Text (Front Pages – Case Study)

    Get PDF
    The focus of the present study is the notion that texts are multimodal and that language is realised through various semiotic modes. Kress and Leeuwen’s theory is tested comparing two front pages of the British daily - The Times and two front pages of the Bulgarian daily - Trud
    • 

    corecore