55,543 research outputs found

    K-Space at TRECVid 2007

    Get PDF
    In this paper we describe K-Space participation in TRECVid 2007. K-Space participated in two tasks, high-level feature extraction and interactive search. We present our approaches for each of these activities and provide a brief analysis of our results. Our high-level feature submission utilized multi-modal low-level features which included visual, audio and temporal elements. Specific concept detectors (such as Face detectors) developed by K-Space partners were also used. We experimented with different machine learning approaches including logistic regression and support vector machines (SVM). Finally we also experimented with both early and late fusion for feature combination. This year we also participated in interactive search, submitting 6 runs. We developed two interfaces which both utilized the same retrieval functionality. Our objective was to measure the effect of context, which was supported to different degrees in each interface, on user performance. The first of the two systems was a ‘shot’ based interface, where the results from a query were presented as a ranked list of shots. The second interface was ‘broadcast’ based, where results were presented as a ranked list of broadcasts. Both systems made use of the outputs of our high-level feature submission as well as low-level visual features

    Automatic detection and extraction of artificial text in video

    Get PDF
    A significant challenge in large multimedia databases is the provision of efficient means for semantic indexing and retrieval of visual information. Artificial text in video is normally generated in order to supplement or summarise the visual content and thus is an important carrier of information that is highly relevant to the content of the video. As such, it is a potential ready-to-use source of semantic information. In this paper we present an algorithm for detection and localisation of artificial text in video using a horizontal difference magnitude measure and morphological processing. The result of character segmentation, based on a modified version of the Wolf-Jolion algorithm [1][2] is enhanced using smoothing and multiple binarisation. The output text is input to an “off-the-shelf” noncommercial OCR. Detection, localisation and recognition results for a 20min long MPEG-1 encoded television programme are presented

    Language-based multimedia information retrieval

    Get PDF
    This paper describes various methods and approaches for language-based multimedia information retrieval, which have been developed in the projects POP-EYE and OLIVE and which will be developed further in the MUMIS project. All of these project aim at supporting automated indexing of video material by use of human language technologies. Thus, in contrast to image or sound-based retrieval methods, where both the query language and the indexing methods build on non-linguistic data, these methods attempt to exploit advanced text retrieval technologies for the retrieval of non-textual material. While POP-EYE was building on subtitles or captions as the prime language key for disclosing video fragments, OLIVE is making use of speech recognition to automatically derive transcriptions of the sound tracks, generating time-coded linguistic elements which then serve as the basis for text-based retrieval functionality

    TRECVid 2005 experiments at Dublin City University

    Get PDF
    In this paper we describe our experiments in the automatic and interactive search tasks and the BBC rushes pilot task of TRECVid 2005. Our approach this year is somewhat different than previous submissions in that we have implemented a multi-user search system using a DiamondTouch tabletop device from Mitsubishi Electric Research Labs (MERL).We developed two versions of oursystem one with emphasis on efficient completion of the search task (FĂ­schlĂĄr-DT Efficiency) and the other with more emphasis on increasing awareness among searchers (FĂ­schlĂĄr-DT Awareness). We supplemented these runs with a further two runs one for each of the two systems, in which we augmented the initial results with results from an automatic run. In addition to these interactive submissions we also submitted three fully automatic runs. We also took part in the BBC rushes pilot task where we indexed the video by semi-automatic segmentation of objects appearing in the video and our search/browsing system allows full keyframe and/or object-based searching. In the interactive search experiments we found that the awareness system outperformed the efficiency system. We also found that supplementing the interactive results with results of an automatic run improves both the Mean Average Precision and Recall values for both system variants. Our results suggest that providing awareness cues in a collaborative search setting improves retrieval performance. We also learned that multi-user searching is a viable alternative to the traditional single searcher paradigm, provided the system is designed to effectively support collaboration
    • 

    corecore