746 research outputs found

    TRECVID: benchmarking the effectiveness of information retrieval tasks on digital video

    Get PDF
    Many research groups worldwide are now investigating techniques which can support information retrieval on archives of digital video and as groups move on to implement these techniques they inevitably try to evaluate the performance of their techniques in practical situations. The difficulty with doing this is that there is no test collection or any environment in which the effectiveness of video IR or video IR sub-tasks, can be evaluated and compared. The annual series of TREC exercises has, for over a decade, been benchmarking the effectiveness of systems in carrying out various information retrieval tasks on text and audio and has contributed to a huge improvement in many of these. Two years ago, a track was introduced which covers shot boundary detection, feature extraction and searching through archives of digital video. In this paper we present a summary of the activities in the TREC Video track in 2002 where 17 teams from across the world took part

    Sensor nets discover search

    Get PDF
    In the world of information discovery there are several major trends which are emerging. These include the fact that the nature of search itself is changing because our information needs are themselves becoming more complex and the data volume is increasing. Other trends are that information is increasingly being aggregated, and that search is now becoming information discovery. In this presentation I address a different kind of information source to the usual media, scientific, leisure, and entertainment information we usually consume, whose availability is now upon us, namely data gathered from sensors. This covers both the physical sensors around us which monitor our environment, our wellbeing and our activities, as well as the online sensors which monitor and track things happening elsewhere in the work and to which we have access. These sensor information sources are noisy, errorsome, unpredictable and dynamic, exactly like both our real and our virtual worlds. Several wide-ranging sensor web applications are used to demonstrate the importance of event processing in managing information discovery from the sensor web

    Managing millions of SenseCam images, events are key

    Get PDF

    So what can we actually do with content-based video retrieval?

    Get PDF
    In this talk I will give a roller-coaster survey of the state of the art in automatic video analysis, indexing, summarisation, search and browsing as demonstrated in the annual TRECVid benchmarking evaluation campaign. I will concentrate on content-based techniques for video management which form a complement to the dominant paradigm of metadata or tag-based video management and I will use example techniques to illustrate these

    The Físchlár digital library: networked access to a video archive of TV news

    Get PDF
    This paper presents an overview of the Físchlár digital library, a collection of over 300 hours of broadcast TV content which has been indexed to allow searching, browsing and playback of video. The system is in daily use by over 1,500 users on our University campus and is used for teaching and learning, for research, and for entertainment. It is shortly to be made available to University libraries elsewhere in Ireland. The infrastructure we use is a Gigabit ETHERNET backbone and a conventional web browser for searching and browsing video content, with a browser plug-in for streaming video. As well as providing an overview of the system, the paper concentrates on the complimentary navigation techniques of browsing and searching which are supported within Físchlár

    Content-based access to digital video: the Físchlár system and the TREC video track

    Get PDF
    This short paper presents an overview of the Físchlár system - an operational digital library of several hundred hours of video content at Dublin City University which is used by over 1,000 users daily, for a variety of applications. The paper describes how Físchlár operates and the services that it provides for users. Following that, the second part of the paper gives an outline of the TREC Video Retrieval track, a benchmarking exercise for information retrieval from video content currently in operation, summarising the operational details of how the benchmarking exercise is operating

    A comparison of score, rank and probability-based fusion methods for video shot retrieval

    Get PDF
    It is now accepted that the most effective video shot retrieval is based on indexing and retrieving clips using multiple, parallel modalities such as text-matching, image-matching and feature matching and then combining or fusing these parallel retrieval streams in some way. In this paper we investigate a range of fusion methods for combining based on multiple visual features (colour, edge and texture), for combining based on multiple visual examples in the query and for combining multiple modalities (text and visual). Using three TRECVid collections and the TRECVid search task, we specifically compare fusion methods based on normalised score and rank that use either the average, weighted average or maximum of retrieval results from a discrete Jelinek-Mercer smoothed language model. We also compare these results with a simple probability-based combination of the language model results that assumes all features and visual examples are fully independent

    Using Graphics Processor Units (GPUs) for automatic video structuring

    Get PDF
    The rapid pace of development of Graphic Processor Units (GPUs) in recent years in terms of performance and programmability has attracted the attention of those seeking to leverage alternative architectures for better performance than that which commodity CPUs can provide. In this paper, the potential of the GPU in automatically structuring video is examined, specifically in shot boundary detection and representative keyframe selection techniques. We first introduce the programming model of the GPU and outline the implementation of techniques for shot boundary detection and representative keyframe selection on both the CPU and GPU, using histogram comparisons. We compare the approaches and present performance results for both the CPU and GPU. Overall these results demonstrate the significant potential for the GPU in this domain

    Biometric responses to music-rich segments in films: the CDVPlex

    Get PDF
    Summarising or generating trailers for films or movies involves finding the highlights within those films, those segments where we become most afraid, happy, sad, annoyed, excited, etc. In this paper we explore three questions related to automatic detection of film highlights by measuring the physiological responses of viewers of those films. Firstly, whether emotional highlights can be detected through viewer biometrics, secondly whether individuals watching a film in a group experience similar emotional reactions as others in the group and thirdly whether the presence of music in a film correlates with the occurrence of emotional highlights. We analyse the results of an experiment known as the CDVPlex, where we monitored and recorded physiological reactions from people as they viewed films in a controlled cinema-like environment. A selection of films were manually annotated for the locations of their emotive contents. We then studied the physiological peaks identified among participants while viewing the same film and how these correlated with emotion tags and with music. We conclude that these are highly correlated and that music-rich segments of a film do act as a catalyst in stimulating viewer response, though we don't know what exact emotions the viewers were experiencing. The results of this work could impact the way in which we index movie content on PVRs for example, paying special significance to movie segments which are most likely to be highlights

    Clustering-based analysis of semantic concept models for video shots

    Get PDF
    In this paper we present a clustering-based method for representing semantic concepts on multimodal low-level feature spaces and study the evaluation of the goodness of such models with entropy-based methods. As different semantic concepts in video are most accurately represented with different features and modalities, we utilize the relative model-wise confidence values of the feature extraction techniques in weighting them automatically. The method also provides a natural way of measuring the similarity of different concepts in a multimedia lexicon. The experiments of the paper are conducted using the development set of the TRECVID 2005 corpus together with a common annotation for 39 semantic concept
    corecore