1,310 research outputs found

    Measuring concept similarities in multimedia ontologies: analysis and evaluations

    Get PDF
    The recent development of large-scale multimedia concept ontologies has provided a new momentum for research in the semantic analysis of multimedia repositories. Different methods for generic concept detection have been extensively studied, but the question of how to exploit the structure of a multimedia ontology and existing inter-concept relations has not received similar attention. In this paper, we present a clustering-based method for modeling semantic concepts on low-level feature spaces and study the evaluation of the quality of such models with entropy-based methods. We cover a variety of methods for assessing the similarity of different concepts in a multimedia ontology. We study three ontologies and apply the proposed techniques in experiments involving the visual and semantic similarities, manual annotation of video, and concept detection. The results show that modeling inter-concept relations can provide a promising resource for many different application areas in semantic multimedia processing

    Content vs. context for multimedia semantics: the case of SenseCam image structuring

    Get PDF
    Much of the current work on determining multimedia semantics from multimedia artifacts is based around using either context, or using content. When leveraged thoroughly these can independently provide content description which is used in building content-based applications. However, there are few cases where multimedia semantics are determined based on an integrated analysis of content and context. In this keynote talk we present one such example system in which we use an integrated combination of the two to automatically structure large collections of images taken by a SenseCam, a device from Microsoft Research which passively records a person’s daily activities. This paper describes the post-processing we perform on SenseCam images in order to present a structured, organised visualisation of the highlights of each of the wearer’s days

    Automatic summarization of rushes video using bipartite graphs

    Get PDF
    In this paper we present a new approach for automatic summarization of rushes, or unstructured video. Our approach is composed of three major steps. First, based on shot and sub-shot segmentations, we filter sub-shots with low information content not likely to be useful in a summary. Second, a method using maximal matching in a bipartite graph is adapted to measure similarity between the remaining shots and to minimize inter-shot redundancy by removing repetitive retake shots common in rushes video. Finally, the presence of faces and motion intensity are characterised in each sub-shot. A measure of how representative the sub-shot is in the context of the overall video is then proposed. Video summaries composed of keyframe slideshows are then generated. In order to evaluate the effectiveness of this approach we re-run the evaluation carried out by TRECVid, using the same dataset and evaluation metrics used in the TRECVid video summarization task in 2007 but with our own assessors. Results show that our approach leads to a significant improvement on our own work in terms of the fraction of the TRECVid summary ground truth included and is competitive with the best of other approaches in TRECVid 2007

    Synchronous collaborative information retrieval: techniques and evaluation

    Get PDF
    Synchronous Collaborative Information Retrieval refers to systems that support multiple users searching together at the same time in order to satisfy a shared information need. To date most SCIR systems have focussed on providing various awareness tools in order to enable collaborating users to coordinate the search task. However, requiring users to both search and coordinate the group activity may prove too demanding. On the other hand without effective coordination policies the group search may not be effective. In this paper we propose and evaluate novel system-mediated techniques for coordinating a group search. These techniques allow for an effective division of labour across the group whereby each group member can explore a subset of the search space.We also propose and evaluate techniques to support automated sharing of knowledge across searchers in SCIR, through novel collaborative and complementary relevance feedback techniques. In order to evaluate these techniques, we propose a framework for SCIR evaluation based on simulations. To populate these simulations we extract data from TREC interactive search logs. This work represent the first simulations of SCIR to date and the first such use of this TREC data

    Visitors to the West Coast of New Zealand's South Island : attractions, satisfations, and senses of place

    Get PDF
    Visitors to the West Coast of the South Island were interviewed (using a survey questionnaire and in-depth interview techniques) in order to determine their satisfactions with the attractions which they had visited. The results of this study indicate that country of origin, stage in family life cycle, and the level of their prior-experience with the area all influence the activities visitors choose, and from this their satisfactions. These satisfactions were largely based around the natural and historic features of the Coast. In-depth interviews with visitors to the Coast revealed a number of senses of place which were held by those visitors. The strongest of these were based around the natural and historic features of the area. Analysis of these senses of place pointed to the possibility that the psychological, aesthetic, and self-actualisation needs of visitors were being met by the experience, thus giving rise to feelings of satisfaction. Finally, the study recommended that the provision of access, careful development, retention, preservation, and signposting of natural and historic features of the Coast should be afforded a high priority if current levels of visitor satisfaction are to be maintained or increased

    Overview of VideoCLEF 2009: New perspectives on speech-based multimedia content enrichment

    Get PDF
    VideoCLEF 2009 offered three tasks related to enriching video content for improved multimedia access in a multilingual environment. For each task, video data (Dutch-language television, predominantly documentaries) accompanied by speech recognition transcripts were provided. The Subject Classification Task involved automatic tagging of videos with subject theme labels. The best performance was achieved by approaching subject tagging as an information retrieval task and using both speech recognition transcripts and archival metadata. Alternatively, classifiers were trained using either the training data provided or data collected from Wikipedia or via general Web search. The Affect Task involved detecting narrative peaks, defined as points where viewers perceive heightened dramatic tension. The task was carried out on the “Beeldenstorm” collection containing 45 short-form documentaries on the visual arts. The best runs exploited affective vocabulary and audience directed speech. Other approaches included using topic changes, elevated speaking pitch, increased speaking intensity and radical visual changes. The Linking Task, also called “Finding Related Resources Across Languages,” involved linking video to material on the same subject in a different language. Participants were provided with a list of multimedia anchors (short video segments) in the Dutch-language “Beeldenstorm” collection and were expected to return target pages drawn from English-language Wikipedia. The best performing methods used the transcript of the speech spoken during the multimedia anchor to build a query to search an index of the Dutch language Wikipedia. The Dutch Wikipedia pages returned were used to identify related English pages. Participants also experimented with pseudo-relevance feedback, query translation and methods that targeted proper names

    The Físchlár digital video system: a digital library of broadcast TV programmes

    Get PDF
    Físchlár is a system for recording, indexing, browsing and playback of broadcast TV programmes which has been operational on our University campus for almost 18 months. In this paper we give a brief overview of how the system operates, how TV programmes are organised for browse/playback and a short report on the system usage by over 900 users in our University

    Chinese Character Decomposition for Neural MT with Multi-Word Expressions

    Get PDF
    Chinese character decomposition has been used as a feature to enhance Machine Translation (MT) models, combining radicals into character and word level models. Recent work has investigated ideograph or stroke level embedding. However, questions remain about different decomposition levels of Chinese character representations, radical and strokes, best suited for MT. To investigate the impact of Chinese decomposition embedding in detail, i.e., radical, stroke, and intermediate levels, and how well these decompositions represent the meaning of the original character sequences, we carry out analysis with both automated and human evaluation of MT. Furthermore, we investigate if the combination of decomposed Multiword Expressions (MWEs) can enhance the model learning. MWE integration into MT has seen more than a decade of exploration. However, decomposed MWEs has not previously been explored.Comment: Accepted to publish in NoDaLiDa202

    The TREC2001 video track: information retrieval on digital video information

    Get PDF
    The development of techniques to support content-based access to archives of digital video information has recently started to receive much attention from the research community. During 2001, the annual TREC activity, which has been benchmarking the performance of information retrieval techniques on a range of media for 10 years, included a ”track“ or activity which allowed investigation into approaches to support searching through a video library. This paper is not intended to provide a comprehensive picture of the different approaches taken by the TREC2001 video track participants but instead we give an overview of the TREC video search task and a thumbnail sketch of the approaches taken by different groups. The reason for writing this paper is to highlight the message from the TREC video track that there are now a variety of approaches available for searching and browsing through digital video archives, that these approaches do work, are scalable to larger archives and can yield useful retrieval performance for users. This has important implications in making digital libraries of video information attainable

    An evaluation of alternative techniques for automatic detection of shot boundaries in digital video

    Get PDF
    The application of image processing techniques to achieve substantial compression in digital video is one of the reasons why computer-supported video processing and digital TV are now becoming commonplace. The encoding formats used for video, such as the MPEG family of standards, have been developed primarily to achieve high compression rates, but now that this has been achieved, effort is being concentrated on other, content-based activities. MPEG-7, for example is a standard intended to support such developments. In the work described here, we are developing and deploying techniques to support content-based navigation and browsing through digital video (broadcast TV) archives. Fundamental to this is being able to automatically structure video into shots and scenes. In this paper we report our progress on developing a variety of approaches to automatic shot boundary detection in MPEG-1 video, and their evaluation on a large test suite of 8 hours of broadcast TV. Our work to date indicates that different techniques work well for different shot transition types and that a combination of techniques may yield the most accurate segmentation
    corecore