1,709 research outputs found
Measuring concept similarities in multimedia ontologies: analysis and evaluations
The recent development of large-scale multimedia concept ontologies has provided a new momentum for research in the semantic analysis of multimedia repositories. Different methods for generic concept detection have been extensively studied, but the question of how to exploit the structure of a multimedia ontology and existing inter-concept relations has not received similar attention. In this paper, we present a clustering-based method for modeling semantic concepts on low-level feature spaces and study the evaluation of the quality of such models with entropy-based methods. We cover a variety of methods for assessing the similarity of different concepts in a multimedia ontology. We study three ontologies and apply the proposed techniques in experiments involving the visual and semantic similarities, manual annotation of video, and concept detection. The results show that modeling inter-concept relations can provide a promising resource for many different application areas in semantic multimedia processing
Automatic summarization of rushes video using bipartite graphs
In this paper we present a new approach for automatic summarization of rushes, or unstructured video. Our approach is composed of three major steps. First, based on shot and sub-shot segmentations, we filter sub-shots with low information content not likely to be useful in a summary. Second, a method using maximal matching in a bipartite graph is adapted to measure similarity between the remaining shots and to minimize inter-shot redundancy by removing repetitive retake shots common in rushes video. Finally, the presence of faces and motion intensity are characterised in each sub-shot. A measure of how representative the sub-shot is in the context of the overall video is then proposed. Video summaries composed of keyframe slideshows are then generated. In order to evaluate the effectiveness of this approach we re-run the evaluation carried out by TRECVid, using the same dataset and evaluation metrics used in the TRECVid video summarization task in 2007 but with our own assessors. Results show that our approach leads to a significant improvement on our own work in terms of the fraction of the TRECVid summary ground truth included and is competitive with the best of other approaches in TRECVid 2007
Synchronous collaborative information retrieval: techniques and evaluation
Synchronous Collaborative Information Retrieval refers to
systems that support multiple users searching together at the same time in order to satisfy a shared information need. To date most SCIR systems have focussed on providing various awareness tools in order to enable collaborating users to coordinate the search task. However, requiring users to both search and coordinate the group activity may prove too demanding. On the other hand without effective coordination policies the group search may not be effective. In this paper we propose and evaluate novel system-mediated techniques for coordinating a group search. These techniques allow for an effective division of labour across the group whereby each group member can explore a subset of the search space.We also propose and evaluate techniques to support automated sharing of knowledge across searchers in SCIR, through novel collaborative and complementary relevance feedback techniques. In order to evaluate these techniques, we propose a framework for SCIR evaluation based on simulations. To populate these simulations we extract data from TREC interactive search logs. This work represent the first simulations of SCIR to date and the first such use of this TREC data
Content vs. context for multimedia semantics: the case of SenseCam image structuring
Much of the current work on determining multimedia semantics from multimedia artifacts is based around using either context, or using content. When leveraged thoroughly these can independently provide content description which is used in building content-based applications. However, there are few cases where multimedia semantics are determined based on an integrated analysis of content and context. In this keynote talk we present one such example system in which we use an integrated combination of the two to automatically structure large collections of images taken by a SenseCam, a device from Microsoft Research which passively records a person’s daily activities. This paper describes the post-processing we perform on SenseCam images in order to present a structured, organised visualisation of the highlights of each of the wearer’s days
The Físchlár digital video system: a digital library of broadcast TV programmes
Físchlár is a system for recording, indexing, browsing and playback of broadcast TV programmes which has been operational on our University campus for almost 18 months. In this paper we give a brief overview of how the system operates, how TV programmes are organised for browse/playback and a short report on the system usage by over 900 users in our University
The TREC2001 video track: information retrieval on digital video information
The development of techniques to support content-based access to archives of digital video information has recently started to receive much attention from the research community. During 2001, the annual TREC activity, which has been benchmarking the performance of information retrieval techniques on a range of media for 10 years, included a ”track“ or activity which allowed investigation into approaches to support searching through a video library. This paper is not intended to provide a comprehensive picture of the different approaches taken by the TREC2001 video track participants but instead we give an overview of the TREC video search task and a thumbnail sketch of the approaches taken by different groups. The reason for writing this paper is to highlight the message from the TREC video track that there are now a variety of approaches available for searching and browsing through digital video archives, that these approaches do work, are scalable to larger archives and can yield useful retrieval performance for users. This has important implications in making digital libraries of video information attainable
An evaluation of alternative techniques for automatic detection of shot boundaries in digital video
The application of image processing techniques to achieve
substantial compression in digital video is one of the reasons why computer-supported video processing and digital TV are now becoming commonplace. The encoding formats used for video, such as the MPEG family of standards, have been developed primarily to achieve high compression rates, but now that this has been achieved, effort is being concentrated on other, content-based activities. MPEG-7, for example is a standard intended to support such developments. In the work described here, we are developing and deploying
techniques to support content-based navigation and browsing through digital video (broadcast TV) archives. Fundamental to this is being able to automatically structure video into shots and scenes. In this paper we report our progress on developing a variety of approaches to automatic shot boundary detection in MPEG-1 video, and their evaluation on a large test suite of 8 hours of broadcast TV. Our work to date indicates that different techniques work well for different shot transition types and that a combination of techniques may yield the most accurate segmentation
Overview of VideoCLEF 2009: New perspectives on speech-based multimedia content enrichment
VideoCLEF 2009 offered three tasks related to enriching video content for improved multimedia access in a multilingual environment. For each task, video data (Dutch-language television, predominantly documentaries) accompanied by speech recognition transcripts were provided.
The Subject Classification Task involved automatic tagging of videos with subject theme labels. The best performance was achieved by approaching subject tagging as an information retrieval task and using both speech recognition transcripts and archival metadata. Alternatively, classifiers were trained using either the training data provided or data collected from Wikipedia or via general Web search. The Affect Task involved detecting narrative peaks, defined as points where viewers perceive heightened dramatic tension. The task was carried out on the “Beeldenstorm” collection containing 45 short-form documentaries on the visual arts. The best runs exploited affective vocabulary and audience directed speech. Other approaches included using topic changes, elevated speaking pitch, increased speaking intensity and radical visual changes. The Linking Task, also called “Finding Related Resources Across Languages,” involved linking video to material on the same subject in a different language.
Participants were provided with a list of multimedia anchors (short video segments) in the Dutch-language “Beeldenstorm” collection and were expected to return target pages drawn from English-language Wikipedia. The best performing methods used the transcript of the
speech spoken during the multimedia anchor to build a query to search an index of the Dutch language Wikipedia. The Dutch Wikipedia pages returned were used to identify related English pages. Participants also experimented with pseudo-relevance feedback, query translation and methods that targeted proper names
Substantial stores of sedimentary carbon held in mid-latitude fjords
This work was supported by the Natural Environment Research Council [Grant Number: NE/L501852/1].Quantifying marine sedimentary carbon stocks is key to improving our understanding of longterm storage of carbon in the coastal ocean and to further constraining the global carbon cycle. Here we present a methodological approach which combines seismic geophysics and geochemical measurements to quantitatively estimate the total stock of carbon held within marine sediment. Through the application of this methodology to Loch Sunart a fjord on the west coast of Scotland, we have generated the first full sedimentary carbon inventory for a fjordic system. The sediments of Loch Sunart hold 26.9 ± 0.5 Mt of carbon split between 11.5 ± 0.2 Mt and 15.0 ± 0.4 Mt of organic and inorganic carbon respectively. These new quantitative estimates of carbon stored in coastal sediments are significantly higher than previous estimates. Through an area normalised comparison to adjacent Scottish peatland carbon stocks we have determined that these mid–latitude fjords are significantly more effective as carbon stores than their terrestrial counterparts. This initial work supports the concept that fjords are important environments for the burial and long-term storage of carbon and therefore should be considered and treated as unique environments within the global carbon cycle.Publisher PDFPeer reviewe
Designing novel applications for emerging multimedia technology
Current R&D in media technologies such as Multimedia, Semantic Web and Sensor Web technologies are advancing in a fierce rate and will sure to become part of our important regular items in a 'conventional' technology inventory in near future. While the R&D nature of these technologies means their accuracy, reliability and robustness are not sufficient enough to be used in real world yet, we want to envision now the near-future where these technologies will have matured and used in real applications in order to explore and start shaping many possible new ways these novel technologies could be utilised.
In this talk, some of this effort in designing novel applications that incorporate various media technologies as their backend will be presented. Examples include novel scenarios of LifeLogging application that incorporate automatic structuring of millions of photos passively captured from a SenseCam (wearable digital camera that automatically takes photos triggered by environmental sensors) and an interactive TV application incorporating a number of multimedia tools yet extremely simple and easy to use with a remote control in a lean-back position. The talk will conclude with remarks on how the design of novel applications that have no precedence or existing user base should require somewhat different approach from those suggested and practiced in conventional usability engineering methodology
- …
