29,848 research outputs found
Extracting semantic video objects
Dagan Feng2000-2001 > Academic research: refereed > Publication in refereed journalVersion of RecordPublishe
The K-Space segmentation tool set
In this paper we describe two applications, created as part of the K-Space Network of Excellence, designed to allow researchers to use and experiment with state-of-the-art methods for spatial segmentation of images and video sequences. The first of these tools is an _Interactive Segmentation Tool_, developed to allow accurate human-guided segmentation of semantic objects from images using different segmentation algorithms. The tool is particularly useful for generating ground-truth segmentations, extracting objects for further processing, and as a general image processing application.The second tool we developed is designed for fully automatic spatial region segmentation of image and video. The tool is web-based; usage only requires a browser.
Both the automatic and interactive segmentation tools have been made available online; we anticipate they will be a valuable resource for other researchers
Practical Uses of A Semi-automatic Video Object Extraction System
Object-based technology is important
for computer vision applications including gesture
understanding, image recognition, augmented reality,
etc. However, extracting the shape information of
semantic objects from video sequences is a very
difficult task, since this information is not explicitly
provided within the video data. Therefore, an
application for exttracting the semantic video object
is indispensable and important for many advanced
applications.
An algorithm for semi-automatic video object
extraction system has been developed. The performance
measures of video object extraction system;
including evaluation using ground truth and
error metric is shown, followed by some practical
uses of our video object extraction system.
The principle at the basis of semi-automatic object
extraction technique is the interaction of the user
during some stages of the segmentation process,
whereby the semantic information is provided
directly by the user. After the user provides the initial
segmentation of the semantic video objects, a
tracking mechanism follows its temporal
transformation in the subsequent frames, thus
propagating the semantic information.
Since the tracking tends to introduce boundary
errors, the semantic information can be refreshed by
the user at certain key frame locations in the video
sequence. The tracking mechanism can also operate
in forward or backward direction of the video
sequence.
The performance analysis of the results is described
using single and multiple key frames; Mean Error
and “Last_Error”, and also forward and backward
extraction. To achieve best performance, results from
forward and backward extraction can be merged
Measuring concept similarities in multimedia ontologies: analysis and evaluations
The recent development of large-scale multimedia concept ontologies has provided a new momentum for research in the semantic analysis of multimedia repositories. Different methods for generic concept detection have been extensively studied, but the question of how to exploit the structure of a multimedia ontology and existing inter-concept relations has not received similar attention. In this paper, we present a clustering-based method for modeling semantic concepts on low-level feature spaces and study the evaluation of the quality of such models with entropy-based methods. We cover a variety of methods for assessing the similarity of different concepts in a multimedia ontology. We study three ontologies and apply the proposed techniques in experiments involving the visual and semantic similarities, manual annotation of video, and concept detection. The results show that modeling inter-concept relations can provide a promising resource for many different application areas in semantic multimedia processing
Recommended from our members
Linking Data Across Universities: An Integrated Video Lectures Dataset
This paper presents our work and experience interlinking educational information across universities through the use of Linked Data principles and technologies. More specifically this paper is focused on selecting, extracting, structuring and interlinking information of video lectures produced by 27 different educational institutions. For this purpose, selected information from several websites and YouTube channels have been scraped and structured according to well-known vocabularies, like FOAF 1, or the W3C Ontology for Media Resources 2. To integrate this information, the extracted videos have been categorized under a common classification space, the taxonomy defined by the Open Directory Project 3. An evaluation of this categorization process has been conducted obtaining a 98% degree of coverage and 89% degree of correctness. As a result of this process a new Linked Data dataset has been released containing more than 14,000 video lectures from 27 different institutions and categorized under a common classification scheme
Semantic levels of domain-independent commonsense knowledgebase for visual indexing and retrieval applications
Building intelligent tools for searching, indexing and retrieval applications is needed to congregate the rapidly increasing amount of visual data. This raised the need for building and maintaining ontologies and knowledgebases to support textual semantic representation of visual contents, which is an important block in these applications. This paper proposes a commonsense knowledgebase that forms the link between the visual world and its semantic textual representation. This domain-independent knowledge is provided at different levels of semantics by a fully automated engine that analyses, fuses and integrates previous commonsense knowledgebases. This knowledgebase satisfies the levels of semantic by adding two new levels: temporal event scenarios and psycholinguistic understanding. Statistical properties and an experiment evaluation, show coherency and effectiveness of the proposed knowledgebase in providing the knowledge needed for wide-domain visual applications
- …