1,689 research outputs found
Advanced content-based semantic scene analysis and information retrieval: the SCHEMA project
The aim of the SCHEMA Network of Excellence is to bring together a critical mass of universities, research centers, industrial partners and end users, in order to design a reference system for content-based semantic scene analysis, interpretation and understanding. Relevant research areas include: content-based multimedia analysis and automatic annotation of semantic multimedia content, combined textual and multimedia information retrieval, semantic -web, MPEG-7 and MPEG-21 standards, user interfaces and human factors. In this paper, recent advances in content-based analysis, indexing and retrieval of digital media within the SCHEMA Network are presented. These advances will be integrated in the SCHEMA module-based, expandable reference system
Enhanced image annotations based on spatial information extraction and ontologies
Current research on image annotation often represents images in terms of labelled regions or objects, but pays little attention to the spatial positions or relationships between those regions or objects. To be effective, general purpose image retrieval systems require images with comprehensive annotations describing fully the content of the image. Much research is being done on automatic image annotation schemes but few authors address the issue of spatial annotations directly. This paper begins with a brief analysis of real picture queries to librarians showing how spatial terms are used to formulate queries. The paper is then concerned with the development of an enhanced automatic image annotation system, which extracts spatial information about objects in the image. The approach uses region boundaries and region labels to generate annotations describing absolute object positions and also relative positions between pairs of objects. A domain ontology and spatial information ontology are also used to extract more complex information about the relative closeness of objects to the viewer
Two-dimensional string notation for representing video sequences
Most current work on video indexing concentrates on queries which operate over high level semantic information which must be entirely composed and entered manually. We propose an indexing system which is based on spatial information about key objects in a scene. These key objects may be detected automatically, with manual supervision, and tracked through a sequence using one of a number of recently developed techniques. This representation is highly compact and allows rapid resolution of queries specified by iconic example. A number of systems have been produced which use 2D string notations to index digital image libraries. Just as 2D strings provide a compact and tractable indexing notation for digital pictures, a sequence of 2D strings might provide an index for a video or image sequence. To improve further upon this we reduce the representation to the 2D string pair representing the initial frame, and a sequence of edits to these strings. This takes advantage of the continuity between frames to further reduce the size of the notation. By representing video sequences using string edits, a notation has been developed which is compact, and allows querying on the spatial relationships of objects to be performed without rebuilding the majority of the scene. Calculating ranks of objects directly from the edit sequence allows matching with minimal calculation, thus greatly reducing search time. This paper presents the edit sequence notation and algorithms for evaluating queries over image sequences. A number of optimizations which represent a considerably saving in search time is demonstrated in the paper
Structured Knowledge Representation for Image Retrieval
We propose a structured approach to the problem of retrieval of images by
content and present a description logic that has been devised for the semantic
indexing and retrieval of images containing complex objects. As other
approaches do, we start from low-level features extracted with image analysis
to detect and characterize regions in an image. However, in contrast with
feature-based approaches, we provide a syntax to describe segmented regions as
basic objects and complex objects as compositions of basic ones. Then we
introduce a companion extensional semantics for defining reasoning services,
such as retrieval, classification, and subsumption. These services can be used
for both exact and approximate matching, using similarity measures. Using our
logical approach as a formal specification, we implemented a complete
client-server image retrieval system, which allows a user to pose both queries
by sketch and queries by example. A set of experiments has been carried out on
a testbed of images to assess the retrieval capabilities of the system in
comparison with expert users ranking. Results are presented adopting a
well-established measure of quality borrowed from textual information
retrieval
Symbol grounding and its implications for artificial intelligence
In response to Searle's well-known Chinese room argument against Strong AI (and more generally, computationalism), Harnad proposed that if the symbols manipulated by a robot were sufficiently grounded in the real world, then the robot could be said to literally understand. In this article, I expand on the notion of symbol groundedness in three ways. Firstly, I show how a robot might select the best set of categories describing the world, given that fundamentally continuous sensory data can be categorised in an almost infinite number of ways. Secondly, I discuss the notion of grounded abstract (as opposed to concrete) concepts. Thirdly, I give an objective criterion for deciding when a robot's symbols become sufficiently grounded for "understanding" to be attributed to it. This deeper analysis of what symbol groundedness actually is weakens Searle's position in significant ways; in particular, whilst Searle may be able to refute Strong AI in the specific context of present-day digital computers, he cannot refute computationalism in general
Indexing: Philosophy of
Critical editions of great thinkers and writers need excellent, comprehensive indexes. The denser the text, the deeper the index. Having indexed several volumes of the writings of the polymathic, dense-and-deep philosopher Charles S. Peirce, I have had occasion to reflect numerous times both upon the art of indexing and upon its logic. This essay will discuss less the art of it (or its mechanics) than its logic-and, by the same token, its ethics. I have good reasons to do so: first, Peirce is the American founder of the logic of signs (also known as semiotics), and one of the major types of signs that Peirce identified and analyzed is called index; second, Peirce is the first thinker to have demonstrated that logic rests on ethics: the desire for truth rests on the desire for the good, not the reverse
The Virtual Image in Streaming Video Indexing
Multimedia technology has been applied to many types of applications and the great amount of multimedia data need to be indexed. Especially the usage of digital video data is very popular today. In particular video browsing is a necessary activity in many kinds of knowledge. For effective and interactive exploration of large digital video archives there is a need to index the videos using their visual, audio and textual data. In this paper, we focus on the visual and textual content of video for indexing. In the former approach we use the Virtual Image and in the latter one we use the Dublin Core Metadata, opportunely extended and multilayered for the video browsing and indexing. Before to concentrate our attemption on the visual content we will explain main methods to video segmentation and annotation, in order to introduce the steps for video keyfeature extraction and video description generation
Mesh and Pyramid Algorithms for Iconic Indexing
In this paper parallel algorithms on meshes and pyramids for iconic indexing are presented. Our algorithms are asymptotically superior to previously known parallel algorithms
- âŠ