7,594 research outputs found
Mind the Gap: Another look at the problem of the semantic gap in image retrieval
This paper attempts to review and characterise the problem of the semantic gap in image retrieval and the attempts being made to bridge it. In particular, we draw from our own experience in user queries, automatic annotation and ontological techniques. The first section of the paper describes a characterisation of the semantic gap as a hierarchy between the raw media and full semantic understanding of the media's content. The second section discusses real users' queries with respect to the semantic gap. The final sections of the paper describe our own experience in attempting to bridge the semantic gap. In particular we discuss our work on auto-annotation and semantic-space models of image retrieval in order to bridge the gap from the bottom up, and the use of ontologies, which capture more semantics than keyword object labels alone, as a technique for bridging the gap from the top down
Semantic Retrieval and Automatic Annotation: Linear Transformations, Correlation and Semantic Spaces
This paper proposes a new technique for auto-annotation and semantic retrieval based upon the idea of linearly mapping an image feature space to a keyword space. The new technique is compared to several related techniques, and a number of salient points about each of the techniques are discussed and contrasted. The paper also discusses how these techniques might actually scale to a real-world retrieval problem, and demonstrates this though a case study of a semantic retrieval technique being used on a real-world data-set (with a mix of annotated and unannotated images) from a picture library
Recommended from our members
Recognition by directed attention to recursively partitioned images
A learning/recognition model (and instantiating program) is described which recursively combines the learning paradigms of conceptual clustering (Michalski, 1980) and learning-from-examples to resolve the ambiguities of real-world recognition. The model is based on neuropsychological and psychological evidence that the visual system is analytic, hierarchical, and composed of a parallel/serial dichotomy (many, see conclusions by Crick, 1984). Emulating the experimental evidence, parallel processes in the model decompose the image into components and cluster the constituents in much the same way as the image processing technique known as moment analysis (Alt, 1962). Serial, attentive mechanisms then reassemble the decompositions by investigating spatial relationships between components. The use of attentive mechanisms extends the moment analysis technique to handle alterations in structure and solves the contention problem created by combining the two learning paradigms. The contention results from a disagreement between the teacher and the model on what constitutes the salient features at the highest level of the symbol. There are four cases ZBT must handle, two of which result from the disagreement with the teacher. The parallel/serial dichotomy represents a vertical/horizontal tradeoff between the invariant and variant features of a domain. The resultant learned hierarchy allows ZBT to recognize structural differences while avoiding problems of exponential growth
Al Hybrid Content-Based Retrieval Approach For Video Data
Increasing use of multimedia data makes it crucial to develop intelligent search mec:hanisms for retrieving multimedia data by content. Traditional text-based methods clearly do not suffice to describe the rich content of images, voice or video. Digital vidseo requires the incorporation of temporal information for any effective contentbased retrieval scheme. We present a novel technique which integrates object motion ancl temporal relationship information in order to characterize the events for subsequent search for similar clips. We propose a hybrid mechanism based on object motion trails similarity match and interval-based temporal modeling that leads to a unique framework for spatio-temporal content based access in digital video. We implemented the proposed methods and demonstrated that high-level query formulation can be achieved for the aforementioned purpose. Development of such technology will enable true multimedia search engines that will accomplish what current Internet search engines like Infoseek or Excite do today for textual data
Multi-Level Visual Alphabets
A central debate in visual perception theory is the argument for indirect versus direct perception; i.e., the use of intermediate, abstract, and hierarchical representations versus direct semantic interpretation of images through interaction with the outside world. We present a content-based representation that combines both approaches. The previously developed Visual Alphabet method is extended with a hierarchy of representations, each level feeding into the next one, but based on features that are not abstract but directly relevant to the task at hand. Explorative benchmark experiments are carried out on face images to investigate and explain the impact of the key parameters such as pattern size, number of prototypes, and distance measures used. Results show that adding an additional middle layer improves results, by encoding the spatial co-occurrence of lower-level pattern prototypes
Digital Image Access & Retrieval
The 33th Annual Clinic on Library Applications of Data Processing, held at the University of Illinois at Urbana-Champaign in March of 1996, addressed the theme of "Digital Image Access & Retrieval." The papers from this conference cover a wide range of topics concerning digital imaging technology for visual resource collections. Papers covered three general areas: (1) systems, planning, and implementation; (2) automatic and semi-automatic indexing; and (3) preservation with the bulk of the conference focusing on indexing and retrieval.published or submitted for publicatio
- …