Search CORE

19,330 research outputs found

Hybrid image representation methods for automatic image annotation: a survey

Author: Bechkoum Kamal
Benblidia Nadjia
Bouyerbou Hafidha
Oukid Saliha
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/09/2012
Field of study

In most automatic image annotation systems, images are represented with low level features using either global methods or local methods. In global methods, the entire image is used as a unit. Local methods divide images into blocks where fixed-size sub-image blocks are adopted as sub-units; or into regions by using segmented regions as sub-units in images. In contrast to typical automatic image annotation methods that use either global or local features exclusively, several recent methods have considered incorporating the two kinds of information, and believe that the combination of the two levels of features is beneficial in annotating images. In this paper, we provide a survey on automatic image annotation techniques according to one aspect: feature extraction, and, in order to complement existing surveys in literature, we focus on the emerging image annotation methods: hybrid methods that combine both global and local features for image representation

Crossref

NECTAR

Multimedia information technology and the annotation of video

Author: Jong F.M.G. de
Smeulders A.
Worring M.
Publication venue: Stichting Archiefpublicaties
Publication date: 01/01/2006
Field of study

The state of the art in multimedia information technology has not progressed to the point where a single solution is available to meet all reasonable needs of documentalists and users of video archives. In general, we do not have an optimistic view of the usability of new technology in this domain, but digitization and digital power can be expected to cause a small revolution in the area of video archiving. The volume of data leads to two views of the future: on the pessimistic side, overload of data will cause lack of annotation capacity, and on the optimistic side, there will be enough data from which to learn selected concepts that can be deployed to support automatic annotation. At the threshold of this interesting era, we make an attempt to describe the state of the art in technology. We sample the progress in text, sound, and image processing, as well as in machine learning

University of Twente Research Information

A framework for automatic semantic video annotation

Author: A Amir
A Basharat
A Gupta
A Kapoor
A Ulges
AF Smeaton
Amjad Altadmri
Amr Ahmed
B Chandrasekaran
C Fellbaum
DB Lenat
H Liu
H Motulsky
J Sivic
ML Shyu
N Haering
R Fergus
T Brox
WL Zhao
XY Wei
YG Jiang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 28/03/2013
Field of study

The rapidly increasing quantity of publicly available videos has driven research into developing automatic tools for indexing, rating, searching and retrieval. Textual semantic representations, such as tagging, labelling and annotation, are often important factors in the process of indexing any video, because of their user-friendly way of representing the semantics appropriate for search and retrieval. Ideally, this annotation should be inspired by the human cognitive way of perceiving and of describing videos. The difference between the low-level visual contents and the corresponding human perception is referred to as the ‘semantic gap’. Tackling this gap is even harder in the case of unconstrained videos, mainly due to the lack of any previous information about the analyzed video on the one hand, and the huge amount of generic knowledge required on the other. This paper introduces a framework for the Automatic Semantic Annotation of unconstrained videos. The proposed framework utilizes two non-domain-specific layers: low-level visual similarity matching, and an annotation analysis that employs commonsense knowledgebases. Commonsense ontology is created by incorporating multiple-structured semantic relationships. Experiments and black-box tests are carried out on standard video databases for action recognition and video information retrieval. White-box tests examine the performance of the individual intermediate layers of the framework, and the evaluation of the results and the statistical analysis show that integrating visual similarity matching with commonsense semantic relationships provides an effective approach to automated video annotation

University of Lincoln Institutional Repository

Crossref

Edge Hill University Research Information Repository

Kent Academic Repository

Mind the Gap: Another look at the problem of the semantic gap in image retrieval

Author: Enser Peter G. B.
Hare Jonathon S.
Lewis Paul H.
Sandom Christine J.
Publication venue
Publication date: 01/01/2006
Field of study

This paper attempts to review and characterise the problem of the semantic gap in image retrieval and the attempts being made to bridge it. In particular, we draw from our own experience in user queries, automatic annotation and ontological techniques. The first section of the paper describes a characterisation of the semantic gap as a hierarchy between the raw media and full semantic understanding of the media's content. The second section discusses real users' queries with respect to the semantic gap. The final sections of the paper describe our own experience in attempting to bridge the semantic gap. In particular we discuss our work on auto-annotation and semantic-space models of image retrieval in order to bridge the gap from the bottom up, and the use of ontologies, which capture more semantics than keyword object labels alone, as a technique for bridging the gap from the top down

Southampton (e-Prints Soton)

Recommended from our members

Semantic Concept Co-Occurrence Patterns for Image Annotation and Retrieval.

Author: Bhanu Bir
Feng Linan
Publication venue: eScholarship, University of California
Publication date: 01/04/2016
Field of study

Describing visual image contents by semantic concepts is an effective and straightforward way to facilitate various high level applications. Inferring semantic concepts from low-level pictorial feature analysis is challenging due to the semantic gap problem, while manually labeling concepts is unwise because of a large number of images in both online and offline collections. In this paper, we present a novel approach to automatically generate intermediate image descriptors by exploiting concept co-occurrence patterns in the pre-labeled training set that renders it possible to depict complex scene images semantically. Our work is motivated by the fact that multiple concepts that frequently co-occur across images form patterns which could provide contextual cues for individual concept inference. We discover the co-occurrence patterns as hierarchical communities by graph modularity maximization in a network with nodes and edges representing concepts and co-occurrence relationships separately. A random walk process working on the inferred concept probabilities with the discovered co-occurrence patterns is applied to acquire the refined concept signature representation. Through experiments in automatic image annotation and semantic image retrieval on several challenging datasets, we demonstrate the effectiveness of the proposed concept co-occurrence patterns as well as the concept signature representation in comparison with state-of-the-art approaches

eScholarship - University of California

Bridging the Semantic Gap in Multimedia Information Retrieval: Top-down and Bottom-up approaches

Author: Enser Peter G.B.
Hare Jonathon S.
Lewis Paul H.
Martinez Kirk
Sandom Christine J.
Sinclair Patrick A. S.
Publication venue
Publication date: 01/01/2006
Field of study

Semantic representation of multimedia information is vital for enabling the kind of multimedia search capabilities that professional searchers require. Manual annotation is often not possible because of the shear scale of the multimedia information that needs indexing. This paper explores the ways in which we are using both top-down, ontologically driven approaches and bottom-up, automatic-annotation approaches to provide retrieval facilities to users. We also discuss many of the current techniques that we are investigating to combine these top-down and bottom-up approaches

CiteSeerX

Southampton (e-Prints Soton)

Fixation prediction with a combined model of bottom-up saliency and vanishing point

Author: Borji Ali
Feng Mengyang
Lu Huchuan
Publication venue
Publication date: 06/12/2015
Field of study

By predicting where humans look in natural scenes, we can understand how they perceive complex natural scenes and prioritize information for further high-level visual processing. Several models have been proposed for this purpose, yet there is a gap between best existing saliency models and human performance. While many researchers have developed purely computational models for fixation prediction, less attempts have been made to discover cognitive factors that guide gaze. Here, we study the effect of a particular type of scene structural information, known as the vanishing point, and show that human gaze is attracted to the vanishing point regions. We record eye movements of 10 observers over 532 images, out of which 319 have vanishing points. We then construct a combined model of traditional saliency and a vanishing point channel and show that our model outperforms state of the art saliency models using three scores on our dataset.Comment: arXiv admin note: text overlap with arXiv:1512.0172

arXiv.org e-Print Archive

Crossref

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)