Search CORE

12,179 research outputs found

Measuring concept similarities in multimedia ontologies: analysis and evaluations

Author: Koskela Markus
Laaksonen J.
Smeaton Alan F.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/08/2007
Field of study

The recent development of large-scale multimedia concept ontologies has provided a new momentum for research in the semantic analysis of multimedia repositories. Different methods for generic concept detection have been extensively studied, but the question of how to exploit the structure of a multimedia ontology and existing inter-concept relations has not received similar attention. In this paper, we present a clustering-based method for modeling semantic concepts on low-level feature spaces and study the evaluation of the quality of such models with entropy-based methods. We cover a variety of methods for assessing the similarity of different concepts in a multimedia ontology. We study three ontologies and apply the proposed techniques in experiments involving the visual and semantic similarities, manual annotation of video, and concept detection. The results show that modeling inter-concept relations can provide a promising resource for many different application areas in semantic multimedia processing

Irish Universities

DCU Online Research Access Service

Semantic spaces revisited: investigating the performance of auto-annotation and semantic retrieval using semantic spaces

Author: Hare Jonathan
Lewis Paul
Nixon Mark
Samangooei Sina
Publication venue
Publication date: 07/07/2008
Field of study

Semantic spaces encode similarity relationships between objects as a function of position in a mathematical space. This paper discusses three different formulations for building semantic spaces which allow the automatic-annotation and semantic retrieval of images. The models discussed in this paper require that the image content be described in the form of a series of visual-terms, rather than as a continuous feature-vector. The paper also discusses how these term-based models compare to the latest state-of-the-art continuous feature models for auto-annotation and retrieval

Southampton (e-Prints Soton)

Bridging the Semantic Gap in Multimedia Information Retrieval: Top-down and Bottom-up approaches

Author: Enser Peter G.B.
Hare Jonathon S.
Lewis Paul H.
Martinez Kirk
Sandom Christine J.
Sinclair Patrick A. S.
Publication venue
Publication date: 01/01/2006
Field of study

Semantic representation of multimedia information is vital for enabling the kind of multimedia search capabilities that professional searchers require. Manual annotation is often not possible because of the shear scale of the multimedia information that needs indexing. This paper explores the ways in which we are using both top-down, ontologically driven approaches and bottom-up, automatic-annotation approaches to provide retrieval facilities to users. We also discuss many of the current techniques that we are investigating to combine these top-down and bottom-up approaches

CiteSeerX

Southampton (e-Prints Soton)

A2-RL: Aesthetics Aware Reinforcement Learning for Image Cropping

Author: Huang Kaiqi
Li Debang
Wu Huikai
Zhang Junge
Publication venue
Publication date: 12/03/2018
Field of study

Image cropping aims at improving the aesthetic quality of images by adjusting their composition. Most weakly supervised cropping methods (without bounding box supervision) rely on the sliding window mechanism. The sliding window mechanism requires fixed aspect ratios and limits the cropping region with arbitrary size. Moreover, the sliding window method usually produces tens of thousands of windows on the input image which is very time-consuming. Motivated by these challenges, we firstly formulate the aesthetic image cropping as a sequential decision-making process and propose a weakly supervised Aesthetics Aware Reinforcement Learning (A2-RL) framework to address this problem. Particularly, the proposed method develops an aesthetics aware reward function which especially benefits image cropping. Similar to human's decision making, we use a comprehensive state representation including both the current observation and the historical experience. We train the agent using the actor-critic architecture in an end-to-end manner. The agent is evaluated on several popular unseen cropping datasets. Experiment results show that our method achieves the state-of-the-art performance with much fewer candidate windows and much less time compared with previous weakly supervised methods.Comment: Accepted by CVPR 201

arXiv.org e-Print Archive

Crossref

Mind the Gap: Another look at the problem of the semantic gap in image retrieval

Author: Enser Peter G. B.
Hare Jonathon S.
Lewis Paul H.
Sandom Christine J.
Publication venue
Publication date: 01/01/2006
Field of study

This paper attempts to review and characterise the problem of the semantic gap in image retrieval and the attempts being made to bridge it. In particular, we draw from our own experience in user queries, automatic annotation and ontological techniques. The first section of the paper describes a characterisation of the semantic gap as a hierarchy between the raw media and full semantic understanding of the media's content. The second section discusses real users' queries with respect to the semantic gap. The final sections of the paper describe our own experience in attempting to bridge the semantic gap. In particular we discuss our work on auto-annotation and semantic-space models of image retrieval in order to bridge the gap from the bottom up, and the use of ontologies, which capture more semantics than keyword object labels alone, as a technique for bridging the gap from the top down

Southampton (e-Prints Soton)

Clustering-based analysis of semantic concept models for video shots

Author: Koskela Markus
Smeaton Alan F.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/07/2006
Field of study

In this paper we present a clustering-based method for representing semantic concepts on multimodal low-level feature spaces and study the evaluation of the goodness of such models with entropy-based methods. As different semantic concepts in video are most accurately represented with different features and modalities, we utilize the relative model-wise confidence values of the feature extraction techniques in weighting them automatically. The method also provides a natural way of measuring the similarity of different concepts in a multimedia lexicon. The experiments of the paper are conducted using the development set of the TRECVID 2005 corpus together with a common annotation for 39 semantic concept

Irish Universities

DCU Online Research Access Service

Recommended from our members

Active learning of an action detector on untrimmed videos

Author: Bandla Sunil
Publication venue
Publication date: 22/07/2014
Field of study

textCollecting and annotating videos of realistic human actions is tedious, yet critical for training action recognition systems. We propose a method to actively request the most useful video annotations among a large set of unlabeled videos. Predicting the utility of annotating unlabeled video is not trivial, since any given clip may contain multiple actions of interest, and it need not be trimmed to temporal regions of interest. To deal with this problem, we propose a detection-based active learner to train action category models. We develop a voting-based framework to localize likely intervals of interest in an unlabeled clip, and use them to estimate the total reduction in uncertainty that annotating that clip would yield. On three datasets, we show our approach can learn accurate action detectors more efficiently than alternative active learning strategies that fail to accommodate the "untrimmed" nature of real video data.Computer Science

Texas ScholarWorks