Search CORE

13,228 research outputs found

Context-aware person identification in personal photo collections

Author: O'Hare Neil
Smeaton Alan F.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/02/2009
Field of study

Identifying the people in photos is an important need for users of photo management systems. We present MediAssist, one such system which facilitates browsing, searching and semi-automatic annotation of personal photos, using analysis of both image content and the context in which the photo is captured. This semi-automatic annotation includes annotation of the identity of people in photos. In this paper, we focus on such person annotation, and propose person identiﬁcation techniques based on a combination of context and content. We propose language modelling and nearest neighbor approaches to context-based person identiﬁcation, in addition to novel face color and image color content-based features (used alongside face recognition and body patch features). We conduct a comprehensive empirical study of these techniques using the real private photo collections of a number of users, and show that combining context- and content-based analysis improves performance over content or context alone

Crossref

Irish Universities

DCU Online Research Access Service

Identifying person re-occurrences for personal photo management applications

Author: Cooray Saman H.
Gurrin Cathal
Jones Gareth J.F.
O'Connor Noel E.
O'Hare Neil
Smeaton Alan F.
Publication venue: 'Institution of Engineering and Technology (IET)'
Publication date: 01/01/2006
Field of study

Automatic identification of "who" is present in individual digital images within a photo management system using only content-based analysis is an extremely difficult problem. The authors present a system which enables identification of person reoccurrences within a personal photo management application by combining image content-based analysis tools with context data from image capture. This combined system employs automatic face detection and body-patch matching techniques, which collectively facilitate identifying person re-occurrences within images grouped into events based on context data. The authors introduce a face detection approach combining a histogram-based skin detection model and a modified BDF face detection method to detect multiple frontal faces in colour images. Corresponding body patches are then automatically segmented relative to the size, location and orientation of the detected faces in the image. The authors investigate the suitability of using different colour descriptors, including MPEG-7 colour descriptors, color coherent vectors (CCV) and color correlograms for effective body-patch matching. The system has been successfully integrated into the MediAssist platform, a prototype Web-based system for personal photo management, and runs on over 13000 personal photos

CiteSeerX

Crossref

Irish Universities

DCU Online Research Access Service

Ontology-based Information Extraction with SOBA

Author: Buitelaar Paul
Cimiano Philipp
Racioppa Stefania
Siegel Melanie
Publication venue
Publication date: 20/12/2011
Field of study

In this paper we describe SOBA, a sub-component of the SmartWeb multi-modal dialog system. SOBA is a component for ontologybased information extraction from soccer web pages for automatic population of a knowledge base that can be used for domainspecific question answering. SOBA realizes a tight connection between the ontology, knowledge base and the information extraction component. The originality of SOBA is in the fact that it extracts information from heterogeneous sources such as tabular structures, text and image captions in a semantically integrated way. In particular, it stores extracted information in a knowledge base, and in turn uses the knowledge base to interpret and link newly extracted information with respect to already existing entities

Hochschulschriftenserver - Universität Frankfurt am Main

Spatio-temporal wardrobe generation of actor's clothing in video content

Author: E Simo-Serra
F Wang
H Wang
J Liaukonyte
K Nogueira
K Taşdemir
L Baraldi
L dos Santos Belo
M Ajmal
P Šaloun
R Achanta
SA Chatzichristofis
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Crossref

Ghent University Academic Bibliography

Learning Multimodal Latent Attributes

Author: Fu Y
Gong S
Hospedales TM
Xiang T
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/02/2014
Field of study

Abstract—The rapid development of social media sharing has created a huge demand for automatic media classification and annotation techniques. Attribute learning has emerged as a promising paradigm for bridging the semantic gap and addressing data sparsity via transferring attribute knowledge in object recognition and relatively simple action classification. In this paper, we address the task of attribute learning for understanding multimedia data with sparse and incomplete labels. In particular we focus on videos of social group activities, which are particularly challenging and topical examples of this task because of their multi-modal content and complex and unstructured nature relative to the density of annotations. To solve this problem, we (1) introduce a concept of semi-latent attribute space, expressing user-defined and latent attributes in a unified framework, and (2) propose a novel scalable probabilistic topic model for learning multi-modal semi-latent attributes, which dramatically reduces requirements for an exhaustive accurate attribute ontology and expensive annotation effort. We show that our framework is able to exploit latent attributes to outperform contemporary approaches for addressing a variety of realistic multimedia sparse data learning tasks including: multi-task learning, learning with label noise, N-shot transfer learning and importantly zero-shot learning

CiteSeerX

Queen Mary Research Online

Image annotation with Photocopain

Author: Brewster Christopher
Chakravarthy Ajay
Ciravegna Fabio
Dupplaw David P.
Gibbins Nicholas
Harris Stephen
O'Hara Kieron
Shadbolt Nigel R.
Sleeman Derek
Tuffield Mischa
Wilks Yorick
Publication venue
Publication date: 01/01/2006
Field of study

Photo annotation is a resource-intensive task, yet is increasingly essential as image archives and personal photo collections grow in size. There is an inherent conflict in the process of describing and archiving personal experiences, because casual users are generally unwilling to expend large amounts of effort on creating the annotations which are required to organise their collections so that they can make best use of them. This paper describes the Photocopain system, a semi-automatic image annotation system which combines information about the context in which a photograph was captured with information from other readily available sources in order to generate outline annotations for that photograph that the user may further extend or amend

Southampton (e-Prints Soton)

Aston Publications Explorer

A Dataset for Movie Description

Author: Rohrbach Anna
Rohrbach Marcus
Schiele Bernt
Tandon Niket
Publication venue
Publication date: 01/01/2015
Field of study

Descriptive video service (DVS) provides linguistic descriptions of movies and allows visually impaired people to follow a movie along with their peers. Such descriptions are by design mainly visual and thus naturally form an interesting data source for computer vision and computational linguistics. In this work we propose a novel dataset which contains transcribed DVS, which is temporally aligned to full length HD movies. In addition we also collected the aligned movie scripts which have been used in prior work and compare the two different sources of descriptions. In total the Movie Description dataset contains a parallel corpus of over 54,000 sentences and video snippets from 72 HD movies. We characterize the dataset by benchmarking different approaches for generating video descriptions. Comparing DVS to scripts, we find that DVS is far more visual and describes precisely what is shown rather than what should happen according to the scripts created prior to movie production

arXiv.org e-Print Archive

CiteSeerX

CISPA – Helmholtz-Zentrum für Informationssicherheit

MPG.PuRe

Click Carving: Segmenting Objects in Video with Point Clicks

Author: Grauman Kristen
Jain Suyog Dutt
Publication venue
Publication date: 05/07/2016
Field of study

We present a novel form of interactive video object segmentation where a few clicks by the user helps the system produce a full spatio-temporal segmentation of the object of interest. Whereas conventional interactive pipelines take the user's initialization as a starting point, we show the value in the system taking the lead even in initialization. In particular, for a given video frame, the system precomputes a ranked list of thousands of possible segmentation hypotheses (also referred to as object region proposals) using image and motion cues. Then, the user looks at the top ranked proposals, and clicks on the object boundary to carve away erroneous ones. This process iterates (typically 2-3 times), and each time the system revises the top ranked proposal set, until the user is satisfied with a resulting segmentation mask. Finally, the mask is propagated across the video to produce a spatio-temporal object tube. On three challenging datasets, we provide extensive comparisons with both existing work and simpler alternative methods. In all, the proposed Click Carving approach strikes an excellent balance of accuracy and human effort. It outperforms all similarly fast methods, and is competitive or better than those requiring 2 to 12 times the effort.Comment: A preliminary version of the material in this document was filed as University of Texas technical report no. UT AI16-0

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications