20,375 research outputs found
Image tag completion by local learning
The problem of tag completion is to learn the missing tags of an image. In
this paper, we propose to learn a tag scoring vector for each image by local
linear learning. A local linear function is used in the neighborhood of each
image to predict the tag scoring vectors of its neighboring images. We
construct a unified objective function for the learning of both tag scoring
vectors and local linear function parame- ters. In the objective, we impose the
learned tag scoring vectors to be consistent with the known associations to the
tags of each image, and also minimize the prediction error of each local linear
function, while reducing the complexity of each local function. The objective
function is optimized by an alternate optimization strategy and gradient
descent methods in an iterative algorithm. We compare the proposed algorithm
against different state-of-the-art tag completion methods, and the results show
its advantages
Socializing the Semantic Gap: A Comparative Survey on Image Tag Assignment, Refinement and Retrieval
Where previous reviews on content-based image retrieval emphasize on what can
be seen in an image to bridge the semantic gap, this survey considers what
people tag about an image. A comprehensive treatise of three closely linked
problems, i.e., image tag assignment, refinement, and tag-based image retrieval
is presented. While existing works vary in terms of their targeted tasks and
methodology, they rely on the key functionality of tag relevance, i.e.
estimating the relevance of a specific tag with respect to the visual content
of a given image and its social context. By analyzing what information a
specific method exploits to construct its tag relevance function and how such
information is exploited, this paper introduces a taxonomy to structure the
growing literature, understand the ingredients of the main works, clarify their
connections and difference, and recognize their merits and limitations. For a
head-to-head comparison between the state-of-the-art, a new experimental
protocol is presented, with training sets containing 10k, 100k and 1m images
and an evaluation on three test sets, contributed by various research groups.
Eleven representative works are implemented and evaluated. Putting all this
together, the survey aims to provide an overview of the past and foster
progress for the near future.Comment: to appear in ACM Computing Survey
A Data-Driven Approach for Tag Refinement and Localization in Web Videos
Tagging of visual content is becoming more and more widespread as web-based
services and social networks have popularized tagging functionalities among
their users. These user-generated tags are used to ease browsing and
exploration of media collections, e.g. using tag clouds, or to retrieve
multimedia content. However, not all media are equally tagged by users. Using
the current systems is easy to tag a single photo, and even tagging a part of a
photo, like a face, has become common in sites like Flickr and Facebook. On the
other hand, tagging a video sequence is more complicated and time consuming, so
that users just tag the overall content of a video. In this paper we present a
method for automatic video annotation that increases the number of tags
originally provided by users, and localizes them temporally, associating tags
to keyframes. Our approach exploits collective knowledge embedded in
user-generated tags and web sources, and visual similarity of keyframes and
images uploaded to social sites like YouTube and Flickr, as well as web sources
like Google and Bing. Given a keyframe, our method is able to select on the fly
from these visual sources the training exemplars that should be the most
relevant for this test sample, and proceeds to transfer labels across similar
images. Compared to existing video tagging approaches that require training
classifiers for each tag, our system has few parameters, is easy to implement
and can deal with an open vocabulary scenario. We demonstrate the approach on
tag refinement and localization on DUT-WEBV, a large dataset of web videos, and
show state-of-the-art results.Comment: Preprint submitted to Computer Vision and Image Understanding (CVIU
Towards Understanding User Preferences from User Tagging Behavior for Personalization
Personalizing image tags is a relatively new and growing area of research,
and in order to advance this research community, we must review and challenge
the de-facto standard of defining tag importance. We believe that for greater
progress to be made, we must go beyond tags that merely describe objects that
are visually represented in the image, towards more user-centric and subjective
notions such as emotion, sentiment, and preferences.
We focus on the notion of user preferences and show that the order that users
list tags on images is correlated to the order of preference over the tags that
they provided for the image. While this observation is not completely
surprising, to our knowledge, we are the first to explore this aspect of user
tagging behavior systematically and report empirical results to support this
observation. We argue that this observation can be exploited to help advance
the image tagging (and related) communities.
Our contributions include: 1.) conducting a user study demonstrating this
observation, 2.) collecting a dataset with user tag preferences explicitly
collected.Comment: 6 page
- …