1,566 research outputs found
Recommended from our members
Improving tag recommendation using social networks
In this paper we address the task of recommending additional tags to partially annotated media objects, in our case images. We propose an extendable framework that can recommend tags using a combination of different personalised and collective contexts. We combine information from four contexts: (1) all the photos in the system, (2) a user's own photos, (3) the photos of a user's social contacts, and (4) the photos posted in the groups of which a user is a member. Variants of methods (1) and (2) have been proposed in previous work, but the use of (3) and (4) is novel.
For each of the contexts we use the same probabilistic model and Borda Count based aggregation approach to generate recommendations from different contexts into a unified ranking of recommended tags. We evaluate our system using a large set of real-world data from Flickr. We show that by using personalised contexts we can significantly improve tag recommendation compared to using collective knowledge alone. We also analyse our experimental results to explore the capabilities of our system with respect to a user's social behaviour
Multi-Modal Multi-Scale Deep Learning for Large-Scale Image Annotation
Image annotation aims to annotate a given image with a variable number of
class labels corresponding to diverse visual concepts. In this paper, we
address two main issues in large-scale image annotation: 1) how to learn a rich
feature representation suitable for predicting a diverse set of visual concepts
ranging from object, scene to abstract concept; 2) how to annotate an image
with the optimal number of class labels. To address the first issue, we propose
a novel multi-scale deep model for extracting rich and discriminative features
capable of representing a wide range of visual concepts. Specifically, a novel
two-branch deep neural network architecture is proposed which comprises a very
deep main network branch and a companion feature fusion network branch designed
for fusing the multi-scale features computed from the main branch. The deep
model is also made multi-modal by taking noisy user-provided tags as model
input to complement the image input. For tackling the second issue, we
introduce a label quantity prediction auxiliary task to the main label
prediction task to explicitly estimate the optimal label number for a given
image. Extensive experiments are carried out on two large-scale image
annotation benchmark datasets and the results show that our method
significantly outperforms the state-of-the-art.Comment: Submited to IEEE TI
Socializing the Semantic Gap: A Comparative Survey on Image Tag Assignment, Refinement and Retrieval
Where previous reviews on content-based image retrieval emphasize on what can
be seen in an image to bridge the semantic gap, this survey considers what
people tag about an image. A comprehensive treatise of three closely linked
problems, i.e., image tag assignment, refinement, and tag-based image retrieval
is presented. While existing works vary in terms of their targeted tasks and
methodology, they rely on the key functionality of tag relevance, i.e.
estimating the relevance of a specific tag with respect to the visual content
of a given image and its social context. By analyzing what information a
specific method exploits to construct its tag relevance function and how such
information is exploited, this paper introduces a taxonomy to structure the
growing literature, understand the ingredients of the main works, clarify their
connections and difference, and recognize their merits and limitations. For a
head-to-head comparison between the state-of-the-art, a new experimental
protocol is presented, with training sets containing 10k, 100k and 1m images
and an evaluation on three test sets, contributed by various research groups.
Eleven representative works are implemented and evaluated. Putting all this
together, the survey aims to provide an overview of the past and foster
progress for the near future.Comment: to appear in ACM Computing Survey
Exploiting multimedia in creating and analysing multimedia Web archives
The data contained on the web and the social web are inherently multimedia and consist of a mixture of textual, visual and audio modalities. Community memories embodied on the web and social web contain a rich mixture of data from these modalities. In many ways, the web is the greatest resource ever created by human-kind. However, due to the dynamic and distributed nature of the web, its content changes, appears and disappears on a daily basis. Web archiving provides a way of capturing snapshots of (parts of) the web for preservation and future analysis. This paper provides an overview of techniques we have developed within the context of the EU funded ARCOMEM (ARchiving COmmunity MEMories) project to allow multimedia web content to be leveraged during the archival process and for post-archival analysis. Through a set of use cases, we explore several practical applications of multimedia analytics within the realm of web archiving, web archive analysis and multimedia data on the web in general
Recommended from our members
Exploiting Social Networks for Recommendation in Online Image Sharing Systems
This thesis aims to demonstrate the distinct and so far little explored value of knowledge derived from social interaction data within large web-scale image sharing systems like Flickr, Picasa Web, Facebook and others for image recommendation. I have shown how such systems can be significantly improved through personalisation that takes into account the social context of users by modelling their interactions by mining data, building and evaluating systems that incorporate this information. These improvements allow users to search and browse large online image collections more quickly and to find results that more accurately match their personal information needs when compared to existing methods.
Traditional information retrieval and recommendation datasets are contrived to provide stable baselines for researchers to compare against but they rarely accurately reflect the media systems users tend to encounter online. The online photo sharing site Flickr provides rich and varied data that can be used by researchers to analyse and understand usersâ interactions with images and with each other. I analyse such data by modelling the connections between users as multigraphs and exploiting the resultant topologies to produce features that can be used to train recommender systems based on machine learnt classifiers.
The core contributions of this work include insight into the nature of very large-scale on- line photo collections and the communities that form around them, as well as the dynamic nature of the interactions users have with their media. I do this through the rigorous evaluation of both a probabilistic tag recommendation system and a machine learnt classifier trained to mimic user decisions regarding image preference. These implementations focus on treating the user as both a unique individual and as a member of potentially many explicit and implicit communities. I also explore the validity of the Flickr âFavouriteâ feedback label as proxy for user preference, which is particularly important when considering other analogous media systems to which my findings transfer. My conclusions highlight how vital both
social context information and the understanding of user behaviour are for online image sharing systems.
In the field of information retrieval the diverse nature of users is often forgotten in the hunt for increases in esoteric performance metrics. This thesis places them back at the centre of the problem of multimedia information retrieval and shows how their variety and uniqueness are valuable traits that can be exploited to augment and improve the experience of browsing and searching shared online image collections
Beautiful and damned. Combined effect of content quality and social ties on user engagement
User participation in online communities is driven by the intertwinement of
the social network structure with the crowd-generated content that flows along
its links. These aspects are rarely explored jointly and at scale. By looking
at how users generate and access pictures of varying beauty on Flickr, we
investigate how the production of quality impacts the dynamics of online social
systems. We develop a deep learning computer vision model to score images
according to their aesthetic value and we validate its output through
crowdsourcing. By applying it to over 15B Flickr photos, we study for the first
time how image beauty is distributed over a large-scale social system.
Beautiful images are evenly distributed in the network, although only a small
core of people get social recognition for them. To study the impact of exposure
to quality on user engagement, we set up matching experiments aimed at
detecting causality from observational data. Exposure to beauty is
double-edged: following people who produce high-quality content increases one's
probability of uploading better photos; however, an excessive imbalance between
the quality generated by a user and the user's neighbors leads to a decline in
engagement. Our analysis has practical implications for improving link
recommender systems.Comment: 13 pages, 12 figures, final version published in IEEE Transactions on
Knowledge and Data Engineering (Volume: PP, Issue: 99
- âŠ