Search CORE

461,896 research outputs found

Object-Centric Unsupervised Image Captioning

Author: Cao Xuefei
Lim Ser-Nam
Meng Zihang
Shah Ashish
Yang David
Publication venue
Publication date: 19/07/2022
Field of study

Image captioning is a longstanding problem in the field of computer vision and natural language processing. To date, researchers have produced impressive state-of-the-art performance in the age of deep learning. Most of these state-of-the-art, however, requires large volume of annotated image-caption pairs in order to train their models. When given an image dataset of interests, practitioner needs to annotate the caption for each image in the training set and this process needs to happen for each newly collected image dataset. In this paper, we explore the task of unsupervised image captioning which utilizes unpaired images and texts to train the model so that the texts can come from different sources than the images. A main school of research on this topic that has been shown to be effective is to construct pairs from the images and texts in the training set according to their overlap of objects. Unlike in the supervised setting, these constructed pairings are however not guaranteed to have fully overlapping set of objects. Our work in this paper overcomes this by harvesting objects corresponding to a given sentence from the training set, even if they don't belong to the same image. When used as input to a transformer, such mixture of objects enables larger if not full object coverage, and when supervised by the corresponding sentence, produced results that outperform current state of the art unsupervised methods by a significant margin. Building upon this finding, we further show that (1) additional information on relationship between objects and attributes of objects also helps in boosting performance; and (2) our method also extends well to non-English image captioning, which usually suffers from a scarcer level of annotations. Our findings are supported by strong empirical results. Our code is available at https://github.com/zihangm/obj-centric-unsup-caption.Comment: ECCV 202

arXiv.org e-Print Archive

A New Geometric Approach to Latent Topic Modeling and Discovery

Author: Ding Weicong
Ishwar Prakash
Rohban Mohammad H.
Saligrama Venkatesh
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 04/01/2013
Field of study

A new geometrically-motivated algorithm for nonnegative matrix factorization is developed and applied to the discovery of latent "topics" for text and image "document" corpora. The algorithm is based on robustly finding and clustering extreme points of empirical cross-document word-frequencies that correspond to novel "words" unique to each topic. In contrast to related approaches that are based on solving non-convex optimization problems using suboptimal approximations, locally-optimal methods, or heuristics, the new algorithm is convex, has polynomial complexity, and has competitive qualitative and quantitative performance compared to the current state-of-the-art approaches on synthetic and real-world datasets.Comment: This paper was submitted to the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2013 on November 30, 201

arXiv.org e-Print Archive

Crossref

Evaluating a workspace's usefulness for image retrieval

Author: C.J. van Rijsbergen
D.G. Hendry
G. Miller
G. Salton
Jana Urban
Joemon M. Jose
M. Beaulieu
M. Nakazato
P. Ingwersen
X.S. Zhou
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/03/2006
Field of study

Image searching is a creative process. We have proposed a novel image retrieval system that supports creative search sessions by allowing the user to organise their search results on a workspace. The workspace’s usefulness is evaluated in a task-oriented and user-centred comparative experiment, involving design professionals and several types of realistic search tasks. In particular, we focus on its effect on task conceptualisation and query formulation. A traditional relevance feedback system serves as a baseline. The results of this study show that the workspace is more useful in terms of both of the above aspects and that the proposed approach leads to a more effective and enjoyable search experience. This paper also highlights the influence of tasks on the users’ search and organisation strategy

Crossref

Enlighten

Automatic tagging and geotagging in video collections and communities

Author: Jones Gareth J.F.
Larson Martha
Serdyukov Pavel
Soleymani Mohammad
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/04/2011
Field of study

Automatically generated tags and geotags hold great promise to improve access to video collections and online communi- ties. We overview three tasks offered in the MediaEval 2010 benchmarking initiative, for each, describing its use scenario, definition and the data set released. For each task, a reference algorithm is presented that was used within MediaEval 2010 and comments are included on lessons learned. The Tagging Task, Professional involves automatically matching episodes in a collection of Dutch television with subject labels drawn from the keyword thesaurus used by the archive staff. The Tagging Task, Wild Wild Web involves automatically predicting the tags that are assigned by users to their online videos. Finally, the Placing Task requires automatically assigning geo-coordinates to videos. The specification of each task admits the use of the full range of available information including user-generated metadata, speech recognition transcripts, audio, and visual features

Irish Universities

DCU Online Research Access Service

BCS SGAI SMA 2013: the BCS SGAI workshop on social media analysis

Author
Publication venue: M. Jeusfeld
Publication date: 01/01/2013
Field of study

Portsmouth University Research Portal (Pure)

Dorset Coast Digital Archive Project - Audience research and monitoring survey.

Author: Calver S.J.
Johnston Nicky
Publication venue: Bournemouth University
Publication date: 01/10/2004
Field of study

Bournemouth University Research Online