Generating Diverse and Meaningful Captions

A Karpathy; I Goodfellow; O Russakovsky; O Vinyals; P Anderson; R Bernardi; S Hochreiter

slides

Generating Diverse and Meaningful Captions

Authors: A Karpathy
I Goodfellow
O Russakovsky
O Vinyals
P Anderson
R Bernardi
S Hochreiter
Publication date: 1 January 2018
Publisher: 'Springer Science and Business Media LLC'
Doi

Abstract

Image Captioning is a task that requires models to acquire a multi-modal understanding of the world and to express this understanding in natural language text. While the state-of-the-art for this task has rapidly improved in terms of n-gram metrics, these models tend to output the same generic captions for similar images. In this work, we address this limitation and train a model that generates more diverse and specific captions through an unsupervised training approach that incorporates a learning signal from an Image Retrieval model. We summarize previous results and improve the state-of-the-art on caption diversity and novelty. We make our source code publicly available online.Comment: Accepted for presentation at The 27th International Conference on Artificial Neural Networks (ICANN 2018

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

Crossref

Last time updated on 10/08/2021

Arrow@TUDublin

oai:arrow.tudublin.ie:airccon-...

Last time updated on 17/04/2020