Search CORE

375,137 research outputs found

Generating Diverse and Meaningful Captions: Unsupervised Specificity Optimization for Image Captioning

Author: Kelleher John D.
Lindh Annika
Mahalunkar Abhijit
Ross Robert
Salton Giancarlo
Publication venue: Dublin Institute of Technology
Publication date: 01/01/2018
Field of study

Image Captioning is a task that requires models to acquire a multi-modal understanding of the world and to express this understanding in natural language text. While the state-of-the-art for this task has rapidly improved in terms of n-gram metrics, these models tend to output the same generic captions for similar images. In this work, we address this limitation and train a model that generates more diverse and specific captions through an unsupervised training approach that incorporates a learning signal from an Image Retrieval model. We summarize previous results and improve the state-of-the-art on caption diversity and novelty. We make our source code publicly available online: https://github.com/AnnikaLindh/Diverse_and_Specific_Image_Captionin

Arrow@TUDublin

Generating Diverse and Meaningful Captions

Author: A Karpathy
I Goodfellow
O Russakovsky
O Vinyals
P Anderson
R Bernardi
S Hochreiter
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

Image Captioning is a task that requires models to acquire a multi-modal understanding of the world and to express this understanding in natural language text. While the state-of-the-art for this task has rapidly improved in terms of n-gram metrics, these models tend to output the same generic captions for similar images. In this work, we address this limitation and train a model that generates more diverse and specific captions through an unsupervised training approach that incorporates a learning signal from an Image Retrieval model. We summarize previous results and improve the state-of-the-art on caption diversity and novelty. We make our source code publicly available online.Comment: Accepted for presentation at The 27th International Conference on Artificial Neural Networks (ICANN 2018

arXiv.org e-Print Archive

Crossref

Arrow@TUDublin