Large Scale Retrieval and Generation of Image Descriptions

Berg, Alexander C.; Berg, Tamara L.; Choi, Yejin; Daumé, Hal; Dodge, Jesse; Goyal, Amit; Han, Xufeng; Kulkarni, Girish; Kuznetsova, Polina; Mensch, Alyssa; Mitchell, Margaret; Ordonez, Vicente; Stratos, Karl; Yamaguchi, Kota

Large Scale Retrieval and Generation of Image Descriptions

Authors: Alexander C. Berg
Tamara L. Berg
Yejin Choi
Hal Daumé
Jesse Dodge
Amit Goyal
Xufeng Han
Girish Kulkarni
Polina Kuznetsova
Alyssa Mensch
Margaret Mitchell
Vicente Ordonez
Karl Stratos
Kota Yamaguchi
Publication date: 1 January 2016
Publisher
Doi

Abstract

What is the story of an image? What is the relationship between pictures, language, and information we can extract using state of the art computational recognition systems? In an attempt to address both of these questions, we explore methods for retrieving and generating natural language descriptions for images. Ideally, we would like our generated textual descriptions (captions) to both sound like a person wrote them, and also remain true to the image content. To do this we develop data-driven approaches for image description generation, using retrieval-based techniques to gather either: (a) whole captions associated with a visually similar image, or (b) relevant bits of text (phrases) from a large collection of image + description pairs. In the case of (b), we develop optimization algorithms to merge the retrieved phrases into valid natural language sentences. The end result is two simple, but effective, methods for harnessing the power of big data to produce image captions that are altogether more general, relevant, and human-like than previous attempts

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

Carolina Digital Repository

cdr.lib.unc.edu:w3763c49w

Last time updated on 23/04/2020