Search CORE

2 research outputs found

The Steep Road to Happily Ever After: An Analysis of Current Visual Storytelling Models

Author: Modi Yatri
Parde Natalie
Publication venue
Publication date: 06/04/2019
Field of study

Visual storytelling is an intriguing and complex task that only recently entered the research arena. In this work, we survey relevant work to date, and conduct a thorough error analysis of three very recent approaches to visual storytelling. We categorize and provide examples of common types of errors, and identify key shortcomings in current work. Finally, we make recommendations for addressing these limitations in the future.Comment: Accepted to the NAACL 2019 Workshop on Shortcomings in Vision and Language (SiVL

arXiv.org e-Print Archive

FigShare

University of Illinois at Chicago: UIC INDIGO (INtellectual property in DIGital form available online in an Open environment)

Towards an Improved Model for Visual Storytelling

Author: Yatri Manoj Modi (9907812)
Publication venue
Publication date: 03/04/2020
Field of study

Visual storytelling is an intriguing and complex task that only recently entered the language and vision research arena. The task focuses on generating human-like, coherent and visually grounded stories from a sequence of images while maintaining the context over these images. In this study I survey recent advances in the field and conduct a thorough error analysis of three approaches to visual storytelling. I categorize and provide examples of common types of errors, and identify key shortcomings in prior work. Later, I make recommendations for addressing these limitations, and propose an improved model for visual storytelling: a hierarchical encoder-decoder network, with co-attention over the images and their natural language literal descriptions. I assess the performance of this model at generating visual stories. Finally, I experiment with a novel metric, BertScore (Zhang et al.,2019), as an alternative to human evaluation

University of Illinois at Chicago: UIC INDIGO (INtellectual property in DIGital form available online in an Open environment)