Generating natural language tags for video information management

BZ Yao; J Pustejovsky; JF Allen; MP Marcus; MUG Khan; Muhammad Usman Ghani Khan; P Baiget; RR Vallacher; W Kim; WC Hu; Y Yang; Yoshihiko Gotoh

Generating natural language tags for video information management

Authors: BZ Yao
J Pustejovsky
JF Allen
MP Marcus
MUG Khan
Muhammad Usman Ghani Khan
P Baiget
RR Vallacher
W Kim
WC Hu
Y Yang
Yoshihiko Gotoh
Publication date: 14 February 2017
Publisher: 'Springer Science and Business Media LLC'
Doi

Abstract

This exploratory work is concerned with generation of natural language descriptions that can be used for video retrieval applications. It is a step ahead of keyword-based tagging as it captures relations between keywords associated with videos. Firstly, we prepare hand annotations consisting of descriptions for video segments crafted from a TREC Video dataset. Analysis of this data presents insights into human’s interests on video contents. Secondly, we develop a framework for creating smooth and coherent description of video streams. It builds on conventional image processing techniques that extract high-level features from individual video frames. Natural language description is then produced based on high-level features. Although feature extraction processes are erroneous at various levels, we explore approaches to putting them together to produce a coherent, smooth and well-phrased description by incorporating spatial and temporal information. Evaluation is made by calculating ROUGE scores between human-annotated and machine-generated descriptions. Further, we introduce a task-based evaluation by human subjects which provides qualitative evaluation of generated descriptions

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

Crossref

info:doi/10.1007%2Fs00138-017-...

Last time updated on 23/03/2019