Search CORE

4 research outputs found

Dilated Temporal Relational Adversarial Network for Generic Video Summarization

Author: Kampffmeyer Michael
Liang Xiaodan
Tan Min
Xing Eric P.
Zhang Dingwen
Zhang Yujia
Publication venue
Publication date: 01/01/2019
Field of study

The large amount of videos popping up every day, make it more and more critical that key information within videos can be extracted and understood in a very short time. Video summarization, the task of finding the smallest subset of frames, which still conveys the whole story of a given video, is thus of great significance to improve efficiency of video understanding. We propose a novel Dilated Temporal Relational Generative Adversarial Network (DTR-GAN) to achieve frame-level video summarization. Given a video, it selects the set of key frames, which contain the most meaningful and compact information. Specifically, DTR-GAN learns a dilated temporal relational generator and a discriminator with three-player loss in an adversarial manner. A new dilated temporal relation (DTR) unit is introduced to enhance temporal representation capturing. The generator uses this unit to effectively exploit global multi-scale temporal context to select key frames and to complement the commonly used Bi-LSTM. To ensure that summaries capture enough key video representation from a global perspective rather than a trivial randomly shorten sequence, we present a discriminator that learns to enforce both the information completeness and compactness of summaries via a three-player loss. The loss includes the generated summary loss, the random summary loss, and the real summary (ground-truth) loss, which play important roles for better regularizing the learned model to obtain useful summaries. Comprehensive experiments on three public datasets show the effectiveness of the proposed approach

arXiv.org e-Print Archive

Munin - Open Research Archive

NORA - Norwegian Open Research Archives

Effective video summarization approach based on visual attention

Author: Ahmad Hilal
Ali Sikandar
Khan Habib Ullah
Khattak Hizbullah
Rahman Syed Ijaz Ur
Wahid Fazli
Publication venue: 'Computers, Materials and Continua (Tech Science Press)'
Publication date: 01/01/2022
Field of study

Video summarization is applied to reduce redundancy and develop a concise representation of key frames in the video, more recently, video summaries have been used through visual attention modeling. In these schemes, the frames that stand out visually are extracted as key frames based on human attention modeling theories. The schemes for modeling visual attention have proven to be effective for video summaries. Nevertheless, the high cost of computing in such techniques restricts their usability in everyday situations. In this context, we propose a method based on KFE (key frame extraction) technique, which is recommended based on an efficient and accurate visual attention model. The calculation effort is minimized by utilizing dynamic visual highlighting based on the temporal gradient instead of the traditional optical flow techniques. In addition, an efficient technique using a discrete cosine transformation is utilized for the static visual salience. The dynamic and static visual attention metrics are merged by means of a non-linear weighted fusion technique. Results of the systemare compared with some existing stateof- the-art techniques for the betterment of accuracy. The experimental results of our proposed model indicate the efficiency and high standard in terms of the key frames extraction as output.Qatar University - No. IRCC-2021-010

Qatar University Institutional Repository

Context-Aware Surveillance Video Summarization

Author: Amit K. Roy-Chowdhury
Shu Zhang
Yingying Zhu
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref