Search CORE

11 research outputs found

Improving Sequential Determinantal Point Processes for Supervised Video Summarization

Author: A Borodin
A Kulesza
A Sharghi
B Xiong
D Potapov
H Daumé
JB Hough
K Zhang
M Gygli
M Sun
R Hirsch
SEF Avila De
YJ Lee
Publication venue
Publication date: 01/01/2018
Field of study

It is now much easier than ever before to produce videos. While the ubiquitous video data is a great source for information discovery and extraction, the computational challenges are unparalleled. Automatically summarizing the videos has become a substantial need for browsing, searching, and indexing visual content. This paper is in the vein of supervised video summarization using sequential determinantal point process (SeqDPP), which models diversity by a probabilistic distribution. We improve this model in two folds. In terms of learning, we propose a large-margin algorithm to address the exposure bias problem in SeqDPP. In terms of modeling, we design a new probabilistic distribution such that, when it is integrated into SeqDPP, the resulting model accepts user input about the expected length of the summary. Moreover, we also significantly extend a popular video summarization dataset by 1) more egocentric videos, 2) dense user annotations, and 3) a refined evaluation scheme. We conduct extensive experiments on this dataset (about 60 hours of videos in total) and compare our approach to several competitive baselines

arXiv.org e-Print Archive

Crossref

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

Summarizing Videos with Attention

Author: A Graves
D Potapov
K Zhang
L dos Santos Belo
M Fei
M Gygli
Mayu Otani
O Russakovsky
RJ Williams
S Hochreiter
SEF De Avila
V Argyriou
Y Yuan
Publication venue
Publication date: 21/02/2019
Field of study

In this work we propose a novel method for supervised, keyshots based video summarization by applying a conceptually simple and computationally efficient soft, self-attention mechanism. Current state of the art methods leverage bi-directional recurrent networks such as BiLSTM combined with attention. These networks are complex to implement and computationally demanding compared to fully connected networks. To that end we propose a simple, self-attention based network for video summarization which performs the entire sequence to sequence transformation in a single feed forward pass and single backward pass during training. Our method sets a new state of the art results on two benchmarks TvSum and SumMe, commonly used in this domain.Comment: Presented at ACCV2018 AIU2018 worksho

arXiv.org e-Print Archive

Durham Research Online

Crossref

Visual saliency models for summarization of diagnostic hysteroscopy videos in healthcare systems

Author: C-C Lin
G Mei
G Mei
I Mehmood
I Mehmood
I Mehmood
I Mehmood
J Almeida
J Almeida
K Muhammad
K Muhammad
K Muhammad
M Lux
N Ejaz
N Ejaz
N Ejaz
N Ejaz
RJ Mstafa
S Rho
S Rho
S Rho
SEF Avila De
T Liu
W Gavião
W Gavião
Y-F Ma
Z Liu
Z Liu
Z Lv
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Towards key-frame extraction methods for 3D video: a review

Author: A Khodakovsky
A Smolic
AF Smeaton
AG Money
B Bustos
B Ionescu
B-D Choi
B-L Yeo
BT Truong
C Cotsaces
C Halit
C Jin
C Nguyen
C Shelton
CH Lee
D DeMenthon
G Ciocca
H Schwarz
H-J Lee
HM Briceño
HS Chang
J Assa
J Bescos
J Nam
J Peng
J Xu
J Xu
J Yuan
J-L Lai
K Ju
K Schoeffmann
K Schoeffmann
K-S Huang
L Ferreira
L Ferreira
L Herranz
L Herranz
Lino Ferreira
Luis A. da Silva Cruz
M Furini
M Mrak
M Savelonas
N Ejaz
N Ejaz
ND Doulamis
P Huang
P Huang
P Merkle
P Mundur
P Over
P Over
P Sidiropoulos
Pedro Assuncao
PJ Besl
Q-G Ji
R Parent
R Slama
R Xu
S Chikkerur
S Lian
SEF de Avila
T Yamasaki
T-Y Lee
T-Y Liu
U Gargi
V Blanz
W Hu
W-N Lie
X Cao
Y Fu
Y Li
Y-F Ma
Y-J Zhang
Z Cernekova
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

The increasing rate of creation and use of 3D video content leads to a pressing need for methods capable of lowering the cost of 3D video searching, browsing and indexing operations, with improved content selection performance. Video summarisation methods specifically tailored for 3D video content fulfil these requirements. This paper presents a review of the state-of-the-art of a crucial component of 3D video summarisation algorithms: the key-frame extraction methods. The methods reviewed cover 3D video key-frame extraction as well as shot boundary detection methods specific for use in 3D video. The performance metrics used to evaluate the key-frame extraction methods and the summaries derived from those key-frames are presented and discussed. The applications of these methods are also presented and discussed, followed by an exposition about current research challenges on 3D video summarisation methods

Crossref

Estudo Geral