Search CORE

4,629 research outputs found

Recommended from our members

Automatic parsing of sports videos with grammars

Author: Li J
Lü K
Wang F
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2005
Field of study

Motivated by the analogies between languages and sports videos, we introduce a novel approach for video parsing with grammars. It utilizes compiler techniques for integrating both semantic annotation and syntactic analysis to generate a semantic index of events and a table of content for a given sports video. The video sequence is first segmented and annotated by event detection with domain knowledge. A grammar-based parser is then used to identify the structure of the video content. Meanwhile, facilities for error handling are introduced which are particularly useful when the results of automatic parsing need to be adjusted. As a case study, we have developed a system for video parsing in the particular domain of TV diving programs. Experimental results indicate the proposed approach is effectiv

Brunel University Research Archive

Requirements analysis of the VoD application using the tools in TRADE

Author: Huyts Sander
Wieringa R.J.
Publication venue: Free University, Faculty of Mathematics and Computer Science
Publication date: 01/01/1996
Field of study

This report contains a specification of requirements for a video-on-demand (VoD) application developed at Belgacom, used as a trial application in the 2RARE project. The specification contains three parts: an informal specification in natural language; a semiformal specification consisting of a number of diagrams intended to illustrate the informal specification; and a formal specification that makes the requiremants on the desired software system precise. The informal specification is structured in such a way that it resembles official specification documents conforming to standards such as that of IEEE or ESA. The semiformal specification uses some of the tools in from a requirements engineering toolkit called TRADE (Toolkit for Requirements And Design Engineering). The purpose of TRADE is to combine the best ideas in current structured and object-oriented analysis and design methods within a traditional systems engineering framework. In the case of the VoD system, the systems engineering framework is useful because it provides techniques for allocation and flowdown of system functions to components. TRADE consists of semiformal techniques taken from structured and object-oriented analysis as well as a formal specification langyage, which provides constructs that correspond to the semiformal constructs. The formal specification used in TRADE is LCM (Language for Conceptual Modeling), which is a syntactically sugared version of order-sorted dynamic logic with equality. The purpose of this report is to illustrate and validate the TRADE/LCM approach in the specification of distributed, communication-intensive systems

University of Twente Research Information

The DICEMAN description schemes for still images and video sequences

Author: Correia P.
O'Connor Noel
Salembier Clairon Philippe Jean
Publication venue: Ankara Universitesi Dil ve Tarih Cografya Fakultesi Sihhiye
Publication date: 01/01/1999
Field of study

To address the problem of visual content description, two Description Schemes (DSs) developed within the context of a European ACTS project known as DICEMAN, are presented. The DSs, designed based on an analogy with well-known tools for document description, describe both the structure and semantics of still images and video sequences. The overall structure of both DSs including the various sub-DSs and descriptors (Ds) of which they are composed is described. In each case, the hierarchical sub-DS for describing structure can be constructed using automatic (or semi-automatic) image/video analysis tools. The hierarchical sub-DSs for describing the semantics, however, are constructed by a user. The integration of the two DSs into a video indexing application currently under development in DICEMAN is also briefly described.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Audio-visual football video analysis, from structure detection to attention analysis

Author: Ren Reede
Publication venue
Publication date: 01/01/2008
Field of study

Sport video is an important video genre. Content-based sports video analysis attracts great interest from both industry and academic ﬁelds. A sports video is characterised by repetitive temporal structures, relatively plain contents, and strong spatio-temporal variations, such as quick camera switches and swift local motions. It is necessary to develop speciﬁc techniques for content-based sports video analysis to utilise these characteristics. For an efﬁcient and effective sports video analysis system, there are three fundamental questions: (1) what are key stories for sports videos; (2) what incurs viewer’s interest; and (3) how to identify game highlights. This thesis is developed around these questions. We approached these questions from two different perspectives and in turn three research contributions are presented, namely, replay detection, attack temporal structure decomposition, and attention-based highlight identiﬁcation. Replay segments convey the most important contents in sports videos. It is an efﬁcient approach to collect game highlights by detecting replay segments. However, replay is an artefact of editing, which improves with advances in video editing tools. The composition of replay is complex, which includes logo transitions, slow motions, viewpoint switches and normal speed video clips. Since logo transition clips are pervasive in game collections of FIFA World Cup 2002, FIFA World Cup 2006 and UEFA Championship 2006, we take logo transition detection as an effective replacement of replay detection. A two-pass system was developed, including a ﬁve-layer adaboost classiﬁer and a logo template matching throughout an entire video. The ﬁve-layer adaboost utilises shot duration, average game pitch ratio, average motion, sequential colour histogram and shot frequency between two neighbouring logo transitions, to ﬁlter out logo transition candidates. Subsequently, a logo template is constructed and employed to ﬁnd all transition logo sequences. The precision and recall of this system in replay detection is 100% in a ﬁve-game evaluation collection. An attack structure is a team competition for a score. Hence, this structure is a conceptually fundamental unit of a football video as well as other sports videos. We review the literature of content-based temporal structures, such as play-break structure, and develop a three-step system for automatic attack structure decomposition. Four content-based shot classes, namely, play, focus, replay and break were identiﬁed by low level visual features. A four-state hidden Markov model was trained to simulate transition processes among these shot classes. Since attack structures are the longest repetitive temporal unit in a sports video, a sufﬁx tree is proposed to ﬁnd the longest repetitive substring in the label sequence of shot class transitions. These occurrences of this substring are regarded as a kernel of an attack hidden Markov process. Therefore, the decomposition of attack structure becomes a boundary likelihood comparison between two Markov chains. Highlights are what attract notice. Attention is a psychological measurement of “notice ”. A brief survey of attention psychological background, attention estimation from vision and auditory, and multiple modality attention fusion is presented. We propose two attention models for sports video analysis, namely, the role-based attention model and the multiresolution autoregressive framework. The role-based attention model is based on the perception structure during watching video. This model removes reﬂection bias among modality salient signals and combines these signals by reﬂectors. The multiresolution autoregressive framework (MAR) treats salient signals as a group of smooth random processes, which follow a similar trend but are ﬁlled with noise. This framework tries to estimate a noise-less signal from these coarse noisy observations by a multiple resolution analysis. Related algorithms are developed, such as event segmentation on a MAR tree and real time event detection. The experiment shows that these attention-based approach can ﬁnd goal events at a high precision. Moreover, results of MAR-based highlight detection on the ﬁnal game of FIFA 2002 and 2006 are highly similar to professionally labelled highlights by BBC and FIFA

Glasgow Theses Service

CiteSeerX

OpenGrey Repository

Transition effect detection for extracting highlights in baseball videos

Author: A Hanjalic
A Kokaram
B Satterwhite
C Jung
C Snoek
CC Cheng
Chi-Heng Lan
Chin-Song Wu
D Tjondronegoro
D Tjondronegoro
D Xu
D Zhang
D Zhang
D Zhang
DD Giusto
E Kijak
EJ Farn
F Zhao
G Zhu
G Zhu
GG Lee
GT Papadopoulos
H Pan
H Pan
HC Shih
HC Shih
HG Kim
HT Chen
HT Chen
J Assfalg
J Assfalg
J Liu
J Shen
J Wang
J Wang
JC Boulton
K Namuduri
L Lu
LC Chan
LY Duan
M Bertini
M Petkovic
NH Bach
P Chang
PC Su
Po-Chyi Su
R Lienhart
R Ren
RA Roberts
RE Ouazzani
V Kobla
V Kwatra
W Li
W Xu
Wei-Yu Chen
X Ruan
X Tong
X Wang
X Zhou
Y Ding
Y Rui
Y Song
Y Zhong
Z Dang
Z Niu
Z Xiong
Z Zhao
Zi-Xin Zeng
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Automatic Recognition of Film Genres

Author: Effelsberg Wolfgang
Fischer Stefan
Lienhart Rainer
Publication venue
Publication date: 01/01/1995
Field of study

Film genres in digital video can be detected automatically. In a three-step approach we analyze first the syntactic properties of digital films: color statistics, cut detection, camera motion, object motion and audio. In a second step we use these statistics to derive at a more abstract level film style attributes such as camera panning and zooming, speech and music. These are distinguishing properties for film genres, e.g. newscasts vs. sports vs. commercials. In the third and final step we map the detected style attributes to film genres. Algorithms for the three steps are presented in detail, and we report on initial experience with real videos. It is our goal to automatically classify the large body of existing video for easier access in digital video-on-demand databases

CiteSeerX

TUbiblio

Crossref

MAnnheim DOCument Server

Symbiosis between the TRECVid benchmark and video libraries at the Netherlands Institute for Sound and Vision

Author: AF Smeaton
AF Smeaton
Alan F. Smeaton
B Huurnink
B Huurnink
CGM Snoek
CGM Snoek
CV Thornley
D. Tjondronegoro
H.-T. Pu
Johan Oomen
L. Hollink
M Hertzum
Paul Over
S Shatford
Wessel Kraaij
Y Zhang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Audiovisual archives are investing in large-scale digitisation efforts of their analogue holdings and, in parallel, ingesting an ever-increasing amount of born- digital files in their digital storage facilities. Digitisation opens up new access paradigms and boosted re-use of audiovisual content. Query-log analyses show the shortcomings of manual annotation, therefore archives are complementing these annotations by developing novel search engines that automatically extract information from both audio and the visual tracks. Over the past few years, the TRECVid benchmark has developed a novel relationship with the Netherlands Institute of Sound and Vision (NISV) which goes beyond the NISV just providing data and use cases to TRECVid. Prototype and demonstrator systems developed as part of TRECVid are set to become a key driver in improving the quality of search engines at the NISV and will ultimately help other audiovisual archives to offer more efficient and more fine-grained access to their collections. This paper reports the experiences of NISV in leveraging the activities of the TRECVid benchmark

Crossref

Irish Universities

DCU Online Research Access Service

Radboud Repository

Sound and Vision Publications

An Overview of Multimodal Techniques for the Characterization of Sport Programmes

Author: Leonardi Riccardo
Nicola Adami
Pierangelo Migliorati
Publication venue: 'SPIE-Intl Soc Optical Eng'
Publication date: 01/01/2003
Field of study

The problem of content characterization of sports videos is of great interest because sports video appeals to large audiences and its efficient distribution over various networks should contribute to widespread usage of multimedia services. In this paper we analyze several techniques proposed in literature for content characterization of sports videos. We focus this analysis on the typology of the signal (audio, video, text captions, ...) from which the low-level features are extracted. First we consider the techniques based on visual information, then the methods based on audio information, and finally the algorithms based on audio-visual cues, used in a multi-modal fashion. This analysis shows that each type of signal carries some peculiar information, and the multi-modal approach can fully exploit the multimedia information associated to the sports video. Moreover, we observe that the characterization is performed either considering what happens in a specific time segment, observing therefore the features in a "static" way, or trying to capture their "dynamic" evolution in time. The effectiveness of each approach depends mainly on the kind of sports it relates to, and the type of highlights we are focusing on

Crossref

Archivio istituzionale della ricerca - Università di Brescia