Search CORE

1,270 research outputs found

Event detection in field sports video using audio-visual features and a support vector machine

Author: O'Connor Noel E.
Sadlier David A.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/10/2005
Field of study

In this paper, we propose a novel audio-visual feature-based framework for event detection in broadcast video of multiple different field sports. Features indicating significant events are selected and robust detectors built. These features are rooted in characteristics common to all genres of field sports. The evidence gathered by the feature detectors is combined by means of a support vector machine, which infers the occurrence of an event based on a model generated during a training phase. The system is tested generically across multiple genres of field sports including soccer, rugby, hockey, and Gaelic football and the results suggest that high event retrieval and content rejection statistics are achievable

Audio-visual football video analysis, from structure detection to attention analysis

Author: Ren Reede
Publication venue
Publication date: 01/01/2008
Field of study

Sport video is an important video genre. Content-based sports video analysis attracts great interest from both industry and academic ﬁelds. A sports video is characterised by repetitive temporal structures, relatively plain contents, and strong spatio-temporal variations, such as quick camera switches and swift local motions. It is necessary to develop speciﬁc techniques for content-based sports video analysis to utilise these characteristics. For an efﬁcient and effective sports video analysis system, there are three fundamental questions: (1) what are key stories for sports videos; (2) what incurs viewer’s interest; and (3) how to identify game highlights. This thesis is developed around these questions. We approached these questions from two different perspectives and in turn three research contributions are presented, namely, replay detection, attack temporal structure decomposition, and attention-based highlight identiﬁcation. Replay segments convey the most important contents in sports videos. It is an efﬁcient approach to collect game highlights by detecting replay segments. However, replay is an artefact of editing, which improves with advances in video editing tools. The composition of replay is complex, which includes logo transitions, slow motions, viewpoint switches and normal speed video clips. Since logo transition clips are pervasive in game collections of FIFA World Cup 2002, FIFA World Cup 2006 and UEFA Championship 2006, we take logo transition detection as an effective replacement of replay detection. A two-pass system was developed, including a ﬁve-layer adaboost classiﬁer and a logo template matching throughout an entire video. The ﬁve-layer adaboost utilises shot duration, average game pitch ratio, average motion, sequential colour histogram and shot frequency between two neighbouring logo transitions, to ﬁlter out logo transition candidates. Subsequently, a logo template is constructed and employed to ﬁnd all transition logo sequences. The precision and recall of this system in replay detection is 100% in a ﬁve-game evaluation collection. An attack structure is a team competition for a score. Hence, this structure is a conceptually fundamental unit of a football video as well as other sports videos. We review the literature of content-based temporal structures, such as play-break structure, and develop a three-step system for automatic attack structure decomposition. Four content-based shot classes, namely, play, focus, replay and break were identiﬁed by low level visual features. A four-state hidden Markov model was trained to simulate transition processes among these shot classes. Since attack structures are the longest repetitive temporal unit in a sports video, a sufﬁx tree is proposed to ﬁnd the longest repetitive substring in the label sequence of shot class transitions. These occurrences of this substring are regarded as a kernel of an attack hidden Markov process. Therefore, the decomposition of attack structure becomes a boundary likelihood comparison between two Markov chains. Highlights are what attract notice. Attention is a psychological measurement of “notice ”. A brief survey of attention psychological background, attention estimation from vision and auditory, and multiple modality attention fusion is presented. We propose two attention models for sports video analysis, namely, the role-based attention model and the multiresolution autoregressive framework. The role-based attention model is based on the perception structure during watching video. This model removes reﬂection bias among modality salient signals and combines these signals by reﬂectors. The multiresolution autoregressive framework (MAR) treats salient signals as a group of smooth random processes, which follow a similar trend but are ﬁlled with noise. This framework tries to estimate a noise-less signal from these coarse noisy observations by a multiple resolution analysis. Related algorithms are developed, such as event segmentation on a MAR tree and real time event detection. The experiment shows that these attention-based approach can ﬁnd goal events at a high precision. Moreover, results of MAR-based highlight detection on the ﬁnal game of FIFA 2002 and 2006 are highly similar to professionally labelled highlights by BBC and FIFA

Glasgow Theses Service

CiteSeerX

Event detection based on generic characteristics of field-sports

Author: O'Connor Noel E.
Sadlier David A.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2005
Field of study

In this paper, we propose a generic framework for event detection in broadcast video of multiple different field-sports. Features indicating significant events are selected, and robust detectors built. These features are rooted in generic characteristics common to all genres of field-sports. The evidence gathered by the feature detectors is combined by means of a support vector machine, which infers the occurrence of an event based on a model generated during a training phase. The system is tested across multiple genres of field-sports including soccer, rugby, hockey and Gaelic football and the results suggest that high event retrieval and content rejection statistics are achievable

Transition effect detection for extracting highlights in baseball videos

Author: A Hanjalic
A Kokaram
B Satterwhite
C Jung
C Snoek
CC Cheng
Chi-Heng Lan
Chin-Song Wu
D Tjondronegoro
D Tjondronegoro
D Xu
D Zhang
D Zhang
D Zhang
DD Giusto
E Kijak
EJ Farn
F Zhao
G Zhu
G Zhu
GG Lee
GT Papadopoulos
H Pan
H Pan
HC Shih
HC Shih
HG Kim
HT Chen
HT Chen
J Assfalg
J Assfalg
J Liu
J Shen
J Wang
J Wang
JC Boulton
K Namuduri
L Lu
LC Chan
LY Duan
M Bertini
M Petkovic
NH Bach
P Chang
PC Su
Po-Chyi Su
R Lienhart
R Ren
RA Roberts
RE Ouazzani
V Kobla
V Kwatra
W Li
W Xu
Wei-Yu Chen
X Ruan
X Tong
X Wang
X Zhou
Y Ding
Y Rui
Y Song
Y Zhong
Z Dang
Z Niu
Z Xiong
Z Zhao
Zi-Xin Zeng
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Leveraging Contextual Cues for Generating Basketball Highlights

Author: R.
Shi J.
Smith R.
Xiong Z.
Publication venue
Publication date: 29/06/2016
Field of study

The massive growth of sports videos has resulted in a need for automatic generation of sports highlights that are comparable in quality to the hand-edited highlights produced by broadcasters such as ESPN. Unlike previous works that mostly use audio-visual cues derived from the video, we propose an approach that additionally leverages contextual cues derived from the environment that the game is being played in. The contextual cues provide information about the excitement levels in the game, which can be ranked and selected to automatically produce high-quality basketball highlights. We introduce a new dataset of 25 NCAA games along with their play-by-play stats and the ground-truth excitement data for each basket. We explore the informativeness of five different cues derived from the video and from the environment through user studies. Our experiments show that for our study participants, the highlights produced by our system are comparable to the ones produced by ESPN for the same games.Comment: Proceedings of ACM Multimedia 201

arXiv.org e-Print Archive

A COMPUTATION METHOD/FRAMEWORK FOR HIGH LEVEL VIDEO CONTENT ANALYSIS AND SEGMENTATION USING AFFECTIVE LEVEL INFORMATION

Author: Arifin Sutjipoto
Arifin Sutjipoto
Publication venue: Electrical & Electronic Engineering, Imperial College London
Publication date: 01/10/2008
Field of study

VIDEO segmentation facilitates e±cient video indexing and navigation in large digital video archives. It is an important process in a content-based video indexing and retrieval (CBVIR) system. Many automated solutions performed seg- mentation by utilizing information about the \facts" of the video. These \facts" come in the form of labels that describe the objects which are captured by the cam- era. This type of solutions was able to achieve good and consistent results for some video genres such as news programs and informational presentations. The content format of this type of videos is generally quite standard, and automated solutions were designed to follow these format rules. For example in [1], the presence of news anchor persons was used as a cue to determine the start and end of a meaningful news segment. The same cannot be said for video genres such as movies and feature films. This is because makers of this type of videos utilized different filming techniques to design their videos in order to elicit certain affective response from their targeted audience. Humans usually perform manual video segmentation by trying to relate changes in time and locale to discontinuities in meaning [2]. As a result, viewers usually have doubts about the boundary locations of a meaningful video segment due to their different affective responses. This thesis presents an entirely new view to the problem of high level video segmentation. We developed a novel probabilistic method for affective level video content analysis and segmentation. Our method had two stages. In the first stage, a®ective content labels were assigned to video shots by means of a dynamic bayesian 0. Abstract 3 network (DBN). A novel hierarchical-coupled dynamic bayesian network (HCDBN) topology was proposed for this stage. The topology was based on the pleasure- arousal-dominance (P-A-D) model of a®ect representation [3]. In principle, this model can represent a large number of emotions. In the second stage, the visual, audio and a®ective information of the video was used to compute a statistical feature vector to represent the content of each shot. Affective level video segmentation was achieved by applying spectral clustering to the feature vectors. We evaluated the first stage of our proposal by comparing its emotion detec- tion ability with all the existing works which are related to the field of a®ective video content analysis. To evaluate the second stage, we used the time adaptive clustering (TAC) algorithm as our performance benchmark. The TAC algorithm was the best high level video segmentation method [2]. However, it is a very computationally intensive algorithm. To accelerate its computation speed, we developed a modified TAC (modTAC) algorithm which was designed to be mapped easily onto a field programmable gate array (FPGA) device. Both the TAC and modTAC algorithms were used as performance benchmarks for our proposed method. Since affective video content is a perceptual concept, the segmentation per- formance and human agreement rates were used as our evaluation criteria. To obtain our ground truth data and viewer agreement rates, a pilot panel study which was based on the work of Gross et al. [4] was conducted. Experiment results will show the feasibility of our proposed method. For the first stage of our proposal, our experiment results will show that an average improvement of as high as 38% was achieved over previous works. As for the second stage, an improvement of as high as 37% was achieved over the TAC algorithm

Spiral - Imperial College Digital Repository

Audiovisual processing for sports-video summarisation technology

Author: Sadlier David A.
Publication venue: Dublin City University. School of Electronic Engineering
Publication date: 01/01/2006
Field of study

In this thesis a novel audiovisual feature-based scheme is proposed for the automatic summarization of sports-video content The scope of operability of the scheme is designed to encompass the wide variety o f sports genres that come under the description ‘field-sports’. Given the assumption that, in terms of conveying the narrative of a field-sports-video, score-update events constitute the most significant moments, it is proposed that their detection should thus yield a favourable summarisation solution. To this end, a generic methodology is proposed for the automatic identification of score-update events in field-sports-video content. The scheme is based on the development of robust extractors for a set of critical features, which are shown to reliably indicate their locations. The evidence gathered by the feature extractors is combined and analysed using a Support Vector Machine (SVM), which performs the event detection process. An SVM is chosen on the basis that its underlying technology represents an implementation of the latest generation of machine learning algorithms, based on the recent advances in statistical learning. Effectively, an SVM offers a solution to optimising the classification performance of a decision hypothesis, inferred from a given set of training data. Via a learning phase that utilizes a 90-hour field-sports-video trainmg-corpus, the SVM infers a score-update event model by observing patterns in the extracted feature evidence. Using a similar but distinct 90-hour evaluation corpus, the effectiveness of this model is then tested genencally across multiple genres of fieldsports- video including soccer, rugby, field hockey, hurling, and Gaelic football. The results suggest that in terms o f the summarization task, both high event retrieval and content rejection statistics are achievable

SportsAnno: what do you think?

Author: Lanagan James
Smeaton Alan F.
Publication venue: CID Paris
Publication date: 01/01/2007
Field of study

The automatic summarisation of sports video is of growing importance with the increased availability of on-demand content. Consumers who are unable to view events live often have a desire to watch a summary which allows then to quickly come to terms with all that has happened during a sporting event. Sports forums show that it is not only summaries that are desirable but also the opportunity to share one’s own point of view and discuss the opinions with a community of similar users. In this paper we give an overview of the ways in which annotations have been used to augment existing visual media. We present SportsAnno, a system developed to summarise World Cup 2006 matches and provide a means for open discussion of events within these matches

CiteSeerX

An HMM-Based Framework for Video Semantic Analysis

Author: Ma Yu-Fei
Xu Gu
Yang Shiqiang
Zhang Hong-Jiang
Publication venue: Scholars\u27 Mine
Publication date: 01/01/2005
Field of study

Video semantic analysis is essential in video indexing and structuring. However, due to the lack of robust and generic algorithms, most of the existing works on semantic analysis are limited to specific domains. In this paper, we present a novel hidden Markove model (HMM)-based framework as a general solution to video semantic analysis. In the proposed framework, semantics in different granularities are mapped to a hierarchical model space, which is composed of detectors and connectors. In this manner, our model decomposes a complex analysis problem into simpler subproblems during the training process and automatically integrates those subproblems for recognition. The proposed framework is not only suitable for a broad range of applications, but also capable of modeling semantics in different semantic granularities. Additionally, we also present a new motion representation scheme, which is robust to different motion vector sources. The applications of the proposed framework in basketball event detection, soccer shot classification, and volleyball sequence analysis have demonstrated the effectiveness of the proposed framework on video semantic analysis

Missouri University of Science and Technology (Missouri S&T): Scholars' Mine

Anomaly Detection, Rule Adaptation and Rule Induction Methodologies in the Context of Automated Sports Video Annotation.

Author: Khan A.
Publication venue: Guildford
Publication date: 06/05/2020
Field of study

Automated video annotation is a topic of considerable interest in computer vision due to its applications in video search, object based video encoding and enhanced broadcast content. The domain of sport broadcasting is, in particular, the subject of current research attention due to its fixed, rule governed, content. This research work aims to develop, analyze and demonstrate novel methodologies that can be useful in the context of adaptive and automated video annotation systems. In this thesis, we present methodologies for addressing the problems of anomaly detection, rule adaptation and rule induction for court based sports such as tennis and badminton. We first introduce an HMM induction strategy for a court-model based method that uses the court structure in the form of a lattice for two related modalities of singles and doubles tennis to tackle the problems of anomaly detection and rectification. We also introduce another anomaly detection methodology that is based on the disparity between the low-level vision based classifiers and the high-level contextual classifier. Another approach to address the problem of rule adaptation is also proposed that employs Convex hulling of the anomalous states. We also investigate a number of novel hierarchical HMM generating methods for stochastic induction of game rules. These methodologies include, Cartesian product Label-based Hierarchical Bottom-up Clustering (CLHBC) that employs prior information within the label structures. A new constrained variant of the classical Chinese Restaurant Process (CRP) is also introduced that is relevant to sports games. We also propose two hybrid methodologies in this context and a comparative analysis is made against the flat Markov model. We also show that these methods are also generalizable to other rule based environments