21 research outputs found
Extraction and Classification of Self-consumable Sport Video Highlights
This paper aims to automatically extract and classify self-consumable sport video highlights. For this purpose, we will emphasize the benefits of using play-break sequences as the effective inputs for HMM-based classifier. HMM is used to model the stochastic pattern of high-level states during specific sport highlights which correspond to the sequence of generic audio-visual measurements extracted from raw video data. This paper uses soccer as the domain study, focusing on the extraction and classification of goal, shot and foul highlights. The experiment work which uses183 play-break sequences from 6 soccer matches will be presented to demonstrate the performance of our proposed scheme
Event detection based on generic characteristics of field-sports
In this paper, we propose a generic framework for event detection in broadcast video of multiple different field-sports. Features indicating significant events are selected, and robust detectors built. These features are rooted in generic characteristics common to all genres of field-sports. The evidence gathered by the feature detectors is combined by means of a support vector machine, which infers the occurrence of an event based on a model generated during a training phase. The system is tested across multiple genres of field-sports including soccer, rugby, hockey and Gaelic football and the results suggest that high event retrieval and content rejection statistics are achievable
Extraction and classification of self-consumable sport video highlights using generic HMM
This paper aims to automatically extract and classify self-consumable sport video highlights. For this purpose, we will emphasize the benefits of using play-break sequences as the effective inputs for HMMbased classifier. HMM is used to model the stochastic pattern of high-level states during specific sport highlights which correspond to the sequence of generic audio-visual measurements extracted from raw video data. This paper uses soccer as the domain study, focusing on the extraction and classification of goal, shot and foul highlights. The experiment work which uses183 play-break sequences from 6 soccer matches will be presented to demonstrate the performance of our proposed scheme.<br /
Video Logo Retrieval based on local Features
Estimation of the frequency and duration of logos in videos is important and
challenging in the advertisement industry as a way of estimating the impact of
ad purchases. Since logos occupy only a small area in the videos, the popular
methods of image retrieval could fail. This paper develops an algorithm called
Video Logo Retrieval (VLR), which is an image-to-video retrieval algorithm
based on the spatial distribution of local image descriptors that measure the
distance between the query image (the logo) and a collection of video images.
VLR uses local features to overcome the weakness of global feature-based models
such as convolutional neural networks (CNN). Meanwhile, VLR is flexible and
does not require training after setting some hyper-parameters. The performance
of VLR is evaluated on two challenging open benchmark tasks (SoccerNet and
Standford I2V), and compared with other state-of-the-art logo retrieval or
detection algorithms. Overall, VLR shows significantly higher accuracy compared
with the existing methods.Comment: Accepted by ICIP 20. Contact author: Bochen Guan ([email protected]
Saying What You're Looking For: Linguistics Meets Video Search
We present an approach to searching large video corpora for video clips which
depict a natural-language query in the form of a sentence. This approach uses
compositional semantics to encode subtle meaning that is lost in other systems,
such as the difference between two sentences which have identical words but
entirely different meaning: "The person rode the horse} vs. \emph{The horse
rode the person". Given a video-sentence pair and a natural-language parser,
along with a grammar that describes the space of sentential queries, we produce
a score which indicates how well the video depicts the sentence. We produce
such a score for each video clip in a corpus and return a ranked list of clips.
Furthermore, this approach addresses two fundamental problems simultaneously:
detecting and tracking objects, and recognizing whether those tracks depict the
query. Because both tracking and object detection are unreliable, this uses
knowledge about the intended sentential query to focus the tracker on the
relevant participants and ensures that the resulting tracks are described by
the sentential query. While earlier work was limited to single-word queries
which correspond to either verbs or nouns, we show how one can search for
complex queries which contain multiple phrases, such as prepositional phrases,
and modifiers, such as adverbs. We demonstrate this approach by searching for
141 queries involving people and horses interacting with each other in 10
full-length Hollywood movies.Comment: 13 pages, 8 figure
Spatio-Temporal Pyramid Matching for Sports Videos
In this paper, we address the problem of querying video shots based on content-based matching. Our proposed system automatically partitions a video stream into video shots that maintain continuous movements of objects. Finding video shots of the same category is not an easy task because objects in a video shot change their locations over time. Our spatio-temporal pyramid matching (STPM) is the modified spatial pyramid matching (SPM) [15], which considers temporal information in conjunction with spatial locations to match objects in video shots. In addition, we model the mathematical condition in which temporal information contributes to match video shots. In order to improve the matching performance, dynamic features including movements of objects are considered in addition to static features such as edges of objects. In our experiments, several methods based on different feature sets and matching methods are compared, and our spatio-temporal pyramid matching performed better than existing methods in video matching for sports videos. 1
Event detection in field sports video using audio-visual features and a support vector machine
In this paper, we propose a novel audio-visual feature-based framework for event detection in broadcast video of multiple different field sports. Features indicating significant events are selected and robust detectors built. These features are rooted in characteristics common to all genres of field sports. The evidence gathered by the feature detectors is combined by means of a support vector machine, which infers the occurrence of an event based on a model generated during a training phase. The system is tested generically across multiple genres of field sports including soccer, rugby, hockey, and Gaelic football and the results suggest that high event retrieval and content rejection statistics are achievable
An Overview of Video Shot Clustering and Summarization Techniques for Mobile Applications
The problem of content characterization of video programmes is of great interest because video appeals to large audiences and its efficient distribution over various networks should contribute to widespread usage of multimedia services. In this paper we analyze several techniques proposed in literature for content characterization of video programmes, including movies and sports, that could be helpful for mobile media consumption. In particular we focus our analysis on shot clustering methods and effective video summarization techniques since, in the current video analysis scenario, they facilitate the access to the content and help in quick understanding of the associated semantics. First we consider the shot clustering techniques based on low-level features, using visual, audio and motion information, even combined in a multi-modal fashion. Then we concentrate on summarization techniques, such as static storyboards, dynamic video
skimming and the extraction of sport highlights. Discussed summarization methods can be employed in the development of tools that would be greatly useful to most mobile users: in fact these algorithms automatically shorten the original video while preserving most events by highlighting only the important content. The effectiveness of each approach has been analyzed, showing that it mainly depends on the kind of video programme it relates to, and the type of summary or highlights we are focusing on