2,422 research outputs found
Semantic analysis of field sports video using a petri-net of audio-visual concepts
The most common approach to automatic summarisation and highlight detection in sports video is to train an automatic classifier to detect semantic highlights based on occurrences of low-level features such as action replays, excited commentators or changes in a scoreboard. We propose an alternative approach based on the detection of perception concepts (PCs) and the construction of Petri-Nets which can be used for both semantic description and event detection within sports videos. Low-level algorithms for the detection of perception concepts using visual, aural and motion characteristics are proposed, and a series of Petri-Nets composed of perception concepts is formally defined to describe video content. We call this a Perception Concept Network-Petri Net (PCN-PN) model. Using PCN-PNs, personalized high-level semantic descriptions of video highlights can be facilitated and queries on high-level semantics can be achieved. A particular strength of this framework is that we can easily build semantic detectors based on PCN-PNs to search within sports videos and locate interesting events. Experimental results based on recorded sports
video data across three types of sports games (soccer, basketball and rugby), and each from multiple broadcasters, are used to illustrate the potential of this framework
Video summarisation: A conceptual framework and survey of the state of the art
This is the post-print (final draft post-refereeing) version of the article. Copyright @ 2007 Elsevier Inc.Video summaries provide condensed and succinct representations of the content of a video stream through a combination of still images, video segments, graphical representations and textual descriptors. This paper presents a conceptual framework for video summarisation derived from the research literature and used as a means for surveying the research literature. The framework distinguishes between video summarisation techniques (the methods used to process content from a source video stream to achieve a summarisation of that stream) and video summaries (outputs of video summarisation techniques). Video summarisation techniques are considered within three broad categories: internal (analyse information sourced directly from the video stream), external (analyse information not sourced directly from the video stream) and hybrid (analyse a combination of internal and external information). Video summaries are considered as a function of the type of content they are derived from (object, event, perception or feature based) and the functionality offered to the user for their consumption (interactive or static, personalised or generic). It is argued that video summarisation would benefit from greater incorporation of external information, particularly user based information that is unobtrusively sourced, in order to overcome longstanding challenges such as the semantic gap and providing video summaries that have greater relevance to individual users
Real-time event detection in field sport videos
This chapter describes a real-time system for event detection in sports broadcasts. The approach presented is applicable to a wide range of field sports. Using two independent event detection approaches that work simultaneously, the system is capable of accurately detecting scores, near misses, and other exciting parts of a game that do not result in a score. The results obtained across a diverse dataset of different field sports are promising, demonstrating over 90% accuracy for a feature-based event detector and 100% accuracy for a scoreboard-based detector detecting only score
An Overview of Multimodal Techniques for the Characterization of Sport Programmes
The problem of content characterization of sports videos is of great interest because sports video appeals to large audiences and its efficient distribution over various networks should contribute to widespread usage of multimedia services. In this paper we analyze several techniques proposed in literature for content characterization of sports videos. We focus this analysis on the typology of the signal (audio, video, text captions, ...) from which the low-level features are extracted. First we consider the techniques based on visual information, then the methods based on audio information, and finally the algorithms based on audio-visual cues, used in a multi-modal fashion. This analysis shows that each type of signal carries some peculiar information, and the multi-modal approach can fully exploit the multimedia information associated to the sports video. Moreover, we observe that the characterization is performed either considering what happens in a specific time segment, observing therefore the features in a "static" way, or trying to capture their "dynamic" evolution in time. The effectiveness of each approach depends mainly on the kind of sports it relates to, and the type of highlights we are focusing on
Video Stream Retrieval of Unseen Queries using Semantic Memory
Retrieval of live, user-broadcast video streams is an under-addressed and
increasingly relevant challenge. The on-line nature of the problem requires
temporal evaluation and the unforeseeable scope of potential queries motivates
an approach which can accommodate arbitrary search queries. To account for the
breadth of possible queries, we adopt a no-example approach to query retrieval,
which uses a query's semantic relatedness to pre-trained concept classifiers.
To adapt to shifting video content, we propose memory pooling and memory
welling methods that favor recent information over long past content. We
identify two stream retrieval tasks, instantaneous retrieval at any particular
time and continuous retrieval over a prolonged duration, and propose means for
evaluating them. Three large scale video datasets are adapted to the challenge
of stream retrieval. We report results for our search methods on the new stream
retrieval tasks, as well as demonstrate their efficacy in a traditional,
non-streaming video task.Comment: Presented at BMVC 2016, British Machine Vision Conference, 201
A Survey of Deep Learning in Sports Applications: Perception, Comprehension, and Decision
Deep learning has the potential to revolutionize sports performance, with
applications ranging from perception and comprehension to decision. This paper
presents a comprehensive survey of deep learning in sports performance,
focusing on three main aspects: algorithms, datasets and virtual environments,
and challenges. Firstly, we discuss the hierarchical structure of deep learning
algorithms in sports performance which includes perception, comprehension and
decision while comparing their strengths and weaknesses. Secondly, we list
widely used existing datasets in sports and highlight their characteristics and
limitations. Finally, we summarize current challenges and point out future
trends of deep learning in sports. Our survey provides valuable reference
material for researchers interested in deep learning in sports applications
SoccerNet-Tracking: Multiple Object Tracking Dataset and Benchmark in Soccer Videos
Tracking objects in soccer videos is extremely important to gather both
player and team statistics, whether it is to estimate the total distance run,
the ball possession or the team formation. Video processing can help automating
the extraction of those information, without the need of any invasive sensor,
hence applicable to any team on any stadium. Yet, the availability of datasets
to train learnable models and benchmarks to evaluate methods on a common
testbed is very limited. In this work, we propose a novel dataset for multiple
object tracking composed of 200 sequences of 30s each, representative of
challenging soccer scenarios, and a complete 45-minutes half-time for long-term
tracking. The dataset is fully annotated with bounding boxes and tracklet IDs,
enabling the training of MOT baselines in the soccer domain and a full
benchmarking of those methods on our segregated challenge sets. Our analysis
shows that multiple player, referee and ball tracking in soccer videos is far
from being solved, with several improvement required in case of fast motion or
in scenarios of severe occlusion.Comment: Paper accepted for the CVsports workshop at CVPR2022. This document
contains 8 pages + reference
- …