11,192 research outputs found

    Semantic analysis of field sports video using a petri-net of audio-visual concepts

    Get PDF
    The most common approach to automatic summarisation and highlight detection in sports video is to train an automatic classifier to detect semantic highlights based on occurrences of low-level features such as action replays, excited commentators or changes in a scoreboard. We propose an alternative approach based on the detection of perception concepts (PCs) and the construction of Petri-Nets which can be used for both semantic description and event detection within sports videos. Low-level algorithms for the detection of perception concepts using visual, aural and motion characteristics are proposed, and a series of Petri-Nets composed of perception concepts is formally defined to describe video content. We call this a Perception Concept Network-Petri Net (PCN-PN) model. Using PCN-PNs, personalized high-level semantic descriptions of video highlights can be facilitated and queries on high-level semantics can be achieved. A particular strength of this framework is that we can easily build semantic detectors based on PCN-PNs to search within sports videos and locate interesting events. Experimental results based on recorded sports video data across three types of sports games (soccer, basketball and rugby), and each from multiple broadcasters, are used to illustrate the potential of this framework

    Video summarisation: A conceptual framework and survey of the state of the art

    Get PDF
    This is the post-print (final draft post-refereeing) version of the article. Copyright @ 2007 Elsevier Inc.Video summaries provide condensed and succinct representations of the content of a video stream through a combination of still images, video segments, graphical representations and textual descriptors. This paper presents a conceptual framework for video summarisation derived from the research literature and used as a means for surveying the research literature. The framework distinguishes between video summarisation techniques (the methods used to process content from a source video stream to achieve a summarisation of that stream) and video summaries (outputs of video summarisation techniques). Video summarisation techniques are considered within three broad categories: internal (analyse information sourced directly from the video stream), external (analyse information not sourced directly from the video stream) and hybrid (analyse a combination of internal and external information). Video summaries are considered as a function of the type of content they are derived from (object, event, perception or feature based) and the functionality offered to the user for their consumption (interactive or static, personalised or generic). It is argued that video summarisation would benefit from greater incorporation of external information, particularly user based information that is unobtrusively sourced, in order to overcome longstanding challenges such as the semantic gap and providing video summaries that have greater relevance to individual users

    Content-based Video Retrieval by Integrating Spatio-Temporal and Stochastic Recognition of Events

    Get PDF
    As amounts of publicly available video data grow the need to query this data efficiently becomes significant. Consequently content-based retrieval of video data turns out to be a challenging and important problem. We address the specific aspect of inferring semantics automatically from raw video data. In particular, we introduce a new video data model that supports the integrated use of two different approaches for mapping low-level features to high-level concepts. Firstly, the model is extended with a rule-based approach that supports spatio-temporal formalization of high-level concepts, and then with a stochastic approach. Furthermore, results on real tennis video data are presented, demonstrating the validity of both approaches, as well us advantages of their integrated us

    Deceptive body movements reverse spatial cueing in soccer

    Get PDF
    This article has been made available through the Brunel Open Access Publishing Fund.The purpose of the experiments was to analyse the spatial cueing effects of the movements of soccer players executing normal and deceptive (step-over) turns with the ball. Stimuli comprised normal resolution or point-light video clips of soccer players dribbling a football towards the observer then turning right or left with the ball. Clips were curtailed before or on the turn (-160, -80, 0 or +80 ms) to examine the time course of direction prediction and spatial cueing effects. Participants were divided into higher-skilled (HS) and lower-skilled (LS) groups according to soccer experience. In experiment 1, accuracy on full video clips was higher than on point-light but results followed the same overall pattern. Both HS and LS groups correctly identified direction on normal moves at all occlusion levels. For deceptive moves, LS participants were significantly worse than chance and HS participants were somewhat more accurate but nevertheless substantially impaired. In experiment 2, point-light clips were used to cue a lateral target. HS and LS groups showed faster reaction times to targets that were congruent with the direction of normal turns, and to targets incongruent with the direction of deceptive turns. The reversed cueing by deceptive moves coincided with earlier kinematic events than cueing by normal moves. It is concluded that the body kinematics of soccer players generate spatial cueing effects when viewed from an opponent's perspective. This could create a reaction time advantage when anticipating the direction of a normal move. A deceptive move is designed to turn this cueing advantage into a disadvantage. Acting on the basis of advance information, the presence of deceptive moves primes responses in the wrong direction, which may be only partly mitigated by delaying a response until veridical cues emerge

    SoccerNet-Tracking: Multiple Object Tracking Dataset and Benchmark in Soccer Videos

    Full text link
    Tracking objects in soccer videos is extremely important to gather both player and team statistics, whether it is to estimate the total distance run, the ball possession or the team formation. Video processing can help automating the extraction of those information, without the need of any invasive sensor, hence applicable to any team on any stadium. Yet, the availability of datasets to train learnable models and benchmarks to evaluate methods on a common testbed is very limited. In this work, we propose a novel dataset for multiple object tracking composed of 200 sequences of 30s each, representative of challenging soccer scenarios, and a complete 45-minutes half-time for long-term tracking. The dataset is fully annotated with bounding boxes and tracklet IDs, enabling the training of MOT baselines in the soccer domain and a full benchmarking of those methods on our segregated challenge sets. Our analysis shows that multiple player, referee and ball tracking in soccer videos is far from being solved, with several improvement required in case of fast motion or in scenarios of severe occlusion.Comment: Paper accepted for the CVsports workshop at CVPR2022. This document contains 8 pages + reference
    corecore