25 research outputs found

    A technology platform for automatic high-level tennis game analysis

    Get PDF
    Sports video research is a popular topic that has been applied to many prominent sports for a large spectrum of applications. In this paper we introduce a technology platform which has been developed for the tennis context, able to extract action sequences and provide support to coaches for players performance analysis during training and official matches. The system consists of an hardware architecture, devised to acquire data in the tennis context and for the specific domain requirements, and a number of processing modules which are able to track both the ball and the players, to extract semantic information from their interactions and automatically annotate video sequences. The aim of this paper is to demonstrate that the proposed combination of hardware and software modules is able to extract 3D ball trajectories robust enough to evaluate ball changes of direction recognizing serves, strokes and bounces. Starting from these information, a finite state machine based decision process can be employed to evaluate the score of each action of the game. The entire platform has been tested in real experiments during both training sessions and matches, and results show that automatic annotation of key events along with 3D positions and scores can be used to support coaches in the extraction of valuable information about players intentions and behaviours

    Towards Structured Analysis of Broadcast Badminton Videos

    Full text link
    Sports video data is recorded for nearly every major tournament but remains archived and inaccessible to large scale data mining and analytics. It can only be viewed sequentially or manually tagged with higher-level labels which is time consuming and prone to errors. In this work, we propose an end-to-end framework for automatic attributes tagging and analysis of sport videos. We use commonly available broadcast videos of matches and, unlike previous approaches, does not rely on special camera setups or additional sensors. Our focus is on Badminton as the sport of interest. We propose a method to analyze a large corpus of badminton broadcast videos by segmenting the points played, tracking and recognizing the players in each point and annotating their respective badminton strokes. We evaluate the performance on 10 Olympic matches with 20 players and achieved 95.44% point segmentation accuracy, 97.38% player detection score ([email protected]), 97.98% player identification accuracy, and stroke segmentation edit scores of 80.48%. We further show that the automatically annotated videos alone could enable the gameplay analysis and inference by computing understandable metrics such as player's reaction time, speed, and footwork around the court, etc.Comment: 9 page

    Detection of Ball Hits in a Tennis Game Using Audio and Visual Information

    Get PDF
    Abstract-In this paper we describe a framework to improve the detection of ball hit events in tennis games by combining audio and visual information. Detection of the presence and timing of these events is crucial for the understanding of the game. However, neither modality on its own gives satisfactory results: audio information is often corrupted by noise and also suffers from acoustic mismatch between the training and test data, and visual information is corrupted by complex backgrounds, camera calibration, and the presence of multiple moving objects. Our approach is to first attempt to track the ball visually and hence estimate a sequence of candidate positions for the ball, and to then locate putative ball hits by analysing the ball's position in this trajectory. To handle the severe interferences caused by false ball candidates, we smooth the trajectory by using linear regression and removing the frames where there are no candidates. We use Gaussian mixture models to generate estimates of the times of hits using the audio information, and then integrate these two sources of information in a probabilistic framework. Testing our approach on three complete tennis games shows significant improvements in detection over a range of conditions when compared with using a single modality

    Fast human behavior analysis for scene understanding

    Get PDF
    Human behavior analysis has become an active topic of great interest and relevance for a number of applications and areas of research. The research in recent years has been considerably driven by the growing level of criminal behavior in large urban areas and increase of terroristic actions. Also, accurate behavior studies have been applied to sports analysis systems and are emerging in healthcare. When compared to conventional action recognition used in security applications, human behavior analysis techniques designed for embedded applications should satisfy the following technical requirements: (1) Behavior analysis should provide scalable and robust results; (2) High-processing efficiency to achieve (near) real-time operation with low-cost hardware; (3) Extensibility for multiple-camera setup including 3-D modeling to facilitate human behavior understanding and description in various events. The key to our problem statement is that we intend to improve behavior analysis performance while preserving the efficiency of the designed techniques, to allow implementation in embedded environments. More specifically, we look into (1) fast multi-level algorithms incorporating specific domain knowledge, and (2) 3-D configuration techniques for overall enhanced performance. If possible, we explore the performance of the current behavior-analysis techniques for improving accuracy and scalability. To fulfill the above technical requirements and tackle the research problems, we propose a flexible behavior-analysis framework consisting of three processing-layers: (1) pixel-based processing (background modeling with pixel labeling), (2) object-based modeling (human detection, tracking and posture analysis), and (3) event-based analysis (semantic event understanding). In Chapter 3, we specifically contribute to the analysis of individual human behavior. A novel body representation is proposed for posture classification based on a silhouette feature. Only pure binary-shape information is used for posture classification without texture/color or any explicit body models. To this end, we have studied an efficient HV-PCA shape-based descriptor with temporal modeling, which achieves a posture-recognition accuracy rate of about 86% and outperforms other existing proposals. As our human motion scheme is efficient and achieves a fast performance (6-8 frames/second), it enables a fast surveillance system or further analysis of human behavior. In addition, a body-part detection approach is presented. The color and body ratio are combined to provide clues for human body detection and classification. The conventional assumption of up-right body posture is not required. Afterwards, we design and construct a specific framework for fast algorithms and apply them in two applications: tennis sports analysis and surveillance. Chapter 4 deals with tennis sports analysis and presents an automatic real-time system for multi-level analysis of tennis video sequences. First, we employ a 3-D camera model to bridge the pixel-level, object-level and scene-level of tennis sports analysis. Second, a weighted linear model combining the visual cues in the real-world domain is proposed to identify various events. The experimentally found event extraction rate of the system is about 90%. Also, audio signals are combined to enhance the scene analysis performance. The complete proposed application is efficient enough to obtain a real-time or near real-time performance (2-3 frames/second for 720×576 resolution, and 5-7 frames/second for 320×240 resolution, with a P-IV PC running at 3GHz). Chapter 5 addresses surveillance and presents a full real-time behavior-analysis framework, featuring layers at pixel, object, event and visualization level. More specifically, this framework captures the human motion, classifies its posture, infers the semantic event exploiting interaction modeling, and performs the 3-D scene reconstruction. We have introduced our system design based on a specific software architecture, by employing the well-known "4+1" view model. In addition, human behavior analysis algorithms are directly designed for real-time operation and embedded in an experimental runtime AV content-analysis architecture. This executable system is designed to be generic for multiple streaming applications with component-based architectures. To evaluate the performance, we have applied this networked system in a single-camera setup. The experimental platform operates with two Pentium Quadcore engines (2.33 GHz) and 4-GB memory. Performance evaluations have shown that this networked framework is efficient and achieves a fast performance (13-15 frames/second) for monocular video sequences. Moreover, a dual-camera setup is tested within the behavior-analysis framework. After automatic camera calibration is conducted, the 3-D reconstruction and communication among different cameras are achieved. The extra view in the multi-camera setup improves the human tracking and event detection in case of occlusion. This extension of multiple-view fusion improves the event-based semantic analysis by 8.3-16.7% in accuracy rate. The detailed studies of two experimental intelligent applications, i.e., tennis sports analysis and surveillance, have proven their value in several extensive tests in the framework of the European Candela and Cantata ITEA research programs, where our proposed system has demonstrated competitive performance with respect to accuracy and efficiency

    Player tracking and identification in broadcast ice hockey video

    Get PDF
    Tracking and identifying players is a fundamental step in computer vision-based ice hockey analytics. The data generated by tracking is used in many other downstream tasks, such as game event detection and game strategy analysis. Player tracking and identification is a challenging problem since the motion of players in hockey is fast-paced and non-linear when compared to pedestrians. There is also significant player-player and player-board occlusion, camera panning and zooming in hockey broadcast video. Identifying players in ice hockey is a difficult task since the players of the same team appear almost identical, with the jersey number the only consistent discriminating factor between players. In this thesis, an automated system to track and identify players in broadcast NHL hockey videos is introduced. The system is composed of player tracking, team identification and player identification models. In addition, the game roster and player shift data is incorporated to further increase the accuracy of player identification in the overall system. Due to the absence of publicly available datasets, new datasets for player tracking, team identification and player identification in ice-hockey are also introduced. Remarking that there is a lack of publicly available research for tracking ice hockey players making use of recent advancements in deep learning, we test five state-of-the-art tracking algorithms on an ice-hockey dataset and analyze the performance and failure cases. We introduce a multi-task loss based network to identify player jersey numbers from static images. The network uses multi-task learning to simultaneously predict and learn from two different representations of a player jersey number. Through various experiments and ablation studies it was demonstrated that the multi-task learning based network performed better than the constituent single-task settings. We incorporate the temporal dimension into account for jersey number identification by inferring jersey number from sequences of player images - called player tracklets. To do so, we tested two popular deep temporal networks (1) Temporal 1D convolutional neural network (CNN) and (2) Transformer network. The network trained using the multi-task loss served as a backbone for these two networks. In addition, we also introduce a weakly-supervised learning strategy to improve training speed and convergence for the transformer network. Experimental results demonstrate that the proposed networks outperform the state-of-the art. Finally, we describe in detail how the player tracking and identification models are put together to form the holistic pipeline starting from raw broadcast NHL video to obtain uniquely identified player tracklets. The process of incorporating the game roster and player shifts to improve player identification is explained. An overall accuracy of 88% is obtained on the test set. An off-the-shelf automatic homography registration model and a puck localization model are also incorporated into the pipeline to obtain the tracks of both player and puck on the ice rink

    Occupancy Analysis of the Outdoor Football Fields

    Get PDF

    Video-based step measurement in sport and daily living.

    Get PDF
    Current knowledge of tennis player-surface interactions with different court surfaces is limited. The measurement of player step and movement strategy would aid the understanding of tennis player-surface interaction. However, this has not yet been performed: no readily available motion analysis tool is capable of measuring spatio-temporal parameters of gait during match-play tennis. The purpose of this project was to develop, validate and use a motion analysis tool designed to measure player location and foot-surface contacts during match-play tennis.Single camera video footage, obtained from the 2011 Roland Garros Qualifying Tournament, was manually digitised to characterise step and movement strategy during men's and women's forehand groundstrokes. Player movements were consistent with previous notational analyses; however gender differences were highlighted for step frequency. Initial findings were limited by manual analysis, e.g. manual digitising subjectivity and low sample size: an objective and automated system was required.A markerless, view-independent, foot-surface contact identification (FSCi) algorithm was developed. The FSCi algorithm identifies foot-surface contacts in image sequences of gait by quantifying the motion of each foot. The algorithm was validated using standard colour image sequences of walking and running obtained from four unique camera perspectives: output data were compared to three-dimensional motion analysis. The FSCi algorithm identified data for 1243 of 1248 foot-surface contacts; root-mean-square error (RMSE) was 52.2 and 103.4 mm for shod walking and running respectively (all camera perspectives). Findings demonstrated that the FSCi algorithm measured basic, spatio-temporal parameters of walking and running, e.g. step length and step time, without interfering with the activity being observed. Furthermore, analyses were independent of camera view.Video footage obtained from the 2011 ATP World Tour Finals was used to develop a combined player tracking and foot-surface contact identification (PT-FSCi) algorithm. Furthermore, a graphical user interface was developed. The PT-FSCi algorithm was used to analyse twenty match-play tennis rallies: output data were compared to manual digitising. The PT-FSCi algorithm tracked player position and identified data for 832 of 890 foot-surface contacts during match-play tennis. RMSE for player position and foot-surface contacts was 232.9 and 121.9 mm respectively. The calculation of step parameters required manual intervention: this reflected the multi-directional nature of tennis. This represents a limitation to the current algorithm however the segmentation of player movement phases to allow the automatic calculation of step parameters.The analysis of this data indicated that top ranked tennis players can win rallies using movement strategies previously considered to be defensive. Furthermore, step length data indicated that shorter step lengths formed the majority of step strategy. The largest 25% of steps were observed behind the baseline, aligned with deuce and advantage court sidelines. This reflected lunging and turning manoeuvres at lateral extremes of player movement.The single camera system that has resulted from this project will enable the International Tennis Federation to characterise player step and movement strategy during match-play tennis. This will allow a more informed approach to player-surface interaction research. Furthermore, the system has potential to be used for different applications, ranging from sport to surveillance
    corecore