546 research outputs found
A Survey of Deep Learning in Sports Applications: Perception, Comprehension, and Decision
Deep learning has the potential to revolutionize sports performance, with
applications ranging from perception and comprehension to decision. This paper
presents a comprehensive survey of deep learning in sports performance,
focusing on three main aspects: algorithms, datasets and virtual environments,
and challenges. Firstly, we discuss the hierarchical structure of deep learning
algorithms in sports performance which includes perception, comprehension and
decision while comparing their strengths and weaknesses. Secondly, we list
widely used existing datasets in sports and highlight their characteristics and
limitations. Finally, we summarize current challenges and point out future
trends of deep learning in sports. Our survey provides valuable reference
material for researchers interested in deep learning in sports applications
Training Algorithms for Multiple Object Tracking
Multiple object tracking is a crucial Computer Vision Task. It aims at locating objects of interest in the image sequences, maintaining their identities, and identifying their trajectories over time. A large portion of current research focuses on tracking pedestrians, and other types of objects, that often exhibit predictable behaviours, that allow us, as humans, to track those objects. Nevertheless, most existing approaches rely solely on simple affinity or appearance cues to maintain the identities of the tracked objects, ignoring their behaviour. This presents a challenge when objects of interest are invisible or indistinguishable for a long period of time.
In this thesis, we focus on enhancing the quality of multiple object trackers by learning and exploiting the long ranging models of object behaviour. Such behaviours come in different forms, be it a physical model of the ball motion, model of interaction between the ball and the players in sports or motion patterns of pedestrians or cars, that is specific to a particular scene.
In the first part of the thesis, we begin with the task of tracking the ball and the players in team sports. We propose a model that tracks both types of objects simultaneously, while respecting the physical laws of ball motion when in free fall, and interaction constraints that appear when players are in the possession of the ball. We show that both the presence of the behaviour models and the simultaneous solution of both tasks aids the performance of tracking, in basketball, volleyball, and soccer.
In the second part of the thesis, we focus on motion models of pedestrian and car behaviour that emerge in the outdoor scenes. Such motion models are inherently global, as they determine where people starting from one location tend to end up much later in time. Imposing such global constraints while keeping the tracking problem tractable presents a challenge, which is why many approaches rely on local affinity measures. We formulate a problem of simultaneously tracking the objects and learning their behaviour patterns. We show that our approach, when applied in conjunction with a number of state-of-the-art trackers, improves their performance, by forcing their output to follow the learned motion patterns of the scene.
In the last part of the thesis, we study a new emerging class of models for multiple object tracking, that appeared recently due to availability of large scale datasets - sequence models for multiple object tracking. While such models could potentially learn arbitrarily long ranging behaviours, training them presents several challenges. We propose a training scheme and a loss function that allows to significantly improve the quality of training of such models. We demonstrate that simply using our training scheme and loss allows to learn scoring function for trajectories, which enables us to outperform state-of-the-art methods on several tracking benchmarks
Automatic visual detection of human behavior: a review from 2000 to 2014
Due to advances in information technology (e.g., digital video cameras, ubiquitous sensors), the automatic detection of human behaviors from video is a very recent research topic. In this paper, we perform a systematic and recent literature review on this topic, from 2000 to 2014, covering a selection of 193 papers that were searched from six major scientific publishers. The selected papers were classified into three main subjects: detection techniques, datasets and applications. The detection techniques were divided into four categories (initialization, tracking, pose estimation and recognition). The list of datasets includes eight examples (e.g., Hollywood action). Finally, several application areas were identified, including human detection, abnormal activity detection, action recognition, player modeling and pedestrian detection. Our analysis provides a road map to guide future research for designing automatic visual human behavior detection systems.This work is funded by the Portuguese Foundation for Science and Technology (FCT - Fundacao para a Ciencia e a Tecnologia) under research Grant SFRH/BD/84939/2012
Fast human behavior analysis for scene understanding
Human behavior analysis has become an active topic of great interest and relevance for a number of applications and areas of research. The research in recent years has been considerably driven by the growing level of criminal behavior in large urban areas and increase of terroristic actions. Also, accurate behavior studies have been applied to sports analysis systems and are emerging in healthcare. When compared to conventional action recognition used in security applications, human behavior analysis techniques designed for embedded applications should satisfy the following technical requirements: (1) Behavior analysis should provide scalable and robust results; (2) High-processing efficiency to achieve (near) real-time operation with low-cost hardware; (3) Extensibility for multiple-camera setup including 3-D modeling to facilitate human behavior understanding and description in various events. The key to our problem statement is that we intend to improve behavior analysis performance while preserving the efficiency of the designed techniques, to allow implementation in embedded environments. More specifically, we look into (1) fast multi-level algorithms incorporating specific domain knowledge, and (2) 3-D configuration techniques for overall enhanced performance. If possible, we explore the performance of the current behavior-analysis techniques for improving accuracy and scalability. To fulfill the above technical requirements and tackle the research problems, we propose a flexible behavior-analysis framework consisting of three processing-layers: (1) pixel-based processing (background modeling with pixel labeling), (2) object-based modeling (human detection, tracking and posture analysis), and (3) event-based analysis (semantic event understanding). In Chapter 3, we specifically contribute to the analysis of individual human behavior. A novel body representation is proposed for posture classification based on a silhouette feature. Only pure binary-shape information is used for posture classification without texture/color or any explicit body models. To this end, we have studied an efficient HV-PCA shape-based descriptor with temporal modeling, which achieves a posture-recognition accuracy rate of about 86% and outperforms other existing proposals. As our human motion scheme is efficient and achieves a fast performance (6-8 frames/second), it enables a fast surveillance system or further analysis of human behavior. In addition, a body-part detection approach is presented. The color and body ratio are combined to provide clues for human body detection and classification. The conventional assumption of up-right body posture is not required. Afterwards, we design and construct a specific framework for fast algorithms and apply them in two applications: tennis sports analysis and surveillance. Chapter 4 deals with tennis sports analysis and presents an automatic real-time system for multi-level analysis of tennis video sequences. First, we employ a 3-D camera model to bridge the pixel-level, object-level and scene-level of tennis sports analysis. Second, a weighted linear model combining the visual cues in the real-world domain is proposed to identify various events. The experimentally found event extraction rate of the system is about 90%. Also, audio signals are combined to enhance the scene analysis performance. The complete proposed application is efficient enough to obtain a real-time or near real-time performance (2-3 frames/second for 720Ă—576 resolution, and 5-7 frames/second for 320Ă—240 resolution, with a P-IV PC running at 3GHz). Chapter 5 addresses surveillance and presents a full real-time behavior-analysis framework, featuring layers at pixel, object, event and visualization level. More specifically, this framework captures the human motion, classifies its posture, infers the semantic event exploiting interaction modeling, and performs the 3-D scene reconstruction. We have introduced our system design based on a specific software architecture, by employing the well-known "4+1" view model. In addition, human behavior analysis algorithms are directly designed for real-time operation and embedded in an experimental runtime AV content-analysis architecture. This executable system is designed to be generic for multiple streaming applications with component-based architectures. To evaluate the performance, we have applied this networked system in a single-camera setup. The experimental platform operates with two Pentium Quadcore engines (2.33 GHz) and 4-GB memory. Performance evaluations have shown that this networked framework is efficient and achieves a fast performance (13-15 frames/second) for monocular video sequences. Moreover, a dual-camera setup is tested within the behavior-analysis framework. After automatic camera calibration is conducted, the 3-D reconstruction and communication among different cameras are achieved. The extra view in the multi-camera setup improves the human tracking and event detection in case of occlusion. This extension of multiple-view fusion improves the event-based semantic analysis by 8.3-16.7% in accuracy rate. The detailed studies of two experimental intelligent applications, i.e., tennis sports analysis and surveillance, have proven their value in several extensive tests in the framework of the European Candela and Cantata ITEA research programs, where our proposed system has demonstrated competitive performance with respect to accuracy and efficiency
Recommended from our members
Movement recognition from wearable sensors data: power-aware evolutionary training for template matching and data annotation recovery methods
Human activities recognition finds numerous applications for example in sport training, patient rehabilitation, gait analysis and surgical skills evaluation. Wearable sensing and Template Matching Methods (TMMs) offer significant advantages compared to manual assessment methods as well as to more cumbersome camera-based setups and other machine learning (ML) algorithms.
TMMs require less data for training than other ML methods, they are low-power and therefore suitable for integration on wearable sensor. They compute a sample-by-sample distance between two time series. When applied to gestures sensors data, this even enables a richer and more movement-specific assessment and feedback. However, TMMs lack of a standard training procedure.
In this thesis, we introduce an innovative evolutionary training algorithm for TMMthat not only can maximize recognition performance, but it can also prefer power-minimisation by reducing the TMM’s computational cost with a configurable trade-off. We exhibit that a reduction is even possible without sacrificing recognition performance by exploiting the long-established concept of “time warping”. We demonstrate that our method is suitable for a wide variety of raw data as well as processed, fused and encoded sensor data.
We present a new original multi-modal, multi-user dataset of beach volleyball movements that allowed to evaluate our training methods on a real-case of sport training actions. Moreover, the collection of this dataset helped to generate a set of guidelines for the collection of movement data in the wild, using wearable sensors.
We introduce a 3D human model that can be animated through inertial wearable sensors data for troubleshooting, movement analysis and privacy-safe annotation of human activities. Finally, through a case study on a dataset of drinking actions, we demonstrate how TMM can improve the quality of a badly annotated but highly valuable dataset
Spatio-temporal human action detection and instance segmentation in videos
With an exponential growth in the number of video capturing devices and digital video content, automatic video understanding is now at the forefront of computer vision research. This thesis presents a series of models for automatic human action detection in videos and also addresses the space-time action instance segmentation problem. Both action detection and instance segmentation play vital roles in video understanding.
Firstly, we propose a novel human action detection approach based on a frame-level deep feature representation combined with a two-pass dynamic programming approach. The method obtains a frame-level action representation by leveraging recent advances in deep learning based action recognition and object detection methods. To combine the the complementary appearance and motion cues, we introduce a new fusion technique which signicantly improves the detection performance. Further, we cast the temporal action detection as two energy optimisation problems which are solved using Viterbi algorithm.
Exploiting a video-level representation further allows the network to learn the inter-frame temporal correspondence between action regions and it is bound to be a more optimal solution to the action detection problem than a frame-level representation. Secondly, we propose a novel deep network architecture which learns a video-level action representation by classifying and regressing 3D region proposals spanning two successive video frames. The proposed model is end-to-end trainable and can be jointly optimised for both proposal generation and action detection objectives in a single training step. We name our new network as \AMTnet" (Action Micro-Tube regression Network). We further extend the AMTnet model by incorporating optical ow features to encode motion patterns of actions.
Finally, we address the problem of action instance segmentation in which multiple concurrent actions of the same class may be segmented out of an image sequence. By taking advantage of recent work on action foreground-background segmentation, we are able to associate each action tube with class-specic segmentations.
We demonstrate the performance of our proposed models on challenging action detection benchmarks achieving new state-of-the-art results across the board and signicantly increasing detection speed at test time
Application of Artificial Intelligence in Basketball Sport
Basketball is among the most popular sports in the world, and its related industries have also produced huge economic benefits. In recent years, the application of artificial intelligence (AI) technology in basketball has attracted a large amount of attention. We conducted a comprehensive review of the application research of AI in basketball through literature retrieval. Current research focuses on the AI analysis of basketball team and player performance, prediction of competition results, analysis and prediction of shooting, AI coaching system, intelligent training machine and arena, and sports injury prevention. Most studies have shown that AI technology can improve the training level of basketball players, help coaches formulate suitable game strategies, prevent sports injuries, and improve the enjoyment of games. At the same time, it is also found that the number and level of published papers are relatively limited. We believe that the application of AI in basketball is still in its infancy. We call on relevant industries to increase their research investment in this area, and promote the improvement of the level of basketball, making the game increasingly exciting as its worldwide popularity continues to increase
Recommended from our members
A Novel Multi-View Table Tennis Umpiring Framework
This research investigates the development of a low-cost multi-view umpiring framework, as an alternative to the current expensive systems that are almost exclusively restricted to elite professional sports. Table tennis has been selected as the testbed because, while automating the process is challenging, it has many different complex match elements including the service, return and rallies, which are governed by a strict set of regulations. The focus is mainly on the rally element rather than the whole match. Ball detection and tracking in video frames are undertaken to determine reliably the ball position relative to key reference objects like the table surface and net, and the ball’s flight path is used to determine the rally’s status.
While a low-cost option has benefits, it is technically challenging due to the limited number of cameras and generally low video resolution used. This thesis presents a portable multi-view umpiring framework that identifies each state change in a rally. It makes three significant contributions to knowledge: i) a reliable ball detection strategy that accurately detects the location of the ball in low-resolution sequences; ii) a novel framework for ball tracking using a multi-view system, and iii) a new state-machine based evaluation system for analysing table tennis rallies.
In a series of ten different test scenarios, the system achieved an average of 94% system detection rate and 100% accurate decisions. A test sequence of duration 1 s can be processed in 8 s, leading to a delay of only 7 s, which is considered acceptable for practical purposes. This solution has the potential to reform the way matches are umpired, providing objectivity in resolving disputed decisions. It affords an economic technology for amateur players, while the multi-view facility is extendible to other relevant ball-based sports. Finally, the ball flight path analysis mechanism can be a valuable training tool for skills development
- …