4,984 research outputs found
Stochastic Prediction of Multi-Agent Interactions from Partial Observations
We present a method that learns to integrate temporal information, from a
learned dynamics model, with ambiguous visual information, from a learned
vision model, in the context of interacting agents. Our method is based on a
graph-structured variational recurrent neural network (Graph-VRNN), which is
trained end-to-end to infer the current state of the (partially observed)
world, as well as to forecast future states. We show that our method
outperforms various baselines on two sports datasets, one based on real
basketball trajectories, and one generated by a soccer game engine.Comment: ICLR 2019 camera read
Online Visual Robot Tracking and Identification using Deep LSTM Networks
Collaborative robots working on a common task are necessary for many
applications. One of the challenges for achieving collaboration in a team of
robots is mutual tracking and identification. We present a novel pipeline for
online visionbased detection, tracking and identification of robots with a
known and identical appearance. Our method runs in realtime on the limited
hardware of the observer robot. Unlike previous works addressing robot tracking
and identification, we use a data-driven approach based on recurrent neural
networks to learn relations between sequential inputs and outputs. We formulate
the data association problem as multiple classification problems. A deep LSTM
network was trained on a simulated dataset and fine-tuned on small set of real
data. Experiments on two challenging datasets, one synthetic and one real,
which include long-term occlusions, show promising results.Comment: IEEE/RSJ International Conference on Intelligent Robots and Systems
(IROS), Vancouver, Canada, 2017. IROS RoboCup Best Paper Awar
Grad-CAM++: Improved Visual Explanations for Deep Convolutional Networks
Over the last decade, Convolutional Neural Network (CNN) models have been
highly successful in solving complex vision problems. However, these deep
models are perceived as "black box" methods considering the lack of
understanding of their internal functioning. There has been a significant
recent interest in developing explainable deep learning models, and this paper
is an effort in this direction. Building on a recently proposed method called
Grad-CAM, we propose a generalized method called Grad-CAM++ that can provide
better visual explanations of CNN model predictions, in terms of better object
localization as well as explaining occurrences of multiple object instances in
a single image, when compared to state-of-the-art. We provide a mathematical
derivation for the proposed method, which uses a weighted combination of the
positive partial derivatives of the last convolutional layer feature maps with
respect to a specific class score as weights to generate a visual explanation
for the corresponding class label. Our extensive experiments and evaluations,
both subjective and objective, on standard datasets showed that Grad-CAM++
provides promising human-interpretable visual explanations for a given CNN
architecture across multiple tasks including classification, image caption
generation and 3D action recognition; as well as in new settings such as
knowledge distillation.Comment: 17 Pages, 15 Figures, 11 Tables. Accepted in the proceedings of IEEE
Winter Conf. on Applications of Computer Vision (WACV2018). Extended version
is under review at IEEE Transactions on Pattern Analysis and Machine
Intelligenc
Data-driven action-value functions for evaluating players in professional team sports
As more and larger event stream datasets for professional sports become available, there is growing interest in modeling the complex play dynamics to evaluate player performance. Among these models, a common player evaluation method is assigning values to player actions. Traditional action-values metrics, however, consider very limited game context and player information. Furthermore, they provide directly related to goals (e.g., shots), not all actions. Recent work has shown that reinforcement learning provided powerful methods for addressing quantifying the value of player actions in sports. This dissertation develops deep reinforcement learning (DRL) methods for estimating action values in sports. We make several contributions to DRL for sports. First, we develop neural network architectures that learn an action-value Q-function from sports events logs to estimate each team\u27s expected success given the current match context. Specifically, our architecture models the game history with a recurrent network and predicts the probability that a team scores the next goal. From the learned Q-values, we derive a Goal Impact Metric (GIM) for evaluating a player\u27s performance over a game season. We show that the resulting player rankings are consistent with standard player metrics and temporally consistent within and across seasons. Second, we address the interpretability of the learned Q-values. While neural networks provided accurate estimates, the black-box structure prohibits understanding the influence of different game features on the action values. To interpret the Q-function and understand the influence of game features on action values, we design an interpretable mimic learning framework for the DRL. The framework is based on a Linear Model U-Tree (LMUT) as a transparent mimic model, which facilitates extracting the function rules and computing the feature importance for action values. Third, we incorporate information about specific players into the action values, by introducing a deep player representation framework. In this framework, each player is assigned a latent feature vector called an embedding, with the property that statistically similar players are mapped to nearby embeddings. To compute embeddings that summarize the statistical information about players, we implement a Variational Recurrent Ladder Agent Encoder (VaRLAE) to learn a contextualized representation for when and how players are likely to act. We learn and evaluate deep Q-functions from event data for both ice hockey and soccer. These are challenging continuous-flow games where game context and medium-term consequences are crucial for properly assessing the impact of a player\u27s actions
Tracking by Prediction: A Deep Generative Model for Mutli-Person localisation and Tracking
Current multi-person localisation and tracking systems have an over reliance
on the use of appearance models for target re-identification and almost no
approaches employ a complete deep learning solution for both objectives. We
present a novel, complete deep learning framework for multi-person localisation
and tracking. In this context we first introduce a light weight sequential
Generative Adversarial Network architecture for person localisation, which
overcomes issues related to occlusions and noisy detections, typically found in
a multi person environment. In the proposed tracking framework we build upon
recent advances in pedestrian trajectory prediction approaches and propose a
novel data association scheme based on predicted trajectories. This removes the
need for computationally expensive person re-identification systems based on
appearance features and generates human like trajectories with minimal
fragmentation. The proposed method is evaluated on multiple public benchmarks
including both static and dynamic cameras and is capable of generating
outstanding performance, especially among other recently proposed deep neural
network based approaches.Comment: To appear in IEEE Winter Conference on Applications of Computer
Vision (WACV), 201
Using Player's Body-Orientation to Model Pass Feasibility in Soccer
Given a monocular video of a soccer match, this paper presents a
computational model to estimate the most feasible pass at any given time. The
method leverages offensive player's orientation (plus their location) and
opponents' spatial configuration to compute the feasibility of pass events
within players of the same team. Orientation data is gathered from body pose
estimations that are properly projected onto the 2D game field; moreover, a
geometrical solution is provided, through the definition of a feasibility
measure, to determine which players are better oriented towards each other.
Once analyzed more than 6000 pass events, results show that, by including
orientation as a feasibility measure, a robust computational model can be
built, reaching more than 0.7 Top-3 accuracy. Finally, the combination of the
orientation feasibility measure with the recently introduced Expected
Possession Value metric is studied; promising results are obtained, thus
showing that existing models can be refined by using orientation as a key
feature. These models could help both coaches and analysts to have a better
understanding of the game and to improve the players' decision-making process.Comment: Accepted at the Computer Vision in Sports Workshop at CVPR 202
Using player's body-orientation to model pass feasibility in soccer
© 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes,creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Given a monocular video of a soccer match, this paper presents a computational model to estimate the most feasible pass at any given time. The method leverages offensive player's orientation (plus their location) and opponents' spatial configuration to compute the feasibility of pass events within players of the same team. Orientation data is gathered from body pose estimations that are properly projected onto the 2D game field; moreover, a geometrical solution is provided, through the definition of a feasibility measure, to determine which players are better oriented towards each other. Once analyzed more than 6000 pass events, results show that, by including orientation as a feasibility measure, a robust computational model can be built, reaching more than 0.7 Top-3 accuracy. Finally, the combination of the orientation feasibility measure with the recently introduced Expected Possession Value metric is studied; promising results are obtained, thus showing that existing models can be refined by using orientation as a key feature. These models could help both coaches and analysts to have a better understanding of the game and to improve the players' decision-making process.Peer ReviewedPostprint (author's final draft
A Survey of Deep Learning in Sports Applications: Perception, Comprehension, and Decision
Deep learning has the potential to revolutionize sports performance, with
applications ranging from perception and comprehension to decision. This paper
presents a comprehensive survey of deep learning in sports performance,
focusing on three main aspects: algorithms, datasets and virtual environments,
and challenges. Firstly, we discuss the hierarchical structure of deep learning
algorithms in sports performance which includes perception, comprehension and
decision while comparing their strengths and weaknesses. Secondly, we list
widely used existing datasets in sports and highlight their characteristics and
limitations. Finally, we summarize current challenges and point out future
trends of deep learning in sports. Our survey provides valuable reference
material for researchers interested in deep learning in sports applications
- …