5,360 research outputs found
Automated Top View Registration of Broadcast Football Videos
In this paper, we propose a novel method to register football broadcast video
frames on the static top view model of the playing surface. The proposed method
is fully automatic in contrast to the current state of the art which requires
manual initialization of point correspondences between the image and the static
model. Automatic registration using existing approaches has been difficult due
to the lack of sufficient point correspondences. We investigate an alternate
approach exploiting the edge information from the line markings on the field.
We formulate the registration problem as a nearest neighbour search over a
synthetically generated dictionary of edge map and homography pairs. The
synthetic dictionary generation allows us to exhaustively cover a wide variety
of camera angles and positions and reduce this problem to a minimal per-frame
edge map matching procedure. We show that the per-frame results can be improved
in videos using an optimization framework for temporal camera stabilization. We
demonstrate the efficacy of our approach by presenting extensive results on a
dataset collected from matches of football World Cup 2014
Building Scalable Video Understanding Benchmarks through Sports
Existing benchmarks for evaluating long video understanding falls short on
two critical aspects, either lacking in scale or quality of annotations. These
limitations arise from the difficulty in collecting dense annotations for long
videos, which often require manually labeling each frame. In this work, we
introduce an automated Annotation and Video Stream Alignment Pipeline
(abbreviated ASAP). We demonstrate the generality of ASAP by aligning unlabeled
videos of four different sports with corresponding freely available dense web
annotations (i.e. commentary). We then leverage ASAP scalability to create
LCric, a large-scale long video understanding benchmark, with over 1000 hours
of densely annotated long Cricket videos (with an average sample length of ~50
mins) collected at virtually zero annotation cost. We benchmark and analyze
state-of-the-art video understanding models on LCric through a large set of
compositional multi-choice and regression queries. We establish a human
baseline that indicates significant room for new research to explore. Our human
studies indicate that ASAP can align videos and annotations with high fidelity,
precision, and speed. The dataset along with the code for ASAP and baselines
can be accessed here: https://asap-benchmark.github.io/
Multimedia search without visual analysis: the value of linguistic and contextual information
This paper addresses the focus of this special issue by analyzing the potential contribution of linguistic content and other non-image aspects to the processing of audiovisual data. It summarizes the various ways in which linguistic content analysis contributes to enhancing the semantic annotation of multimedia content, and, as a consequence, to improving the effectiveness of conceptual media access tools. A number of techniques are presented, including the time-alignment of textual resources, audio and speech processing, content reduction and reasoning tools, and the exploitation of surface features
Game Plan: What AI can do for Football, and What Football can do for AI
The rapid progress in artificial intelligence (AI) and machine learning has opened unprecedented
analytics possibilities in various team and individual sports, including baseball, basketball, and
tennis. More recently, AI techniques have been applied to football, due to a huge increase in
data collection by professional teams, increased computational power, and advances in machine
learning, with the goal of better addressing new scientific challenges involved in the analysis of
both individual playersâ and coordinated teamsâ behaviors. The research challenges associated
with predictive and prescriptive football analytics require new developments and progress at the
intersection of statistical learning, game theory, and computer vision. In this paper, we provide
an overarching perspective highlighting how the combination of these fields, in particular, forms a
unique microcosm for AI research, while offering mutual benefits for professional teams, spectators,
and broadcasters in the years to come. We illustrate that this duality makes football analytics
a game changer of tremendous value, in terms of not only changing the game of football itself,
but also in terms of what this domain can mean for the field of AI. We review the state-of-theart and exemplify the types of analysis enabled by combining the aforementioned fields, including
illustrative examples of counterfactual analysis using predictive models, and the combination of
game-theoretic analysis of penalty kicks with statistical learning of player attributes. We conclude
by highlighting envisioned downstream impacts, including possibilities for extensions to other sports
(real and virtual)
A Survey of Deep Learning in Sports Applications: Perception, Comprehension, and Decision
Deep learning has the potential to revolutionize sports performance, with
applications ranging from perception and comprehension to decision. This paper
presents a comprehensive survey of deep learning in sports performance,
focusing on three main aspects: algorithms, datasets and virtual environments,
and challenges. Firstly, we discuss the hierarchical structure of deep learning
algorithms in sports performance which includes perception, comprehension and
decision while comparing their strengths and weaknesses. Secondly, we list
widely used existing datasets in sports and highlight their characteristics and
limitations. Finally, we summarize current challenges and point out future
trends of deep learning in sports. Our survey provides valuable reference
material for researchers interested in deep learning in sports applications
Recognizing Teamwork Activity In Observations Of Embodied Agents
This thesis presents contributions to the theory and practice of team activity recognition. A particular focus of our work was to improve our ability to collect and label representative samples, thus making the team activity recognition more efficient. A second focus of our work is improving the robustness of the recognition process in the presence of noisy and distorted data. The main contributions of this thesis are as follows: We developed a software tool, the Teamwork Scenario Editor (TSE), for the acquisition, segmentation and labeling of teamwork data. Using the TSE we acquired a corpus of labeled team actions both from synthetic and real world sources. We developed an approach through which representations of idealized team actions can be acquired in form of Hidden Markov Models which are trained using a small set of representative examples segmented and labeled with the TSE. We developed set of team-oriented feature functions, which extract discrete features from the high-dimensional continuous data. The features were chosen such that they mimic the features used by humans when recognizing teamwork actions. We developed a technique to recognize the likely roles played by agents in teams even before the team action was recognized. Through experimental studies we show that the feature functions and role recognition module significantly increase the recognition accuracy, while allowing arbitrary shuffled inputs and noisy data
Precise video feedback through live annotation of football
The domain of sports analysis is a huge field in sports science. Several different computer systems are available for doing analysis, both expensive and less expensive. Some specialize in specific sports such as football or ice hockey, while others are sports agnostic. However, a common property of most of these systems is that they try to give in-depth and detailed analysis of the sport in question.
This thesis proposes and describes a system that provides the user with the ability to annotate interesting happenings during a live sporting event, through a non-invasive mobile device interface. The device permits focus on important happenings by filtering out unnecessary detail. Our system provides corresponding video of the annotations on the same mobile device, thereby facilitating the process of giving video feedback to the involved coaches and players.
We have implemented a prototype of the system that enables evaluation of this idea, and through case studies with Tromsø Idrettslag, a Norwegian Premier League football club, we show its usefulness and applicability
The role of terminology and local grammar in video annotation
The linguistic annotation' of video sequences is an intellectually challenging task involving the investigation of how images and words are linked .together, a task that is ultimately financially rewarding in that the eventual automatic retrieval of video (sequences) can be much less time consuming, subjective and expensive than when retrieved manually. Much effort has been focused on automatic or semi-automatic annotation. Computational linguistic methods of video annotation rely on collections of collateral text in the form of keywords and proper nouns. Keywords are often used in a particular order indicating an identifiable pattern which is often limited and can subsequently be used to annotate the portion of a video where such a pattern occurred. Once' the relevant keywords and patterns have been stored, they can then be used to annotate the remainder of the video, excluding all collateral text which does not match the keywords or patterns. A new method of video annotation is presented in this thesis. The method facilitates a) annotation extraction of specialist terms within a corpus of collateral text; b) annotation identification of frequently used linguistic patterns to use in repeating key events within the data-set. The use of the method has led to the development of a system that can automatically assign key words and key patterns to a number of frames that are found in the commentary text approximately contemporaneous to the selected number of frames. The system does not perform video analysis; it only analyses the collateral text. The method is based on corpus linguistics and is mainly frequency based - frequency of occurrence of a key word or key pattern is taken as the basis of its representation. No assumptions are made about the grammatical structure of the language used in the collateral text, neither is a lexica of key words refined. Our system has been designed to annotate videos of football matches in English a!ld Arabic, and also cricket videos in English. The system has also been designed to retrieve annotated clips. The system not only provides a simple search method for annotated clips retrieval, it also provides complex, more advanced search methods.EThOS - Electronic Theses Online ServiceGBUnited Kingdo
SoccerNet-Caption: Dense Video Captioning for Soccer Broadcasts Commentaries
Soccer is more than just a game - it is a passion that transcends borders and
unites people worldwide. From the roar of the crowds to the excitement of the
commentators, every moment of a soccer match is a thrill. Yet, with so many
games happening simultaneously, fans cannot watch them all live. Notifications
for main actions can help, but lack the engagement of live commentary, leaving
fans feeling disconnected. To fulfill this need, we propose in this paper a
novel task of dense video captioning focusing on the generation of textual
commentaries anchored with single timestamps. To support this task, we
additionally present a challenging dataset consisting of almost 37k timestamped
commentaries across 715.9 hours of soccer broadcast videos. Additionally, we
propose a first benchmark and baseline for this task, highlighting the
difficulty of temporally anchoring commentaries yet showing the capacity to
generate meaningful commentaries. By providing broadcasters with a tool to
summarize the content of their video with the same level of engagement as a
live game, our method could help satisfy the needs of the numerous fans who
follow their team but cannot necessarily watch the live game. We believe our
method has the potential to enhance the accessibility and understanding of
soccer content for a wider audience, bringing the excitement of the game to
more people
- âŚ