Search CORE

1,188 research outputs found

Indirect Match Highlights Detection with Deep Convolutional Neural Networks

Author: Godi Marco
Rota Paolo
Setti Francesco
Publication venue
Publication date: 01/01/2017
Field of study

Highlights in a sport video are usually referred as actions that stimulate excitement or attract attention of the audience. A big effort is spent in designing techniques which find automatically highlights, in order to automatize the otherwise manual editing process. Most of the state-of-the-art approaches try to solve the problem by training a classifier using the information extracted on the tv-like framing of players playing on the game pitch, learning to detect game actions which are labeled by human observers according to their perception of highlight. Obviously, this is a long and expensive work. In this paper, we reverse the paradigm: instead of looking at the gameplay, inferring what could be exciting for the audience, we directly analyze the audience behavior, which we assume is triggered by events happening during the game. We apply deep 3D Convolutional Neural Network (3D-CNN) to extract visual features from cropped video recordings of the supporters that are attending the event. Outputs of the crops belonging to the same frame are then accumulated to produce a value indicating the Highlight Likelihood (HL) which is then used to discriminate between positive (i.e. when a highlight occurs) and negative samples (i.e. standard play or time-outs). Experimental results on a public dataset of ice-hockey matches demonstrate the effectiveness of our method and promote further research in this new exciting direction.Comment: "Social Signal Processing and Beyond" workshop, in conjunction with ICIAP 201

arXiv.org e-Print Archive

Catalogo dei prodotti della ricerca

A Fuzzy Logic-Based System for Soccer Video Scenes Classification

Author: Song Wei
Publication venue
Publication date: 10/12/2018
Field of study

Massive global video surveillance worldwide captures data but lacks detailed activity information to flag events of interest, while the human burden of monitoring video footage is untenable. Artificial intelligence (AI) can be applied to raw video footage to identify and extract required information and summarize it in linguistic formats. Video summarization automation usually involves text-based data such as subtitles, segmenting text and semantics, with little attention to video summarization in the processing of video footage only. Classification problems in recorded videos are often very complex and uncertain due to the dynamic nature of the video sequence and light conditions, background, camera angle, occlusions, indistinguishable scene features, etc. Video scene classification forms the basis of linguistic video summarization, an open research problem with major commercial importance. Soccer video scenes present added challenges due to specific objects and events with similar features (e.g. “people” include audiences, coaches, and players), as well as being constituted from a series of quickly changing and dynamic frames with small inter-frame variations. There is an added difficulty associated with the need to have light weight video classification systems working in real time with massive data sizes. In this thesis, we introduce a novel system based on Interval Type-2 Fuzzy Logic Classification Systems (IT2FLCS) whose parameters are optimized by the Big Bang–Big Crunch (BB-BC) algorithm, which allows for the automatic scenes classification using optimized rules in broadcasted soccer matches video. The type-2 fuzzy logic systems would be unequivocal to present a highly interpretable and transparent model which is very suitable for the handling the encountered uncertainties in video footages and converting the accumulated data to linguistic formats which can be easily stored and analysed. Meanwhile the traditional black box techniques, such as support vector machines (SVMs) and neural networks, do not provide models which could be easily analysed and understood by human users. The BB-BC optimization is a heuristic, population-based evolutionary approach which is characterized by the ease of implementation, fast convergence and low computational cost. We employed the BB-BC to optimize our system parameters of fuzzy logic membership functions and fuzzy rules. Using the BB-BC we are able to balance the system transparency (through generating a small rule set) together with increasing the accuracy of scene classification. Thus, the proposed fuzzy-based system allows achieving relatively high classification accuracy with a small number of rules thus increasing the system interpretability and allowing its real-time processing. The type-2 Fuzzy Logic Classification System (T2FLCS) obtained 87.57% prediction accuracy in the scene classification of our testing group data which is better than the type-1 fuzzy classification system and neural networks counterparts. The BB-BC optimization algorithms decrease the size of rule bases both in T1FLCS and T2FLCS; the T2FLCS finally got 85.716% with reduce rules, outperforming the T1FLCS and neural network counterparts, especially in the “out-of-range data” which validates the T2FLCSs capability to handle the high level of faced uncertainties. We also presented a novel approach based on the scenes classification system combined with the dynamic time warping algorithm to implement the video events detection for real world processing. The proposed system could run on recorded or live video clips and output a label to describe the event in order to provide the high level summarization of the videos to the user

University of Essex Research Repository

A Novel and Effective Short Track Speed Skating Tracking System

Author: Wang Yuxuan
Publication venue: DigitalCommons@USU
Publication date: 01/05/2012
Field of study

This dissertation proposes a novel and effective system for tracking high-speed skaters. A novel registration method is employed to automatically discover key frames to build the panorama. Then, the homography between a frame and the real world rink can be generated accordingly. Aimed at several challenging tracking problems of short track skating, a novel multiple-objects tracking approach is proposed which includes: Gaussian mixture models (GMMs), evolving templates, constrained dynamical model, fuzzy model, multiple templates initialization, and evolution. The outputs of the system include spatialtemporal trajectories, velocity analysis, and 2D reconstruction animations. The tracking accuracy is about 10 cm (2 pixels). Such information is invaluable for sports experts. Experimental results demonstrate the effectiveness and robustness of the proposed system

DigitalCommons@USU

Deep Decision Trees for Discriminative Dictionary Learning with Adversarial Multi-Agent Trajectories

Author: Denman Simon
Fernando Tharindu
Fookes Clinton
Sridharan Sridha
Publication venue
Publication date: 01/01/2018
Field of study

With the explosion in the availability of spatio-temporal tracking data in modern sports, there is an enormous opportunity to better analyse, learn and predict important events in adversarial group environments. In this paper, we propose a deep decision tree architecture for discriminative dictionary learning from adversarial multi-agent trajectories. We first build up a hierarchy for the tree structure by adding each layer and performing feature weight based clustering in the forward pass. We then fine tune the player role weights using back propagation. The hierarchical architecture ensures the interpretability and the integrity of the group representation. The resulting architecture is a decision tree, with leaf-nodes capturing a dictionary of multi-agent group interactions. Due to the ample volume of data available, we focus on soccer tracking data, although our approach can be used in any adversarial multi-agent domain. We present applications of proposed method for simulating soccer games as well as evaluating and quantifying team strategies.Comment: To appear in 4th International Workshop on Computer Vision in Sports (CVsports) at CVPR 201

arXiv.org e-Print Archive

Crossref

Queensland University of Technology ePrints Archive

Real-time event classification in field sport videos

Author: Kapela Rafal
Kolanowski Krzysztof
O'Connor Noel E.
Rybarczyk Andrzej
Świetlicka Aleksandra
Publication venue: 'Elsevier BV'
Publication date: 01/07/2015
Field of study

The paper presents a novel approach to real-time event detection in sports broadcasts. We present how the same underlying audio-visual feature extraction algorithm based on new global image descriptors is robust across a range of different sports alleviating the need to tailor it to a particular sport. In addition, we propose and evaluate three different classifiers in order to detect events using these features: a feed-forward neural network, an Elman neural network and a decision tree. Each are investigated and evaluated in terms of their usefulness for real-time event classification. We also propose a ground truth dataset together with an annotation technique for performance evaluation of each classifier useful to others interested in this problem

Crossref

Irish Universities

DCU Online Research Access Service

Multimodal Prediction based on Graph Representations

Author: Dourado Icaro Cavalcante
Tabbone Salvatore
Torres Ricardo da Silva
Publication venue
Publication date: 03/07/2020
Field of study

This paper proposes a learning model, based on rank-fusion graphs, for general applicability in multimodal prediction tasks, such as multimodal regression and image classification. Rank-fusion graphs encode information from multiple descriptors and retrieval models, thus being able to capture underlying relationships between modalities, samples, and the collection itself. The solution is based on the encoding of multiple ranks for a query (or test sample), defined according to different criteria, into a graph. Later, we project the generated graph into an induced vector space, creating fusion vectors, targeting broader generality and efficiency. A fusion vector estimator is then built to infer whether a multimodal input object refers to a class or not. Our method is capable of promoting a fusion model better than early-fusion and late-fusion alternatives. Performed experiments in the context of multiple multimodal and visual datasets, as well as several descriptors and retrieval models, demonstrate that our learning model is highly effective for different prediction scenarios involving visual, textual, and multimodal features, yielding better effectiveness than state-of-the-art methods

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

A Literature Study On Video Retrieval Approaches

Author: S PADMAKALA
Publication venue: International Journal of Innovative Technology and Research
Publication date: 13/09/2019
Field of study

A detailed survey has been carried out to identify the various research articles available in the literature in all the categories of video retrieval and to do the analysis of the major contributions and their advantages, following are the literature used for the assessment of the state-of-art work on video retrieval. Here, a large number of papershave been studied

International Journal of Innovative Technology and Research (IJITR)