7 research outputs found

    Extraction and Classification of Diving Clips from Continuous Video Footage

    Full text link
    Due to recent advances in technology, the recording and analysis of video data has become an increasingly common component of athlete training programmes. Today it is incredibly easy and affordable to set up a fixed camera and record athletes in a wide range of sports, such as diving, gymnastics, golf, tennis, etc. However, the manual analysis of the obtained footage is a time-consuming task which involves isolating actions of interest and categorizing them using domain-specific knowledge. In order to automate this kind of task, three challenging sub-problems are often encountered: 1) temporally cropping events/actions of interest from continuous video; 2) tracking the object of interest; and 3) classifying the events/actions of interest. Most previous work has focused on solving just one of the above sub-problems in isolation. In contrast, this paper provides a complete solution to the overall action monitoring task in the context of a challenging real-world exemplar. Specifically, we address the problem of diving classification. This is a challenging problem since the person (diver) of interest typically occupies fewer than 1% of the pixels in each frame. The model is required to learn the temporal boundaries of a dive, even though other divers and bystanders may be in view. Finally, the model must be sensitive to subtle changes in body pose over a large number of frames to determine the classification code. We provide effective solutions to each of the sub-problems which combine to provide a highly functional solution to the task as a whole. The techniques proposed can be easily generalized to video footage recorded from other sports.Comment: To appear at CVsports 201

    A systematic review of the use of Deep Learning in Satellite Imagery for Agriculture

    Full text link
    Agricultural research is essential for increasing food production to meet the requirements of an increasing population in the coming decades. Recently, satellite technology has been improving rapidly and deep learning has seen much success in generic computer vision tasks and many application areas which presents an important opportunity to improve analysis of agricultural land. Here we present a systematic review of 150 studies to find the current uses of deep learning on satellite imagery for agricultural research. Although we identify 5 categories of agricultural monitoring tasks, the majority of the research interest is in crop segmentation and yield prediction. We found that, when used, modern deep learning methods consistently outperformed traditional machine learning across most tasks; the only exception was that Long Short-Term Memory (LSTM) Recurrent Neural Networks did not consistently outperform Random Forests (RF) for yield prediction. The reviewed studies have largely adopted methodologies from generic computer vision, except for one major omission: benchmark datasets are not utilised to evaluate models across studies, making it difficult to compare results. Additionally, some studies have specifically utilised the extra spectral resolution available in satellite imagery, but other divergent properties of satellite images - such as the hugely different scales of spatial patterns - are not being taken advantage of in the reviewed studies.Comment: 25 pages, 2 figures and lots of large tables. Supplementary materials section included here in main pd

    Pose is all you need: The pose only group activity recognition system (POGARS)

    Full text link
    We introduce a novel deep learning based group activity recognition approach called the Pose Only Group Activity Recognition System (POGARS), designed to use only tracked poses of people to predict the performed group activity. In contrast to existing approaches for group activity recognition, POGARS uses 1D CNNs to learn spatiotemporal dynamics of individuals involved in a group activity and forgo learning features from pixel data. The proposed model uses a spatial and temporal attention mechanism to infer person-wise importance and multi-task learning for simultaneously performing group and individual action classification. Experimental results confirm that POGARS achieves highly competitive results compared to state-of-the-art methods on a widely used public volleyball dataset despite only using tracked pose as input. Further our experiments show by using pose only as input, POGARS has better generalization capabilities compared to methods that use RGB as input.Comment: 12 pages, 7 figure

    Trajic: An Effective Compression System for Trajectory Data

    No full text
    corecore