7 research outputs found
Extraction and Classification of Diving Clips from Continuous Video Footage
Due to recent advances in technology, the recording and analysis of video
data has become an increasingly common component of athlete training
programmes. Today it is incredibly easy and affordable to set up a fixed camera
and record athletes in a wide range of sports, such as diving, gymnastics,
golf, tennis, etc. However, the manual analysis of the obtained footage is a
time-consuming task which involves isolating actions of interest and
categorizing them using domain-specific knowledge. In order to automate this
kind of task, three challenging sub-problems are often encountered: 1)
temporally cropping events/actions of interest from continuous video; 2)
tracking the object of interest; and 3) classifying the events/actions of
interest.
Most previous work has focused on solving just one of the above sub-problems
in isolation. In contrast, this paper provides a complete solution to the
overall action monitoring task in the context of a challenging real-world
exemplar. Specifically, we address the problem of diving classification. This
is a challenging problem since the person (diver) of interest typically
occupies fewer than 1% of the pixels in each frame. The model is required to
learn the temporal boundaries of a dive, even though other divers and
bystanders may be in view. Finally, the model must be sensitive to subtle
changes in body pose over a large number of frames to determine the
classification code. We provide effective solutions to each of the sub-problems
which combine to provide a highly functional solution to the task as a whole.
The techniques proposed can be easily generalized to video footage recorded
from other sports.Comment: To appear at CVsports 201
A systematic review of the use of Deep Learning in Satellite Imagery for Agriculture
Agricultural research is essential for increasing food production to meet the
requirements of an increasing population in the coming decades. Recently,
satellite technology has been improving rapidly and deep learning has seen much
success in generic computer vision tasks and many application areas which
presents an important opportunity to improve analysis of agricultural land.
Here we present a systematic review of 150 studies to find the current uses of
deep learning on satellite imagery for agricultural research. Although we
identify 5 categories of agricultural monitoring tasks, the majority of the
research interest is in crop segmentation and yield prediction. We found that,
when used, modern deep learning methods consistently outperformed traditional
machine learning across most tasks; the only exception was that Long Short-Term
Memory (LSTM) Recurrent Neural Networks did not consistently outperform Random
Forests (RF) for yield prediction. The reviewed studies have largely adopted
methodologies from generic computer vision, except for one major omission:
benchmark datasets are not utilised to evaluate models across studies, making
it difficult to compare results. Additionally, some studies have specifically
utilised the extra spectral resolution available in satellite imagery, but
other divergent properties of satellite images - such as the hugely different
scales of spatial patterns - are not being taken advantage of in the reviewed
studies.Comment: 25 pages, 2 figures and lots of large tables. Supplementary materials
section included here in main pd
Pose is all you need: The pose only group activity recognition system (POGARS)
We introduce a novel deep learning based group activity recognition approach
called the Pose Only Group Activity Recognition System (POGARS), designed to
use only tracked poses of people to predict the performed group activity. In
contrast to existing approaches for group activity recognition, POGARS uses 1D
CNNs to learn spatiotemporal dynamics of individuals involved in a group
activity and forgo learning features from pixel data. The proposed model uses a
spatial and temporal attention mechanism to infer person-wise importance and
multi-task learning for simultaneously performing group and individual action
classification. Experimental results confirm that POGARS achieves highly
competitive results compared to state-of-the-art methods on a widely used
public volleyball dataset despite only using tracked pose as input. Further our
experiments show by using pose only as input, POGARS has better generalization
capabilities compared to methods that use RGB as input.Comment: 12 pages, 7 figure