2,645 research outputs found
Multi-camera analysis of soccer sequences
The automatic detection of meaningful phases in a soccer game depends on the accurate localization of players and the ball at each moment. However, the automatic analysis of soccer sequences is a challenging task due to the presence of fast moving multiple objects. For this purpose, we present a multi-camera analysis system that yields the position of the ball and players on a common ground plane. The detection in each camera is based on a code-book algorithm and different features are used to classify the detected blobs. The detection results of each camera are transformed using homography to a virtual top-view of the playing field. Within this virtual top-view we merge trajectory information of the different cameras allowing to refine the found positions. In this paper, we evaluate the system on a public SOCCER dataset and end with a discussion of possible improvements of the dataset
Large-Scale Mapping of Human Activity using Geo-Tagged Videos
This paper is the first work to perform spatio-temporal mapping of human
activity using the visual content of geo-tagged videos. We utilize a recent
deep-learning based video analysis framework, termed hidden two-stream
networks, to recognize a range of activities in YouTube videos. This framework
is efficient and can run in real time or faster which is important for
recognizing events as they occur in streaming video or for reducing latency in
analyzing already captured video. This is, in turn, important for using video
in smart-city applications. We perform a series of experiments to show our
approach is able to accurately map activities both spatially and temporally. We
also demonstrate the advantages of using the visual content over the
tags/titles.Comment: Accepted at ACM SIGSPATIAL 201
Soccer on Your Tabletop
We present a system that transforms a monocular video of a soccer game into a
moving 3D reconstruction, in which the players and field can be rendered
interactively with a 3D viewer or through an Augmented Reality device. At the
heart of our paper is an approach to estimate the depth map of each player,
using a CNN that is trained on 3D player data extracted from soccer video
games. We compare with state of the art body pose and depth estimation
techniques, and show results on both synthetic ground truth benchmarks, and
real YouTube soccer footage.Comment: CVPR'18. Project: http://grail.cs.washington.edu/projects/soccer
Capturing the sporting heroes of our past by extracting 3D movements from legacy video content
Sports are a key part of cultural identity, and it is necessary to preserve them as important intangible Cultural Heritage, especially the human motion techniques specific to individual sports. In this paper we present a method for extracting 3D athlete motion from video broadcast sources, providing an important tool for preserving the heritage represented by these movements. Broadcast videos include camera motion, multiple player interaction, occlusions and noise, presenting significant challenges to solve the reconstruction. The approach requires initial definition of some key-frames and setting of 2D key-points in those frames manually. Thereafter an automatic process estimates the poses and positions of the players in the key-frames, and in the frames between key-frames, taking into account collisions with the environment and human kinematic constraints. Initial results are extremely promising and this data could be used to analyze the sport's evolution over time, or even to generate animations for interactive applications
Action Recognition in Videos: from Motion Capture Labs to the Web
This paper presents a survey of human action recognition approaches based on
visual data recorded from a single video camera. We propose an organizing
framework which puts in evidence the evolution of the area, with techniques
moving from heavily constrained motion capture scenarios towards more
challenging, realistic, "in the wild" videos. The proposed organization is
based on the representation used as input for the recognition task, emphasizing
the hypothesis assumed and thus, the constraints imposed on the type of video
that each technique is able to address. Expliciting the hypothesis and
constraints makes the framework particularly useful to select a method, given
an application. Another advantage of the proposed organization is that it
allows categorizing newest approaches seamlessly with traditional ones, while
providing an insightful perspective of the evolution of the action recognition
task up to now. That perspective is the basis for the discussion in the end of
the paper, where we also present the main open issues in the area.Comment: Preprint submitted to CVIU, survey paper, 46 pages, 2 figures, 4
table
- …