471 research outputs found
Indoor Activity Detection and Recognition for Sport Games Analysis
Activity recognition in sport is an attractive field for computer vision
research. Game, player and team analysis are of great interest and research
topics within this field emerge with the goal of automated analysis. The very
specific underlying rules of sports can be used as prior knowledge for the
recognition task and present a constrained environment for evaluation. This
paper describes recognition of single player activities in sport with special
emphasis on volleyball. Starting from a per-frame player-centered activity
recognition, we incorporate geometry and contextual information via an activity
context descriptor that collects information about all player's activities over
a certain timespan relative to the investigated player. The benefit of this
context information on single player activity recognition is evaluated on our
new real-life dataset presenting a total amount of almost 36k annotated frames
containing 7 activity classes within 6 videos of professional volleyball games.
Our incorporation of the contextual information improves the average
player-centered classification performance of 77.56% by up to 18.35% on
specific classes, proving that spatio-temporal context is an important clue for
activity recognition.Comment: Part of the OAGM 2014 proceedings (arXiv:1404.3538
Building with Drones: Accurate 3D Facade Reconstruction using MAVs
Automatic reconstruction of 3D models from images using multi-view
Structure-from-Motion methods has been one of the most fruitful outcomes of
computer vision. These advances combined with the growing popularity of Micro
Aerial Vehicles as an autonomous imaging platform, have made 3D vision tools
ubiquitous for large number of Architecture, Engineering and Construction
applications among audiences, mostly unskilled in computer vision. However, to
obtain high-resolution and accurate reconstructions from a large-scale object
using SfM, there are many critical constraints on the quality of image data,
which often become sources of inaccuracy as the current 3D reconstruction
pipelines do not facilitate the users to determine the fidelity of input data
during the image acquisition. In this paper, we present and advocate a
closed-loop interactive approach that performs incremental reconstruction in
real-time and gives users an online feedback about the quality parameters like
Ground Sampling Distance (GSD), image redundancy, etc on a surface mesh. We
also propose a novel multi-scale camera network design to prevent scene drift
caused by incremental map building, and release the first multi-scale image
sequence dataset as a benchmark. Further, we evaluate our system on real
outdoor scenes, and show that our interactive pipeline combined with a
multi-scale camera network approach provides compelling accuracy in multi-view
reconstruction tasks when compared against the state-of-the-art methods.Comment: 8 Pages, 2015 IEEE International Conference on Robotics and
Automation (ICRA '15), Seattle, WA, US
Using Self-Contradiction to Learn Confidence Measures in Stereo Vision
Learned confidence measures gain increasing importance for outlier removal
and quality improvement in stereo vision. However, acquiring the necessary
training data is typically a tedious and time consuming task that involves
manual interaction, active sensing devices and/or synthetic scenes. To overcome
this problem, we propose a new, flexible, and scalable way for generating
training data that only requires a set of stereo images as input. The key idea
of our approach is to use different view points for reasoning about
contradictions and consistencies between multiple depth maps generated with the
same stereo algorithm. This enables us to generate a huge amount of training
data in a fully automated manner. Among other experiments, we demonstrate the
potential of our approach by boosting the performance of three learned
confidence measures on the KITTI2012 dataset by simply training them on a vast
amount of automatically generated training data rather than a limited amount of
laser ground truth data.Comment: This paper was accepted to the IEEE Conference on Computer Vision and
Pattern Recognition (CVPR), 2016. The copyright was transfered to IEEE
(https://www.ieee.org). The official version of the paper will be made
available on IEEE Xplore (R) (http://ieeexplore.ieee.org). This version of
the paper also contains the supplementary material, which will not appear
IEEE Xplore (R
A Deep Primal-Dual Network for Guided Depth Super-Resolution
In this paper we present a novel method to increase the spatial resolution of
depth images. We combine a deep fully convolutional network with a non-local
variational method in a deep primal-dual network. The joint network computes a
noise-free, high-resolution estimate from a noisy, low-resolution input depth
map. Additionally, a high-resolution intensity image is used to guide the
reconstruction in the network. By unrolling the optimization steps of a
first-order primal-dual algorithm and formulating it as a network, we can train
our joint method end-to-end. This not only enables us to learn the weights of
the fully convolutional network, but also to optimize all parameters of the
variational method and its optimization procedure. The training of such a deep
network requires a large dataset for supervision. Therefore, we generate
high-quality depth maps and corresponding color images with a physically based
renderer. In an exhaustive evaluation we show that our method outperforms the
state-of-the-art on multiple benchmarks.Comment: BMVC 201
Scalable Surface Reconstruction from Point Clouds with Extreme Scale and Density Diversity
In this paper we present a scalable approach for robustly computing a 3D
surface mesh from multi-scale multi-view stereo point clouds that can handle
extreme jumps of point density (in our experiments three orders of magnitude).
The backbone of our approach is a combination of octree data partitioning,
local Delaunay tetrahedralization and graph cut optimization. Graph cut
optimization is used twice, once to extract surface hypotheses from local
Delaunay tetrahedralizations and once to merge overlapping surface hypotheses
even when the local tetrahedralizations do not share the same topology.This
formulation allows us to obtain a constant memory consumption per sub-problem
while at the same time retaining the density independent interpolation
properties of the Delaunay-based optimization. On multiple public datasets, we
demonstrate that our approach is highly competitive with the state-of-the-art
in terms of accuracy, completeness and outlier resilience. Further, we
demonstrate the multi-scale potential of our approach by processing a newly
recorded dataset with 2 billion points and a point density variation of more
than four orders of magnitude - requiring less than 9GB of RAM per process.Comment: This paper was accepted to the IEEE Conference on Computer Vision and
Pattern Recognition (CVPR), 2017. The copyright was transfered to IEEE
(ieee.org). The official version of the paper will be made available on IEEE
Xplore (R) (ieeexplore.ieee.org). This version of the paper also contains the
supplementary material, which will not appear IEEE Xplore (R
- …