11,730 research outputs found
PoseTrack: A Benchmark for Human Pose Estimation and Tracking
Human poses and motions are important cues for analysis of videos with people
and there is strong evidence that representations based on body pose are highly
effective for a variety of tasks such as activity recognition, content
retrieval and social signal processing. In this work, we aim to further advance
the state of the art by establishing "PoseTrack", a new large-scale benchmark
for video-based human pose estimation and articulated tracking, and bringing
together the community of researchers working on visual human analysis. The
benchmark encompasses three competition tracks focusing on i) single-frame
multi-person pose estimation, ii) multi-person pose estimation in videos, and
iii) multi-person articulated tracking. To facilitate the benchmark and
challenge we collect, annotate and release a new %large-scale benchmark dataset
that features videos with multiple people labeled with person tracks and
articulated pose. A centralized evaluation server is provided to allow
participants to evaluate on a held-out test set. We envision that the proposed
benchmark will stimulate productive research both by providing a large and
representative training dataset as well as providing a platform to objectively
evaluate and compare the proposed methods. The benchmark is freely accessible
at https://posetrack.net.Comment: www.posetrack.ne
Learning to Transform Time Series with a Few Examples
We describe a semi-supervised regression algorithm that learns to transform one time series into another time series given examples of the transformation. This algorithm is applied to tracking, where a time series of observations from sensors is transformed to a time series describing the pose of a target. Instead of defining and implementing such transformations for each tracking task separately, our algorithm learns a memoryless transformation of time series from a few example input-output mappings. The algorithm searches for a smooth function that fits the training examples and, when applied to the input time series, produces a time series that evolves according to assumed dynamics. The learning procedure is fast and lends itself to a closed-form solution. It is closely related to nonlinear system identification and manifold learning techniques. We demonstrate our algorithm on the tasks of tracking RFID tags from signal strength measurements, recovering the pose of rigid objects, deformable bodies, and articulated bodies from video sequences. For these tasks, this algorithm requires significantly fewer examples compared to fully-supervised regression algorithms or semi-supervised learning algorithms that do not take the dynamics of the output time series into account
Concurrent Segmentation and Localization for Tracking of Surgical Instruments
Real-time instrument tracking is a crucial requirement for various
computer-assisted interventions. In order to overcome problems such as specular
reflections and motion blur, we propose a novel method that takes advantage of
the interdependency between localization and segmentation of the surgical tool.
In particular, we reformulate the 2D instrument pose estimation as heatmap
regression and thereby enable a concurrent, robust and near real-time
regression of both tasks via deep learning. As demonstrated by our experimental
results, this modeling leads to a significantly improved performance than
directly regressing the tool position and allows our method to outperform the
state of the art on a Retinal Microsurgery benchmark and the MICCAI EndoVis
Challenge 2015.Comment: I. Laina and N. Rieke contributed equally to this work. Accepted to
MICCAI 201
A Comprehensive Performance Evaluation of Deformable Face Tracking "In-the-Wild"
Recently, technologies such as face detection, facial landmark localisation
and face recognition and verification have matured enough to provide effective
and efficient solutions for imagery captured under arbitrary conditions
(referred to as "in-the-wild"). This is partially attributed to the fact that
comprehensive "in-the-wild" benchmarks have been developed for face detection,
landmark localisation and recognition/verification. A very important technology
that has not been thoroughly evaluated yet is deformable face tracking
"in-the-wild". Until now, the performance has mainly been assessed
qualitatively by visually assessing the result of a deformable face tracking
technology on short videos. In this paper, we perform the first, to the best of
our knowledge, thorough evaluation of state-of-the-art deformable face tracking
pipelines using the recently introduced 300VW benchmark. We evaluate many
different architectures focusing mainly on the task of on-line deformable face
tracking. In particular, we compare the following general strategies: (a)
generic face detection plus generic facial landmark localisation, (b) generic
model free tracking plus generic facial landmark localisation, as well as (c)
hybrid approaches using state-of-the-art face detection, model free tracking
and facial landmark localisation technologies. Our evaluation reveals future
avenues for further research on the topic.Comment: E. Antonakos and P. Snape contributed equally and have joint second
authorshi
- …