11,137 research outputs found
Learning Behavioural Context
The original publication is available at www.springerlink.co
Fusion of Head and Full-Body Detectors for Multi-Object Tracking
In order to track all persons in a scene, the tracking-by-detection paradigm
has proven to be a very effective approach. Yet, relying solely on a single
detector is also a major limitation, as useful image information might be
ignored. Consequently, this work demonstrates how to fuse two detectors into a
tracking system. To obtain the trajectories, we propose to formulate tracking
as a weighted graph labeling problem, resulting in a binary quadratic program.
As such problems are NP-hard, the solution can only be approximated. Based on
the Frank-Wolfe algorithm, we present a new solver that is crucial to handle
such difficult problems. Evaluation on pedestrian tracking is provided for
multiple scenarios, showing superior results over single detector tracking and
standard QP-solvers. Finally, our tracker ranks 2nd on the MOT16 benchmark and
1st on the new MOT17 benchmark, outperforming over 90 trackers.Comment: 10 pages, 4 figures; Winner of the MOT17 challenge; CVPRW 201
Multiresolution hierarchy co-clustering for semantic segmentation in sequences with small variations
This paper presents a co-clustering technique that, given a collection of
images and their hierarchies, clusters nodes from these hierarchies to obtain a
coherent multiresolution representation of the image collection. We formalize
the co-clustering as a Quadratic Semi-Assignment Problem and solve it with a
linear programming relaxation approach that makes effective use of information
from hierarchies. Initially, we address the problem of generating an optimal,
coherent partition per image and, afterwards, we extend this method to a
multiresolution framework. Finally, we particularize this framework to an
iterative multiresolution video segmentation algorithm in sequences with small
variations. We evaluate the algorithm on the Video Occlusion/Object Boundary
Detection Dataset, showing that it produces state-of-the-art results in these
scenarios.Comment: International Conference on Computer Vision (ICCV) 201
Accurate video object tracking using a region-based particle filter
Usually, in particle filters applied to video tracking, a simple geometrical shape, typically an ellipse, is used in order to bound the object being tracked. Although it is a good tracker, it tends to a bad object representation, as most of the world objects are not simple geometrical shapes. A better way to represent the object is by using a region-based approach, such as the Region Based Particle Filter (RBPF). This method exploits a hierarchical region based representation associated with images to tackle both problems at the same time: tracking and video object segmentation. By means of RBPF the object segmentation is resolved with high accuracy, but new problems arise. The object representation is now based on image partitions instead of pixels. This means that the amount of possible combinations has now decreased, which is computationally good, but an error on the regions taken for the object representation leads to a higher estimation error than methods working at pixel level. On the other hand, if the level of regions detail in the partition is high, the estimation of the object turns to be very noisy, making it hard to accurately propagate the object segmentation. In this thesis we present new tools to the existing RBPF. These tools are focused on increasing the RBPF performance by means of guiding the particles towards a good solution while maintaining a particle filter approach. The concept of hierarchical flow is presented and exploited, a Bayesian estimation is used in order to assign probabilities of being object or background to each region, and the reduction, in an intelligent way, of the solution space , to increase the RBPF robustness while reducing computational effort. Also changes on the already proposed co-clustering in the RBPF approach are proposed. Finally, we present results on the recently presented DAVIS database. This database comprises 50 High Definition video sequences representing several challenging situations. By using this dataset, we compare the RBPF with other state-ofthe- art methods
Action Recognition in Videos: from Motion Capture Labs to the Web
This paper presents a survey of human action recognition approaches based on
visual data recorded from a single video camera. We propose an organizing
framework which puts in evidence the evolution of the area, with techniques
moving from heavily constrained motion capture scenarios towards more
challenging, realistic, "in the wild" videos. The proposed organization is
based on the representation used as input for the recognition task, emphasizing
the hypothesis assumed and thus, the constraints imposed on the type of video
that each technique is able to address. Expliciting the hypothesis and
constraints makes the framework particularly useful to select a method, given
an application. Another advantage of the proposed organization is that it
allows categorizing newest approaches seamlessly with traditional ones, while
providing an insightful perspective of the evolution of the action recognition
task up to now. That perspective is the basis for the discussion in the end of
the paper, where we also present the main open issues in the area.Comment: Preprint submitted to CVIU, survey paper, 46 pages, 2 figures, 4
table
- âŠ