45,076 research outputs found
Tracking-Reconstruction or Reconstruction-Tracking?
We developed two methods for tracking multiple objects using several camera views. The methods use the Multiple Hypothesis Tracking (MHT) framework to solve both the across-view data association problem (i.e., finding object correspondences across several views) and the across-time data association problem (i.e., the assignment of current object measurements to previously established object tracks). The "tracking-reconstruction method" establishes two-dimensional (2D) objects tracks for each view and then reconstructs their three-dimensional (3D) motion trajectories. The "reconstruction-tracking method" assembles 2D object measurements from all views, reconstructs 3D object positions, and then matches these 3D positions to previously established 3D object tracks to compute 3D motion trajectories. For both methods, we propose techniques for pruning the number of association hypotheses and for gathering track fragments. We tested and compared the performance of our methods on thermal infrared video of bats using several performance measures. Our analysis of video sequences with different levels of densities of flying bats reveals that the reconstruction-tracking method produces fewer track fragments than the tracking-reconstruction method but creates more false positive 3D tracks
Detecting and tracking multiple interacting objects without class-specific models
We propose a framework for detecting and tracking multiple interacting objects from a single, static, uncalibrated camera. The number of objects is variable and unknown, and object-class-specific models are not available. We use background subtraction results as measurements for object detection and tracking. Given these constraints, the main challenge is to associate pixel measurements with (possibly interacting) object targets. We first track clusters of pixels, and note when they merge or split. We then build an inference graph, representing relations between the tracked clusters. Using this graph and a generic object model based on spatial connectedness and coherent motion, we label the tracked clusters as whole objects, fragments of objects or groups of interacting objects. The outputs of our algorithm are entire tracks of objects, which may include corresponding tracks from groups of objects during interactions. Experimental results on multiple video sequences are shown
Search Tracker: Human-derived object tracking in-the-wild through large-scale search and retrieval
Humans use context and scene knowledge to easily localize moving objects in
conditions of complex illumination changes, scene clutter and occlusions. In
this paper, we present a method to leverage human knowledge in the form of
annotated video libraries in a novel search and retrieval based setting to
track objects in unseen video sequences. For every video sequence, a document
that represents motion information is generated. Documents of the unseen video
are queried against the library at multiple scales to find videos with similar
motion characteristics. This provides us with coarse localization of objects in
the unseen video. We further adapt these retrieved object locations to the new
video using an efficient warping scheme. The proposed method is validated on
in-the-wild video surveillance datasets where we outperform state-of-the-art
appearance-based trackers. We also introduce a new challenging dataset with
complex object appearance changes.Comment: Under review with the IEEE Transactions on Circuits and Systems for
Video Technolog
ACCURATE TRACKING OF OBJECTS USING LEVEL SETS
Our current work presents an approach to tackle the challenging task of tracking objects in Internet videos taken from large web repositories such as YouTube. Such videos more often than not, are captured by users using their personal hand-held cameras and cellphones and hence suffer from problems such as poor quality, camera jitter and unconstrained lighting and environmental settings. Also, it has been observed that events being recorded by such videos usually contain objects moving in an unconstrained fashion. Hence, tracking objects in Internet videos is a very challenging task in the field of computer vision since there is no a-priori information about the types of objects we might encounter, their velocities while in motion or intrinsic camera parameters to estimate the location of object in each frame. Hence, in this setting it is clearly not possible to model objects as single homogenous distributions in feature space. The feature space itself cannot be fixed since different objects might be discriminable in different sub-spaces. Keeping these challenges in mind, in the current proposed technique, each object is divided into multiple fragments or regions and each fragment is represented in Gaussian Mixture model (GMM) in a joint feature-spatial space. Each fragment is automatically selected from the image data by adapting to image statistics using a segmentation technique. We introduce the concept of strength map which represents a probability distribution of the image statistics and is used to detecting the object. We extend our goal of tracking object to tracking them with accurate boundaries thereby making the current task more challenging. We solve this problem by modeling the object using a level sets framework, which helps in preserving accurate boundaries of the object and as well in modeling the target object and background. These extracted object boundaries are learned dynamically over time, enabling object tracking even during occlusion. Our proposed algorithm performs significantly better than any of the existing object modeling techniques. Experimental results have been shown in support of this claim. Apart from tracking, the present algorithm can also be applied to different scenarios. One such application is contour-based object detection. Also, the idea of strength map was successfully applied to track objects such as vessels and vehicles on a wide range of videos, as a part of the summer internship program
Real-time detection and tracking of multiple objects with partial decoding in H.264/AVC bitstream domain
In this paper, we show that we can apply probabilistic spatiotemporal
macroblock filtering (PSMF) and partial decoding processes to effectively
detect and track multiple objects in real time in H.264|AVC bitstreams with
stationary background. Our contribution is that our method cannot only show
fast processing time but also handle multiple moving objects that are
articulated, changing in size or internally have monotonous color, even though
they contain a chaotic set of non-homogeneous motion vectors inside. In
addition, our partial decoding process for H.264|AVC bitstreams enables to
improve the accuracy of object trajectories and overcome long occlusion by
using extracted color information.Comment: SPIE Real-Time Image and Video Processing Conference 200
Bags of Affine Subspaces for Robust Object Tracking
We propose an adaptive tracking algorithm where the object is modelled as a
continuously updated bag of affine subspaces, with each subspace constructed
from the object's appearance over several consecutive frames. In contrast to
linear subspaces, affine subspaces explicitly model the origin of subspaces.
Furthermore, instead of using a brittle point-to-subspace distance during the
search for the object in a new frame, we propose to use a subspace-to-subspace
distance by representing candidate image areas also as affine subspaces.
Distances between subspaces are then obtained by exploiting the non-Euclidean
geometry of Grassmann manifolds. Experiments on challenging videos (containing
object occlusions, deformations, as well as variations in pose and
illumination) indicate that the proposed method achieves higher tracking
accuracy than several recent discriminative trackers.Comment: in International Conference on Digital Image Computing: Techniques
and Applications, 201
- …