196 research outputs found
Recommended from our members
A Visual Tracking Study and A Proposal of Modifications
On-line visual tracking of a specified target in motion throughout frames of video clips faces challenges in robust identification of the target in the current frame based on the past frames. Three approaches for tracking the target image patch are described and compared. These approaches utilize particle filtering and principal component analysis (PCA) to identify the most likely location of the target in the current frame and a low dimensional subspace representation of the patches of images to be kept as the templates in the dictionary for the identification. By using a combination of methods and compare the result of each, a new model based is proposed. The goal is to achieve a more robust and accurate tracking of a target throughout the video and continue updating the identification templates to adapt the target changes, such as apparences in lighting, angle, scale and occlusions. The challenges in tracking are to introduction of the "right" templates into the identification templates in the dictionary and identify the most accurate particle image patch while tracking the target with the right tracking patch scaling. The first approach considered and on which the structure of the visual tracker is based is the "Incremental Learning for Robust Visual Tracking" by D. Ross et al., which is a computationally fast tracker that utilizes a method of low dimensional subspace for the identification template dictionary and incremental PCA for its tracking. The tracker has a simple rule in accepting the patches of images to be in the identification template dictionary after the image patch has gone through a singular value decomposition (SVD), where it eliminates singular values are smaller than of the sum of squared sinuglar values and the corresponding bases are also eliminated. This elimination scheme has very limited robustness in tracking, therefore, more selective processes in accepting identification templates in the dictionary are explored and introduced on top of the existing method in comparison and to address the challenges in on-line video tracking. The second approach is the "Least Soft-Threshold Squares Tracking" proposed by D. Wang et al. solves the least soft-threshold squares distance problem to identify the distances of the particles to the templates in the dictionary, which greatly improves the tracking accuracy. This method is also computationally cheap in comparison to the first approach, and its accuracy is also better than the first approach, but it would sometimes fail to track in some applications. Finally, the third approach reviewed is the "Robust Visual Tracking and Vehicle Classification via Sparse Representation" by X. Mei et al. is to weight each particles when selecting the most likely target patch so the best patch has a highest weighted probability which ensures it being selected and introduced to the template dictionary. This approach performs well in comparison to the first and the second approaches in tracking accuracy and robustness, but this approach is extremely computationally expensive. Three new components are proposed in an effort to mitigate some of the limitations that the three approaches exhibit. One such component is to simply reject the image patches that exhibit too great of difference to the current template dictionary, which resulted in improved tracking robustness. This method is computationally cheap and easy to implement. Another component introduced is a second set of dictionary that is composed of admitted image patches, which is used for tracking when the image patches appears to be too dissimilar to the dictionary with low dimensional representation. It is expected that with more well defined and stronger features, it forces the tracking to identify the target. Finally, the third component introduced is the to prevent shrinkage of the target boundary box by weighting the particles drawn with the ratio of area change so that more weight is placed on particles with less arial change. This increases the likelihood of recovering the target again if tracking loses the target, and instead of shrinking the boundary box, the tracking is biased to staying with the image patch of the same size. The resulting performance of the proposed tracking scheme has not been noticeably improved, part of the reason is because the metrics available to identify a noisy image patch from the good image patches are not always indicative of the noisy-good image patch divide
Non-Rigid Puzzles
Shape correspondence is a fundamental problem in computer graphics and vision, with applications in various problems including animation, texture mapping, robotic vision, medical imaging, archaeology and many more. In settings where the shapes are allowed to undergo non-rigid deformations and only partial views are available, the problem becomes very challenging. To this end, we present a non-rigid multi-part shape matching algorithm. We assume to be given a reference shape and its multiple parts undergoing a non-rigid deformation. Each of these query parts can be additionally contaminated by clutter, may overlap with other parts, and there might be missing parts or redundant ones. Our method simultaneously solves for the segmentation of the reference model, and for a dense correspondence to (subsets of) the parts. Experimental results on synthetic as well as real scans demonstrate the effectiveness of our method in dealing with this challenging matching scenario
Comparison of Infrared and Visible Imagery for Object Tracking: Toward Trackers with Superior IR Performance
The subject of this paper is the visual object tracking in infrared (IR) videos. Our contribution is twofold. First, the performance behaviour of the state-of-the-art trackers is investigated via a comparative study using IR-visible band video conjugates, i.e., video pairs captured observing the same scene simultaneously, to identify the IR specific challenges. Second, we propose a novel ensemble based tracking method that is tuned to IR data. The proposed algorithm sequentially constructs and maintains a dynamical ensemble of simple correlators and produces tracking decisions by switching among the ensemble correlators depending on the target appearance in a computationally highly efficient manner We empirically show that our algorithm significantly outperforms the state-of-the-art trackers in our extensive set of experiments with IR imagery
Texture-based Tracking in mm-wave Images
Current tracking methods rely on color-, intensity-, and edge-based features to compute a description of an image region. These approaches are not well-suited for low-quality images such as mm-wave data from full-body scanners. In order to perform tracking in such challenging grayscale images, we propose several enhancements and extensions to the Visual Tracking Decomposition (VTD) by Kwon and Lee. A novel region descriptor, which uses texture-based features, is presented and integrated into VTD. We improve VTD by adding a sophisticated weighting scheme for observations, better motion models, and a more realistic way for sampling and interaction. Our method not only outperforms VTD on mm-wave data but also has comparable results on normal-quality images. We are confident that our region descriptor can easily be extended to other kinds of features and applications such that tracking can be performed in a large variety of image data, especially low-resolution, low-illumination and noisy images
Planar Object Tracking in the Wild: A Benchmark
Planar object tracking is an actively studied problem in vision-based robotic
applications. While several benchmarks have been constructed for evaluating
state-of-the-art algorithms, there is a lack of video sequences captured in the
wild rather than in constrained laboratory environment. In this paper, we
present a carefully designed planar object tracking benchmark containing 210
videos of 30 planar objects sampled in the natural environment. In particular,
for each object, we shoot seven videos involving various challenging factors,
namely scale change, rotation, perspective distortion, motion blur, occlusion,
out-of-view, and unconstrained. The ground truth is carefully annotated
semi-manually to ensure the quality. Moreover, eleven state-of-the-art
algorithms are evaluated on the benchmark using two evaluation metrics, with
detailed analysis provided for the evaluation results. We expect the proposed
benchmark to benefit future studies on planar object tracking.Comment: Accepted by ICRA 201
Discriminative tracking using tensor pooling
How to effectively organize local descriptors to build a global representation has a critical impact on the performance of vision tasks. Recently, local sparse representation has been successfully applied to visual tracking, owing to its discriminative nature and robustness against local noise and partial occlusions. Local sparse codes computed with a template actually form a three-order tensor according to their original layout, although most existing pooling operators convert the codes to a vector by concatenating or computing statistics on them. We argue that, compared to pooling vectors, the tensor form could deliver more intrinsic structural information for the target appearance, and can also avoid high dimensionality learning problems suffered in concatenation-based pooling methods. Therefore, in this paper, we propose to represent target templates and candidates directly with sparse coding tensors, and build the appearance model by incrementally learning on these tensors. We propose a discriminative framework to further improve robustness of our method against drifting and environmental noise. Experiments on a recent comprehensive benchmark indicate that our method performs better than state-of-the-art trackers
Learning quadrangulated patches for 3D shape parameterization and completion
We propose a novel 3D shape parameterization by surface patches, that are
oriented by 3D mesh quadrangulation of the shape. By encoding 3D surface detail
on local patches, we learn a patch dictionary that identifies principal surface
features of the shape. Unlike previous methods, we are able to encode surface
patches of variable size as determined by the user. We propose novel methods
for dictionary learning and patch reconstruction based on the query of a noisy
input patch with holes. We evaluate the patch dictionary towards various
applications in 3D shape inpainting, denoising and compression. Our method is
able to predict missing vertices and inpaint moderately sized holes. We
demonstrate a complete pipeline for reconstructing the 3D mesh from the patch
encoding. We validate our shape parameterization and reconstruction methods on
both synthetic shapes and real world scans. We show that our patch dictionary
performs successful shape completion of complicated surface textures.Comment: To be presented at International Conference on 3D Vision 2017, 201
- …