9 research outputs found

    Visual Tracking Algorithms using Different Object Representation Schemes

    Get PDF
    Visual tracking, being one of the fundamental, most important and challenging areas in computer vision, has attracted much attention in the research community during the past decade due to its broad range of real-life applications. Even after three decades of research, it still remains a challenging problem in view of the complexities involved in the target searching due to intrinsic and extrinsic appearance variations of the object. The existing trackers fail to track the object when there are considerable amount of object appearance variations and when the object undergoes severe occlusion, scale change, out-of-plane rotation, motion blur, fast motion, in-plane rotation, out-of-view and illumination variation either individually or simultaneously. In order to have a reliable and improved tracking performance, the appearance variations should be handled carefully such that the appearance model should adapt to the intrinsic appearance variations and be robust enough for extrinsic appearance variations. The objective of this thesis is to develop visual object tracking algorithms by addressing the deficiencies of the existing algorithms to enhance the tracking performance by investigating the use of different object representation schemes to model the object appearance and then devising mechanisms to update the observation models. A tracking algorithm based on the global appearance model using robust coding and its collaboration with a local model is proposed. The global PCA subspace is used to model the global appearance of the object, and the optimum PCA basis coefficients and the global weight matrix are estimated by developing an iteratively reweighted robust coding (IRRC) technique. This global model is collaborated with the local model to exploit their individual merits. Global and local robust coding distances are introduced to find the candidate sample having similar appearance as that of the reconstructed sample from the subspace, and these distances are used to define the observation likelihood. A robust occlusion map generation scheme and a mechanism to update both the global and local observation models are developed. Quantitative and qualitative performance evaluations on OTB-50 and VOT2016, two popular benchmark datasets, demonstrate that the proposed algorithm with histogram of oriented gradient (HOG) features generally performs better than the state-of-the-art methods considered do. In spite of its good performance, there is a need to improve the tracking performance in some of the challenging attributes of OTB-50 and VOT2016. A second tracking algorithm is developed to provide an improved performance in situations for the above mentioned challenging attributes. The algorithms is designed based on a structural local 2DDCT sparse appearance model and an occlusion handling mechanism. In a structural local 2DDCT sparse appearance model, the energy compaction property of the transform is exploited to reduce the size of the dictionary as well as that of the candidate samples in the object representation so that the computational cost of the l_1-minimization used could be reduced. This strategy is in contrast to the existing models that use raw pixels. A holistic image reconstruction procedure is presented from the overlapped local patches that are obtained from the dictionary and the sparse codes, and then the reconstructed holistic image is used for robust occlusion detection and occlusion map generation. The occlusion map thus obtained is used for developing a novel observation model update mechanism to avoid the model degradation. A patch occlusion ratio is employed in the calculation of the confidence score to improve the tracking performance. Quantitative and qualitative performance evaluations on the two above mentioned benchmark datasets demonstrate that this second proposed tracking algorithm generally performs better than several state-of-the-art methods and the first proposed tracking method do. Despite the improved performance of this second proposed tracking algorithm, there are still some challenging attributes of OTB-50 and of VOT2016 for which the performance needs to be improved. Finally, a third tracking algorithm is proposed by developing a scheme for collaboration between the discriminative and generative appearance models. The discriminative model is explored to estimate the position of the target and a new generative model is used to find the remaining affine parameters of the target. In the generative model, robust coding is extended to two dimensions and employed in the bilateral two dimensional PCA (2DPCA) reconstruction procedure to handle the non-Gaussian or non-Laplacian residuals by developing an IRRC technique. A 2D robust coding distance is introduced to differentiate the candidate sample from the one reconstructed from the subspace and used to compute the observation likelihood in the generative model. A method of generating a robust occlusion map from the weights obtained during the IRRC technique and a novel update mechanism of the observation model for both the kernelized correlation filters and the bilateral 2DPCA subspace are developed. Quantitative and qualitative performance evaluations on the two datasets demonstrate that this algorithm with HOG features generally outperforms the state-of-the-art methods and the other two proposed algorithms for most of the challenging attributes

    Visual Tracking Based on Correlation Filter and Robust Coding in Bilateral 2DPCA Subspace

    Get PDF
    The success of correlation filters in visual tracking has attracted much attention in computer vision due to their high efficiency and performance. However, they are not equipped with a mechanism to cope with challenging situations like scale variations, out-of-view, and camera motion. With the aim of dealing with such situations, a collaborative scheme of tracking based on the discriminative and generative models is proposed. Instead of finding all the affine motion parameters of the target by the combined likelihood of these models, the correlation filters, based on discriminative model, are used to find the position of the target, whereas 2D robust coding in a bilateral 2DPCA subspace, based on generative model, is used to find the other affine motion parameters of the target. Further, a 2D robust coding distance is proposed to differentiate the candidate samples from the subspace and used to compute the observation likelihood in the generative model. In addition, it is proposed to generate a robust occlusion map from the weights obtained during the residual minimization and a novel update mechanism of the appearance model for both the correlation filters and bilateral 2DPCA subspace is proposed. The proposed method is evaluated on the challenging image sequences available in the OTB-50, VOT2016, and UAV20L benchmark datasets, and its performance is compared with that of the state-of-the-art tracking algorithms. In contrast to OTB-50 and VOT2016, the dataset UAV20L contains long duration sequences with additional challenges introduced by both the camera motion and the view points in three dimensions. Quantitative and qualitative performance evaluations on three benchmark datasets demonstrate that the proposed tracking algorithm outperforms the state-of-the-art methods

    Visual tracking using structural local DCT sparse appearance model with occlusion detection

    Get PDF
    In this paper, a structural local DCT sparse appearance model with occlusion detection is proposed for visual tracking in a particle filter framework. The energy compaction property of the 2D-DCT is exploited to reduce the size of the dictionary as well as that of the candidate samples so that the computational cost of l1-minimization can be lowered. Further, a holistic image reconstruction procedure is proposed for robust occlusion detection and used for appearance model update, thus avoiding the degradation of the appearance model in the presence of occlusion/outliers. Also, a patch occlusion ratio is introduced in the confidence score computation to enhance the tracking performance. Quantitative and qualitative performance evaluations on two popular benchmark datasets demonstrate that the proposed tracking algorithm generally outperforms several state-of-the-art methods

    Biometric face recognition using multilinear projection and artificial intelligence

    Get PDF
    PhD ThesisNumerous problems of automatic facial recognition in the linear and multilinear subspace learning have been addressed; nevertheless, many difficulties remain. This work focuses on two key problems for automatic facial recognition and feature extraction: object representation and high dimensionality. To address these problems, a bidirectional two-dimensional neighborhood preserving projection (B2DNPP) approach for human facial recognition has been developed. Compared with 2DNPP, the proposed method operates on 2-D facial images and performs reductions on the directions of both rows and columns of images. Furthermore, it has the ability to reveal variations between these directions. To further improve the performance of the B2DNPP method, a new B2DNPP based on the curvelet decomposition of human facial images is introduced. The curvelet multi- resolution tool enhances the edges representation and other singularities along curves, and thus improves directional features. In this method, an extreme learning machine (ELM) classifier is used which significantly improves classification rate. The proposed C-B2DNPP method decreases error rate from 5.9% to 3.5%, from 3.7% to 2.0% and from 19.7% to 14.2% using ORL, AR, and FERET databases compared with 2DNPP. Therefore, it achieves decreases in error rate more than 40%, 45%, and 27% respectively with the ORL, AR, and FERET databases. Facial images have particular natural structures in the form of two-, three-, or even higher-order tensors. Therefore, a novel method of supervised and unsupervised multilinear neighborhood preserving projection (MNPP) is proposed for face recognition. This allows the natural representation of multidimensional images 2-D, 3-D or higher-order tensors and extracts useful information directly from tensotial data rather than from matrices or vectors. As opposed to a B2DNPP which derives only two subspaces, in the MNPP method multiple interrelated subspaces are obtained over different tensor directions, so that the subspaces are learned iteratively by unfolding the tensor along the different directions. The performance of the MNPP has performed in terms of the two modes of facial recognition biometrics systems of identification and verification. The proposed supervised MNPP method achieved decrease over 50.8%, 75.6%, and 44.6% in error rate using ORL, AR, and FERET databases respectively, compared with 2DNPP. Therefore, the results demonstrate that the MNPP approach obtains the best overall performance in various learning scenarios

    Segmentation of images by color features: a survey

    Get PDF
    En este articulo se hace la revisión del estado del arte sobre la segmentación de imagenes de colorImage segmentation is an important stage for object recognition. Many methods have been proposed in the last few years for grayscale and color images. In this paper, we present a deep review of the state of the art on color image segmentation methods; through this paper, we explain the techniques based on edge detection, thresholding, histogram-thresholding, region, feature clustering and neural networks. Because color spaces play a key role in the methods reviewed, we also explain in detail the most commonly color spaces to represent and process colors. In addition, we present some important applications that use the methods of image segmentation reviewed. Finally, a set of metrics frequently used to evaluate quantitatively the segmented images is shown

    Advances in Image Processing, Analysis and Recognition Technology

    Get PDF
    For many decades, researchers have been trying to make computers’ analysis of images as effective as the system of human vision is. For this purpose, many algorithms and systems have previously been created. The whole process covers various stages, including image processing, representation and recognition. The results of this work can be applied to many computer-assisted areas of everyday life. They improve particular activities and provide handy tools, which are sometimes only for entertainment, but quite often, they significantly increase our safety. In fact, the practical implementation of image processing algorithms is particularly wide. Moreover, the rapid growth of computational complexity and computer efficiency has allowed for the development of more sophisticated and effective algorithms and tools. Although significant progress has been made so far, many issues still remain, resulting in the need for the development of novel approaches
    corecore