1,788 research outputs found

    Visual Tracking Algorithms using Different Object Representation Schemes

    Get PDF
    Visual tracking, being one of the fundamental, most important and challenging areas in computer vision, has attracted much attention in the research community during the past decade due to its broad range of real-life applications. Even after three decades of research, it still remains a challenging problem in view of the complexities involved in the target searching due to intrinsic and extrinsic appearance variations of the object. The existing trackers fail to track the object when there are considerable amount of object appearance variations and when the object undergoes severe occlusion, scale change, out-of-plane rotation, motion blur, fast motion, in-plane rotation, out-of-view and illumination variation either individually or simultaneously. In order to have a reliable and improved tracking performance, the appearance variations should be handled carefully such that the appearance model should adapt to the intrinsic appearance variations and be robust enough for extrinsic appearance variations. The objective of this thesis is to develop visual object tracking algorithms by addressing the deficiencies of the existing algorithms to enhance the tracking performance by investigating the use of different object representation schemes to model the object appearance and then devising mechanisms to update the observation models. A tracking algorithm based on the global appearance model using robust coding and its collaboration with a local model is proposed. The global PCA subspace is used to model the global appearance of the object, and the optimum PCA basis coefficients and the global weight matrix are estimated by developing an iteratively reweighted robust coding (IRRC) technique. This global model is collaborated with the local model to exploit their individual merits. Global and local robust coding distances are introduced to find the candidate sample having similar appearance as that of the reconstructed sample from the subspace, and these distances are used to define the observation likelihood. A robust occlusion map generation scheme and a mechanism to update both the global and local observation models are developed. Quantitative and qualitative performance evaluations on OTB-50 and VOT2016, two popular benchmark datasets, demonstrate that the proposed algorithm with histogram of oriented gradient (HOG) features generally performs better than the state-of-the-art methods considered do. In spite of its good performance, there is a need to improve the tracking performance in some of the challenging attributes of OTB-50 and VOT2016. A second tracking algorithm is developed to provide an improved performance in situations for the above mentioned challenging attributes. The algorithms is designed based on a structural local 2DDCT sparse appearance model and an occlusion handling mechanism. In a structural local 2DDCT sparse appearance model, the energy compaction property of the transform is exploited to reduce the size of the dictionary as well as that of the candidate samples in the object representation so that the computational cost of the l_1-minimization used could be reduced. This strategy is in contrast to the existing models that use raw pixels. A holistic image reconstruction procedure is presented from the overlapped local patches that are obtained from the dictionary and the sparse codes, and then the reconstructed holistic image is used for robust occlusion detection and occlusion map generation. The occlusion map thus obtained is used for developing a novel observation model update mechanism to avoid the model degradation. A patch occlusion ratio is employed in the calculation of the confidence score to improve the tracking performance. Quantitative and qualitative performance evaluations on the two above mentioned benchmark datasets demonstrate that this second proposed tracking algorithm generally performs better than several state-of-the-art methods and the first proposed tracking method do. Despite the improved performance of this second proposed tracking algorithm, there are still some challenging attributes of OTB-50 and of VOT2016 for which the performance needs to be improved. Finally, a third tracking algorithm is proposed by developing a scheme for collaboration between the discriminative and generative appearance models. The discriminative model is explored to estimate the position of the target and a new generative model is used to find the remaining affine parameters of the target. In the generative model, robust coding is extended to two dimensions and employed in the bilateral two dimensional PCA (2DPCA) reconstruction procedure to handle the non-Gaussian or non-Laplacian residuals by developing an IRRC technique. A 2D robust coding distance is introduced to differentiate the candidate sample from the one reconstructed from the subspace and used to compute the observation likelihood in the generative model. A method of generating a robust occlusion map from the weights obtained during the IRRC technique and a novel update mechanism of the observation model for both the kernelized correlation filters and the bilateral 2DPCA subspace are developed. Quantitative and qualitative performance evaluations on the two datasets demonstrate that this algorithm with HOG features generally outperforms the state-of-the-art methods and the other two proposed algorithms for most of the challenging attributes

    Recognising facial expressions in video sequences

    Full text link
    We introduce a system that processes a sequence of images of a front-facing human face and recognises a set of facial expressions. We use an efficient appearance-based face tracker to locate the face in the image sequence and estimate the deformation of its non-rigid components. The tracker works in real-time. It is robust to strong illumination changes and factors out changes in appearance caused by illumination from changes due to face deformation. We adopt a model-based approach for facial expression recognition. In our model, an image of a face is represented by a point in a deformation space. The variability of the classes of images associated to facial expressions are represented by a set of samples which model a low-dimensional manifold in the space of deformations. We introduce a probabilistic procedure based on a nearest-neighbour approach to combine the information provided by the incoming image sequence with the prior information stored in the expression manifold in order to compute a posterior probability associated to a facial expression. In the experiments conducted we show that this system is able to work in an unconstrained environment with strong changes in illumination and face location. It achieves an 89\% recognition rate in a set of 333 sequences from the Cohn-Kanade data base

    Towards an autonomous vision-based unmanned aerial system againstwildlife poachers

    Get PDF
    Poaching is an illegal activity that remains out of control in many countries. Based on the 2014 report of the United Nations and Interpol, the illegal trade of global wildlife and natural resources amounts to nearly $213 billion every year, which is even helping to fund armed conflicts. Poaching activities around the world are further pushing many animal species on the brink of extinction. Unfortunately, the traditional methods to fight against poachers are not enough, hence the new demands for more efficient approaches. In this context, the use of new technologies on sensors and algorithms, as well as aerial platforms is crucial to face the high increase of poaching activities in the last few years. Our work is focused on the use of vision sensors on UAVs for the detection and tracking of animals and poachers, as well as the use of such sensors to control quadrotors during autonomous vehicle following and autonomous landing.Peer Reviewe
    corecore