13 research outputs found

    Incremental Training of a Detector Using Online Sparse Eigen-decomposition

    Full text link
    The ability to efficiently and accurately detect objects plays a very crucial role for many computer vision tasks. Recently, offline object detectors have shown a tremendous success. However, one major drawback of offline techniques is that a complete set of training data has to be collected beforehand. In addition, once learned, an offline detector can not make use of newly arriving data. To alleviate these drawbacks, online learning has been adopted with the following objectives: (1) the technique should be computationally and storage efficient; (2) the updated classifier must maintain its high classification accuracy. In this paper, we propose an effective and efficient framework for learning an adaptive online greedy sparse linear discriminant analysis (GSLDA) model. Unlike many existing online boosting detectors, which usually apply exponential or logistic loss, our online algorithm makes use of LDA's learning criterion that not only aims to maximize the class-separation criterion but also incorporates the asymmetrical property of training data distributions. We provide a better alternative for online boosting algorithms in the context of training a visual object detector. We demonstrate the robustness and efficiency of our methods on handwriting digit and face data sets. Our results confirm that object detection tasks benefit significantly when trained in an online manner.Comment: 14 page

    Visual tracking with spatio-temporal Dempster-Shafer information fusion

    Get PDF
    A key problem in visual tracking is how to effectively combine spatio-temporal visual information from throughout a video to accurately estimate the state of an object. We address this problem by incorporating Dempster-Shafer information fusion into the tracking approach. To implement this fusion task, the entire image sequence is partitioned into spatially and temporally adjacent subsequences. A support vector machine (SVM) classifier is trained for object=non-object classification on each of these subsequences, the outputs of which act as separate data sources. To combine the discriminative information from these classifiers, we further present a spatio-temporal weighted Dempster-Shafer (STWDS) scheme. Moreover, temporally adjacent sources are likely to share discriminative information on object/non-object classification. In order to use such information, an adaptive SVM learning scheme is designed to transfer discriminative information across sources. Finally, the corresponding Dempster-Shafer belief function of the STWDS scheme is embedded into a Bayesian tracking model. Experimental results on challenging videos demonstrate the effectiveness and robustness of the proposed tracking approach.Xi Li, Anthony Dick, Chunhua Shen, Zhongfei Zhang, Anton van den Hengel, Hanzi Wan

    Object Tracking: Appearance Modeling And Feature Learning

    Get PDF
    Object tracking in real scenes is an important problem in computer vision due to increasing usage of tracking systems day in and day out in various applications such as surveillance, security, monitoring and robotic vision. Object tracking is the process of locating objects of interest in every frame of video frames. Many systems have been proposed to address the tracking problem where the major challenges come from handling appearance variation during tracking caused by changing scale, pose, rotation, illumination and occlusion. In this dissertation, we address these challenges by introducing several novel tracking techniques. First, we developed a multiple object tracking system that deals specially with occlusion issues. The system depends on our improved KLT tracker for accurate and robust tracking during partial occlusion. In full occlusion, we applied a Kalman filter to predict the object\u27s new location and connect the trajectory parts. Many tracking methods depend on a rectangle or an ellipse mask to segment and track objects. Typically, using a larger or smaller mask will lead to loss of tracked objects. Second, we present an object tracking system (SegTrack) that deals with partial and full occlusions by employing improved segmentation methods: mixture of Gaussians and a silhouette segmentation algorithm. For re-identification, one or more feature vectors for each tracked object are used after target reappearing. Third, we propose a novel Bayesian Hierarchical Appearance Model (BHAM) for robust object tracking. Our idea is to model the appearance of a target as combination of multiple appearance models, each covering the target appearance changes under a certain situation (e.g. view angle). In addition, we built an object tracking system by integrating BHAM with background subtraction and the KLT tracker for static camera videos. For moving camera videos, we applied BHAM to cluster negative and positive target instances. As tracking accuracy depends mainly on finding good discriminative features to estimate the target location, finally, we propose to learn good features for generic object tracking using online convolutional neural networks (OCNN). In order to learn discriminative and stable features for tracking, we propose a novel object function to train OCNN by penalizing the feature variations in consecutive frames, and the tracker is built by integrating OCNN with a color-based multi-appearance model. Our experimental results on real-world videos show that our tracking systems have superior performance when compared with several state-of-the-art trackers. In the feature, we plan to apply the Bayesian Hierarchical Appearance Model (BHAM) for multiple objects tracking

    Incremental learning of 3D-DCT compact representations for robust visual tracking

    Get PDF
    Visual tracking usually requires an object appearance model that is robust to changing illumination, pose and other factors encountered in video. Many recent trackers utilize appearance samples in previous frames to form the bases upon which the object appearance model is built. This approach has the following limitations: (a) the bases are data driven, so they can be easily corrupted; and (b) it is difficult to robustly update the bases in challenging situations. In this paper, we construct an appearance model using the 3D discrete cosine transform (3D-DCT). The 3D-DCT is based on a set of cosine basis functions, which are determined by the dimensions of the 3D signal and thus independent of the input video data. In addition, the 3D-DCT can generate a compact energy spectrum whose high-frequency coefficients are sparse if the appearance samples are similar. By discarding these high-frequency coefficients, we simultaneously obtain a compact 3D-DCT based object representation and a signal reconstruction-based similarity measure (reflecting the information loss from signal reconstruction). To efficiently update the object representation, we propose an incremental 3D-DCT algorithm, which decomposes the 3D-DCT into successive operations of the 2D discrete cosine transform (2D-DCT) and 1D discrete cosine transform (1D-DCT) on the input video data. As a result, the incremental 3D-DCT algorithm only needs to compute the 2D-DCT for newly added frames as well as the 1D-DCT along the third dimension, which significantly reduces the computational complexity. Based on this incremental 3D-DCT algorithm, we design a discriminative criterion to evaluate the likelihood of a test sample belonging to the foreground object. We then embed the discriminative criterion into a particle filtering framework for object state inference over time. Experimental results demonstrate the effectiveness and robustness of the proposed tracker.Xi Li, Anthony Dick, Chunhua Shen, Anton van den Hengel, and Hanzi Wan

    Online classification via self-organizing space partitioning

    Get PDF
    The authors study online supervised learning under the empirical zero-one loss and introduce a novel classification algorithm with strong theoretical guarantees. The proposed method is a highly dynamical self-organizing decision tree structure, which adaptively partitions the feature space into small regions and combines (takes the union of) the local simple classification models specialized in those regions. The authors' approach sequentially and directly minimizes the cumulative loss by jointly learning the optimal feature space partitioning and the corresponding individual partition-region classifiers. They mitigate overtraining issues by using basic linear classifiers at each region while providing a superior modeling power through hierarchical and data adaptive models. The computational complexity of the introduced algorithm scales linearly with the dimensionality of the feature space and the depth of the tree. Their algorithm can be applied to any streaming data without requiring a training phase or a priori information, hence processing data on-the-fly and then discarding it. Therefore, the introduced algorithm is especially suitable for the applications requiring sequential data processing at large scales/high rates. The authors present a comprehensive experimental study in stationary and nonstationary environments. In these experiments, their algorithm is compared with the state-of-the-art methods over the well-known benchmark datasets and shown to be computationally highly superior. The proposed algorithm significantly outperforms the competing methods in the stationary settings and demonstrates remarkable adaptation capabilities to nonstationarity in the presence of drifting concepts and abrupt/sudden concept changes. © 1991-2012 IEEE
    corecore