22 research outputs found

    Learning Background-Aware Correlation Filters for Visual Tracking

    Full text link
    Correlation Filters (CFs) have recently demonstrated excellent performance in terms of rapidly tracking objects under challenging photometric and geometric variations. The strength of the approach comes from its ability to efficiently learn - "on the fly" - how the object is changing over time. A fundamental drawback to CFs, however, is that the background of the object is not be modelled over time which can result in suboptimal results. In this paper we propose a Background-Aware CF that can model how both the foreground and background of the object varies over time. Our approach, like conventional CFs, is extremely computationally efficient - and extensive experiments over multiple tracking benchmarks demonstrate the superior accuracy and real-time performance of our method compared to the state-of-the-art trackers including those based on a deep learning paradigm

    Correlation Filters with Limited Boundaries

    Full text link
    Correlation filters take advantage of specific properties in the Fourier domain allowing them to be estimated efficiently: O(NDlogD) in the frequency domain, versus O(D^3 + ND^2) spatially where D is signal length, and N is the number of signals. Recent extensions to correlation filters, such as MOSSE, have reignited interest of their use in the vision community due to their robustness and attractive computational properties. In this paper we demonstrate, however, that this computational efficiency comes at a cost. Specifically, we demonstrate that only 1/D proportion of shifted examples are unaffected by boundary effects which has a dramatic effect on detection/tracking performance. In this paper, we propose a novel approach to correlation filter estimation that: (i) takes advantage of inherent computational redundancies in the frequency domain, and (ii) dramatically reduces boundary effects. Impressive object tracking and detection results are presented in terms of both accuracy and computational efficiency.Comment: 8 pages, 6 figures, 2 table

    MULTI-CHANNEL CORRELATION FILTERS WITH LIMITED BOUNDARIES: THEORY AND APPLICATIONS

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Deep-LK for Efficient Adaptive Object Tracking

    Full text link
    In this paper we present a new approach for efficient regression based object tracking which we refer to as Deep- LK. Our approach is closely related to the Generic Object Tracking Using Regression Networks (GOTURN) framework of Held et al. We make the following contributions. First, we demonstrate that there is a theoretical relationship between siamese regression networks like GOTURN and the classical Inverse-Compositional Lucas & Kanade (IC-LK) algorithm. Further, we demonstrate that unlike GOTURN IC-LK adapts its regressor to the appearance of the currently tracked frame. We argue that this missing property in GOTURN can be attributed to its poor performance on unseen objects and/or viewpoints. Second, we propose a novel framework for object tracking - which we refer to as Deep-LK - that is inspired by the IC-LK framework. Finally, we show impressive results demonstrating that Deep-LK substantially outperforms GOTURN. Additionally, we demonstrate comparable tracking performance to current state of the art deep-trackers whilst being an order of magnitude (i.e. 100 FPS) computationally efficient

    Dense Feature Aggregation and Pruning for RGBT Tracking

    Full text link
    How to perform effective information fusion of different modalities is a core factor in boosting the performance of RGBT tracking. This paper presents a novel deep fusion algorithm based on the representations from an end-to-end trained convolutional neural network. To deploy the complementarity of features of all layers, we propose a recursive strategy to densely aggregate these features that yield robust representations of target objects in each modality. In different modalities, we propose to prune the densely aggregated features of all modalities in a collaborative way. In a specific, we employ the operations of global average pooling and weighted random selection to perform channel scoring and selection, which could remove redundant and noisy features to achieve more robust feature representation. Experimental results on two RGBT tracking benchmark datasets suggest that our tracker achieves clear state-of-the-art against other RGB and RGBT tracking methods.Comment: arXiv admin note: text overlap with arXiv:1811.0985

    NOVA: rendering virtual worlds with humans for computer vision tasks

    Get PDF
    Today, the cutting edge of computer vision research greatly depends on the availability of large datasets, which are critical for effectively training and testing new methods. Manually annotating visual data, however, is not only a labor-intensive process but also prone to errors. In this study, we present NOVA, a versatile framework to create realistic-looking 3D rendered worlds containing procedurally generated humans with rich pixel-level ground truth annotations. NOVA can simulate various environmental factors such as weather conditions or different times of day, and bring an exceptionally diverse set of humans to life, each having a distinct body shape, gender and age. To demonstrate NOVA's capabilities, we generate two synthetic datasets for person tracking. The first one includes 108 sequences, each with different levels of difficulty like tracking in crowded scenes or at nighttime and aims for testing the limits of current state-of-the-art trackers. A second dataset of 97 sequences with normal weather conditions is used to show how our synthetic sequences can be utilized to train and boost the performance of deep-learning based trackers. Our results indicate that the synthetic data generated by NOVA represents a good proxy of the real-world and can be exploited for computer vision tasks

    A survey on heterogeneous face recognition: Sketch, infra-red, 3D and low-resolution

    Get PDF
    Heterogeneous face recognition (HFR) refers to matching face imagery across different domains. It has received much interest from the research community as a result of its profound implications in law enforcement. A wide variety of new invariant features, cross-modality matching models and heterogeneous datasets are being established in recent years. This survey provides a comprehensive review of established techniques and recent developments in HFR. Moreover, we offer a detailed account of datasets and benchmarks commonly used for evaluation. We finish by assessing the state of the field and discussing promising directions for future research

    Tracking Groups of People in Presence of Occlusion

    No full text
    Abstract-This paper addresses the problem of people group tracking in presence of occlusion as people form groups, interact within groups or leave groups. Foreground objects (a person or a group of people) from two consecutive frames are matched based on appearance (RGB histogram) and object location (2D region) similarity. While tracking, this method determines and handles some events such as objects merging and splitting using forward and backward matching matrices. The experimental results show that the proposed algorithm is efficient to track group of people in cluttered and complex environments even when total or partial occlusion occurs

    2012b. Inter-modality Face Sketch Recognition

    No full text
    Abstract-Automatic face sketch recognition plays an important role in law enforcement. Recently, various methods have been proposed to address the problem of face sketch recognition by matching face photos and sketches, which are of different modalities. However, their performance is strongly affected by the modality difference between sketches and photos. In this paper, we propose a new face descriptor based on gradient orientations to reduce the modality difference in feature extraction stage, called Histogram of Averaged Oriented Gradients (HAOG). Experiments on CUFS database show that the new descriptor outperforms the state-of-the-art approaches

    2012a. Face sketch recognition by Local Radon Binary Pattern: LRBP

    No full text
    ABSTRACT In this paper, we propose a new face descriptor to directly match face photos and sketches of different modalities, called Local Radon Binary Pattern (LRBP). LRBP is inspired by the fact that the shape of a face photo and its corresponding sketch is similar, even when the sketch is exaggerated by an artist. Therefore, the shape of face can be exploited to compute features which are robust against modality differences between face photo and sketch. In LRBP framework, the characteristics of face shape are captured by transforming face image into Radon space. Then, micro-information of face shape in new space is encoded by Local Binary Pattern (LBP). Finally, LRBP is computed by concatenating histograms of local LBPs. In order to capture both local and global characteristics of face shape, LRBP is extracted in a spatial pyramid fashion. Experiments on CUFS and CUFSF datasets indicate the efficiency of LRBP for face sketch recognition
    corecore