21,837 research outputs found

    Vision-based Real-Time Aerial Object Localization and Tracking for UAV Sensing System

    Get PDF
    The paper focuses on the problem of vision-based obstacle detection and tracking for unmanned aerial vehicle navigation. A real-time object localization and tracking strategy from monocular image sequences is developed by effectively integrating the object detection and tracking into a dynamic Kalman model. At the detection stage, the object of interest is automatically detected and localized from a saliency map computed via the image background connectivity cue at each frame; at the tracking stage, a Kalman filter is employed to provide a coarse prediction of the object state, which is further refined via a local detector incorporating the saliency map and the temporal information between two consecutive frames. Compared to existing methods, the proposed approach does not require any manual initialization for tracking, runs much faster than the state-of-the-art trackers of its kind, and achieves competitive tracking performance on a large number of image sequences. Extensive experiments demonstrate the effectiveness and superior performance of the proposed approach.Comment: 8 pages, 7 figure

    On the Relations of Correlation Filter Based Trackers and Struck

    Full text link
    In recent years, two types of trackers, namely correlation filter based tracker (CF tracker) and structured output tracker (Struck), have exhibited the state-of-the-art performance. However, there seems to be lack of analytic work on their relations in the computer vision community. In this paper, we investigate two state-of-the-art CF trackers, i.e., spatial regularization discriminative correlation filter (SRDCF) and correlation filter with limited boundaries (CFLB), and Struck, and reveal their relations. Specifically, after extending the CFLB to its multiple channel version we prove the relation between SRDCF and CFLB on the condition that the spatial regularization factor of SRDCF is replaced by the masking matrix of CFLB. We also prove the asymptotical approximate relation between SRDCF and Struck on the conditions that the spatial regularization factor of SRDCF is replaced by an indicator function of object bounding box, the weights of SRDCF in its loss item are replaced by those of Struck, the linear kernel is employed by Struck, and the search region tends to infinity. Extensive experiments on public benchmarks OTB50 and OTB100 are conducted to verify our theoretical results. Moreover, we explain how detailed differences among SRDCF, CFLB, and Struck would give rise to slightly different performances on visual sequence

    Spectral Filter Tracking

    Full text link
    Visual object tracking is a challenging computer vision task with numerous real-world applications. Here we propose a simple but efficient Spectral Filter Tracking (SFT)method. To characterize rotational and translation invariance of tracking targets, the candidate image region is models as a pixelwise grid graph. Instead of the conventional graph matching, we convert the tracking into a plain least square regression problem to estimate the best center coordinate of the target. But different from the holistic regression of correlation filter based methods, SFT can operate on localized surrounding regions of each pixel (i.e.,vertex) by using spectral graph filters, which thus is more robust to resist local variations and cluttered background.To bypass the eigenvalue decomposition problem of the graph Laplacian matrix L, we parameterize spectral graph filters as the polynomial of L by spectral graph theory, in which L k exactly encodes a k-hop local neighborhood of each vertex. Finally, the filter parameters (i.e., polynomial coefficients) as well as feature projecting functions are jointly integrated into the regression model.Comment: 11page

    Learning Spatial-Aware Regressions for Visual Tracking

    Full text link
    In this paper, we analyze the spatial information of deep features, and propose two complementary regressions for robust visual tracking. First, we propose a kernelized ridge regression model wherein the kernel value is defined as the weighted sum of similarity scores of all pairs of patches between two samples. We show that this model can be formulated as a neural network and thus can be efficiently solved. Second, we propose a fully convolutional neural network with spatially regularized kernels, through which the filter kernel corresponding to each output channel is forced to focus on a specific region of the target. Distance transform pooling is further exploited to determine the effectiveness of each output channel of the convolution layer. The outputs from the kernelized ridge regression model and the fully convolutional neural network are combined to obtain the ultimate response. Experimental results on two benchmark datasets validate the effectiveness of the proposed method.Comment: To appear in CVPR201

    Tracking Randomly Moving Objects on Edge Box Proposals

    Full text link
    Most tracking-by-detection methods employ a local search window around the predicted object location in the current frame assuming the previous location is accurate, the trajectory is smooth, and the computational capacity permits a search radius that can accommodate the maximum speed yet small enough to reduce mismatches. These, however, may not be valid always, in particular for fast and irregularly moving objects. Here, we present an object tracker that is not limited to a local search window and has ability to probe efficiently the entire frame. Our method generates a small number of "high-quality" proposals by a novel instance-specific objectness measure and evaluates them against the object model that can be adopted from an existing tracking-by-detection approach as a core tracker. During the tracking process, we update the object model concentrating on hard false-positives supplied by the proposals, which help suppressing distractors caused by difficult background clutters, and learn how to re-rank proposals according to the object model. Since we reduce significantly the number of hypotheses the core tracker evaluates, we can use richer object descriptors and stronger detector. Our method outperforms most recent state-of-the-art trackers on popular tracking benchmarks, and provides improved robustness for fast moving objects as well as for ultra low-frame-rate videos

    Effective Occlusion Handling for Fast Correlation Filter-based Trackers

    Full text link
    Correlation filter-based trackers heavily suffer from the problem of multiple peaks in their response maps incurred by occlusions. Moreover, the whole tracking pipeline may break down due to the uncertainties brought by shifting among peaks, which will further lead to the degraded correlation filter model. To alleviate the drift problem caused by occlusions, we propose a novel scheme to choose the specific filter model according to different scenarios. Specifically, an effective measurement function is designed to evaluate the quality of filter response. A sophisticated strategy is employed to judge whether occlusions occur, and then decide how to update the filter models. In addition, we take advantage of both log-polar method and pyramid-like approach to estimate the best scale of the target. We evaluate our proposed approach on VOT2018 challenge and OTB100 dataset, whose experimental result shows that the proposed tracker achieves the promising performance compared against the state-of-the-art trackers

    Robust Visual Tracking using Multi-Frame Multi-Feature Joint Modeling

    Full text link
    It remains a huge challenge to design effective and efficient trackers under complex scenarios, including occlusions, illumination changes and pose variations. To cope with this problem, a promising solution is to integrate the temporal consistency across consecutive frames and multiple feature cues in a unified model. Motivated by this idea, we propose a novel correlation filter-based tracker in this work, in which the temporal relatedness is reconciled under a multi-task learning framework and the multiple feature cues are modeled using a multi-view learning approach. We demonstrate the resulting regression model can be efficiently learned by exploiting the structure of blockwise diagonal matrix. A fast blockwise diagonal matrix inversion algorithm is developed thereafter for efficient online tracking. Meanwhile, we incorporate an adaptive scale estimation mechanism to strengthen the stability of scale variation tracking. We implement our tracker using two types of features and test it on two benchmark datasets. Experimental results demonstrate the superiority of our proposed approach when compared with other state-of-the-art trackers. project homepage http://bmal.hust.edu.cn/project/KMF2JMTtracking.htmlComment: This paper has been accepted by IEEE Transactions on Circuits and Systems for Video Technology. The MATLAB code of our method is available from our project homepage http://bmal.hust.edu.cn/project/KMF2JMTtracking.htm

    Hierarchical Spatial-aware Siamese Network for Thermal Infrared Object Tracking

    Full text link
    Most thermal infrared (TIR) tracking methods are discriminative, treating the tracking problem as a classification task. However, the objective of the classifier (label prediction) is not coupled to the objective of the tracker (location estimation). The classification task focuses on the between-class difference of the arbitrary objects, while the tracking task mainly deals with the within-class difference of the same objects. In this paper, we cast the TIR tracking problem as a similarity verification task, which is coupled well to the objective of the tracking task. We propose a TIR tracker via a Hierarchical Spatial-aware Siamese Convolutional Neural Network (CNN), named HSSNet. To obtain both spatial and semantic features of the TIR object, we design a Siamese CNN that coalesces the multiple hierarchical convolutional layers. Then, we propose a spatial-aware network to enhance the discriminative ability of the coalesced hierarchical feature. Subsequently, we train this network end to end on a large visible video detection dataset to learn the similarity between paired objects before we transfer the network into the TIR domain. Next, this pre-trained Siamese network is used to evaluate the similarity between the target template and target candidates. Finally, we locate the candidate that is most similar to the tracked target. Extensive experimental results on the benchmarks VOT-TIR 2015 and VOT-TIR 2016 show that our proposed method achieves favourable performance compared to the state-of-the-art methods.Comment: 20 pages, 7 figure

    Real-time quantitative Schlieren imaging by fast Fourier demodulation of a checkered backdrop

    Full text link
    A quantitative synthetic Schlieren imaging (SSI) method based on fast Fourier demodulation is presented. Instead of a random dot pattern (as usually employed in SSI), a 2D periodic pattern (such as a checkerboard) is used as a backdrop to the refractive object of interest. The range of validity and accuracy of this "Fast Checkerboard Demodulation" (FCD) method are assessed using both synthetic data and experimental recordings of patterns optically distorted by small waves on a water surface. It is found that the FCD method is at least as accurate as sophisticated, multi-stage, digital image correlation (DIC) or optical flow (OF) techniques used with random dot patterns, and it is significantly faster. Efficient, fully vectorized, implementations of both the FCD and DIC/OF schemes developed for this study are made available as Matlab scripts.Comment: 21 pages, 7 figures, 1 appendi

    Correlation Filters for Unmanned Aerial Vehicle-Based Aerial Tracking: A Review and Experimental Evaluation

    Full text link
    Aerial tracking, which has exhibited its omnipresent dedication and splendid performance, is one of the most active applications in the remote sensing field. Especially, unmanned aerial vehicle (UAV)-based remote sensing system, equipped with a visual tracking approach, has been widely used in aviation, navigation, agriculture,transportation, and public security, etc. As is mentioned above, the UAV-based aerial tracking platform has been gradually developed from research to practical application stage, reaching one of the main aerial remote sensing technologies in the future. However, due to the real-world onerous situations, e.g., harsh external challenges, the vibration of the UAV mechanical structure (especially under strong wind conditions), the maneuvering flight in complex environment, and the limited computation resources onboard, accuracy, robustness, and high efficiency are all crucial for the onboard tracking methods. Recently, the discriminative correlation filter (DCF)-based trackers have stood out for their high computational efficiency and appealing robustness on a single CPU, and have flourished in the UAV visual tracking community. In this work, the basic framework of the DCF-based trackers is firstly generalized, based on which, 23 state-of-the-art DCF-based trackers are orderly summarized according to their innovations for solving various issues. Besides, exhaustive and quantitative experiments have been extended on various prevailing UAV tracking benchmarks, i.e., UAV123, UAV123@10fps, UAV20L, UAVDT, DTB70, and VisDrone2019-SOT, which contain 371,903 frames in total. The experiments show the performance, verify the feasibility, and demonstrate the current challenges of DCF-based trackers onboard UAV tracking.Comment: 28 pages, 10 figures, submitted to GRS
    • …
    corecore