4,430 research outputs found

    Robust Visual Tracking Revisited: From Correlation Filter to Template Matching

    Full text link
    In this paper, we propose a novel matching based tracker by investigating the relationship between template matching and the recent popular correlation filter based trackers (CFTs). Compared to the correlation operation in CFTs, a sophisticated similarity metric termed "mutual buddies similarity" (MBS) is proposed to exploit the relationship of multiple reciprocal nearest neighbors for target matching. By doing so, our tracker obtains powerful discriminative ability on distinguishing target and background as demonstrated by both empirical and theoretical analyses. Besides, instead of utilizing single template with the improper updating scheme in CFTs, we design a novel online template updating strategy named "memory filtering" (MF), which aims to select a certain amount of representative and reliable tracking results in history to construct the current stable and expressive template set. This scheme is beneficial for the proposed tracker to comprehensively "understand" the target appearance variations, "recall" some stable results. Both qualitative and quantitative evaluations on two benchmarks suggest that the proposed tracking method performs favorably against some recently developed CFTs and other competitive trackers.Comment: has been published on IEEE TI

    Efficient Diverse Ensemble for Discriminative Co-Tracking

    Full text link
    Ensemble discriminative tracking utilizes a committee of classifiers, to label data samples, which are in turn, used for retraining the tracker to localize the target using the collective knowledge of the committee. Committee members could vary in their features, memory update schemes, or training data, however, it is inevitable to have committee members that excessively agree because of large overlaps in their version space. To remove this redundancy and have an effective ensemble learning, it is critical for the committee to include consistent hypotheses that differ from one-another, covering the version space with minimum overlaps. In this study, we propose an online ensemble tracker that directly generates a diverse committee by generating an efficient set of artificial training. The artificial data is sampled from the empirical distribution of the samples taken from both target and background, whereas the process is governed by query-by-committee to shrink the overlap between classifiers. The experimental results demonstrate that the proposed scheme outperforms conventional ensemble trackers on public benchmarks.Comment: CVPR 2018 Submissio

    RTrack: Accelerating Convergence for Visual Object Tracking via Pseudo-Boxes Exploration

    Full text link
    Single object tracking (SOT) heavily relies on the representation of the target object as a bounding box. However, due to the potential deformation and rotation experienced by the tracked targets, the genuine bounding box fails to capture the appearance information explicitly and introduces cluttered background. This paper proposes RTrack, a novel object representation baseline tracker that utilizes a set of sample points to get a pseudo bounding box. RTrack automatically arranges these points to define the spatial extents and highlight local areas. Building upon the baseline, we conducted an in-depth exploration of the training potential and introduced a one-to-many leading assignment strategy. It is worth noting that our approach achieves competitive performance to the state-of-the-art trackers on the GOT-10k dataset while reducing training time to just 10% of the previous state-of-the-art (SOTA) trackers' training costs. The substantial reduction in training costs brings single-object tracking (SOT) closer to the object detection (OD) task. Extensive experiments demonstrate that our proposed RTrack achieves SOTA results with faster convergence

    Classification of road users detected and tracked with LiDAR at intersections

    Get PDF
    Data collection is a necessary component of transportation engineering. Manual data collection methods have proven to be inefficient and limited in terms of the data required for comprehensive traffic and safety studies. Automatic methods are being introduced to characterize the transportation system more accurately and are providing more information to better understand the dynamics between road users. Video data collection is an inexpensive and widely used automated method, but the accuracy of video-based algorithms is known to be affected by obstacles and shadows and the third dimension is lost with video camera data collection. The impressive progress in sensing technologies has encouraged development of new methods for measuring the movements of road users. The Center for Road Safety at Purdue University proposed application of a LiDAR-based algorithm for tracking vehicles at intersections from a roadside location. LiDAR provides a three-dimensional characterization of the sensed environment for better detection and tracking results. The feasibility of this system was analyzed in this thesis using an evaluation methodology to determine the accuracy of the algorithm when tracking vehicles at intersections. According to the implemented method, the LiDAR-based system provides successful detection and tracking of vehicles, and its accuracy is comparable to the results provided by frame-by-frame extraction of trajectory data using video images by human observers. After supporting the suitability of the system for tracking, the second component of this thesis focused on proposing a classification methodology to discriminate between vehicles, pedestrians, and two-wheelers. Four different methodologies were applied to identify the best method for implementation. The KNN algorithm, which is capable of creating adaptive decision boundaries based on the characteristics of similar observations, provided better performance when evaluating new locations. The multinomial logit model did not allow the inclusion of collinear variables into the model. Overfitting of the training data was indicated in the classification tree and boosting methodologies and produced lower performance when the models were applied to the test data. Despite ANOVA analysis not supporting superior performance by a competitor, the objective of classifying movements at intersections under diverse conditions was achieved with the KNN algorithm and was chosen as the method to implement with the existing algorithm

    Generalized Relation Modeling for Transformer Tracking

    Full text link
    Compared with previous two-stream trackers, the recent one-stream tracking pipeline, which allows earlier interaction between the template and search region, has achieved a remarkable performance gain. However, existing one-stream trackers always let the template interact with all parts inside the search region throughout all the encoder layers. This could potentially lead to target-background confusion when the extracted feature representations are not sufficiently discriminative. To alleviate this issue, we propose a generalized relation modeling method based on adaptive token division. The proposed method is a generalized formulation of attention-based relation modeling for Transformer tracking, which inherits the merits of both previous two-stream and one-stream pipelines whilst enabling more flexible relation modeling by selecting appropriate search tokens to interact with template tokens. An attention masking strategy and the Gumbel-Softmax technique are introduced to facilitate the parallel computation and end-to-end learning of the token division module. Extensive experiments show that our method is superior to the two-stream and one-stream pipelines and achieves state-of-the-art performance on six challenging benchmarks with a real-time running speed.Comment: Accepted by CVPR 2023. Code and models are publicly available at https://github.com/Little-Podi/GR

    Errors and Truths from Transportation Data Aggregation: Some Implications for Research and Practice

    Get PDF
    Data aggregation, which is a process to combine information by defined groups for statistical analysis, summary, data size reduction, or other purposes, has fundamental challenges, such as loss of the original information. Improper data aggregation, such as sampling bias or incorrect calculation of average, may cause misreading of information. In first chapter, it is revealed that the harmonic mean, which is used to calculate space mean speed for fixed segment, has a sampling bias, i.e., overestimation with small samples. The several impact analyses show that the sampling bias is affected by sampling rate, time interval, segment length, and distribution type. If the data aggregation is properly used, it can help us improve analytical efficiency, encounter some of critical problems, or reveal its casualties and other relevant information. Second and third chapters utilize the aggregation of multi-source data to estimate error distributions of data sources and improve accuracy of their measurements. This is a leaping point of evaluating data sources as the proposed model does not require ground truth data. Second chapter focuses more on the methodology, i.e., a modified Approximate Bayesian Computation, incorporated to construct the error distribution with numerous simulations. In the simulated experiment, the proposed model outperformed the alternative approach, which is a conventional way of evaluating data source that is gathering error information by comparing with ground data source. Several sensitivity analyses explore that how the model performance is affected by sample size, number of data sources, and distribution types. The proposed model in chapter II is limited to one dimensional variable, and then the application is expanded to improving the position and distance measurement of connected vehicle environment. The proposed model can be used to further improve the accuracy of vehicle positioning with other existing methods, such as simultaneous localization and mapping (SLAM). The estimation process can be conducted in real-time operation, and the learning process will try to keep improving the accuracy of estimation. The results show that the proposed model noticeably improves the accuracy of position and distance measurements

    Visual Tracking via Nonnegative Multiple Coding

    Full text link
    © 2017 IEEE. It has been extensively observed that an accurate appearance model is critical to achieving satisfactory performance for robust object tracking. Most existing top-ranked methods rely on linear representation over a single dictionary, which brings about improper understanding on the target appearance. To address this problem, in this paper, we propose a novel appearance model named as "nonnegative multiple coding" (NMC) to accurately represent a target. First, a series of local dictionaries are created with different predefined numbers of nearest neighbors, and then the contributions of these dictionaries are automatically learned. As a result, this ensemble of dictionaries can comprehensively exploit the appearance information carried by all the constituted dictionaries. Second, the existing methods explicitly impose the nonnegative constraint to coefficient vectors, but in the proposed model, we directly deploy an efficient l2 norm regularization to achieve the similar nonnegative purpose with theoretical guarantees. Moreover, an efficient occlusion detection scheme is designed to alleviate tracking drifts, which investigates whether negative templates are selected to represent the severely occluded target. Experimental results on two benchmarks demonstrate that our NMC tracker are able to achieve superior performance to state-of-the-art methods
    • …
    corecore