4,430 research outputs found
Robust Visual Tracking Revisited: From Correlation Filter to Template Matching
In this paper, we propose a novel matching based tracker by investigating the
relationship between template matching and the recent popular correlation
filter based trackers (CFTs). Compared to the correlation operation in CFTs, a
sophisticated similarity metric termed "mutual buddies similarity" (MBS) is
proposed to exploit the relationship of multiple reciprocal nearest neighbors
for target matching. By doing so, our tracker obtains powerful discriminative
ability on distinguishing target and background as demonstrated by both
empirical and theoretical analyses. Besides, instead of utilizing single
template with the improper updating scheme in CFTs, we design a novel online
template updating strategy named "memory filtering" (MF), which aims to select
a certain amount of representative and reliable tracking results in history to
construct the current stable and expressive template set. This scheme is
beneficial for the proposed tracker to comprehensively "understand" the target
appearance variations, "recall" some stable results. Both qualitative and
quantitative evaluations on two benchmarks suggest that the proposed tracking
method performs favorably against some recently developed CFTs and other
competitive trackers.Comment: has been published on IEEE TI
Efficient Diverse Ensemble for Discriminative Co-Tracking
Ensemble discriminative tracking utilizes a committee of classifiers, to
label data samples, which are in turn, used for retraining the tracker to
localize the target using the collective knowledge of the committee. Committee
members could vary in their features, memory update schemes, or training data,
however, it is inevitable to have committee members that excessively agree
because of large overlaps in their version space. To remove this redundancy and
have an effective ensemble learning, it is critical for the committee to
include consistent hypotheses that differ from one-another, covering the
version space with minimum overlaps. In this study, we propose an online
ensemble tracker that directly generates a diverse committee by generating an
efficient set of artificial training. The artificial data is sampled from the
empirical distribution of the samples taken from both target and background,
whereas the process is governed by query-by-committee to shrink the overlap
between classifiers. The experimental results demonstrate that the proposed
scheme outperforms conventional ensemble trackers on public benchmarks.Comment: CVPR 2018 Submissio
RTrack: Accelerating Convergence for Visual Object Tracking via Pseudo-Boxes Exploration
Single object tracking (SOT) heavily relies on the representation of the
target object as a bounding box. However, due to the potential deformation and
rotation experienced by the tracked targets, the genuine bounding box fails to
capture the appearance information explicitly and introduces cluttered
background. This paper proposes RTrack, a novel object representation baseline
tracker that utilizes a set of sample points to get a pseudo bounding box.
RTrack automatically arranges these points to define the spatial extents and
highlight local areas. Building upon the baseline, we conducted an in-depth
exploration of the training potential and introduced a one-to-many leading
assignment strategy. It is worth noting that our approach achieves competitive
performance to the state-of-the-art trackers on the GOT-10k dataset while
reducing training time to just 10% of the previous state-of-the-art (SOTA)
trackers' training costs. The substantial reduction in training costs brings
single-object tracking (SOT) closer to the object detection (OD) task.
Extensive experiments demonstrate that our proposed RTrack achieves SOTA
results with faster convergence
Classification of road users detected and tracked with LiDAR at intersections
Data collection is a necessary component of transportation engineering. Manual data collection methods have proven to be inefficient and limited in terms of the data required for comprehensive traffic and safety studies. Automatic methods are being introduced to characterize the transportation system more accurately and are providing more information to better understand the dynamics between road users. Video data collection is an inexpensive and widely used automated method, but the accuracy of video-based algorithms is known to be affected by obstacles and shadows and the third dimension is lost with video camera data collection.
The impressive progress in sensing technologies has encouraged development of new methods for measuring the movements of road users. The Center for Road Safety at Purdue University proposed application of a LiDAR-based algorithm for tracking vehicles at intersections from a roadside location. LiDAR provides a three-dimensional characterization of the sensed environment for better detection and tracking results. The feasibility of this system was analyzed in this thesis using an evaluation methodology to determine the accuracy of the algorithm when tracking vehicles at intersections. According to the implemented method, the LiDAR-based system provides successful detection and tracking of vehicles, and its accuracy is comparable to the results provided by frame-by-frame extraction of trajectory data using video images by human observers.
After supporting the suitability of the system for tracking, the second component of this thesis focused on proposing a classification methodology to discriminate between vehicles, pedestrians, and two-wheelers. Four different methodologies were applied to identify the best method for implementation. The KNN algorithm, which is capable of creating adaptive decision boundaries based on the characteristics of similar observations, provided better performance when evaluating new locations. The multinomial logit model did not allow the inclusion of collinear variables into the model. Overfitting of the training data was indicated in the classification tree and boosting methodologies and produced lower performance when the models were applied to the test data. Despite ANOVA analysis not supporting superior performance by a competitor, the objective of classifying movements at intersections under diverse conditions was achieved with the KNN algorithm and was chosen as the method to implement with the existing algorithm
Generalized Relation Modeling for Transformer Tracking
Compared with previous two-stream trackers, the recent one-stream tracking
pipeline, which allows earlier interaction between the template and search
region, has achieved a remarkable performance gain. However, existing
one-stream trackers always let the template interact with all parts inside the
search region throughout all the encoder layers. This could potentially lead to
target-background confusion when the extracted feature representations are not
sufficiently discriminative. To alleviate this issue, we propose a generalized
relation modeling method based on adaptive token division. The proposed method
is a generalized formulation of attention-based relation modeling for
Transformer tracking, which inherits the merits of both previous two-stream and
one-stream pipelines whilst enabling more flexible relation modeling by
selecting appropriate search tokens to interact with template tokens. An
attention masking strategy and the Gumbel-Softmax technique are introduced to
facilitate the parallel computation and end-to-end learning of the token
division module. Extensive experiments show that our method is superior to the
two-stream and one-stream pipelines and achieves state-of-the-art performance
on six challenging benchmarks with a real-time running speed.Comment: Accepted by CVPR 2023. Code and models are publicly available at
https://github.com/Little-Podi/GR
Errors and Truths from Transportation Data Aggregation: Some Implications for Research and Practice
Data aggregation, which is a process to combine information by defined groups for statistical analysis, summary, data size reduction, or other purposes, has fundamental challenges, such as loss of the original information. Improper data aggregation, such as sampling bias or incorrect calculation of average, may cause misreading of information. In first chapter, it is revealed that the harmonic mean, which is used to calculate space mean speed for fixed segment, has a sampling bias, i.e., overestimation with small samples. The several impact analyses show that the sampling bias is affected by sampling rate, time interval, segment length, and distribution type.
If the data aggregation is properly used, it can help us improve analytical efficiency, encounter some of critical problems, or reveal its casualties and other relevant information. Second and third chapters utilize the aggregation of multi-source data to estimate error distributions of data sources and improve accuracy of their measurements. This is a leaping point of evaluating data sources as the proposed model does not require ground truth data. Second chapter focuses more on the methodology, i.e., a modified Approximate Bayesian Computation, incorporated to construct the error distribution with numerous simulations. In the simulated experiment, the proposed model outperformed the alternative approach, which is a conventional way of evaluating data source that is gathering error information by comparing with ground data source. Several sensitivity analyses explore that how the model performance is affected by sample size, number of data sources, and distribution types. The proposed model in chapter II is limited to one dimensional variable, and then the application is expanded to improving the position and distance measurement of connected vehicle environment. The proposed model can be used to further improve the accuracy of vehicle positioning with other existing methods, such as simultaneous localization and mapping (SLAM). The estimation process can be conducted in real-time operation, and the learning process will try to keep improving the accuracy of estimation. The results show that the proposed model noticeably improves the accuracy of position and distance measurements
Visual Tracking via Nonnegative Multiple Coding
© 2017 IEEE. It has been extensively observed that an accurate appearance model is critical to achieving satisfactory performance for robust object tracking. Most existing top-ranked methods rely on linear representation over a single dictionary, which brings about improper understanding on the target appearance. To address this problem, in this paper, we propose a novel appearance model named as "nonnegative multiple coding" (NMC) to accurately represent a target. First, a series of local dictionaries are created with different predefined numbers of nearest neighbors, and then the contributions of these dictionaries are automatically learned. As a result, this ensemble of dictionaries can comprehensively exploit the appearance information carried by all the constituted dictionaries. Second, the existing methods explicitly impose the nonnegative constraint to coefficient vectors, but in the proposed model, we directly deploy an efficient l2 norm regularization to achieve the similar nonnegative purpose with theoretical guarantees. Moreover, an efficient occlusion detection scheme is designed to alleviate tracking drifts, which investigates whether negative templates are selected to represent the severely occluded target. Experimental results on two benchmarks demonstrate that our NMC tracker are able to achieve superior performance to state-of-the-art methods
- …