1,653 research outputs found
Designing Network Design Strategies Through Gradient Path Analysis
Designing a high-efficiency and high-quality expressive network architecture
has always been the most important research topic in the field of deep
learning. Most of today's network design strategies focus on how to integrate
features extracted from different layers, and how to design computing units to
effectively extract these features, thereby enhancing the expressiveness of the
network. This paper proposes a new network design strategy, i.e., to design the
network architecture based on gradient path analysis. On the whole, most of
today's mainstream network design strategies are based on feed forward path,
that is, the network architecture is designed based on the data path. In this
paper, we hope to enhance the expressive ability of the trained model by
improving the network learning ability. Due to the mechanism driving the
network parameter learning is the backward propagation algorithm, we design
network design strategies based on back propagation path. We propose the
gradient path design strategies for the layer-level, the stage-level, and the
network-level, and the design strategies are proved to be superior and feasible
from theoretical analysis and experiments.Comment: 12 pages, 9 figure
Fast Video Retrieval via the Statistics of Motion
[[abstract]]Due to the popularity of the Internet and the powerful computing capability of computers, efficient processing/retrieval of multimedia data has become an important issue. In this paper, we propose a fast video retrieval algorithm that bases its search core on the statistics of object motion. The algorithm starts with extracting object motions from a shot and then transform/quantize them into the form of probability distributions. By choosing the shot that has the largest entropy value among the constituent shots of an unknown query video clip, we execute the first stage video search.By comparing two shots with different lengths, their corresponding motion probability distributions are compared by a discrete Bhattacharyya distance which is designed to measure the similarity between any two distribution functions. In the second stage, we add an adjacent shot(either preceding or subsequent) to perform a finer comparison. Experimental results demonstrate that our fast video retrieval algorithm is powerful in terms of accuracy and efficiency.[[fileno]]2030144030026[[department]]電機工程å¸
Automatic Key Posture Selection for Human Behavior Analysis
[[abstract]]A novel human posture analysis framework that can perform automatic key posture selection and template matching for human behavior analysis is proposed. The entropy measurement, which is commonly adopted as an important feature to describe the degree of disorder in thermodynamics, is used as an underlying feature for identifying key postures. First, we use cumulative entropy change as an indicator to select an appropriate set of key postures from a human behavior video sequence and then conduct a cross entropy check to remove redundant key postures. With the key postures detected and stored as human posture templates, the degree of similarity between a query posture and a database template is evaluated using a modified Hausdorff distance measure. The experiment results show that the proposed system is highly efficient and powerful[[fileno]]2030144030013[[department]]電機工程å¸
NeighborTrack: Improving Single Object Tracking by Bipartite Matching with Neighbor Tracklets
We propose a post-processor, called NeighborTrack, that leverages neighbor
information of the tracking target to validate and improve single-object
tracking (SOT) results. It requires no additional data or retraining. Instead,
it uses the confidence score predicted by the backbone SOT network to
automatically derive neighbor information and then uses this information to
improve the tracking results. When tracking an occluded target, its appearance
features are untrustworthy. However, a general siamese network often cannot
tell whether the tracked object is occluded by reading the confidence score
alone, because it could be misled by neighbors with high confidence scores. Our
proposed NeighborTrack takes advantage of unoccluded neighbors' information to
reconfirm the tracking target and reduces false tracking when the target is
occluded. It not only reduces the impact caused by occlusion, but also fixes
tracking problems caused by object appearance changes. NeighborTrack is
agnostic to SOT networks and post-processing methods. For the VOT challenge
dataset commonly used in short-term object tracking, we improve three famous
SOT networks, Ocean, TransT, and OSTrack, by an average of EAO and
robustness. For the mid- and long-term tracking experiments based on
OSTrack, we achieve state-of-the-art AUC on LaSOT and AO
on GOT-10K. Code duplication can be found in
https://github.com/franktpmvu/NeighborTrack.Comment: This paper was accepted by 9th International Workshop on Computer
Vision in Sports (CVsports) 2023 IEEE/CVF Conference on Computer Vision and
Pattern Recognition Workshops (CVPRW
- …