1,653 research outputs found

    Designing Network Design Strategies Through Gradient Path Analysis

    Full text link
    Designing a high-efficiency and high-quality expressive network architecture has always been the most important research topic in the field of deep learning. Most of today's network design strategies focus on how to integrate features extracted from different layers, and how to design computing units to effectively extract these features, thereby enhancing the expressiveness of the network. This paper proposes a new network design strategy, i.e., to design the network architecture based on gradient path analysis. On the whole, most of today's mainstream network design strategies are based on feed forward path, that is, the network architecture is designed based on the data path. In this paper, we hope to enhance the expressive ability of the trained model by improving the network learning ability. Due to the mechanism driving the network parameter learning is the backward propagation algorithm, we design network design strategies based on back propagation path. We propose the gradient path design strategies for the layer-level, the stage-level, and the network-level, and the design strategies are proved to be superior and feasible from theoretical analysis and experiments.Comment: 12 pages, 9 figure

    Fast Video Retrieval via the Statistics of Motion

    Get PDF
    [[abstract]]Due to the popularity of the Internet and the powerful computing capability of computers, efficient processing/retrieval of multimedia data has become an important issue. In this paper, we propose a fast video retrieval algorithm that bases its search core on the statistics of object motion. The algorithm starts with extracting object motions from a shot and then transform/quantize them into the form of probability distributions. By choosing the shot that has the largest entropy value among the constituent shots of an unknown query video clip, we execute the first stage video search.By comparing two shots with different lengths, their corresponding motion probability distributions are compared by a discrete Bhattacharyya distance which is designed to measure the similarity between any two distribution functions. In the second stage, we add an adjacent shot(either preceding or subsequent) to perform a finer comparison. Experimental results demonstrate that our fast video retrieval algorithm is powerful in terms of accuracy and efficiency.[[fileno]]2030144030026[[department]]電機工程學

    Automatic Key Posture Selection for Human Behavior Analysis

    Get PDF
    [[abstract]]A novel human posture analysis framework that can perform automatic key posture selection and template matching for human behavior analysis is proposed. The entropy measurement, which is commonly adopted as an important feature to describe the degree of disorder in thermodynamics, is used as an underlying feature for identifying key postures. First, we use cumulative entropy change as an indicator to select an appropriate set of key postures from a human behavior video sequence and then conduct a cross entropy check to remove redundant key postures. With the key postures detected and stored as human posture templates, the degree of similarity between a query posture and a database template is evaluated using a modified Hausdorff distance measure. The experiment results show that the proposed system is highly efficient and powerful[[fileno]]2030144030013[[department]]電機工程學

    NeighborTrack: Improving Single Object Tracking by Bipartite Matching with Neighbor Tracklets

    Full text link
    We propose a post-processor, called NeighborTrack, that leverages neighbor information of the tracking target to validate and improve single-object tracking (SOT) results. It requires no additional data or retraining. Instead, it uses the confidence score predicted by the backbone SOT network to automatically derive neighbor information and then uses this information to improve the tracking results. When tracking an occluded target, its appearance features are untrustworthy. However, a general siamese network often cannot tell whether the tracked object is occluded by reading the confidence score alone, because it could be misled by neighbors with high confidence scores. Our proposed NeighborTrack takes advantage of unoccluded neighbors' information to reconfirm the tracking target and reduces false tracking when the target is occluded. It not only reduces the impact caused by occlusion, but also fixes tracking problems caused by object appearance changes. NeighborTrack is agnostic to SOT networks and post-processing methods. For the VOT challenge dataset commonly used in short-term object tracking, we improve three famous SOT networks, Ocean, TransT, and OSTrack, by an average of 1.92%{1.92\%} EAO and 2.11%{2.11\%} robustness. For the mid- and long-term tracking experiments based on OSTrack, we achieve state-of-the-art 72.25%{72.25\%} AUC on LaSOT and 75.7%{75.7\%} AO on GOT-10K. Code duplication can be found in https://github.com/franktpmvu/NeighborTrack.Comment: This paper was accepted by 9th International Workshop on Computer Vision in Sports (CVsports) 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW
    • …
    corecore