Search CORE

15 research outputs found

Dynamic Prototype Mask for Occluded Person Re-Identification

Author: Dai Pingyang
Ji Rongrong
Tan Lei
Wu Yongjian
Publication venue
Publication date: 18/07/2022
Field of study

Although person re-identification has achieved an impressive improvement in recent years, the common occlusion case caused by different obstacles is still an unsettled issue in real application scenarios. Existing methods mainly address this issue by employing body clues provided by an extra network to distinguish the visible part. Nevertheless, the inevitable domain gap between the assistant model and the ReID datasets has highly increased the difficulty to obtain an effective and efficient model. To escape from the extra pre-trained networks and achieve an automatic alignment in an end-to-end trainable network, we propose a novel Dynamic Prototype Mask (DPM) based on two self-evident prior knowledge. Specifically, we first devise a Hierarchical Mask Generator which utilizes the hierarchical semantic to select the visible pattern space between the high-quality holistic prototype and the feature representation of the occluded input image. Under this condition, the occluded representation could be well aligned in a selected subspace spontaneously. Then, to enrich the feature representation of the high-quality holistic prototype and provide a more complete feature space, we introduce a Head Enrich Module to encourage different heads to aggregate different patterns representation in the whole image. Extensive experimental evaluations conducted on occluded and holistic person re-identification benchmarks demonstrate the superior performance of the DPM over the state-of-the-art methods. The code is released at https://github.com/stone96123/DPM.Comment: Accepted by ACM MM 202

arXiv.org e-Print Archive

Approximated Prompt Tuning for Vision-Language Pre-trained Models

Author: Dai Pingyang
Huang Shubin
Ji Rongrong
Jiang Guannan
Shu Annan
Wu Qiong
Zhou Yiyi
Publication venue
Publication date: 27/06/2023
Field of study

Prompt tuning is a parameter-efficient way to deploy large-scale pre-trained models to downstream tasks by adding task-specific tokens. In terms of vision-language pre-trained (VLP) models, prompt tuning often requires a large number of learnable tokens to bridge the gap between the pre-training and downstream tasks, which greatly exacerbates the already high computational overhead. In this paper, we revisit the principle of prompt tuning for Transformer-based VLP models and reveal that the impact of soft prompt tokens can be actually approximated via independent information diffusion steps, thereby avoiding the expensive global attention modeling and reducing the computational complexity to a large extent. Based on this finding, we propose a novel Approximated Prompt Tuning (APT) approach towards efficient VL transfer learning. To validate APT, we apply it to two representative VLP models, namely ViLT and METER, and conduct extensive experiments on a bunch of downstream tasks. Meanwhile, the generalization of APT is also validated on CLIP for image classification. The experimental results not only show the superior performance gains and computation efficiency of APT against the conventional prompt tuning methods, e.g., +6.6% accuracy and -64.62% additional computation overhead on METER, but also confirm its merits over other parameter-efficient transfer learning approaches

arXiv.org e-Print Archive

The knowledge cultures of changing farming practices in a water town of the Southern Yangtze Valley, China

Author: Dai Xingyi
Harder Marie
Liu Pingyang
Ravenscroft Neil
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 28/05/2015
Field of study

Crossref

University of Brighton Research Portal

Bag of Features with Dense Sampling for Visual Tracking ⋆

Author: Dai Pingyang
Li Cuihua
Liu Weisheng
Wang Lan
Xie Yi
李翠华
Publication venue
Publication date: 15/10/2013
Field of study

The bag-of-feature model has become a state-of-the-art method of visual classification. Visual codebooks can be used to capture image statistical information for object detection and classification, which is extracted from local image patches and based on the quantization of robust appearance descriptors. In this paper, more information of target objects can be captured by dense sampling rather than sparsely sampling. Then a robust visual tracking method is proposed based on dense sampling and bag of features. Firstly, local image patches are densely extracted by sliding windows and represented as invariant descriptors. Secondly, visual codebooks are generated by fast clustering algorithms such as hierarchical k-means. Therefore, the object region and candidate regions are represented by the bag-of-feature model with the learnt codebooks. After that, tracking can operate in a Bayesian inference framework. The bag-of-feature tracking method with dense sampling is adaptive and flexible. It works independently in many situations without the complement of existed tracking algorithms. The experiments on various challenging videos demonstrate that the proposed tracker outperforms several state-of-art algorithms

CiteSeerX

Xiamen University Institutional Repository

Object Tracking Based on the Combination of Learning and Cascade Particle Filter

Author: Dai Pingyang
Gong Hanjie
Li Cuihua
Xie Yi
李翠华
谢怡
Publication venue: IEEE SYS MAN CYBERN
Publication date: 01/01/2009
Field of study

Conference Name:IEEE International Conference on Systems, Man and Cybernetics. Conference Address: San Antonio, TX. Time:OCT 11-14, 2009.The problem of object tracking in dense clutter is a challenge in computer vision. This paper proposes a method for tracking object robustly by combining the online selection of discriminative color features and the offline selection of discriminative Haar features. Furthermore, the cascade particle filter which has four stages of importance sampling is used to fuse two kinds of features efficiently. When the illumination changes dramatically, the Haar features selected offline play a major role. When the object is occluded, or its rotation angle is very large, the color features selected online play a major role. The experimental results show that the proposed method performs well under the conditions of illumination change, occlusion, object scale change and abrupt motion of object or camera

Xiamen University Institutional Repository

Robust visual tracking via part-based sparsity model

Author: Dai Pingyang
Li Cuihua
Liu Weisheng
Luo Yanlong
Xie Yi
李翠华
Publication venue: Institute of Electrical and Electronics Engineers Inc.
Publication date: 18/10/2013
Field of study

Conference Name:2013 38th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013. Conference Address: Vancouver, BC, Canada. Time:May 26, 2013 - May 31, 2013.IEE Signal Processing SocietyThe sparse representation has been widely used in many areas including visual tracking. The part-based representation performs outstandingly by using non-holistic templates to against occlusion. This paper combined them and proposed a robust object tracking method using part-based sparsity model for tracking an object in a video sequence. In the proposed model, one object is represented by image patches. The candidates of these patches are sparsely represented in the space which is spanned by the patch templates and trivial templates. The part-based method takes the spatial information of each patch into consideration, where the vote maps of multiple patches are used. Furthermore, the update scheme keeps the representative templates of each part dynamically. Therefore, trackers can effectively deal with the changes of appearances and heavy occlusion. On various public benchmark videos, the abundant results of experiments demonstrate that the proposed tracking method outperforms many existing state-of-the-arts algorithms. ? 2013 IEEE

Xiamen University Institutional Repository

Dual Distribution Alignment Network for Generalizable Person Re-Identification

Author: Chen Peixian
Dai Pingyang
Ji Rongrong
Liu Jianzhuang
Tian Qi
Xu Mingliang
Zheng Feng
Publication venue: Association for the Advancement of Artificial Intelligence
Publication date: 18/05/2021
Field of study

Domain generalization (DG) offers a preferable real-world setting for Person Re-Identification (Re-ID), which trains a model using multiple source domain datasets and expects it to perform well in an unseen target domain without any model updating. Unfortunately, most DG approaches are designed explicitly for classification tasks, which fundamentally differs from the retrieval task Re-ID. Moreover, existing applications of DG in Re-ID cannot correctly handle the massive variation among Re-ID datasets. In this paper, we identify two fundamental challenges in DG for Person Re-ID: domain-wise variations and identity-wise similarities. To this end, we propose an end-to-end Dual Distribution Alignment Network (DDAN) to learn domain-invariant features with dual-level constraints: the domain-wise adversarial feature learning and the identity-wise similarity enhancement. These constraints effectively reduce the domain-shift among multiple source domains further while agreeing to real-world scenarios. We evaluate our method in a large-scale DG Re-ID benchmark and compare it with various cutting-edge DG approaches. Quantitative results show that DDAN achieves state-of-the-art performance

Association for the Advancement of Artificial Intelligence: AAAI Publications