Search CORE

79 research outputs found

Decomposed Human Motion Prior for Video Pose Estimation via Adversarial Training

Author: Chen Wenshuo
Gu Weixi
Yu Zhengdi
Zhang Kai
Zhou Xiang
Publication venue
Publication date: 24/09/2023
Field of study

Estimating human pose from video is a task that receives considerable attention due to its applicability in numerous 3D fields. The complexity of prior knowledge of human body movements poses a challenge to neural network models in the task of regressing keypoints. In this paper, we address this problem by incorporating motion prior in an adversarial way. Different from previous methods, we propose to decompose holistic motion prior to joint motion prior, making it easier for neural networks to learn from prior knowledge thereby boosting the performance on the task. We also utilize a novel regularization loss to balance accuracy and smoothness introduced by motion prior. Our method achieves 9\% lower PA-MPJPE and 29\% lower acceleration error than previous methods tested on 3DPW. The estimator proves its robustness by achieving impressive performance on in-the-wild dataset

arXiv.org e-Print Archive

Shareable Driving Style Learning and Analysis with a Hierarchical Latent Model

Author: Chen Zhaokun
Sun Lijun
Wang Wenshuo
Xi Junqiang
Zhang Chaopeng
Zhang Jian
Publication venue
Publication date: 24/10/2023
Field of study

Driving style is usually used to characterize driving behavior for a driver or a group of drivers. However, it remains unclear how one individual's driving style shares certain common grounds with other drivers. Our insight is that driving behavior is a sequence of responses to the weighted mixture of latent driving styles that are shareable within and between individuals. To this end, this paper develops a hierarchical latent model to learn the relationship between driving behavior and driving styles. We first propose a fragment-based approach to represent complex sequential driving behavior, allowing for sufficiently representing driving behavior in a low-dimension feature space. Then, we provide an analytical formulation for the interaction of driving behavior and shareable driving style with a hierarchical latent model by introducing the mechanism of Dirichlet allocation. Our developed model is finally validated and verified with 100 drivers in naturalistic driving settings with urban and highways. Experimental results reveal that individuals share driving styles within and between them. We also analyzed the influence of personalities (e.g., age, gender, and driving experience) on driving styles and found that a naturally aggressive driver would not always keep driving aggressively (i.e., could behave calmly sometimes) but with a higher proportion of aggressiveness than other types of drivers

arXiv.org e-Print Archive

Modeling and Recognizing Driver Behavior Based on Driving Data: A Survey

Author: Huiyan Chen
Junqiang Xi
Wenshuo Wang
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2014
Field of study

In recent years, modeling and recognizing driver behavior have become crucial to understanding intelligence transport systems, human-vehicle systems, and intelligent vehicle systems. A wide range of both mathematical identification methods and modeling methods of driver behavior are presented from the control point of view in this paper based on the driving data, such as the brake/throttle pedal position and the steering wheel angle, among others. Subsequently, the driver’s characteristics derived from the driver model are embedded into the advanced driver assistance systems, and the evaluation and verification of vehicle systems based on the driver model are described

Crossref

Directory of Open Access Journals

TinySAM: Pushing the Envelope for Efficient Segment Anything Model

Author: Chen Xinghao
Chen Yihao
Li Houqiang
Li Wenshuo
Shu Han
Tang Yehui
Wang Yunhe
Zhang Yiman
Publication venue
Publication date: 09/03/2024
Field of study

Recently segment anything model (SAM) has shown powerful segmentation capability and has drawn great attention in computer vision fields. Massive following works have developed various applications based on the pretrained SAM and achieved impressive performance on downstream vision tasks. However, SAM consists of heavy architectures and requires massive computational capacity, which hinders the further application of SAM on computation constrained edge devices. To this end, in this paper we propose a framework to obtain a tiny segment anything model (TinySAM) while maintaining the strong zero-shot performance. We first propose a full-stage knowledge distillation method with hard prompt sampling and hard mask weighting strategy to distill a lightweight student model. We also adapt the post-training quantization to the promptable segmentation task and further reduce the computational cost. Moreover, a hierarchical segmenting everything strategy is proposed to accelerate the everything inference by

2\times

with almost no performance degradation. With all these proposed methods, our TinySAM leads to orders of magnitude computational reduction and pushes the envelope for efficient segment anything task. Extensive experiments on various zero-shot transfer tasks demonstrate the significantly advantageous performance of our TinySAM against counterpart methods. Pre-trained models and codes are available at https://github.com/xinghaochen/TinySAM and https://gitee.com/mindspore/models/tree/master/research/cv/TinySAM

arXiv.org e-Print Archive

Gen-L-Video: Multi-Text to Long Video Generation via Temporal Co-Denoising

Author: Chen Wenshuo
Li Hongsheng
Liu Yu
Song Guanglu
Wang Fu-Yun
Ye Han-Jia
Publication venue
Publication date: 29/05/2023
Field of study

Leveraging large-scale image-text datasets and advancements in diffusion models, text-driven generative models have made remarkable strides in the field of image generation and editing. This study explores the potential of extending the text-driven ability to the generation and editing of multi-text conditioned long videos. Current methodologies for video generation and editing, while innovative, are often confined to extremely short videos (typically less than 24 frames) and are limited to a single text condition. These constraints significantly limit their applications given that real-world videos usually consist of multiple segments, each bearing different semantic information. To address this challenge, we introduce a novel paradigm dubbed as Gen-L-Video, capable of extending off-the-shelf short video diffusion models for generating and editing videos comprising hundreds of frames with diverse semantic segments without introducing additional training, all while preserving content consistency. We have implemented three mainstream text-driven video generation and editing methodologies and extended them to accommodate longer videos imbued with a variety of semantic segments with our proposed paradigm. Our experimental outcomes reveal that our approach significantly broadens the generative and editing capabilities of video diffusion models, offering new possibilities for future research and applications. The code is available at https://github.com/G-U-N/Gen-L-Video.Comment: The code is available at https://github.com/G-U-N/Gen-L-Vide

arXiv.org e-Print Archive

Real-Time Scalable Visual Tracking via Quadrangle Kernelized Correlation Filters

Author: Chen Wenshuo
Ding Guiguang
Han Jungong
Liu Qiaoyan
Zhao Sicheng
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

Correlation filter (CF) has been widely used in tracking tasks due to its simplicity and high efficiency. However, conventional CF-based trackers fail to handle the scale variation that occurs when the targeted object is moving, which is one of the most notable unsolved problems of visual object tracking. In this paper, we propose a scalable visual tracking algorithm based on kernelized correlation filters, referred to as quadrangle kernelized correlation filters (QKCF). Unlike existing complicated scalable trackers that either perform the correlation filtering operation multiple times or extract many candidate windows at various scales, our tracker intends to estimate the scale of the object based on the positions of its four corners, which can be detected using a new Gaussian training output matrix within one filtering process. After obtaining four peak values corresponding to the four corners, we measure the detection confidence of each part response by evaluating its spatial and temporal smoothness. On top of it, a weighted Bayesian inference framework is employed to estimate the final location and size of the bounding box from the response matrix, where the weights are synchronized with the calculated detection likelihoods. Experiments are performed on the OTB-100 data set and 16 benchmark sequences with significant scale variations. The results demonstrate the superiority of the proposed method in terms of both effectiveness and robustness, compared with the state-of-the-art methods

Crossref

Lancaster E-Prints

A New Type of Quartz Smog Chamber : Design and Characterization

Author: Chen Tianzeng
Chu Biwu
Feng Zemin
Guo Yishuo
He Hong
Hua Chenjie
Kulmala Markku
Liu Chunshan
Liu Yongchun
Ma Li
Ma Qingxin
Ma Wei
Mu Yujing
Yan Chao
Zhan Junlei
Zhang Ying
Zhang Yusheng
Zhou Wenshuo
Publication venue
Publication date: 15/02/2022
Field of study

Publisher Copyright: ©Since the 1960s, many indoor and outdoor smog chambers have been developed worldwide. However, most of them are made of Teflon films, which have relatively high background contaminations due to the wall effect. We developed the world's first medium-size quartz chamber (10 m(3)), which is jointed with 32 pieces of 5 mm thick polished quartz glasses and a stainless-steel frame. Characterizations show that this chamber exhibits excellent performance in terms of relative humidity (RH) (2-80%) and temperature (15-30 +/- 1 degrees C) control, mixing efficiency of the reactants (6-8 min), light transmittance (>90% above 290 nm), and wall loss of pollutants. The wall loss rates of the gas-phase pollutants are on the order of 10(-4) min(-1) at 298 K under dry conditions. It is 0.08 h(-1) for 100-500 nm particles, significantly lower than those of Teflon chambers. The photolysis rate of NO2 (J(NO2)) is automatically adjustable to simulate the diurnal variation of solar irradiation from 0 to 0.40 min(-1). The inner surface of the chamber can be repeatedly washed with deionized water, resulting in low background contaminations. Both experiments (toluene-NOx and alpha-pinene-ozone systems) and box model demonstrate that this new quartz chamber can provide high-quality data for investigating SOA and O-3 formation in the atmosphere.Peer reviewe

Helsingin yliopiston digitaalinen arkisto