Search CORE

14 research outputs found

Robust Sensor Fusion for Indoor Wireless Localization

Author: Wang Gang
Zhang Zuxuan
Publication venue
Publication date: 25/09/2023
Field of study

Location knowledge in indoor environment using Indoor Positioning Systems (IPS) has become very useful and popular in recent years. Indoor wireless localization suffers from severe multi-path fading and non-line-of-sight conditions. This paper presents a novel indoor localization framework based on sensor fusion of Zigbee Wireless Sensor Networks (WSN) using Received Signal Strength (RSS). The unknown position is equipped with two or more mobile nodes. The range between two mobile nodes is fixed as priori. The attitude (roll, pitch, and yaw) of the mobile node are measured by inertial sensors (ISs). Then the angle and the range between any two nodes can be obtained, and thus the path between the two nodes can be modeled as a curve. Through an efficient cooperation between two or more mobile nodes, this framework effectively exploits the RSS techniques. This constraint help improve the positioning accuracy. Theoretical analysis on localization distortion and Monte Carlo simulations shows that the proposed cooperative strategy of multiple nodes with extended Kalman filter (EKF) achieves significantly higher positioning accuracy than the existing systems, especially in heavily obstructed scenarios

arXiv.org e-Print Archive

Quaternion MLP Neural Networks Based on the Maximum Correntropy Criterion

Author: Tian Xinyu
Wang Gang
Zhang Zuxuan
Publication venue
Publication date: 13/09/2023
Field of study

We propose a gradient ascent algorithm for quaternion multilayer perceptron (MLP) networks based on the cost function of the maximum correntropy criterion (MCC). In the algorithm, we use the split quaternion activation function based on the generalized Hamilton-real quaternion gradient. By introducing a new quaternion operator, we first rewrite the early quaternion single layer perceptron algorithm. Secondly, we propose a gradient descent algorithm for quaternion multilayer perceptron based on the cost function of the mean square error (MSE). Finally, the MSE algorithm is extended to the MCC algorithm. Simulations show the feasibility of the proposed method

arXiv.org e-Print Archive

Variational Bayesian Approximations Kalman Filter Based on Threshold Judgment

Author: He Jiacheng
Wang Gang
Zhang Zuxuan
Zhong Shan
Publication venue
Publication date: 06/09/2023
Field of study

The estimation of non-Gaussian measurement noise models is a significant challenge across various fields. In practical applications, it often faces challenges due to the large number of parameters and high computational complexity. This paper proposes a threshold-based Kalman filtering approach for online estimation of noise parameters in non-Gaussian measurement noise models. This method uses a certain amount of sample data to infer the variance threshold of observation parameters and employs variational Bayesian estimation to obtain corresponding noise variance estimates, enabling subsequent iterations of the Kalman filtering algorithm. Finally, we evaluate the performance of this algorithm through simulation experiments, demonstrating its accurate and effective estimation of state and noise parameters.Comment: 5 pages, conferenc

arXiv.org e-Print Archive

DiffusionAD: Norm-guided One-step Denoising Diffusion for Anomaly Detection

Author: Jiang Yu-Gang
Wang Zheng
Wu Zuxuan
Zhang Hui
Publication venue
Publication date: 24/11/2023
Field of study

Anomaly detection has garnered extensive applications in real industrial manufacturing due to its remarkable effectiveness and efficiency. However, previous generative-based models have been limited by suboptimal reconstruction quality, hampering their overall performance. A fundamental enhancement lies in our reformulation of the reconstruction process using a diffusion model into a noise-to-norm paradigm. Here, anomalous regions are perturbed with Gaussian noise and reconstructed as normal, overcoming the limitations of previous models by facilitating anomaly-free restoration. Additionally, we propose a rapid one-step denoising paradigm, significantly faster than the traditional iterative denoising in diffusion models. Furthermore, the introduction of the norm-guided paradigm elevates the accuracy and fidelity of reconstructions. The segmentation sub-network predicts pixel-level anomaly scores using the input image and its anomaly-free restoration. Comprehensive evaluations on four standard and challenging benchmarks reveal that DiffusionAD outperforms current state-of-the-art approaches, demonstrating the effectiveness and broad applicability of the proposed pipeline.Comment: 14 pages, 12 figure

arXiv.org e-Print Archive

Prototypical Residual Networks for Anomaly Detection and Localization

Author: Chen Zhineng
Jiang Yu-Gang
Wang Zheng
Wu Zuxuan
Zhang Hui
Publication venue
Publication date: 18/04/2023
Field of study

Anomaly detection and localization are widely used in industrial manufacturing for its efficiency and effectiveness. Anomalies are rare and hard to collect and supervised models easily over-fit to these seen anomalies with a handful of abnormal samples, producing unsatisfactory performance. On the other hand, anomalies are typically subtle, hard to discern, and of various appearance, making it difficult to detect anomalies and let alone locate anomalous regions. To address these issues, we propose a framework called Prototypical Residual Network (PRN), which learns feature residuals of varying scales and sizes between anomalous and normal patterns to accurately reconstruct the segmentation maps of anomalous regions. PRN mainly consists of two parts: multi-scale prototypes that explicitly represent the residual features of anomalies to normal patterns; a multisize self-attention mechanism that enables variable-sized anomalous feature learning. Besides, we present a variety of anomaly generation strategies that consider both seen and unseen appearance variance to enlarge and diversify anomalies. Extensive experiments on the challenging and widely used MVTec AD benchmark show that PRN outperforms current state-of-the-art unsupervised and supervised methods. We further report SOTA results on three additional datasets to demonstrate the effectiveness and generalizability of PRN.Comment: Accepted by CVPR 202

arXiv.org e-Print Archive

VIDiff: Translating Videos via Multi-Modal Instructions with Diffusion Models

Author: Dai Qi
Hu Han
Jiang Yu-Gang
Wu Zuxuan
Xing Zhen
Zhang Hui
Zhang Zihao
Publication venue
Publication date: 30/11/2023
Field of study

Diffusion models have achieved significant success in image and video generation. This motivates a growing interest in video editing tasks, where videos are edited according to provided text descriptions. However, most existing approaches only focus on video editing for short clips and rely on time-consuming tuning or inference. We are the first to propose Video Instruction Diffusion (VIDiff), a unified foundation model designed for a wide range of video tasks. These tasks encompass both understanding tasks (such as language-guided video object segmentation) and generative tasks (video editing and enhancement). Our model can edit and translate the desired results within seconds based on user instructions. Moreover, we design an iterative auto-regressive method to ensure consistency in editing and enhancing long videos. We provide convincing generative results for diverse input videos and written instructions, both qualitatively and quantitatively. More examples can be found at our website https://ChenHsing.github.io/VIDiff

arXiv.org e-Print Archive

AdaDiff: Adaptive Step Selection for Fast Diffusion

Author: Jiang Yu-Gang
Shao Jie
Wu Zuxuan
Xing Zhen
Zhang Hui
Publication venue
Publication date: 24/11/2023
Field of study

Diffusion models, as a type of generative models, have achieved impressive results in generating images and videos conditioned on textual conditions. However, the generation process of diffusion models involves denoising for dozens of steps to produce photorealistic images/videos, which is computationally expensive. Unlike previous methods that design ``one-size-fits-all'' approaches for speed up, we argue denoising steps should be sample-specific conditioned on the richness of input texts. To this end, we introduce AdaDiff, a lightweight framework designed to learn instance-specific step usage policies, which are then used by the diffusion model for generation. AdaDiff is optimized using a policy gradient method to maximize a carefully designed reward function, balancing inference time and generation quality. We conduct experiments on three image generation and two video generation benchmarks and demonstrate that our approach achieves similar results in terms of visual quality compared to the baseline using a fixed 50 denoising steps while reducing inference time by at least 33%, going as high as 40%. Furthermore, our qualitative analysis shows that our method allocates more steps to more informative text conditions and fewer steps to simpler text conditions.Comment: 10 pages, 5 figure

arXiv.org e-Print Archive

Fuse Your Latents: Video Editing with Multi-source Latent Diffusion Models

Author: Gu Jiaxi
Lu Tianyi
Pei Renjing
Wu Zuxuan
Xu Hang
Xu Songcen
Zhang Xing
Publication venue
Publication date: 25/10/2023
Field of study

Latent Diffusion Models (LDMs) are renowned for their powerful capabilities in image and video synthesis. Yet, video editing methods suffer from insufficient pre-training data or video-by-video re-training cost. In addressing this gap, we propose FLDM (Fused Latent Diffusion Model), a training-free framework to achieve text-guided video editing by applying off-the-shelf image editing methods in video LDMs. Specifically, FLDM fuses latents from an image LDM and an video LDM during the denoising process. In this way, temporal consistency can be kept with video LDM while high-fidelity from the image LDM can also be exploited. Meanwhile, FLDM possesses high flexibility since both image LDM and video LDM can be replaced so advanced image editing methods such as InstructPix2Pix and ControlNet can be exploited. To the best of our knowledge, FLDM is the first method to adapt off-the-shelf image editing methods into video LDMs for video editing. Extensive quantitative and qualitative experiments demonstrate that FLDM can improve the textual alignment and temporal consistency of edited videos

arXiv.org e-Print Archive