14 research outputs found

    Robust Sensor Fusion for Indoor Wireless Localization

    Full text link
    Location knowledge in indoor environment using Indoor Positioning Systems (IPS) has become very useful and popular in recent years. Indoor wireless localization suffers from severe multi-path fading and non-line-of-sight conditions. This paper presents a novel indoor localization framework based on sensor fusion of Zigbee Wireless Sensor Networks (WSN) using Received Signal Strength (RSS). The unknown position is equipped with two or more mobile nodes. The range between two mobile nodes is fixed as priori. The attitude (roll, pitch, and yaw) of the mobile node are measured by inertial sensors (ISs). Then the angle and the range between any two nodes can be obtained, and thus the path between the two nodes can be modeled as a curve. Through an efficient cooperation between two or more mobile nodes, this framework effectively exploits the RSS techniques. This constraint help improve the positioning accuracy. Theoretical analysis on localization distortion and Monte Carlo simulations shows that the proposed cooperative strategy of multiple nodes with extended Kalman filter (EKF) achieves significantly higher positioning accuracy than the existing systems, especially in heavily obstructed scenarios

    Quaternion MLP Neural Networks Based on the Maximum Correntropy Criterion

    Full text link
    We propose a gradient ascent algorithm for quaternion multilayer perceptron (MLP) networks based on the cost function of the maximum correntropy criterion (MCC). In the algorithm, we use the split quaternion activation function based on the generalized Hamilton-real quaternion gradient. By introducing a new quaternion operator, we first rewrite the early quaternion single layer perceptron algorithm. Secondly, we propose a gradient descent algorithm for quaternion multilayer perceptron based on the cost function of the mean square error (MSE). Finally, the MSE algorithm is extended to the MCC algorithm. Simulations show the feasibility of the proposed method

    Variational Bayesian Approximations Kalman Filter Based on Threshold Judgment

    Full text link
    The estimation of non-Gaussian measurement noise models is a significant challenge across various fields. In practical applications, it often faces challenges due to the large number of parameters and high computational complexity. This paper proposes a threshold-based Kalman filtering approach for online estimation of noise parameters in non-Gaussian measurement noise models. This method uses a certain amount of sample data to infer the variance threshold of observation parameters and employs variational Bayesian estimation to obtain corresponding noise variance estimates, enabling subsequent iterations of the Kalman filtering algorithm. Finally, we evaluate the performance of this algorithm through simulation experiments, demonstrating its accurate and effective estimation of state and noise parameters.Comment: 5 pages, conferenc

    DiffusionAD: Norm-guided One-step Denoising Diffusion for Anomaly Detection

    Full text link
    Anomaly detection has garnered extensive applications in real industrial manufacturing due to its remarkable effectiveness and efficiency. However, previous generative-based models have been limited by suboptimal reconstruction quality, hampering their overall performance. A fundamental enhancement lies in our reformulation of the reconstruction process using a diffusion model into a noise-to-norm paradigm. Here, anomalous regions are perturbed with Gaussian noise and reconstructed as normal, overcoming the limitations of previous models by facilitating anomaly-free restoration. Additionally, we propose a rapid one-step denoising paradigm, significantly faster than the traditional iterative denoising in diffusion models. Furthermore, the introduction of the norm-guided paradigm elevates the accuracy and fidelity of reconstructions. The segmentation sub-network predicts pixel-level anomaly scores using the input image and its anomaly-free restoration. Comprehensive evaluations on four standard and challenging benchmarks reveal that DiffusionAD outperforms current state-of-the-art approaches, demonstrating the effectiveness and broad applicability of the proposed pipeline.Comment: 14 pages, 12 figure

    Prototypical Residual Networks for Anomaly Detection and Localization

    Full text link
    Anomaly detection and localization are widely used in industrial manufacturing for its efficiency and effectiveness. Anomalies are rare and hard to collect and supervised models easily over-fit to these seen anomalies with a handful of abnormal samples, producing unsatisfactory performance. On the other hand, anomalies are typically subtle, hard to discern, and of various appearance, making it difficult to detect anomalies and let alone locate anomalous regions. To address these issues, we propose a framework called Prototypical Residual Network (PRN), which learns feature residuals of varying scales and sizes between anomalous and normal patterns to accurately reconstruct the segmentation maps of anomalous regions. PRN mainly consists of two parts: multi-scale prototypes that explicitly represent the residual features of anomalies to normal patterns; a multisize self-attention mechanism that enables variable-sized anomalous feature learning. Besides, we present a variety of anomaly generation strategies that consider both seen and unseen appearance variance to enlarge and diversify anomalies. Extensive experiments on the challenging and widely used MVTec AD benchmark show that PRN outperforms current state-of-the-art unsupervised and supervised methods. We further report SOTA results on three additional datasets to demonstrate the effectiveness and generalizability of PRN.Comment: Accepted by CVPR 202

    VIDiff: Translating Videos via Multi-Modal Instructions with Diffusion Models

    Full text link
    Diffusion models have achieved significant success in image and video generation. This motivates a growing interest in video editing tasks, where videos are edited according to provided text descriptions. However, most existing approaches only focus on video editing for short clips and rely on time-consuming tuning or inference. We are the first to propose Video Instruction Diffusion (VIDiff), a unified foundation model designed for a wide range of video tasks. These tasks encompass both understanding tasks (such as language-guided video object segmentation) and generative tasks (video editing and enhancement). Our model can edit and translate the desired results within seconds based on user instructions. Moreover, we design an iterative auto-regressive method to ensure consistency in editing and enhancing long videos. We provide convincing generative results for diverse input videos and written instructions, both qualitatively and quantitatively. More examples can be found at our website https://ChenHsing.github.io/VIDiff

    AdaDiff: Adaptive Step Selection for Fast Diffusion

    Full text link
    Diffusion models, as a type of generative models, have achieved impressive results in generating images and videos conditioned on textual conditions. However, the generation process of diffusion models involves denoising for dozens of steps to produce photorealistic images/videos, which is computationally expensive. Unlike previous methods that design ``one-size-fits-all'' approaches for speed up, we argue denoising steps should be sample-specific conditioned on the richness of input texts. To this end, we introduce AdaDiff, a lightweight framework designed to learn instance-specific step usage policies, which are then used by the diffusion model for generation. AdaDiff is optimized using a policy gradient method to maximize a carefully designed reward function, balancing inference time and generation quality. We conduct experiments on three image generation and two video generation benchmarks and demonstrate that our approach achieves similar results in terms of visual quality compared to the baseline using a fixed 50 denoising steps while reducing inference time by at least 33%, going as high as 40%. Furthermore, our qualitative analysis shows that our method allocates more steps to more informative text conditions and fewer steps to simpler text conditions.Comment: 10 pages, 5 figure

    Fuse Your Latents: Video Editing with Multi-source Latent Diffusion Models

    Full text link
    Latent Diffusion Models (LDMs) are renowned for their powerful capabilities in image and video synthesis. Yet, video editing methods suffer from insufficient pre-training data or video-by-video re-training cost. In addressing this gap, we propose FLDM (Fused Latent Diffusion Model), a training-free framework to achieve text-guided video editing by applying off-the-shelf image editing methods in video LDMs. Specifically, FLDM fuses latents from an image LDM and an video LDM during the denoising process. In this way, temporal consistency can be kept with video LDM while high-fidelity from the image LDM can also be exploited. Meanwhile, FLDM possesses high flexibility since both image LDM and video LDM can be replaced so advanced image editing methods such as InstructPix2Pix and ControlNet can be exploited. To the best of our knowledge, FLDM is the first method to adapt off-the-shelf image editing methods into video LDMs for video editing. Extensive quantitative and qualitative experiments demonstrate that FLDM can improve the textual alignment and temporal consistency of edited videos
    corecore