1,342 research outputs found

    TENT: Connect Language Models with IoT Sensors for Zero-Shot Activity Recognition

    Full text link
    Recent achievements in language models have showcased their extraordinary capabilities in bridging visual information with semantic language understanding. This leads us to a novel question: can language models connect textual semantics with IoT sensory signals to perform recognition tasks, e.g., Human Activity Recognition (HAR)? If so, an intelligent HAR system with human-like cognition can be built, capable of adapting to new environments and unseen categories. This paper explores its feasibility with an innovative approach, IoT-sEnsors-language alignmEnt pre-Training (TENT), which jointly aligns textual embeddings with IoT sensor signals, including camera video, LiDAR, and mmWave. Through the IoT-language contrastive learning, we derive a unified semantic feature space that aligns multi-modal features with language embeddings, so that the IoT data corresponds to specific words that describe the IoT data. To enhance the connection between textual categories and their IoT data, we propose supplementary descriptions and learnable prompts that bring more semantic information into the joint feature space. TENT can not only recognize actions that have been seen but also ``guess'' the unseen action by the closest textual words from the feature space. We demonstrate TENT achieves state-of-the-art performance on zero-shot HAR tasks using different modalities, improving the best vision-language models by over 12%.Comment: Preprint manuscript in submissio

    A novel method for computing the Hilbert transform with Haar multiresolution approximation

    Get PDF
    AbstractIn this paper, an algorithm for computing the Hilbert transform based on the Haar multiresolution approximation is proposed and the L2-error is estimated. Experimental results show that it outperforms the library function ‘hilbert’ in Matlab (The MathWorks, Inc. 1994–2007). Finally it is applied to compute the instantaneous phase of signals approximately and is compared with three existing methods

    Diffusion Glancing Transformer for Parallel Sequence to Sequence Learning

    Full text link
    Previously, non-autoregressive models were widely perceived as being superior in generation efficiency but inferior in generation quality due to the difficulties of modeling multiple target modalities. To enhance the multi-modality modeling ability, we propose the diffusion glancing transformer, which employs a modality diffusion process and residual glancing sampling. The modality diffusion process is a discrete process that interpolates the multi-modal distribution along the decoding steps, and the residual glancing sampling approach guides the model to continuously learn the remaining modalities across the layers. Experimental results on various machine translation and text generation benchmarks demonstrate that DIFFGLAT achieves better generation accuracy while maintaining fast decoding speed compared with both autoregressive and non-autoregressive models.Comment: 8 pages, 7 figure

    Fault identification technology for gear tooth surface wear based on MPE method by MI and improved FNN algorithm

    Get PDF
    Multiscale Permutation Entropy (MPE) is a presented nonlinear dynamic technology for measuring the randomness and detecting the nonlinear dynamic change of time sequences and can be used effectively to extract the nonlinear dynamic wear fault feature of gear tooth surface from vibration signals of gear set. To solve the subjectivity drawback of threshold parameter selection process in MPE method, a joint calculation method based on the Mutual Information (MI) and improved False Nearest Neighbor (FNN) principle for calculating threshold parameters for MPE method was presented in this article. Then, the influence of threshold parameters on the identification accuracy of fault features with the MPE was studied by analyzing simulation data. Through the simulation analysis, the effectiveness of the proposed MPE method is validated. Finally, the wear failure test of spur gear was carried out, and the proposed method was applied to analyze the experimental data of fault signal. Meanwhile, the vibration characteristics of the fault signal are acquired. The analysis results show that the proposed method can effectively realize the fault diagnosis of gear box and has higher fault identification accuracy than the existing methods

    An Efficient Feature Extraction Scheme for Mobile Anti-Shake in Augmented Reality

    Get PDF
    In recent years, augmented reality on mobile devices has become popular. Mobile shakes are the most typical type of interference in mobile augmented reality. To negate such interference, anti-shake is an urgent requirement. To enhance anti-shake efficiency, we propose an efficient feature extraction scheme for mobile anti-shake in augmented reality. The scheme directly detects corners to avoid the non-extreme constraint such that the efficiency of feature extraction is improved. Meanwhile, the scheme only updates the added corners during mobile shakes, which improves the accuracy of feature extraction. In the experiments, the memory consumption of existing methods is almost double compared to that in our scheme. Further, the runtime of our scheme is only half of the runtime of the existing methods. The experimental results demonstrate that our scheme performs better than the existing classic methods on mobile anti-shake in terms of memory consumption, efficiency, and accuracy
    corecore