Search CORE

30 research outputs found

Contextualized End-to-End Speech Recognition with Contextual Phrase Prediction Network

Author: Guo Pengcheng
Huang Kaixun
Mu Bingshen
Xie Lei
Xu Tianyi
Yang Zhanheng
Zhang Ao
Publication venue
Publication date: 29/05/2023
Field of study

Contextual information plays a crucial role in speech recognition technologies and incorporating it into the end-to-end speech recognition models has drawn immense interest recently. However, previous deep bias methods lacked explicit supervision for bias tasks. In this study, we introduce a contextual phrase prediction network for an attention-based deep bias method. This network predicts context phrases in utterances using contextual embeddings and calculates bias loss to assist in the training of the contextualized model. Our method achieved a significant word error rate (WER) reduction across various end-to-end speech recognition models. Experiments on the LibriSpeech corpus show that our proposed model obtains a 12.1% relative WER improvement over the baseline model, and the WER of the context phrases decreases relatively by 40.5%. Moreover, by applying a context phrase filtering strategy, we also effectively eliminate the WER degradation when using a larger biasing list.Comment: Accepted by interspeech202

arXiv.org e-Print Archive

Adaptive Contextual Biasing for Transducer Based Streaming Speech Recognition

Author: Chen Changru
Guo Pengcheng
Huang Kaixun
Li Biao
Li Chao
Xie Lei
Xu Tianyi
Yang Zhanheng
Zhang Ao
Publication venue
Publication date: 15/08/2023
Field of study

By incorporating additional contextual information, deep biasing methods have emerged as a promising solution for speech recognition of personalized words. However, for real-world voice assistants, always biasing on such personalized words with high prediction scores can significantly degrade the performance of recognizing common words. To address this issue, we propose an adaptive contextual biasing method based on Context-Aware Transformer Transducer (CATT) that utilizes the biased encoder and predictor embeddings to perform streaming prediction of contextual phrase occurrences. Such prediction is then used to dynamically switch the bias list on and off, enabling the model to adapt to both personalized and common scenarios. Experiments on Librispeech and internal voice assistant datasets show that our approach can achieve up to 6.7% and 20.7% relative reduction in WER and CER compared to the baseline respectively, mitigating up to 96.7% and 84.9% of the relative WER and CER increase for common cases. Furthermore, our approach has a minimal performance impact in personalized scenarios while maintaining a streaming inference pipeline with negligible RTF increase

arXiv.org e-Print Archive

Active Metamaterial Antenna with Tunable Zeroth-Order Resonances for Narrowband Internet of Things

Author: Hongtao Liu
Leimeng Yang
Yong Luo
Zhanheng Liu
Publication venue: MDPI AG
Publication date: 01/09/2023
Field of study

With unique electromagnetic properties, metamaterials (MTMs) provide more freedom for antenna design, particularly with the combination of active-device-enabling effective tuning. By integrating the active device and the periodical cells of MTMs, the electromagnetic characteristics of individual cells can be manipulated independently, thereby realizing multiple tunable states for MTM antennas consisting of several periodical cells. In this paper, we employ active devices such as PIN diodes to each periodical cell to tune each cell independently, thereby realizing 36 tunable zeroth-order resonances (ZORs) for the metamaterial antenna with three cells in a frequency range of 4.48–5.34 GHz. Moreover, each ZOR has a bandwidth as narrow as 0.09 GHz, indicating that the tunable ZOR antenna can be potentially applied to 5G Narrowband Internet of Things (NB-IoT)

Directory of Open Access Journals

A model-based method with geometric solutions for gaze correction in eye-tracking

Author: Kai Liu
Xiaomei Yang
Xinyi Chun
Xiujuan Zheng
Zhanheng Li
Publication venue: 'American Institute of Mathematical Sciences (AIMS)'
Publication date: 01/01/2020
Field of study

The eyeball distortions caused by eye diseases, such as myopia and strabismus, can lead to the deviations of eye-tracking data. In this paper, a model-based method with geometric solutions is proposed for gaze correction. The deviations of estimated gaze points are geometrically analyzed based on the individual eyeball models with considerations of the distortions caused by myopia and strabismus. A set of integrated geometric solutions is derived from the varied situations including the case of strabismus and the case of myopia and strabismus, and then used for gaze correction in eyetracking. The experimental results demonstrate that this model-based method is effective to reduce deviations in estimated gaze points, and can be used to correct the modeling error in eye-tracking. Moreover, the proposed method has the potential to provide a simple approach to correct the eyetracking data for various populations with eye diseases

Crossref

Directory of Open Access Journals

Mesh Generation and Flexible Shape Comparisons for Bio-Molecules

Author: Fu Zhicheng
Gao Zhanheng
Pang Xiaoli
Rostami Reihaneh
Yu Zeyun
Publication venue: De Gruyter
Publication date
Field of study

Novel approaches for generating and comparing flexible (non-rigid) molecular surface meshes are developed. The mesh-generating method is fast and memory-efficient. The resulting meshes are smooth and accurate, and possess high mesh quality. An isometric-invariant shape descriptor based on the Laplace- Beltrami operator is then explored for mesh comparing. The new shape descriptor is more powerful in discriminating different surface shapes but rely only on a small set of signature values. The shape descriptor is applied to shape comparison between molecules with deformed structures. The proposed methods are implemented into a program that can be used as a stand-alone software tool or as a plug-in to other existing molecular modeling tools. Particularly, the code is encapsulated into a software toolkit with a user-friendly graphical interface developed by the authors

Directory of Open Access Journals

Feature-Based Solid Model Reconstruction

Author: Changbai Tan
Dongxiao Gu
Jun Wang
Laishui Zhou
Zeyun Yu
Zhanheng Gao
Publication venue: 'ASME International'
Publication date
Field of study

Crossref