30 research outputs found

    Contextualized End-to-End Speech Recognition with Contextual Phrase Prediction Network

    Full text link
    Contextual information plays a crucial role in speech recognition technologies and incorporating it into the end-to-end speech recognition models has drawn immense interest recently. However, previous deep bias methods lacked explicit supervision for bias tasks. In this study, we introduce a contextual phrase prediction network for an attention-based deep bias method. This network predicts context phrases in utterances using contextual embeddings and calculates bias loss to assist in the training of the contextualized model. Our method achieved a significant word error rate (WER) reduction across various end-to-end speech recognition models. Experiments on the LibriSpeech corpus show that our proposed model obtains a 12.1% relative WER improvement over the baseline model, and the WER of the context phrases decreases relatively by 40.5%. Moreover, by applying a context phrase filtering strategy, we also effectively eliminate the WER degradation when using a larger biasing list.Comment: Accepted by interspeech202

    Adaptive Contextual Biasing for Transducer Based Streaming Speech Recognition

    Full text link
    By incorporating additional contextual information, deep biasing methods have emerged as a promising solution for speech recognition of personalized words. However, for real-world voice assistants, always biasing on such personalized words with high prediction scores can significantly degrade the performance of recognizing common words. To address this issue, we propose an adaptive contextual biasing method based on Context-Aware Transformer Transducer (CATT) that utilizes the biased encoder and predictor embeddings to perform streaming prediction of contextual phrase occurrences. Such prediction is then used to dynamically switch the bias list on and off, enabling the model to adapt to both personalized and common scenarios. Experiments on Librispeech and internal voice assistant datasets show that our approach can achieve up to 6.7% and 20.7% relative reduction in WER and CER compared to the baseline respectively, mitigating up to 96.7% and 84.9% of the relative WER and CER increase for common cases. Furthermore, our approach has a minimal performance impact in personalized scenarios while maintaining a streaming inference pipeline with negligible RTF increase

    Active Metamaterial Antenna with Tunable Zeroth-Order Resonances for Narrowband Internet of Things

    No full text
    With unique electromagnetic properties, metamaterials (MTMs) provide more freedom for antenna design, particularly with the combination of active-device-enabling effective tuning. By integrating the active device and the periodical cells of MTMs, the electromagnetic characteristics of individual cells can be manipulated independently, thereby realizing multiple tunable states for MTM antennas consisting of several periodical cells. In this paper, we employ active devices such as PIN diodes to each periodical cell to tune each cell independently, thereby realizing 36 tunable zeroth-order resonances (ZORs) for the metamaterial antenna with three cells in a frequency range of 4.48–5.34 GHz. Moreover, each ZOR has a bandwidth as narrow as 0.09 GHz, indicating that the tunable ZOR antenna can be potentially applied to 5G Narrowband Internet of Things (NB-IoT)

    A model-based method with geometric solutions for gaze correction in eye-tracking

    No full text
    The eyeball distortions caused by eye diseases, such as myopia and strabismus, can lead to the deviations of eye-tracking data. In this paper, a model-based method with geometric solutions is proposed for gaze correction. The deviations of estimated gaze points are geometrically analyzed based on the individual eyeball models with considerations of the distortions caused by myopia and strabismus. A set of integrated geometric solutions is derived from the varied situations including the case of strabismus and the case of myopia and strabismus, and then used for gaze correction in eyetracking. The experimental results demonstrate that this model-based method is effective to reduce deviations in estimated gaze points, and can be used to correct the modeling error in eye-tracking. Moreover, the proposed method has the potential to provide a simple approach to correct the eyetracking data for various populations with eye diseases

    Mesh Generation and Flexible Shape Comparisons for Bio-Molecules

    No full text
    Novel approaches for generating and comparing flexible (non-rigid) molecular surface meshes are developed. The mesh-generating method is fast and memory-efficient. The resulting meshes are smooth and accurate, and possess high mesh quality. An isometric-invariant shape descriptor based on the Laplace- Beltrami operator is then explored for mesh comparing. The new shape descriptor is more powerful in discriminating different surface shapes but rely only on a small set of signature values. The shape descriptor is applied to shape comparison between molecules with deformed structures. The proposed methods are implemented into a program that can be used as a stand-alone software tool or as a plug-in to other existing molecular modeling tools. Particularly, the code is encapsulated into a software toolkit with a user-friendly graphical interface developed by the authors

    Feature-Based Solid Model Reconstruction

    No full text
    corecore