Search CORE

29 research outputs found

A Quantum Kernel Learning Approach to Acoustic Modeling for Spoken Command Recognition

Author: Chen Nanxin
Lee Chin-Hui
Li Bo
Sainath Tara N.
Siniscalchi Sabato Marco
Yang Chao-Han Huck
Zhang Yu
Publication venue
Publication date: 02/11/2022
Field of study

We propose a quantum kernel learning (QKL) framework to address the inherent data sparsity issues often encountered in training large-scare acoustic models in low-resource scenarios. We project acoustic features based on classical-to-quantum feature encoding. Different from existing quantum convolution techniques, we utilize QKL with features in the quantum space to design kernel-based classifiers. Experimental results on challenging spoken command recognition tasks for a few low-resource languages, such as Arabic, Georgian, Chuvash, and Lithuanian, show that the proposed QKL-based hybrid approach attains good improvements over existing classical and quantum solutions.Comment: Submitted to ICASSP 202

arXiv.org e-Print Archive

Low Altitude Air-to-Ground Channel Characterization in LTE Network

Author: Cai Xuesong
Fan Wei
Pedersen Gert Frølund
Pérez Yuste Antonio
Rodríguez-Piñeiro José
Tian Li
Wang Nanxin
Yin Xuefeng
Zhang Guojin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

Low altitude unmanned aerial vehicle (UAV)-aided applications are promising in the future generation communication systems. In this paper, a recently conducted measurement campaign for characterizing the low-altitude air-to-ground (A2G) channel in a typical Long Term Evolution (LTE) network is introduced. Five horizontal flights at the heights of 15, 30, 50, 75, and 100 m are applied, respectively. The realtime LTE downlink signal is recorded by using the Universal Software Radio Peripheral (USRP)-based channel sounder onboard the UAV. Channel impulse responses (CIRs) are extracted from the cell specific signals in the recorded downlink data. To shed lights on the physical propagation mechanisms, propagation graph simulation is exploited. Moreover, path loss at different heights are investigated and compared based on the empirical data. The simulated and empirical results provide valuable understanding of the low altitude A2G channels

VBN

Archivo Digital UPM

How to Estimate Model Transferability of Pre-Trained Speech Models?

Author: Chang Shou-Yiin
Chen Nanxin
Chen Zih-Ching
Lee Hung-yi
Li Bo
Prabhavalkar Rohit
Sainath Tara N.
Yang Chao-Han Huck
Zhang Yu
Publication venue
Publication date: 01/06/2023
Field of study

In this work, we introduce a ``score-based assessment'' framework for estimating the transferability of pre-trained speech models (PSMs) for fine-tuning target tasks. We leverage upon two representation theories, Bayesian likelihood estimation and optimal transport, to generate rank scores for the PSM candidates using the extracted representations. Our framework efficiently computes transferability scores without actual fine-tuning of candidate models or layers by making a temporal independent hypothesis. We evaluate some popular supervised speech models (e.g., Conformer RNN-Transducer) and self-supervised speech models (e.g., HuBERT) in cross-layer and cross-model settings using public data. Experimental results show a high Spearman's rank correlation and low

p

-value between our estimation framework and fine-tuning ground truth. Our proposed transferability framework requires less computational time and resources, making it a resource-saving and time-efficient approach for tuning speech foundation models.Comment: Accepted to Interspeech. Code will be release

arXiv.org e-Print Archive

C-reactive protein levels after 4 types of arthroplasty

Author: Aalto K
Black S
Choudhry RR
Ellitsgaard N
Foglar C
Hao Shen
Kristiansson M
Laiho K
Larsson S
Nanxin Zhang
Neumaier M
Niskanen RO
Scherer MA
Weiping Ji
White J
Xianlong Zhang
Publication venue: Informa Healthcare
Publication date
Field of study

Background and purpose Postoperative C-reactive protein (CRP) levels in serum appear to reflect surgical trauma. We examined CRP levels after 4 types of arthroplasty

Crossref

PubMed Central

A Comparative Study on Transformer vs RNN in Speech Applications

Author: Chen Nanxin
Hayashi Tomoki
Hori Takaaki
Inaguma Hirofumi
Jiang Ziyan
Karita Shigeki
Someki Masao
Soplin Nelson Enrique Yalta
Wang Xiaofei
Watanabe Shinji
Yamamoto Ryuichi
Yoshimura Takenori
Zhang Wangyou
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 28/09/2019
Field of study

Sequence-to-sequence models have been widely used in end-to-end speech processing, for example, automatic speech recognition (ASR), speech translation (ST), and text-to-speech (TTS). This paper focuses on an emergent sequence-to-sequence model called Transformer, which achieves state-of-the-art performance in neural machine translation and other natural language processing applications. We undertook intensive studies in which we experimentally compared and analyzed Transformer and conventional recurrent neural networks (RNN) in a total of 15 ASR, one multilingual ASR, one ST, and two TTS benchmarks. Our experiments revealed various training tips and significant performance benefits obtained with Transformer for each task including the surprising superiority of Transformer in 13/15 ASR benchmarks in comparison with RNN. We are preparing to release Kaldi-style reproducible recipes using open source and publicly available datasets for all the ASR, ST, and TTS tasks for the community to succeed our exciting outcomes.Comment: Accepted at ASRU 201

arXiv.org e-Print Archive

Crossref

Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages

Author: Axelrod Vera
Bapna Ankur
Beaufays Françoise
Chen Nanxin
Chen Zhehuai
Chiu Chung-Cheng
Haghani Parisa
Han Wei
Hu Ke
Li Bo
Meng Zhong
Moreno Pedro
Park Daniel S.
Perng Ginger
Prabhavalkar Rohit
Qin James
Ramabhadran Bhuvana
Riesa Jason
Rosenberg Andrew
Sainath Tara
Schalkwyk Johan
Soltau Hagen
Strohman Trevor
Wang Gary
Wang Yongqiang
Wu Yonghui
Zhang Yu
Publication venue
Publication date: 24/09/2023
Field of study

We introduce the Universal Speech Model (USM), a single large model that performs automatic speech recognition (ASR) across 100+ languages. This is achieved by pre-training the encoder of the model on a large unlabeled multilingual dataset of 12 million (M) hours spanning over 300 languages, and fine-tuning on a smaller labeled dataset. We use multilingual pre-training with random-projection quantization and speech-text modality matching to achieve state-of-the-art performance on downstream multilingual ASR and speech-to-text translation tasks. We also demonstrate that despite using a labeled training set 1/7-th the size of that used for the Whisper model, our model exhibits comparable or better performance on both in-domain and out-of-domain speech recognition tasks across many languages.Comment: 20 pages, 7 figures, 8 table

arXiv.org e-Print Archive

The mood stabilizer lamotrigine produces antidepressant behavioral effects in rats: role of brain-derived neurotrophic factor

Author: He Shuchang
He Xiaolu
Li Nanxin
Qi Xiaoli
Zhang Yu
Publication venue
Publication date: 01/01/2010
Field of study

The anticonvulsant drug lamotrigine has been shown to produce strong antidepressant effects in the treatment of patients with bipolar disorder. However, to date there are few preclinical reports on its behavioral actions in animal models of depression or its underlying molecular mechanisms. The current study investigated the effects of lamotrigine in the forced swimming test and the learned helplessness test. The results demonstrate that both 15 and 30 mg/kg acute treatment of lamotrigine significantly reduced immobility in the forced swimming test without affecting locomotor activity. Sub-chronic twice daily injections of 30 mg/kg lamotrigine robustly decreased escape failures in animals that had developed learned helplessness symptoms. In parallel, the sub-chronic lamotrigine treatment also up-regulated frontal and hippocampal brain-derived neurotrophic factor expression in both naive and stressed animals and restored the stress-induced down-regulation of brain-derived neurotrophic factor expression. This study provides further evidence for the use of lamotrigine as a novel antidepressant in the treatment of bipolar disorders

Institute of Psychology,Chinese Academy Of Sciences

Institutional Repository of Institute of Psychology, Chinese Academy of Sciences