29 research outputs found
A Quantum Kernel Learning Approach to Acoustic Modeling for Spoken Command Recognition
We propose a quantum kernel learning (QKL) framework to address the inherent
data sparsity issues often encountered in training large-scare acoustic models
in low-resource scenarios. We project acoustic features based on
classical-to-quantum feature encoding. Different from existing quantum
convolution techniques, we utilize QKL with features in the quantum space to
design kernel-based classifiers. Experimental results on challenging spoken
command recognition tasks for a few low-resource languages, such as Arabic,
Georgian, Chuvash, and Lithuanian, show that the proposed QKL-based hybrid
approach attains good improvements over existing classical and quantum
solutions.Comment: Submitted to ICASSP 202
Low Altitude Air-to-Ground Channel Characterization in LTE Network
Low altitude unmanned aerial vehicle (UAV)-aided applications are promising in the future generation communication systems. In this paper, a recently conducted measurement campaign for characterizing the low-altitude air-to-ground (A2G) channel in a typical Long Term Evolution (LTE) network is introduced. Five horizontal flights at the heights of 15, 30, 50, 75, and 100 m are applied, respectively. The realtime LTE downlink signal is recorded by using the Universal Software Radio Peripheral (USRP)-based channel sounder onboard the UAV. Channel impulse responses (CIRs) are extracted from the cell specific signals in the recorded downlink data. To shed lights on the physical propagation mechanisms, propagation graph simulation is exploited. Moreover, path loss at different heights are investigated and compared based on the empirical data. The simulated and empirical results provide valuable understanding of the low altitude A2G channels
How to Estimate Model Transferability of Pre-Trained Speech Models?
In this work, we introduce a ``score-based assessment'' framework for
estimating the transferability of pre-trained speech models (PSMs) for
fine-tuning target tasks. We leverage upon two representation theories,
Bayesian likelihood estimation and optimal transport, to generate rank scores
for the PSM candidates using the extracted representations. Our framework
efficiently computes transferability scores without actual fine-tuning of
candidate models or layers by making a temporal independent hypothesis. We
evaluate some popular supervised speech models (e.g., Conformer RNN-Transducer)
and self-supervised speech models (e.g., HuBERT) in cross-layer and cross-model
settings using public data. Experimental results show a high Spearman's rank
correlation and low -value between our estimation framework and fine-tuning
ground truth. Our proposed transferability framework requires less
computational time and resources, making it a resource-saving and
time-efficient approach for tuning speech foundation models.Comment: Accepted to Interspeech. Code will be release
C-reactive protein levels after 4 types of arthroplasty
Background and purpose Postoperative C-reactive protein (CRP) levels in serum appear to reflect surgical trauma. We examined CRP levels after 4 types of arthroplasty
A Comparative Study on Transformer vs RNN in Speech Applications
Sequence-to-sequence models have been widely used in end-to-end speech
processing, for example, automatic speech recognition (ASR), speech translation
(ST), and text-to-speech (TTS). This paper focuses on an emergent
sequence-to-sequence model called Transformer, which achieves state-of-the-art
performance in neural machine translation and other natural language processing
applications. We undertook intensive studies in which we experimentally
compared and analyzed Transformer and conventional recurrent neural networks
(RNN) in a total of 15 ASR, one multilingual ASR, one ST, and two TTS
benchmarks. Our experiments revealed various training tips and significant
performance benefits obtained with Transformer for each task including the
surprising superiority of Transformer in 13/15 ASR benchmarks in comparison
with RNN. We are preparing to release Kaldi-style reproducible recipes using
open source and publicly available datasets for all the ASR, ST, and TTS tasks
for the community to succeed our exciting outcomes.Comment: Accepted at ASRU 201
Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
We introduce the Universal Speech Model (USM), a single large model that
performs automatic speech recognition (ASR) across 100+ languages. This is
achieved by pre-training the encoder of the model on a large unlabeled
multilingual dataset of 12 million (M) hours spanning over 300 languages, and
fine-tuning on a smaller labeled dataset. We use multilingual pre-training with
random-projection quantization and speech-text modality matching to achieve
state-of-the-art performance on downstream multilingual ASR and speech-to-text
translation tasks. We also demonstrate that despite using a labeled training
set 1/7-th the size of that used for the Whisper model, our model exhibits
comparable or better performance on both in-domain and out-of-domain speech
recognition tasks across many languages.Comment: 20 pages, 7 figures, 8 table
The mood stabilizer lamotrigine produces antidepressant behavioral effects in rats: role of brain-derived neurotrophic factor
The anticonvulsant drug lamotrigine has been shown to produce strong antidepressant effects in the treatment of patients with bipolar disorder. However, to date there are few preclinical reports on its behavioral actions in animal models of depression or its underlying molecular mechanisms. The current study investigated the effects of lamotrigine in the forced swimming test and the learned helplessness test. The results demonstrate that both 15 and 30 mg/kg acute treatment of lamotrigine significantly reduced immobility in the forced swimming test without affecting locomotor activity. Sub-chronic twice daily injections of 30 mg/kg lamotrigine robustly decreased escape failures in animals that had developed learned helplessness symptoms. In parallel, the sub-chronic lamotrigine treatment also up-regulated frontal and hippocampal brain-derived neurotrophic factor expression in both naive and stressed animals and restored the stress-induced down-regulation of brain-derived neurotrophic factor expression. This study provides further evidence for the use of lamotrigine as a novel antidepressant in the treatment of bipolar disorders