15 research outputs found
Constrained Output Embeddings for End-to-End Code-Switching Speech Recognition with Only Monolingual Data
The lack of code-switch training data is one of the major concerns in the
development of end-to-end code-switching automatic speech recognition (ASR)
models. In this work, we propose a method to train an improved end-to-end
code-switching ASR using only monolingual data. Our method encourages the
distributions of output token embeddings of monolingual languages to be
similar, and hence, promotes the ASR model to easily code-switch between
languages. Specifically, we propose to use Jensen-Shannon divergence and cosine
distance based constraints. The former will enforce output embeddings of
monolingual languages to possess similar distributions, while the later simply
brings the centroids of two distributions to be close to each other.
Experimental results demonstrate high effectiveness of the proposed method,
yielding up to 4.5% absolute mixed error rate improvement on Mandarin-English
code-switching ASR task.Comment: 5 pages, 3 figures, accepted to INTERSPEECH 201
Multilingual self-supervised speech representations improve the speech recognition of low-resource African languages with codeswitching
While many speakers of low-resource languages regularly code-switch between
their languages and other regional languages or English, datasets of
codeswitched speech are too small to train bespoke acoustic models from scratch
or do language model rescoring. Here we propose finetuning self-supervised
speech representations such as wav2vec 2.0 XLSR to recognize code-switched
data. We find that finetuning self-supervised multilingual representations and
augmenting them with n-gram language models trained from transcripts reduces
absolute word error rates by up to 20% compared to baselines of hybrid models
trained from scratch on code-switched data. Our findings suggest that in
circumstances with limited training data finetuning self-supervised
representations is a better performing and viable solution.Comment: 5 pages, 1 figure. Computational Approaches to Linguistic
Code-Switching, CALCS 2023 (co-located with EMNLP 2023
An Effective Mixture-Of-Experts Approach For Code-Switching Speech Recognition Leveraging Encoder Disentanglement
With the massive developments of end-to-end (E2E) neural networks, recent
years have witnessed unprecedented breakthroughs in automatic speech
recognition (ASR). However, the codeswitching phenomenon remains a major
obstacle that hinders ASR from perfection, as the lack of labeled data and the
variations between languages often lead to degradation of ASR performance. In
this paper, we focus exclusively on improving the acoustic encoder of E2E ASR
to tackle the challenge caused by the codeswitching phenomenon. Our main
contributions are threefold: First, we introduce a novel disentanglement loss
to enable the lower-layer of the encoder to capture inter-lingual acoustic
information while mitigating linguistic confusion at the higher-layer of the
encoder. Second, through comprehensive experiments, we verify that our proposed
method outperforms the prior-art methods using pretrained dual-encoders,
meanwhile having access only to the codeswitching corpus and consuming half of
the parameterization. Third, the apparent differentiation of the encoders'
output features also corroborates the complementarity between the
disentanglement loss and the mixture-of-experts (MoE) architecture.Comment: ICASSP 202
Code-Switching in the Frequently Used English Vocabularies among College Students of Non-Native Speaker
Code-switching is a common linguistic phenomenon that occurs because of the need for two or more languages when communicating directly in general or specific expressions in the worldâs multilingual society. The research aimed to find out and to describe code-switching in the frequently used English vocabularies among college students of non-native speaker. The research used qualitative focuses on phenomenological research. The research involved the first until third-year students of the English Language Education Department of STKIP PGRI Trenggalek in the 2021/2022 academic year as the subject. The research instruments were questionnaires and documentation was analyzed using Likert scale and qualitative descriptive way. Research findings show students mostly agree with the use of code-switching in their daily communication and their social media such as WhatsApp, Facebook, and Instagram. From 95 vocabularies presented in the questionnaire, 85 vocabularies are the most frequently used in studentsâ daily communication. Moreover, given the findings, discussions, and conclusion elaborated above, it is suggested for college students, lecturers, college and next researchers may find benefits in this research