Search CORE

15 research outputs found

Constrained Output Embeddings for End-to-End Code-Switching Speech Recognition with Only Monolingual Data

Author: Chng Eng Siong
Khassanov Yerbolat
Ma Bin
Ni Chongjia
Pham Van Tung
Xu Haihua
Zeng Zhiping
Publication venue: 'International Speech Communication Association'
Publication date: 31/07/2019
Field of study

The lack of code-switch training data is one of the major concerns in the development of end-to-end code-switching automatic speech recognition (ASR) models. In this work, we propose a method to train an improved end-to-end code-switching ASR using only monolingual data. Our method encourages the distributions of output token embeddings of monolingual languages to be similar, and hence, promotes the ASR model to easily code-switch between languages. Specifically, we propose to use Jensen-Shannon divergence and cosine distance based constraints. The former will enforce output embeddings of monolingual languages to possess similar distributions, while the later simply brings the centroids of two distributions to be close to each other. Experimental results demonstrate high effectiveness of the proposed method, yielding up to 4.5% absolute mixed error rate improvement on Mandarin-English code-switching ASR task.Comment: 5 pages, 3 figures, accepted to INTERSPEECH 201

arXiv.org e-Print Archive

Crossref

Multilingual self-supervised speech representations improve the speech recognition of low-resource African languages with codeswitching

Author: Jurafsky Dan
Manning Christopher D.
Ògúnrèmí Tolúlopé
Publication venue
Publication date: 25/11/2023
Field of study

While many speakers of low-resource languages regularly code-switch between their languages and other regional languages or English, datasets of codeswitched speech are too small to train bespoke acoustic models from scratch or do language model rescoring. Here we propose finetuning self-supervised speech representations such as wav2vec 2.0 XLSR to recognize code-switched data. We find that finetuning self-supervised multilingual representations and augmenting them with n-gram language models trained from transcripts reduces absolute word error rates by up to 20% compared to baselines of hybrid models trained from scratch on code-switched data. Our findings suggest that in circumstances with limited training data finetuning self-supervised representations is a better performing and viable solution.Comment: 5 pages, 1 figure. Computational Approaches to Linguistic Code-Switching, CALCS 2023 (co-located with EMNLP 2023

arXiv.org e-Print Archive

An Effective Mixture-Of-Experts Approach For Code-Switching Speech Recognition Leveraging Encoder Disentanglement

Author: Chen Berlin
Lin Chi-Han
Wang Hsin-Wei
Wang Yi-Cheng
Yang Tzu-Ting
Publication venue
Publication date: 26/02/2024
Field of study

With the massive developments of end-to-end (E2E) neural networks, recent years have witnessed unprecedented breakthroughs in automatic speech recognition (ASR). However, the codeswitching phenomenon remains a major obstacle that hinders ASR from perfection, as the lack of labeled data and the variations between languages often lead to degradation of ASR performance. In this paper, we focus exclusively on improving the acoustic encoder of E2E ASR to tackle the challenge caused by the codeswitching phenomenon. Our main contributions are threefold: First, we introduce a novel disentanglement loss to enable the lower-layer of the encoder to capture inter-lingual acoustic information while mitigating linguistic confusion at the higher-layer of the encoder. Second, through comprehensive experiments, we verify that our proposed method outperforms the prior-art methods using pretrained dual-encoders, meanwhile having access only to the codeswitching corpus and consuming half of the parameterization. Third, the apparent differentiation of the encoders' output features also corroborates the complementarity between the disentanglement loss and the mixture-of-experts (MoE) architecture.Comment: ICASSP 202

arXiv.org e-Print Archive

Code-Switching in the Frequently Used English Vocabularies among College Students of Non-Native Speaker

Author: Basuki Yudi
Roshalia Intan
Publication venue: 'Universitas PGRI Madiun'
Publication date: 24/12/2022
Field of study

Code-switching is a common linguistic phenomenon that occurs because of the need for two or more languages when communicating directly in general or specific expressions in the world’s multilingual society. The research aimed to find out and to describe code-switching in the frequently used English vocabularies among college students of non-native speaker. The research used qualitative focuses on phenomenological research. The research involved the first until third-year students of the English Language Education Department of STKIP PGRI Trenggalek in the 2021/2022 academic year as the subject. The research instruments were questionnaires and documentation was analyzed using Likert scale and qualitative descriptive way. Research findings show students mostly agree with the use of code-switching in their daily communication and their social media such as WhatsApp, Facebook, and Instagram. From 95 vocabularies presented in the questionnaire, 85 vocabularies are the most frequently used in students’ daily communication. Moreover, given the findings, discussions, and conclusion elaborated above, it is suggested for college students, lecturers, college and next researchers may find benefits in this research

E-Journal Universitas PGRI Madiun (Persatuan Guru Republik Indonesia)