Search CORE

26,740 research outputs found

DARTS-ASR: Differentiable Architecture Search for Multilingual Speech Recognition and Adaptation

Author: Chen Yi-Chen
Hsu Jui-Yang
Lee Cheng-Kuang
Lee Hung-yi
Publication venue
Publication date: 25/07/2020
Field of study

In previous works, only parameter weights of ASR models are optimized under fixed-topology architecture. However, the design of successful model architecture has always relied on human experience and intuition. Besides, many hyperparameters related to model architecture need to be manually tuned. Therefore in this paper, we propose an ASR approach with efficient gradient-based architecture search, DARTS-ASR. In order to examine the generalizability of DARTS-ASR, we apply our approach not only on many languages to perform monolingual ASR, but also on a multilingual ASR setting. Following previous works, we conducted experiments on a multilingual dataset, IARPA BABEL. The experiment results show that our approach outperformed the baseline fixed-topology architecture by 10.2% and 10.0% relative reduction on character error rates under monolingual and multilingual ASR settings respectively. Furthermore, we perform some analysis on the searched architectures by DARTS-ASR.Comment: Accepted at INTERSPEECH 202

arXiv.org e-Print Archive

Crossref

Unifying and Merging Well-trained Deep Neural Networks for Inference Stage

Author: Chan Yi-Ming
Chen Chu-Song
Chiu Chih-Yi
Chou Yi-Min
Lee Jia-Hong
Publication venue
Publication date: 13/05/2018
Field of study

We propose a novel method to merge convolutional neural-nets for the inference stage. Given two well-trained networks that may have different architectures that handle different tasks, our method aligns the layers of the original networks and merges them into a unified model by sharing the representative codes of weights. The shared weights are further re-trained to fine-tune the performance of the merged model. The proposed method effectively produces a compact model that may run original tasks simultaneously on resource-limited devices. As it preserves the general architectures and leverages the co-used weights of well-trained networks, a substantial training overhead can be reduced to shorten the system development time. Experimental results demonstrate a satisfactory performance and validate the effectiveness of the method.Comment: To appear in the 27th International Joint Conference on Artificial Intelligence and the 23rd European Conference on Artificial Intelligence, 2018. (IJCAI-ECAI 2018

arXiv.org e-Print Archive

Crossref