AbstractThe currently common methods in automatic speech recognition and labeling are usually supervised which need manually labeled transcriptions. Considering the high cost and time-consuming especially in the acoustic model training stage, a new data selection method named Grammar-based Semi-Supervised Incremental Learning is proposed requiring only a small number of manually labeled data to initialize the acoustic model. The initial model is loaded to recognize a great number of unlabeled transcriptions to receive the N-best results which are analyzed and further synthesized and used to choose the optimal results for iterately system retraining according to the proposed method. The experimental results show that this method can significantly improve the performance of the acoustic model in LVCSR, and moreover the whole labeling system. Besides, a syllable-covering explanation about the improvement is given
Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.