1,160 research outputs found
Generating Paired Transliterated-cognates Using Multiple Pronunciation Characteristics from Web corpora
A novel approach to automatically extracting paired transliterated-cognates from Web corpora is proposed in this paper. One of the most important issues addressed is that of taking multiple pronunciation characteristics into account. Terms from various languages may pronounce very differently. Incorporating the knowledge of word origin may improve the pronunciation accuracy of terms. The accuracy of generated phonetic information has an important impact on term transliteration and hence transliterated-term extraction. Transliterated-term extraction is a fundamental task in natural language processing to extract paired transliterated-terms in studying term transliteration. An experiment on transliterated-term extraction from two kinds of Web resources, Web pages and anchored texts, has been conducted and evaluated. The experimental results show that many transliterated-term pairs, which cannot be extracted using the approach only exploiting English pronunciation characteristics, have been successfully extracted using the proposed approach in this paper. By taking multiple language-specific pronunciation transformations into account may further improve the output of the transliterated-term extraction
The Microsoft 2016 Conversational Speech Recognition System
We describe Microsoft's conversational speech recognition system, in which we
combine recent developments in neural-network-based acoustic and language
modeling to advance the state of the art on the Switchboard recognition task.
Inspired by machine learning ensemble techniques, the system uses a range of
convolutional and recurrent neural networks. I-vector modeling and lattice-free
MMI training provide significant gains for all acoustic model architectures.
Language model rescoring with multiple forward and backward running RNNLMs, and
word posterior-based system combination provide a 20% boost. The best single
system uses a ResNet architecture acoustic model with RNNLM rescoring, and
achieves a word error rate of 6.9% on the NIST 2000 Switchboard task. The
combined system has an error rate of 6.2%, representing an improvement over
previously reported results on this benchmark task
Multi-Module G2P Converter for Persian Focusing on Relations between Words
In this paper, we investigate the application of end-to-end and multi-module
frameworks for G2P conversion for the Persian language. The results demonstrate
that our proposed multi-module G2P system outperforms our end-to-end systems in
terms of accuracy and speed. The system consists of a pronunciation dictionary
as our look-up table, along with separate models to handle homographs, OOVs and
ezafe in Persian created using GRU and Transformer architectures. The system is
sequence-level rather than word-level, which allows it to effectively capture
the unwritten relations between words (cross-word information) necessary for
homograph disambiguation and ezafe recognition without the need for any
pre-processing. After evaluation, our system achieved a 94.48% word-level
accuracy, outperforming the previous G2P systems for Persian.Comment: 10 pages, 4 figure
- …