Search CORE

675 research outputs found

Speech Synthesis Based on Hidden Markov Models

Author: Nankaku Y.
Oura K.
Toda T.
Tokuda K.
Yamagishi J.
Zen H.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/05/2013
Field of study

Prosodic scoring of word hypotheses graphs

Author: Batliner Anton
Kießling Andreas
Kompe Ralf
Niemann Heinrich
Nöth Elmar
Schukat-Talamazzini Ernst Günter
Zottmann A.
Publication venue: Sonstige Einrichtungen. DFKI Deutsches Forschungszentrum für Künstliche Intelligenz
Publication date: 01/01/1995
Field of study

Prosodic boundary detection is important to disambiguate parsing, especially in spontaneous speech, where elliptic sentences occur frequently. Word graphs are an efficient interface between word recognition and parser. Prosodic classification of word chains has been published earlier. The adjustments necessary for applying these classification techniques to word graphs are discussed in this paper. When classifying a word hypothesis a set of context words has to be determined appropriately. A method has been developed to use stochastic language models for prosodic classification. This as well has been adopted for the use on word graphs. We also improved the set of acoustic-prosodic features with which the recognition errors were reduced by about 60% on the read speech we were working on previously, now achieving 10% error rate for 3 boundary classes and 3% for 2 accent classes. Moving to spontaneous speech the recognition error increases significantly (e.g. 16% for a 2-class boundary task). We show that even on word graphs the combination of language models which model a larger context with acoustic-prosodic classifiers reduces the recognition error by up to 50 %

CiteSeerX

Universaar

Acronym

In search of isoglosses: continuous and discrete language embeddings in Slavic historical phonology

Author: Cathcart Chundra A.
Wandl Florian
Publication venue
Publication date: 01/01/2020
Field of study

This paper investigates the ability of neural network architectures to effectively learn diachronic phonological generalizations in a multilingual setting. We employ models using three different types of language embedding (dense, sigmoid, and straight-through). We find that the Straight-Through model outperforms the other two in terms of accuracy, but the Sigmoid model's language embeddings show the strongest agreement with the traditional subgrouping of the Slavic languages. We find that the Straight-Through model has learned coherent, semi-interpretable information about sound change, and outline directions for future research

arXiv.org e-Print Archive

Crossref

ZORA

ヒトニトッテシゼンナオンキョウショリニカンスルケンキュウ

Author: Tachibana Ryuki
タチバナリュウキ
立花隆輝
Publication venue
Publication date
Field of study

Osaka University Knowledge Archive