82 research outputs found

    Optimizing Phonetic Encoding for Viennese Unit Selection Speech Synthesis

    Get PDF
    While developing lexical resources for a particular language variety (Viennese), we experimented with a set of 5 different phonetic encodings, termed phone sets, used for unit selection speech synthesis. We started with a very rich phone set based on phonological considerations and covering as much phonetic variability as possible, which was then reduced to smaller sets by applying transformation rules that map or merge phone symbols. The optimal trade-off was found measuring the phone error rates of automatically learnt grapheme-to-phone rules and by a perceptual evaluation of 27 representative synthesized sentences. Further, we describe a method to semi-automatically enlarge the lexical resources for the target language variety using a lexicon base for Standard Austrian German

    Resources for speech synthesis of Viennese varieties

    Get PDF
    This paper describes our work on developing corpora of three varieties of Viennese for unit selection speech synthesis. The synthetic voices for Viennese varieties, implemented with the open domain unit selection speech synthesis engine Multisyn of Festival will also be released within Festival. The paper especially focuses on two questions: how we selected the appropriate speakers and how we obtained the text sources needed for the recording of these non-standard varieties. Regarding the first one, it turned out that working with a 'prototypical' professional speaker was much more preferable than striving for authenticity. In addition, we give a brief outline about the differences between the Austrian standard and its dialectal varieties and how we solved certain technical problems that are related to these differences. In particular, the specific set of phones applicable to each variety had to be determined by applying various constraints. Since such a set does not serve any descriptive purposes but rather is influencing the quality of speech synthesis, a careful design of such a (in most cases reduced) set was an important task

    sistemi di interazione vocale per la domotica

    Get PDF
    Una delle questioni aperte nell’ambito dell’home automation è la realizzazione di interfacce uomo-macchina che siano non solo efficaci per il controllo di un sistema, ma anche facilmente accessibili. La voce è il mezzo naturale per comunicare richieste e comandi, quindi l’interfaccia vocale presenta notevoli vantaggi rispetto alle soluzioni touch-screen, interruttori ecc. Il lavoro di tesi proposto è finalizzato alla realizzazione di un sistema di interazione vocale per l’home automation, in grado non solo di riconoscere singoli comandi veicolati da segnali vocali, ma anche di personalizzare i servizi richiesti tramite il riconoscimento del parlatore e di interagire mediante il parlato sintetizzato. Per ciascuna tipologia di interazione vocale, verranno proposte soluzioni volte a superare i limiti dell’approccio classico in letteratura. In prima analisi, verrà presentato un sistema di riconoscimento vocale distribuito (DSR) per il controllo delle luci, che implementa ottimizzazioni ad-hoc per operare nell’ambiente in modo non invasivo e risolvere le problematiche di uno scenario reale. Nel sistema DSR sarà integrato un algoritmo di identificazione del parlatore per ottenere un sistema in grado di personalizzare i comandi sulla base dell’utente riconosciuto. Un sistema di identificazione vocale deve essere in grado di classificare l’utente con frasi della durata inferiore a 5 s. A tal fine verrà proposto un algoritmo basato su truncated Karhunen-Loève transform con performance, su brevi sequenze di speech (< 3.5 s), migliori della convenzionale tecnica basata su Mel-Cepstral coefficients. Verrà infine proposto un framework di sintesi vocale Hidden Markov Model/unit-selection basato su Modified Discrete Cosine Transform, che garantisce la perfetta ricostruibilità del segnale e supera i limiti imposti dalla tecnica Mel-cepstral. Gli algoritmi ed il sistema proposto saranno applicati a segnali acquisiti in condizioni realistiche, al fine di verificarne l’adeguatezza.One of the open questions in home automation is the realization of human-machine interfaces that are not only effective for the control of the available functions, but also easily accessible. The voice is the natural way to communicate requests and commands, in this way speech interface offers considerable advantages over solutions such as touch-screen, switches etc. The proposed thesis is aimed at studying and realizing a speech interaction system for home automation to be able not only to recognize individual commands conveyed by voice signals, but also to customize the services requested through a speaker recognizer and to interact by means of synthesized speech. For each speech interaction mechanism, solutions are suggested to overcome the traditional limitations of previous work. In the first analysis, it is offered a speech distributed recognition system (DSR), for the voice control of a lighting system, that implements strategies and ad-hoc optimizations and is able to solve the typical problems of a real scenario. The DSR system can also be integrated with a speaker identification algorithm in order to obtain a system able to customize the spoken commands on the user specific settings. In the home automation, a speaker identification system must be able to classify the user with sequences of speech frames of a duration less than 5 s. To this goal, an algorithm based on truncated Karhunen-Loève transform able to produce results, with short sequences of speech frames (< 3.5 s), better than those achieved with the Mel-Cepstral coefficients, is suggested. Moreover, this work presents a novel Hidden Markov Models/unit-selection speech synthesis framework based on Modified Discrete Cosine Transform, which guarantees the perfect reconstruction of the speech signal and overcomes the main lacks of Mel-cepstral technique. The algorithms and the proposed system will be applied to signals acquired under realistic conditions, in order to verify its adequacy

    Topics in Programming Languages, a Philosophical Analysis through the case of Prolog

    Get PDF
    [EN]Programming languages seldom find proper anchorage in philosophy of logic, language and science. is more, philosophy of language seems to be restricted to natural languages and linguistics, and even philosophy of logic is rarely framed into programming languages topics. The logic programming paradigm and Prolog are, thus, the most adequate paradigm and programming language to work on this subject, combining natural language processing and linguistics, logic programming and constriction methodology on both algorithms and procedures, on an overall philosophizing declarative status. Not only this, but the dimension of the Fifth Generation Computer system related to strong Al wherein Prolog took a major role. and its historical frame in the very crucial dialectic between procedural and declarative paradigms, structuralist and empiricist biases, serves, in exemplar form, to treat straight ahead philosophy of logic, language and science in the contemporaneous age as well. In recounting Prolog's philosophical, mechanical and algorithmic harbingers, the opportunity is open to various routes. We herein shall exemplify some: - the mechanical-computational background explored by Pascal, Leibniz, Boole, Jacquard, Babbage, Konrad Zuse, until reaching to the ACE (Alan Turing) and EDVAC (von Neumann), offering the backbone in computer architecture, and the work of Turing, Church, Gödel, Kleene, von Neumann, Shannon, and others on computability, in parallel lines, throughly studied in detail, permit us to interpret ahead the evolving realm of programming languages. The proper line from lambda-calculus, to the Algol-family, the declarative and procedural split with the C language and Prolog, and the ensuing branching and programming languages explosion and further delimitation, are thereupon inspected as to relate them with the proper syntax, semantics and philosophical élan of logic programming and Prolog

    The evolution of language: Proceedings of the Joint Conference on Language Evolution (JCoLE)

    Get PDF

    The perceptual flow of phonetic feature processing

    Get PDF

    Across frequency processes involved in auditory detection of coloration

    Get PDF
    corecore