6 research outputs found

    Close Copy Speech Synthesis for Speech Perception Testing

    Get PDF
    The present study is concerned with developing a speech synthesis subcomponent for perception testing in the context of evaluating cochlear implants in children. We provide a detailed requirements analysis, and develop a strategy for maximally high quality speech synthesis using Close Copy Speech synthesis techniques with a diphone based speech synthesiser, MBROLA. The close copy concept used in this work defines close copy as a function from a pair of speech signal recording and a phonemic annotation aligned with the recording into the pronunciation specification interface of the speech synthesiser. The design procedure has three phases: Manual Close Copy Speech (MCCS) synthesis as a ?best case gold standard?, in which the function is implemented manually as a preliminary step; Automatic Close Copy Speech (ACCS) synthesis, in which the steps taken in manual transformation are emulated by software; finally, Parametric Close Copy Speech (PCCS) synthesis, in which prosodic parameters are modifiable while retaining the diphones. This contribution reports on the MCCS and ACCS synthesis phases

    Dialogue system development for an emergency scenario

    No full text
    Abstract Dialogue systems are commonly used in callcentres and support the human telephone operators at their work. For the present work, a corpus linguistic study was performed, with the aim of finding patterns in dialogue which can be used to model typical dialogues, and of identifying stimuli which trigger alignment behaviours of particular types in humans. The main focus was set to alignment on the semantic level. A model of the human-computer dialogue is developed and implemented in a prototype dialogue system for an emergency scenario. The dialogue between the human and the computer is handled by two linked finite state automata (FSA): one for the dialogue manager and one for the map traversal

    Komunikacyjne dopasowanie mowy syntetycznej

    No full text
    Wydział Neofilologii: Instytut JęzykoznawstwaThe central claim of the thesis is that a dialogue system should be well-motivated by dialogue theory and by analysis of actual dialogues, and that the resulting system should be tested in a real-world scenario. The thesis concentrates on methodology and investigates a range of methods from discourse theory through corpus linguistic studies to automata theory in order to fulfil this requirement. The operational aim is to provide a basic proof-of-concept dialogue system based on the claim and combining written and spoken communication. The operational aim is therefore not to develop a fully functional engineering standard dialogue system, but a prototype which demonstrates the methodology of the thesis and fulfilment of the main claim in a simulated but realistic stressful scenario. An emergency scenario and a map-task dialogue were chosen as an example and alignment of semantic representations of the map was claimed to be essential for successful communication. Linguistic specifications were outlined and their implications for the spoken dialogue system development were discussed. A prototype dialogue system using these specifications was developed and successfully evaluated with human users. The prototype system combines text input with speech output, with a dialogue engine based on two linked finite state automata: one for the dialogue manager and one for map traversal.Głównym twierdzeniem rozprawy jest to, że system dialogowy powinien być dobrze umotywowany przez teorię dialogu i przez analizę rzeczywistych dialogów, i to że zbudowany w ten sposób system powinien być testowany w realnych warunkach. Rozprawa koncentruje się na metodologii i bada zakres metod od teorii dyskursu poprzez lingwistyczne badania korpusowe do teorii automatów w celu spełnienia tego wymogu. Cel operacyjny to dostarczenie podstawowego systemu dialogowego potwierdzającego założenia i opartego na twierdzeniu, który łączy pisemną i mówioną komunikację. Celem nie jest więc stworzenie pełnego funkcjonalnego systemu spełniającego standardy inżynierskie, ale prototyp, który prezentuje metodologię rozprawy i realizację głównego twierdzenia w symulowanych, ale realistycznych warunkach stresowych. Scenariusz nagłego wypadku oraz dialog zadania z mapą zostały wybrane jako przykład, a dopasowanie reprezentacji semantycznych mapy zostało uznane jako rzecz niezbędna do pomyślnej komunikacji. Specyfikacje lingwistyczne i ich implikacje dla rozwoju systemu zostały omówione. Prototyp systemu korzystającego z tych specyfikacji został stworzony i pomyślnie oceniony przez użytkowników. System łączy tekst na wejściu i mowę na wyjściu, a silnik dialogu oparty jest na dwóch połączonych automatach stanów skończonych: jednego dla menadżera dialogu i jednego dla przemieszczania się po mapie

    Duration and speed of speech events: A selection of methods

    No full text
    Gibbon D, Klessa K, Bachan J. Duration and speed of speech events: A selection of methods. Lingua Posnaniensis. 2015;56(1):59-83.The study of speech timing, i.e. the duration and speed or tempo of speech events, has increased in importance over the past twenty years, in particular in connection with increased demands for accuracy, intelligibility and naturalness in speech technology, with applications in language teaching and testing, and with the study of speech timing patterns in language typology. H owever, the methods used in such studies are very diverse, and so far there is no accessible overview of these methods. Since the field is too broad for us to provide an exhaustive account, we have made two choices: first, to provide a framework of paradigmatic (classificatory), syntagmatic (compositional) and functional (discourse-oriented) dimensions for duration analysis; and second, to provide worked examples of a selection of methods associated primarily with these three dimensions. Some of the methods which are covered are established state-of-the-art approaches (e.g. the paradigmatic Classification and Regression Trees, CART , analysis), others are discussed in a critical light (e.g. so-called ‘rhythm metrics’). A set of syntagmatic approaches applies to the tokenisation and tree parsing of duration hierarchies, based on speech annotations, and a functional approach describes duration distributions with sociolinguistic variables. Several of the methods are supported by a new web-based software tool for analysing annotated speech data, the Time Group Analyser
    corecore