113 research outputs found

    Incorporating typographic, logical and layout knowledge of documents into text-to-speech

    No full text
    Although Text-to-Speech (TtS) is considered a mature technology capable to produce synthetic speech of very high quality, current TtS systems do not include effective acoustic provision of the semantics and the cognitive aspects of the visual (such as the typographic cues) and non-visual (such as the logical structure) knowledge embedded in the rich text documents. In this paper, after the introduction of an appropriate document architecture, we analyze the semantics of the document signals. Then, by following a Design-for-All methodology, we present the Document-to-Audio approach we have developed for the automatics rendering document signals from the typographic, logical and the layout layers to the auditory modality. © 2013 The authors and IOS Press. All rights reserved

    Augmented auditory representation of e-texts for text-to-speech systems

    No full text
    Emerging electronic text formats include hierarchical structure and visualization related information that current Text-to-Speech (TtS) systems ignore. In this paper we present a novel approach for composing detailed auditory representation of e-texts using speech and audio. Furthermore, we provide a scripting language (CAD scripts) for defining specific customizations on the operation of a TtS. CAD scripts can be assigned as well to specific text meta-data to enable their discrete auditory representation. This approach can form a mean for a detailed exchange of functionality across different TtS implementations. Moreover, it can be hosted to current TtS systems with minor (or major) modifications. Finally, we briefly present the implementation of DEMOSTHeNES Composer for augmented auditory generation of meta-text using the above methodology. © Springer-Verlag Berlin Heidelberg 2001

    Tone-Group F0 selection for modeling focus prominence in small-footprint speech synthesis

    No full text
    This work targets to improve the naturalness of synthetic intonational contours in Text-to-Speech synthesis through the provision of prominence, which is a major expression of human speech. Focusing on the tonal dimension of emphasis, we present a robust unit-selection methodology for generating realistic F0 curves in cases where focus prominence is required. The proposed approach is based on selecting Tone-Group units from commonly used prosodic corpora that are automatically transcribed as patterns of syllables. In contrast to related approaches, patterns represent only the most perceivable sections of the sampled curves and are encoded to serve morphologically different sequence of syllables. This results in a minimization of the required amount of units so as to achieve sufficient coverage within the database. Nevertheless, this optimization enables the application of high-quality F0 generation to small-footprint text-to-speech synthesis. For generic F0 selection we query the database based on sequences of ToBI labels, though other intonational frameworks can be used as well. To realize focus prominence on specific Tone-Groups the selection also incorporates a level indicator of emphasis. We set up a series of listening tests by exploiting a database built from a 482-utterance corpus, which featured partially purpose-uttered emphasis. The results showed a clear subjective preference of the proposed model against a linear regression one in 75% of the cases when used in generic synthesis. Furthermore, this model provided ambiguous percept of emphasis in an experiment featuring major and minor degrees of prominence. © 2006 Elsevier B.V. All rights reserved

    Transforming spontaneous telegraphic language to well-formed greek sentences for alternative and augmentative communication

    No full text
    The domain of Augmentative and Alternative Communication (AAC) studies appropriate techniques and systems that enhance or accomplish the retaining or non-existing abilities for interpersonal communication. Some AAC users apply telegraphic language, as they attempt to speed up the interactive communication or because they are language impaired. In many AAC aids, a “sentence” is formulated by combining symbols of an icon-based communication system. To be accepted by the communication partner, the output should be a correct oral sentence of a natural language. In this paper we present our effort to develop a novel technique for expanding spontaneous telegraphic input to well-formed Greek sentences, by adopting a feature-based surface realization for Natural Language generation. We first describe the general architecture of the system that accepts compressed, incomplete, grammatically and syntactically ill-formed text and produces a correct full sentence. The NLP techniques of the two main modules, named preprocessor and translator/ generator, are then analyzed. A prototype system has been developed using Component Based Technology (CBT) which is under field evaluation by a number of speech-disabled users. Currently it supports fully the BLISS and MAKATON icon based communication systems. Some limitations of the module are also discussed along with possibilities for future expansions. © Springer-Verlag Berlin Heidelberg 2002

    Design and developing methodology for 8-dot braille code systems

    No full text
    Braille code, employing six embossed dots evenly arranged in rectangular letter spaces or cells, constitutes the dominant touch reading or typing system for the blind. Limited to 63 possible dot combinations per cell, there are a number of application examples, such as mathematics and sciences, and assistive technologies, such as braille displays, in which the 6-dot cell braille is extended to 8-dot. This work proposes a language-independent methodology for the systematic development of an 8-dot braille code. Moreover, a set of design principles is introduced that focuses on: achieving an abbreviated representation of the supported symbols, retaining connectivity with the 6-dot representation, preserving similarity on the transition rules applied in other languages, removing ambiguities, and considering future extensions. The proposed methodology was successfully applied in the development of an 8-dot literary Greek braille code that covers both the modern and the ancient Greek orthography, including diphthongs, digits, and punctuation marks. © 2013 Springer-Verlag Berlin Heidelberg
    corecore