473 research outputs found

    HCI for the deaf community: developing human-like avatars for sign language synthesis

    Get PDF
    With ever increasing computing power and advances in 3D animation technologies it is no surprise that 3D avatars for sign language (SL) generation are advancing too. Traditionally these avatars have been driven by somewhat expensive and inflexible motion capture technologies and perhaps this is the reason avatars do not feature in all but a few user interfaces (UIs). SL synthesis is a competing technology that is less costly, more versatile and may prove to be the answer to the current lack of access for the Deaf in HCI. This paper outlines the current state of the art in SL synthesis for HCI and how we propose to advance this by improving avatar quality and realism with a view to ameliorating communication and computer interaction for the Deaf community as part of a wider localisation project

    Localization of speech synthesis systems for Irish voice web operations

    Get PDF
    As the information age creates growing consumer demand for information, anytime and anyplace voice applications and voice based services are dramatically changing the way people and businesses communicate. Voice portals established an early market but the development of advanced voice applications has introduced speech technology to the masses

    Phonetic inventory for an Arabic speech corpus

    No full text
    Corpus design for speech synthesis is a well-researched topic in languages such as English compared to Modern Standard Arabic, and there is a tendency to focus on methods to automatically generate the orthographic transcript to be recorded (usually greedy methods). In this work, a study of Modern Standard Arabic (MSA) phonetics and phonology is conducted in order to create criteria for a greedy meth-od to create a speech corpus transcript for recording. The size of the dataset is reduced a number of times using these optimisation methods with different parameters to yield a much smaller dataset with identical phonetic coverage than before the reduction, and this output transcript is chosen for recording. This is part of a larger work to create a completely annotated and segmented speech corpus for MSA

    Vowel synthesis using feed-forward neural networks

    Get PDF

    'The Difference Between Us': Using Early Medieval Northern European Texts in the Creation of a Work for Instrumental Ensemble, Voices and Electronics

    Get PDF
    The aim of this investigation was to explore ways of using untranslated early medieval texts in contemporary musical composition, drawing on literature in Old English, medieval Irish, Old Norse and Middle Welsh. The ultimate goal of this research was to compose a chamber opera for three singers, ensemble and electronics. Despite focusing on the use of text, this is not a literary or linguistic research project. While a knowledge of languages, form and metre has been crucial to my work, the texts have been treated as an element of the creative compositional process. The music has not been written to exemplify text, but to explore and extrapolate the ideas that might arise from it. Moreover, although the texts under consideration were from early medieval northern Europe, the project did not address issues of historical performance practice. Neither was there any attempt to recreate a historical or imagined form of early music. Instead, the texts were used for the literary and sonic content that they provided. The musical language with which these ideas were expressed is my own, which owes its development both to contemporary music and to the legacy of the twentieth century. The first chapter introduces the background to the project, with reference to contemporary composers whose works have informed and influenced the development of my ideas. This is followed by a brief description of the major piece, a chamber opera for three singers, ensemble and electronics entitled The Difference Between Us. In order to hone and explore the various approaches that had the potential to be used in the chamber opera, it was necessary to compose a variety of supporting works. The first supporting work, We Are Apart; Our Song Together, is discussed in detail in Chapter Two, since the composition was included in its entirety as part of the final work. Additional supporting works are discussed in Chapter Three, with sections of this chapter devoted to the vocal, instrumental and electronic compositions of the portfolio. The ideas that were !ii developed in these three compositional genres achieved synthesis in the final work, The Difference Between Us, which is discussed in Chapter Four. In writing The Difference Between Us, the supporting compositions provided invaluable preparatory research into the ways in which early medieval texts could shape the musical structure and content of the work at every level, from surface detail through to global structure. However, the use of untranslated texts in a chamber opera raised profound questions regarding communication and narrative. The form, structure and content of The Difference Between Us arose precisely as an attempt to answer these questions. Rather than limiting the scope of the chamber opera, the early medieval texts became the cornerstone of the musical structure and drama of the work. These conclusions are discussed and evaluated in Chapter Five

    Spectral discontinuity in concatenative speech synthesis – perception, join costs and feature transformations

    Get PDF
    This thesis explores the problem of determining an objective measure to represent human perception of spectral discontinuity in concatenative speech synthesis. Such measures are used as join costs to quantify the compatibility of speech units for concatenation in unit selection synthesis. No previous study has reported a spectral measure that satisfactorily correlates with human perception of discontinuity. An analysis of the limitations of existing measures and our understanding of the human auditory system were used to guide the strategies adopted to advance a solution to this problem. A listening experiment was conducted using a database of concatenated speech with results indicating the perceived continuity of each concatenation. The results of this experiment were used to correlate proposed measures of spectral continuity with the perceptual results. A number of standard speech parametrisations and distance measures were tested as measures of spectral continuity and analysed to identify their limitations. Time-frequency resolution was found to limit the performance of standard speech parametrisations.As a solution to this problem, measures of continuity based on the wavelet transform were proposed and tested, as wavelets offer superior time-frequency resolution to standard spectral measures. A further limitation of standard speech parametrisations is that they are typically computed from the magnitude spectrum. However, the auditory system combines information relating to the magnitude spectrum, phase spectrum and spectral dynamics. The potential of phase and spectral dynamics as measures of spectral continuity were investigated. One widely adopted approach to detecting discontinuities is to compute the Euclidean distance between feature vectors about the join in concatenated speech. The detection of an auditory event, such as the detection of a discontinuity, involves processing high up the auditory pathway in the central auditory system. The basic Euclidean distance cannot model such behaviour. A study was conducted to investigate feature transformations with sufficient processing complexity to mimic high level auditory processing. Neural networks and principal component analysis were investigated as feature transformations. Wavelet based measures were found to outperform all measures of continuity based on standard speech parametrisations. Phase and spectral dynamics based measures were found to correlate with human perception of discontinuity in the test database, although neither measure was found to contribute a significant increase in performance when combined with standard measures of continuity. Neural network feature transformations were found to significantly outperform all other measures tested in this study, producing correlations with perceptual results in excess of 90%

    Interaction Design for Digital Musical Instruments

    Get PDF
    The thesis aims to elucidate the process of designing interactive systems for musical performance that combine software and hardware in an intuitive and elegant fashion. The original contribution to knowledge consists of: (1) a critical assessment of recent trends in digital musical instrument design, (2) a descriptive model of interaction design for the digital musician and (3) a highly customisable multi-touch performance system that was designed in accordance with the model. Digital musical instruments are composed of a separate control interface and a sound generation system that exchange information. When designing the way in which a digital musical instrument responds to the actions of a performer, we are creating a layer of interactive behaviour that is abstracted from the physical controls. Often, the structure of this layer depends heavily upon: 1. The accepted design conventions of the hardware in use 2. Established musical systems, acoustic or digital 3. The physical configuration of the hardware devices and the grouping of controls that such configuration suggests This thesis proposes an alternate way to approach the design of digital musical instrument behaviour – examining the implicit characteristics of its composite devices. When we separate the conversational ability of a particular sensor type from its hardware body, we can look in a new way at the actual communication tools at the heart of the device. We can subsequently combine these separate pieces using a series of generic interaction strategies in order to create rich interactive experiences that are not immediately obvious or directly inspired by the physical properties of the hardware. This research ultimately aims to enhance and clarify the existing toolkit of interaction design for the digital musician
    corecore