78,340 research outputs found

    On the Computation of the Kullback-Leibler Measure for Spectral Distances

    Get PDF
    Efficient algorithms for the exact and approximate computation of the symmetrical Kullback-Leibler (1998) measure for spectral distances are presented for linear predictive coding (LPC) spectra. A interpretation of this measure is given in terms of the poles of the spectra. The performances of the algorithms in terms of accuracy and computational complexity are assessed for the application of computing concatenation costs in unit-selection-based speech synthesis. With the same complexity and storage requirements, the exact method is superior in terms of accuracy

    Study to determine potential flight applications and human factors design guidelines for voice recognition and synthesis systems

    Get PDF
    A study was conducted to determine potential commercial aircraft flight deck applications and implementation guidelines for voice recognition and synthesis. At first, a survey of voice recognition and synthesis technology was undertaken to develop a working knowledge base. Then, numerous potential aircraft and simulator flight deck voice applications were identified and each proposed application was rated on a number of criteria in order to achieve an overall payoff rating. The potential voice recognition applications fell into five general categories: programming, interrogation, data entry, switch and mode selection, and continuous/time-critical action control. The ratings of the first three categories showed the most promise of being beneficial to flight deck operations. Possible applications of voice synthesis systems were categorized as automatic or pilot selectable and many were rated as being potentially beneficial. In addition, voice system implementation guidelines and pertinent performance criteria are proposed. Finally, the findings of this study are compared with those made in a recent NASA study of a 1995 transport concept

    Design of multimedia processor based on metric computation

    Get PDF
    Media-processing applications, such as signal processing, 2D and 3D graphics rendering, and image compression, are the dominant workloads in many embedded systems today. The real-time constraints of those media applications have taxing demands on today's processor performances with low cost, low power and reduced design delay. To satisfy those challenges, a fast and efficient strategy consists in upgrading a low cost general purpose processor core. This approach is based on the personalization of a general RISC processor core according the target multimedia application requirements. Thus, if the extra cost is justified, the general purpose processor GPP core can be enforced with instruction level coprocessors, coarse grain dedicated hardware, ad hoc memories or new GPP cores. In this way the final design solution is tailored to the application requirements. The proposed approach is based on three main steps: the first one is the analysis of the targeted application using efficient metrics. The second step is the selection of the appropriate architecture template according to the first step results and recommendations. The third step is the architecture generation. This approach is experimented using various image and video algorithms showing its feasibility

    Assessment of cockpit interface concepts for data link retrofit

    Get PDF
    The problem is examined of retrofitting older generation aircraft with data link capability. The approach taken analyzes requirements for the cockpit interface, based on review of prior research and opinions obtained from subject matter experts. With this background, essential functions and constraints for a retrofit installation are defined. After an assessment of the technology available to meet the functions and constraints, candidate design concepts are developed. The most promising design concept is described in detail. Finally, needs for further research and development are identified

    17 ways to say yes:Toward nuanced tone of voice in AAC and speech technology

    Get PDF
    People with complex communication needs who use speech-generating devices have very little expressive control over their tone of voice. Despite its importance in human interaction, the issue of tone of voice remains all but absent from AAC research and development however. In this paper, we describe three interdisciplinary projects, past, present and future: The critical design collection Six Speaking Chairs has provoked deeper discussion and inspired a social model of tone of voice; the speculative concept Speech Hedge illustrates challenges and opportunities in designing more expressive user interfaces; the pilot project Tonetable could enable participatory research and seed a research network around tone of voice. We speculate that more radical interactions might expand frontiers of AAC and disrupt speech technology as a whole

    Relating Objective and Subjective Performance Measures for AAM-based Visual Speech Synthesizers

    Get PDF
    We compare two approaches for synthesizing visual speech using Active Appearance Models (AAMs): one that utilizes acoustic features as input, and one that utilizes a phonetic transcription as input. Both synthesizers are trained using the same data and the performance is measured using both objective and subjective testing. We investigate the impact of likely sources of error in the synthesized visual speech by introducing typical errors into real visual speech sequences and subjectively measuring the perceived degradation. When only a small region (e.g. a single syllable) of ground-truth visual speech is incorrect we find that the subjective score for the entire sequence is subjectively lower than sequences generated by our synthesizers. This observation motivates further consideration of an often ignored issue, which is to what extent are subjective measures correlated with objective measures of performance? Significantly, we find that the most commonly used objective measures of performance are not necessarily the best indicator of viewer perception of quality. We empirically evaluate alternatives and show that the cost of a dynamic time warp of synthesized visual speech parameters to the respective ground-truth parameters is a better indicator of subjective quality

    Speech Synthesis Based on Hidden Markov Models

    Get PDF
    corecore