2,181 research outputs found

    How to improve TTS systems for emotional expressivity

    Get PDF
    Several experiments have been carried out that revealed weaknesses of the current Text-To-Speech (TTS) systems in their emotional expressivity. Although some TTS systems allow XML-based representations of prosodic and/or phonetic variables, few publications considered, as a pre-processing stage, the use of intelligent text processing to detect affective information that can be used to tailor the parameters needed for emotional expressivity. This paper describes a technique for an automatic prosodic parameterization based on affective clues. This technique recognizes the affective information conveyed in a text and, accordingly to its emotional connotation, assigns appropriate pitch accents and other prosodic parameters by XML-tagging. This pre-processing assists the TTS system to generate synthesized speech that contains emotional clues. The experimental results are encouraging and suggest the possibility of suitable emotional expressivity in speech synthesis

    A point process framework for modeling electrical stimulation of the auditory nerve

    Full text link
    Model-based studies of auditory nerve responses to electrical stimulation can provide insight into the functioning of cochlear implants. Ideally, these studies can identify limitations in sound processing strategies and lead to improved methods for providing sound information to cochlear implant users. To accomplish this, models must accurately describe auditory nerve spiking while avoiding excessive complexity that would preclude large-scale simulations of populations of auditory nerve fibers and obscure insight into the mechanisms that influence neural encoding of sound information. In this spirit, we develop a point process model of the auditory nerve that provides a compact and accurate description of neural responses to electric stimulation. Inspired by the framework of generalized linear models, the proposed model consists of a cascade of linear and nonlinear stages. We show how each of these stages can be associated with biophysical mechanisms and related to models of neuronal dynamics. Moreover, we derive a semi-analytical procedure that uniquely determines each parameter in the model on the basis of fundamental statistics from recordings of single fiber responses to electric stimulation, including threshold, relative spread, jitter, and chronaxie. The model also accounts for refractory and summation effects that influence the responses of auditory nerve fibers to high pulse rate stimulation. Throughout, we compare model predictions to published physiological data and explain differences in auditory nerve responses to high and low pulse rate stimulation. We close by performing an ideal observer analysis of simulated spike trains in response to sinusoidally amplitude modulated stimuli and find that carrier pulse rate does not affect modulation detection thresholds.Comment: 1 title page, 27 manuscript pages, 14 figures, 1 table, 1 appendi

    ONE-BIT QUANTIZER PARAMETRIZATION FOR ARBITRARY LAPLACIAN SOURCES

    Get PDF
    In this paper we suggest an exact formula for the total distortion of one-bit quantizer and for the arbitrary Laplacian probability density function (pdf). Suggested formula additionally extends normalized case of zero mean and unit variance, which is the most applied quantization case not only in traditional quantization rather in contemporary solutions that involve quantization. Additionally symmetrical quantizer’s representation levels are calculated from minimal distortion criteria. Note that one-bit quantization is the most sensitive quantization from the standpoint of accuracy degradation and quantization error, thus increasing importance of the suggested parameterization of one-bit quantizer

    The Local Structure of Space-Variant Images

    Full text link
    Local image structure is widely used in theories of both machine and biological vision. The form of the differential operators describing this structure for space-invariant images has been well documented (e.g. Koenderink, 1984). Although space-variant coordinates are universally used in mammalian visual systems, the form of the operators in the space-variant domain has received little attention. In this report we derive the form of the most common differential operators and surface characteristics in the space-variant domain and show examples of their use. The operators include the Laplacian, the gradient and the divergence, as well as the fundamental forms of the image treated as a surface. We illustrate the use of these results by deriving the space-variant form of corner detection and image enhancement algorithms. The latter is shown to have interesting properties in the complex log domain, implicitly encoding a variable grid-size integration of the underlying PDE, allowing rapid enhancement of large scale peripheral features while preserving high spatial frequencies in the fovea.Office of Naval Research (N00014-95-I-0409
    corecore