77 research outputs found

    Long‐term memory for unfamiliar voices

    Full text link
    From a sample of young male Californians, ten speakers were selected whose voices were approximately normally distributed with respect to the "easy-to-remember" versus "hard-to-remember" judgments of a group of raters. A separate group of listeners each heard one of the voices, and, after delays of 1, 2, or 4 weeks, tried to identify the voice they had heard, using an open-set, independent-judgment task. Distributions of the results did not differ from the distributions expected under the hypothesis of independent judgments. For both "heard previously" and "not heard previously" responses, there was a trend toward increasing accuracy as a function of increasing listener certainty. Overall, heard previously responses were less accurate than not heard previously responses. For heard previously responses, there was a trend toward decreasing accuracy as a function of delay between hearing a voice and trying to identify it. Information-theoretic analysis showed loss of information as a function of delay and provided means to quantify the effects of patterns of voice confusability. Signal-detection analysis revealed the similarity of results from diverse experimental paradigms. A "prototype" model is advanced to explain the fact that certain voices are preferentially selected as having been heard previously. The model also unites several previously unconnected findings in the literature on voice recognition and makes testable predictions

    The multidimensional nature of pathologic vocal quality

    Full text link
    Although the terms "breathy" and "rough" are frequently applied to pathological voices, widely accepted definitions are not available and the relationship between these qualities is not understood. To investigate these matters, expert listeners judged the dissimilarity of pathological voices with respect to breathiness and roughness. A second group of listeners rated the voices on unidimensional scales for the same qualities. Multidimensional scaling analyses suggested that breathiness and roughness are related, multidimensional constructs. Unidimensional ratings of both breathiness and roughness were necessary to describe patterns of similarity with respect to either quality. Listeners differed in the relative importance given to different aspects of voice quality, particularly when judging roughness. The presence of roughness in a voice did not appear to influence raters' judgments of breathiness; however, judgments of roughness were heavily influenced by the degree of breathiness, the particular nature of the influence varying from listener to listener. Differences in how listeners focus their attention on the different aspects of multidimensional perceptual qualities apparently are a significant source of interrater unreliability (noise) in voice quality ratings

    Hacia una teoría unificada de la producción y la percepción de la voz

    Get PDF
    At present, two important questions about voice remain unanswered: When voice quality changes, what physiological alteration caused this change, and if a change to the voice production system occurs, what change in perceived quality can be expected? We argue that these questions can only be answered by an integrated model of voice linking production and perception, and we describe steps towards the development of such a model. Preliminary evidence in support of this approach is also presented. We conclude that development of such a model should be a priority for scientists interested in voice, to explain what physical condition(s) might underlie a given voice quality, or what voice quality might result from a specific physical configuration.En la actualidad quedan por contestar dos cuestiones importantes relacionadas con la voz, a saber: (1) cuando la cualidad de la voz cambia, ¿qué alteración en el mecanismo vocal es la responsable?; y (2) si se produce un cambio en el sistema de producción de la voz, ¿qué cambio puede esperarse en la cualidad de voz percibida auditivamente? Sostenemos que la única respuesta posible a estas preguntas reside en un modelo de voz integrado que una producción y percepción, y describimos pasos hacia el desarrollo de tal modelo. Presentamos evidencias preliminares para respaldar esta propuesta. Concluimos que el desarrollo de semejante modelo debería ser una prioridad para los científicos interesados en la voz con el fin de explicar qué condición o condiciones físicas podrían subyacer a una cualidad de voz determinada, o qué cualidad de voz podría derivar de una configuración física específica

    Perceptual importance of time-domain features of the voice source

    Full text link
    Our previous study examined the perceptual adequacy of different source models. We found that perceived similarity between modeled and natural voice samples was best predicted (in the time dimension) by thematch between waveforms at the negative peak of the flow derivative (R(2) = 0.34). The extent of fit during the opening phase of the source pulses added only 2% to perceived match. However, in that study model, fitting was unweighted, and results might differ if another approach were used. In this study, we constrained the models to fit the negative peak of the flow derivative precisely. We fit 6 different source models to 40 natural voice sources, and then generated synthetic copies of the voices using each modeled source pulse, with all other synthesizer parameters held constant. We then conducted a visual sort-and-rate task in which listeners assessed the extent of perceived match between the original natural voice samples and each copy. Discussion will focus on the specific strengths and weaknesses of each modeling approach for characterizing differences in vocal quality, and on the importance of matches to specific time-domain events versus spectral features in determining voice quality. [Work supported by NIH/NIDCD grant DC01797 and NSF grant IIS-1018863.]

    Validity of rating scale measures of voice quality

    Full text link
    The validity of perceptual measures of vocal quality has been neglected in studies of voice, which focus more commonly on rater reliability. Validity depends in part on reliability, because an unreliable test does not measure what it is intended to measure. However, traditional measures of rating reliability only partially represent interrater agreement, because they cannot reflect variations or patterns of agreement for specific voice samples. In this paper the likelihood that two raters would agree in their ratings of a single voice is examined, for each voice in five previously gathered data sets. Results do not support the continued assumption that traditional rating procedures produce useful indices of listeners' perceptions. Listeners agreed very poorly in the midrange of scales for breathiness and roughness, and mean ratings in the midrange of such scales did not represent the extent to which a voice possesses a quality, but served only to indicate that listeners disagreed. Techniques like analysis by synthesis or judgment of similarity avoid decomposing quality into constituent dimensions, and do not require a listener to compare an external stimulus to an unstable internal representation, thus decreasing the error in measures of quality. Modeling individual differences in perception can increase the variance accounted for in models of quality, further reducing the error in perceptual measures. Thus such techniques may provide valid alternatives to current approaches

    Perception of aperiodicity in pathological voice

    Full text link
    Although jitter, shimmer, and noise acoustically characterize all voice signals, their perceptual importance in naturally produced pathological voices has not been established psychoacoustically. To determine the role of these attributes in the perception of vocal quality, listeners were asked to adjust levels of jitter, shimmer, and the noise-to-signal ratio in a speech synthesizer, so that synthetic voices matched naturally produced tokens. Results showed that, although listeners agreed well in their judgments of the noise-to-signal ratio, they did not agree with one another in their chosen settings for jitter and shimmer. Noise-dependent differences in listeners' ability to detect changes in amounts of jitter and shimmer implicate both listener insensitivity and inability to isolate jitter and shimmer as separate dimensions in the overall pattern of aperiodicity in a voice as causes of this poor agreement. These results suggest that jitter and shimmer are not useful as independent indices of perceived vocal quality, apart from their acoustic contributions to the overall pattern of spectrally shaped noise in a voice. (c) 2005 Acoustical Society of America

    On Peer Review

    No full text
    corecore