Although voices provide listeners with significant information about speakers, defining and quantifying voice quality remain elusive goals. The ANSI standard definition of quality (that attribute of auditory sensation in terms of which a listener can judge that two sounds similarly presented and having the same loudness and pitch are dissimilar) is often criticized because it specifies what quality is not, rather than what it is. It has also proven difficult to devise measurement protocols for quality as specified in the ANSI definition. We argue that the ANSI definition is in fact appropriate, because it treats quality as the result of perceptual processes— interactions between listeners and signals in the context of specific perceptual goals. Application of speech synthesis in method-of-adjustment tasks allows measurement of quality psychoacoustically as those aspects of the signal that allow a listener to determine that two sounds of equal pitch and loudness are different, consistent with the ANSI definition, and provides insight into the salient acoustic attributes contributing to quality. This technique holds promise for improving the reliability and validity of measures of voice quality. WHAT IS VOICE? THE DEFINITIONAL DILEMM
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.