1 research outputs found
Non-verbal information in spontaneous speech -- towards a new framework of analysis
Non-verbal signals in speech are encoded by prosody and carry information
that ranges from conversation action to attitude and emotion. Despite its
importance, the principles that govern prosodic structure are not yet
adequately understood. This paper offers an analytical schema and a
technological proof-of-concept for the categorization of prosodic signals and
their association with meaning. The schema interprets surface-representations
of multi-layered prosodic events. As a first step towards implementation, we
present a classification process that disentangles prosodic phenomena of three
orders. It relies on fine-tuning a pre-trained speech recognition model,
enabling the simultaneous multi-class/multi-label detection. It generalizes
over a large variety of spontaneous data, performing on a par with, or superior
to, human annotation. In addition to a standardized formalization of prosody,
disentangling prosodic patterns can direct a theory of communication and speech
organization. A welcome by-product is an interpretation of prosody that will
enhance speech- and language-related technologies