1 research outputs found
Combining Acoustics, Content and Interaction Features to Find Hot Spots in Meetings
Involvement hot spots have been proposed as a useful concept for meeting
analysis and studied off and on for over 15 years. These are regions of
meetings that are marked by high participant involvement, as judged by human
annotators. However, prior work was either not conducted in a formal machine
learning setting, or focused on only a subset of possible meeting features or
downstream applications (such as summarization). In this paper we investigate
to what extent various acoustic, linguistic and pragmatic aspects of the
meetings, both in isolation and jointly, can help detect hot spots. In this
context, the openSMILE toolkit is to used to extract features based on
acoustic-prosodic cues, BERT word embeddings are used for encoding the lexical
content, and a variety of statistics based on speech activity are used to
describe the verbal interaction among participants. In experiments on the
annotated ICSI meeting corpus, we find that the lexical model is the most
informative, with incremental contributions from interaction and
acoustic-prosodic model components.Comment: Revised for publicatio