The COVID-19 pandemic shifted many events in our daily lives into the virtual
domain. While virtual conference systems provide an alternative to physical
meetings, larger events require a muted audience to avoid an accumulation of
background noise and distorted audio. However, performing artists strongly rely
on the feedback of their audience. We propose a concept for a virtual audience
framework which supports all participants with the ambience of a real audience.
Audience feedback is collected locally, allowing users to express enthusiasm or
discontent by selecting means such as clapping, whistling, booing, and
laughter. This feedback is sent as abstract information to a virtual audience
server. We broadcast the combined virtual audience feedback information to all
participants, which can be synthesized as a single acoustic feedback by the
client. The synthesis can be done by turning the collective audience feedback
into a prompt that is fed to state-of-the-art models such as AudioGen. This
way, each user hears a single acoustic feedback sound of the entire virtual
event, without requiring to unmute or risk hearing distorted, unsynchronized
feedback.Comment: 4 pages, 2 figure