Dominant speaker detection in multipoint video communication using Markov chain with non-linear weights and dynamic transition window

Baskaran, Vishnu Monn; Chang, Yoong Choon; Gan, Ming Tao; Loo, Jonathan; Wong, KokSheik

Dominant speaker detection in multipoint video communication using Markov chain with non-linear weights and dynamic transition window

Authors: Vishnu Monn Baskaran
Yoong Choon Chang
Ming Tao Gan
Jonathan Loo
KokSheik Wong
Publication date: 1 January 2018
Publisher: 'Elsevier BV'
Doi

Abstract

This paper proposes an enhanced discrete-time Markov chain algorithm in predicting dominant speaker(s) for multipoint video communication system in the presence of transient speech. The proposed algorithm exploits statistical properties of the past speech patterns to accurately predict the dominant speaker for the next time state. Non-linear weights-based coefficients are employed in the enhanced Markov chain for both the initial state vector and transition probability matrix. These weights significantly improve the time taken to predict a new dominant speaker during a conference session. In addition, a mechanism to dynamically modify the size of the transition probability matrix window/container is introduced to improve the adaptability of the Markov chain towards the variability of speech characteristics. Simulation results indicate that for an 11 conference participants test scenario, the enhanced Markov chain prediction algorithm registered an 85% accuracy in predicting a dominant speaker when compared to an ideal case where there is no transient speech. Misclassification of dominant speakers due to transient speech was also reduced by 87%

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

SHDL@MMU Digital Repository

oai:shdl.mmu.edu.my:7390

Last time updated on 06/12/2020

UWL Repository

oai:repository.uwl.ac.uk:5297

Last time updated on 01/08/2018