1 research outputs found
Gamma Boltzmann Machine for Simultaneously Modeling Linear- and Log-amplitude Spectra
In audio applications, one of the most important representations of audio
signals is the amplitude spectrogram. It is utilized in many
machine-learning-based information processing methods including the ones using
the restricted Boltzmann machines (RBM). However, the ordinary
Gaussian-Bernoulli RBM (the most popular RBM among its variations) cannot
directly handle amplitude spectra because the Gaussian distribution is a
symmetric model allowing negative values which never appear in the amplitude.
In this paper, after proposing a general gamma Boltzmann machine, we propose a
practical model called the gamma-Bernoulli RBM that simultaneously handles both
linear- and log-amplitude spectrograms. Its conditional distribution of the
observable data is given by the gamma distribution, and thus the proposed RBM
can naturally handle the data represented by positive numbers as the amplitude
spectra. It can also treat amplitude in the logarithmic scale which is
important for audio signals from the perceptual point of view. The advantage of
the proposed model compared to the ordinary Gaussian-Bernoulli RBM was
confirmed by PESQ and MSE in the experiment of representing the amplitude
spectrograms of speech signals.Comment: Submitted to APSIPA202