Abstract — Traditionally, static mel-frequency cepstral coeffi-cients (MFCCs) are derived by discrete cosine transformation (DCT), and dynamic MFCCs are derived by linear regression. Their derivation may be generalized as a frequency-domain transformation of the log filter-bank energies (FBEs) followed by a time-domain transformation. In the past, these two trans-formations are usually estimated or optimized separately. In this paper, we consider sequences of log FBEs as a set of spectrogram images, and investigate an image compression technique to jointly optimize the two transformations so that the reconstruction error of the spectrogram images is minimized; there is an efficient algorithm that solves the optimization problem. The framework allows extension to other optimization costs as well. Index Terms — low-rank approximation of matrices, time-frequency representation, mel-frequency cepstral coefficients, discrete cosine transform I
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.