This paper proposes text independent automatic speaker verification system using IMFCC (Inverse/
Reverse Mel Frequency Coefficients) and IT-EM (Information Theoretic Expectation Maximization). To
perform speaker verification, feature extraction using Mel scale has been widely applied and has
established better results. The IMFCC is based on inverse Mel-scale. The IMFCC effectively captures
information available at the high frequency formants which is ignored by the MFCC. In this paper the
fusion of MFCC and IMFCC at input level is proposed. GMMs (Gaussian Mixture Models) based on EM
(Expectation Maximization) have been widely used for classification of text independent verification.
However EM comes across the convergence issue. In this paper we use our proposed IT-EM which has
faster convergence, to train speaker models. IT-EM uses information theory principles such as PDE
(Parzen Density Estimation) and KL (Kullback-Leibler) divergence measure. IT-EM acclimatizes the
weights, means and covariances, like EM. However, IT-EM process is not performed on feature vector sets
but on a set of centroids obtained using IT (Information Theoretic) metric. The IT-EM process at once
diminishes divergence measure between PDE estimates of features distribution within a given class and
the centroids distribution within the same class. The feature level fusion and IT-EM is tested for the task
of speaker verification using NIST2001 and NIST2004. The experimental evaluation validates that
MFCC/IMFCC has better results than the conventional delta/MFCC feature set. The MFCC/IMFCC
feature vector size is also much smaller than the delta MFCC thus reducing the computational burden as
well. IT-EM method also showed faster convergence, than the conventional EM method, and thus it leads
to higher speaker recognition scores