14,120 research outputs found
Speaker recognition using frequency filtered spectral energies
The spectral parameters that result from filtering the
frequency sequence of log mel-scaled filter-bank energies
with a simple first or second order FIR filter have proved
to be an efficient speech representation in terms of both
speech recognition rate and computational load. Recently,
the authors have shown that this frequency filtering can
approximately equalize the cepstrum variance enhancing
the oscillations of the spectral envelope curve that are
most effective for discrimination between speakers. Even
better speaker identification results than using melcepstrum
have been obtained on the TIMIT database,
especially when white noise was added. On the other
hand, the hybridization of both linear prediction and
filter-bank spectral analysis using either cepstral
transformation or the alternative frequency filtering has
been explored for speaker verification. The combination
of hybrid spectral analysis and frequency filtering, that
had shown to be able to outperform the conventional
techniques in clean and noisy word recognition, has yield
good text-dependent speaker verification results on the
new speaker-oriented telephone-line POLYCOST
database.Peer ReviewedPostprint (published version
Efficient Invariant Features for Sensor Variability Compensation in Speaker Recognition
In this paper, we investigate the use of invariant features for speaker recognition. Owing to their characteristics, these features are introduced to cope with the difficult and challenging problem of sensor variability and the source of performance degradation inherent in speaker recognition systems. Our experiments show: (1) the effectiveness of these features in match cases; (2) the benefit of combining these features with the mel frequency cepstral coefficients to exploit their discrimination power under uncontrolled conditions (mismatch cases). Consequently, the proposed invariant features result in a performance improvement as demonstrated by a reduction in the equal error rate and the minimum decision cost function compared to the GMM-UBM speaker recognition systems based on MFCC features
- …