86,869 research outputs found
Attentive Statistics Pooling for Deep Speaker Embedding
This paper proposes attentive statistics pooling for deep speaker embedding
in text-independent speaker verification. In conventional speaker embedding,
frame-level features are averaged over all the frames of a single utterance to
form an utterance-level feature. Our method utilizes an attention mechanism to
give different weights to different frames and generates not only weighted
means but also weighted standard deviations. In this way, it can capture
long-term variations in speaker characteristics more effectively. An evaluation
on the NIST SRE 2012 and the VoxCeleb data sets shows that it reduces equal
error rates (EERs) from the conventional method by 7.5% and 8.1%, respectively.Comment: Proc. Interspeech 2018, pp2252--2256. arXiv admin note: text overlap
with arXiv:1809.0931
Improvement of Text Dependent Speaker Identification System Using Neuro-Genetic Hybrid Algorithm in Office Environmental Conditions
In this paper, an improved strategy for automated text dependent speaker identification system has been proposed in noisy environment. The identification process incorporates the Neuro-Genetic hybrid algorithm with cepstral based features. To remove the background noise from the source utterance, wiener filter has been used. Different speech pre-processing techniques such as start-end point detection algorithm, pre-emphasis filtering, frame blocking and windowing have been used to process the speech utterances. RCC, MFCC, ?MFCC, ??MFCC, LPC and LPCC have been used to extract the features. After feature extraction of the speech, Neuro-Genetic hybrid algorithm has been used in the learning and identification purposes. Features are extracted by using different techniques to optimize the performance of the identification. According to the VALID speech database, the highest speaker identification rate of 100.000% for studio environment and 82.33% for office environmental conditions have been achieved in the close set text dependent speaker identification system
- …