15 research outputs found

    Analysis of Large-Scale SVM Training Algorithms for Language and Speaker Recognition

    Get PDF
    This paper compares a set of large scale support vector machine (SVM) training algorithms for language and speaker recognition tasks.We analyze five approaches for training phonetic and acoustic SVM models for language recognition. We compare the performance of these approaches as a function of the training time required by each of them to reach convergence, and we discuss their scalability towards large corpora. Two of these algorithms can be used in speaker recognition to train a SVM that classifies pairs of utterances as either belonging to the same speaker or to two different speakers. Our results show that the accuracy of these algorithms is asymptotically equivalent, but they have different behavior with respect to the time required to converge. Some of these algorithms not only scale linearly with the training set size, but are also able to give their best results after just a few iterations. State-of-the-art performance has been obtained in the female subset of the NIST 2010 Speaker Recognition Evaluation extended core test using a single SVM syste

    Memory and computation trade-offs for efficient i-vector extraction

    Get PDF
    This work aims at reducing the memory demand of the data structures that are usually pre-computed and stored for fast computation of the i-vectors, a compact representation of spoken utterances that is used by most state-of-the-art speaker recognition systems. We propose two new approaches allowing accurate i-vector extraction but requiring less memory, showing their relations with the standard computation method introduced for eigenvoices, and with the recently proposed fast eigen-decomposition technique. The first approach computes an i-vector in a Variational Bayes (VB) framework by iterating the estimation of one sub-block of i-vector elements at a time, keeping fixed all the others, and can obtain i-vectors as accurate as the ones obtained by the standard technique but requiring only 25% of its memory. The second technique is based on the Conjugate Gradient solution of a linear system, which is accurate and uses even less memory, but is slower than the VB approach. We analyze and compare the time and memory resources required by all these solutions, which are suited to different applications, and we show that it is possible to get accurate results greatly reducing memory demand compared with the standard solution at almost the same speed

    Memory and computation effective approaches for i-vector extraction

    Get PDF
    This paper focuses on the extraction of i-vectors, a compact representation of spoken utterances that is used by most of the state-of-the-art speaker recognition systems. This work was mainly motivated by the need of reducing the memory demand of the huge data structures that are usually precomputed for fast computation of the i-vectors. We propose a set of new approaches allowing accurate i-vector extraction but requiring less memory, showing their relations with the standard computation method introduced for eigenvoices. We analyze the time and memory resources required by these solutions, which are suited to different fields of application, and we show that it is possible to get accurate results with solutions that reduce both computation time and memory demand compared with the standard solutio

    Memory and computation effective approaches for i–vector extraction

    Get PDF
    This paper focuses on the extraction of i-vectors, a compact representation of spoken utterances that is used by most of the state–of–the–art speaker recognition systems. This work was mainly motivated by the need of reducing the memory demand of the huge data structures that are usually precomputed for fast computation of the i-vectors. We propose a set of new approaches allowing accurate i-vector extraction but requiring less memory, showing their relations with the standard computation method introduced for eigenvoices. We analyze the time and memory resources required by these solutions, which are suited to different fields of application, and we show that it is possible to get accurate results with solutions that reduce both computation time and memory demand compared with the standard solution
    corecore