    Effectiveness in the Realisation of Speaker Authentication

    © 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.An important consideration for the deployment of speaker recognition in authentication applications is the approach to the formation of training and testing utterances . Whilst defining this for a specific scenario is influenced by the associated requirements and conditions, the process can be further guided through the establishment of the relative usefulness of alternative frameworks for composing the training and testing material. In this regard, the present paper provides an analysis of the effects, on the speaker recognition accuracy, of various bases for the formation of the training and testing data. The experimental investigations are conducted based on the use of digit utterances taken from the XM2VTS database. The paper presents a detailed description of the individual approaches considered and discusses the experimental results obtained in different cases

    On The Security Of Hmm-Based Speaker Verification Systems Against Imposture Using Synthetic Speech

    For speaker verification systems, security against imposture is one of the most important problems, and many approaches to reducing false acceptance of impostors as well as false rejection of clients have been investigated. On the other hand, imposture using synthetic speech has not been considered. In this paper, we investigate imposture against speaker verification systems using synthetic speech. We use an HMM-based text-prompted speaker verification system with a false acceptance rate of 0% for human impostors as a reference system, and adopt a trainable HMM-based speech synthesis system for imposture. Experimental results show that false acceptance rates for synthetic speech reached over 70% by training the synthesis system using only 1 sentence from each customer, and current security of HMM-based speaker verification systems against synthetic speech is inadequate

    Secure Speech Biometric Templates

    Utilización de la fase armónica en la detección de voz sintética.

    156 p.Los sistemas de verificación de locutor (SV) tienen que enfrentarse a la posibilidad de ser atacados mediante técnicas de spoofing. Hoy en día, las tecnologías de conversión de voces y de síntesis de voz adaptada a locutor han avanzado lo suficiente para poder crear voces que sean capaces de engañar a un sistema SV. En esta tesis se propone un módulo de detección de habla sintética (SSD) que puede utilizarse como complemento a un sistema SV, pero que es capaz de funcionar de manera independiente. Lo conforma un clasificador basado en GMM, dotado de modelos de habla humana y sintética. Cada entrada se compara con ambos, y, si la diferencia de verosimilitudes supera un determinado umbral, se acepta como humana, rechazándose en caso contrario. El sistema desarrollado es independiente de locutor. Para la generación de modelos se utilizarán parámetros RPS. Se propone una técnica para reducir la complejidad del proceso de entrenamiento, evitando generar TTSs adaptados o un conversor de voz para cada locutor. Para ello, como la mayoría de los sistemas de adaptación o síntesis modernos hacen uso de vocoders, se propone transcodificar las señales humanas mediante vocoders para obtener de esta forma sus versiones sintéticas, con las que se generarán los modelos sintéticos del clasificador. Se demostrará que se pueden detectar señales sintéticas detectando que se crearon mediante un vocoder. El rendimiento del sistema prueba en diferentes condiciones: con las propias señales transcodificadas o con ataques TTS. Por último, se plantean estrategias para el entrenamiento de modelos para sistemas SSD