Search CORE

45,392 research outputs found

Real Time SpeakerRecognition on TMS320C6713

Author: Sahoo Akash
Tripathy Abhijit
Publication venue
Publication date: 14/01/2012
Field of study

Speaker recognition is defined as the process of identifying a person on the basis of the information contained in speech.In this world where breach of security is a major threat ,speaker recognition is one of the major biometric recognition techniques. A large number of organizations like banks,defence laboratories ,industries , forensic surveillance are using this technology for security purposes.Speaker recognition is mainly divided into two categories : Speaker identification andSpeaker verification. In speaker identification we find out which speaker has uttered the given speech ,whereas in speaker verification we determine if the speaker who is claiming a particular identity is telling the truth or not.In our first phase we did speaker recognition on MATLAB.The process we followed comprised of three parts.First we did preprocessing where we truncated the signal and performed thresholding on it.Then we extracted the features of speech signals using Mel frequency Cepstrum coefficients. These extracted features were then matched with a set of speakers using a Vector Quantization approach.In our second phase we tried to implement speaker recognition in real time.As speaker recognition is a signal processing task ,we decided to implement it real time on a DSP (digital signl processor) as it performs very fast multiply and accumulate operations(MAC) and speaker recognition had stages where signals were primarily added and multiplied .Hence DSP was choosen as our platform.The second phase comprises our familiarisation with the TMS320C6713 DSP,the first few audio applications we performed on it,some real time filters we developed on it and finally our speech recognition problem

ethesis@nitr

Speaker verification using Mel Frequency Cepstral Coefficient and Artificial Neural Network

Author: Behera Sujit
Singh Jatindra Kumar
Publication venue
Publication date: 14/05/2012
Field of study

Speaker recognition is defined as to make sure that if the person is the same person he claims to be or not. This technique is one of the biometric recognition techniques useful in all most all areas where security is a concern. Speaker recognition can be divided into speaker identification and speaker verification. Speaker identification decides if a speaker is a specific person or is from a group. In speaker verification, a person makes an identity claim (e.g., by entering a pin number with the debit/credit card at ATM). There are two main stages in this technique, feature extraction and feature matching. Feature extraction is the process in which we extract some useful data which can later to be used to represent the speaker. Feature matching involves identification of the unknown speaker by comparing the feature extracted from the voice with the enrolled voices of known speakers. In this project we have extracted the MFCCs of the speech signal, which involve recording of the speech signal, windowing, framing, thresholding, STDFT (short time discrete fourier transform) calculation and then passing through mel frequency filter. Extracted features are then matched with the stored templates. Algorithms used in feature extraction are calculation of real cepstral coefficient calculation and mel frequency cepstral coefficient calculation. For feature matching we used multi-layer perceptron method in artificial neural network

ethesis@nitr

An Optimized and Privacy-Preserving System Architecture for Effective Voice Authentication over Wireless Network

Author: Dr. Aniruddha Deka
Dr. Debashis Dev Misra
Publication venue
Publication date: 30/09/2023
Field of study

The speaker authentication systems assist in determining the identity of speaker in audio through distinctive voice characteristics. Accurate speaker authentication over wireless network is becoming more challenging due to phishing assaults over the network. There have been constructed multiple kinds of speech authentication models to employ in multiple applications where voice authentication is a primary focus for user identity verification. However, explored voice authentication models have some limitations related to accuracy and phishing assaults in real-time over wireless network. In research, optimized and privacy-preserving system architecture for effective speaker authentication over a wireless network has been proposed to accurately identify the speaker voice in real-time and prevent phishing assaults over network in more accurate manner. The proposed system achieved very good performance metrics measured accuracy, precision, and recall and the F1 score of the proposed model were98.91%, 96.43%, 95.37%, and 97.99%, respectively. The measured training losses on the epoch 0, 10, 20, 30, 40, 50, 60, 70, 80, 90, and 100 were 2.4, 2.1, 1.8, 1.5, 1.2, 0.9, 0.6, 0.3, 0.3, 0.3, and 0.2, respectively. Also, the measured testing losses on the epoch of 0, 10, 20, 30, 40, 50, 60, 70, 80, 90, and 100 were 2.2, 2, 1.5, 1.4, 1.1, 0.8, 0.8, 0.7, 0.4, 0.1 and 0.1, respectively. Voice authentication over wireless networks is serious issue due to various phishing attacks and inaccuracy in voice identification. Therefore, this requires huge attention for further research in this field to develop less computationally complex speech authentication systems.Published By: Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP) © Copyright: All rights reserved

ZENODO

A Real Time Speaker Recognition System Based on GMM

Author: 洪青阳
胡益平
蔡骏
Publication venue
Publication date: 17/06/2007
Field of study

介绍了一个基于GMM实时说话人识别系统的设计与实现,系统具有实时说话人辨认和实时说话人确认功能。在实验室条件下,对不同的高斯混合密度个数及采样率进行了测试,测试了模型的自适应性能。实验表明系统具有较好的识别准确率。The design and implementation of a real-time speaker recognition system which is based on GMM(Gaussian Mixture Model) are presented. The system has the characteristics of real time speaker identification and real time speaker verification. In the lab environment, the performance of the system, as well as the model adaptation, has been fully tested with GMMs of different numbers of Gaussian mixtures and different sampling rates. The testing results show that the GMM-based system has a satisfactory correctness in performing speaker recognition.厦门大学“985工程”二期“信息技术”创新平台项目资助,项目编号0000-X0720

Xiamen University Institutional Repository

MCE 2018: The 1st Multi-target Speaker Detection and Identification Challenge Evaluation

Author: Dehak Najim
Glass James
Reynolds Douglas
Shon Suwon
Publication venue
Publication date: 07/04/2019
Field of study

The Multi-target Challenge aims to assess how well current speech technology is able to determine whether or not a recorded utterance was spoken by one of a large number of blacklisted speakers. It is a form of multi-target speaker detection based on real-world telephone conversations. Data recordings are generated from call center customer-agent conversations. The task is to measure how accurately one can detect 1) whether a test recording is spoken by a blacklisted speaker, and 2) which specific blacklisted speaker was talking. This paper outlines the challenge and provides its baselines, results, and discussions.Comment: http://mce.csail.mit.edu . arXiv admin note: text overlap with arXiv:1807.0666

arXiv.org e-Print Archive

Crossref

Speaker Re-identification with Speaker Dependent Speech Enhancement

Author: Hain Thomas
Huang Qiang
Shi Yanpei
Publication venue
Publication date: 27/08/2020
Field of study

While the use of deep neural networks has significantly boosted speaker recognition performance, it is still challenging to separate speakers in poor acoustic environments. Here speech enhancement methods have traditionally allowed improved performance. The recent works have shown that adapting speech enhancement can lead to further gains. This paper introduces a novel approach that cascades speech enhancement and speaker recognition. In the first step, a speaker embedding vector is generated , which is used in the second step to enhance the speech quality and re-identify the speakers. Models are trained in an integrated framework with joint optimisation. The proposed approach is evaluated using the Voxceleb1 dataset, which aims to assess speaker recognition in real world situations. In addition three types of noise at different signal-noise-ratios were added for this work. The obtained results show that the proposed approach using speaker dependent speech enhancement can yield better speaker recognition and speech enhancement performances than two baselines in various noise conditions.Comment: Acceptted for presentation at Interspeech202

arXiv.org e-Print Archive

Crossref

White Rose Research Online