Search CORE

13,949 research outputs found

Implementasi Speech Recognition Berbasis Android Dalam Optimalisasi Komunikasi Bagi Penyandang Tunarungu

Author: Fergina Anggun
Jaman Adang Badru
Publication venue: Universitas Katolik Santo Thomas
Publication date: 13/12/2021
Field of study

This study aims to implement Android-based Speech Recognition or Speech to Text to make it easier to communicate with deaf people, so that everyone who wants to communicate does not need to understand certain sign languages when interacting with the deaf person. The research method in developing the system uses the SDLC (System Development Life Cycle) method or often referred to as the waterfall approach. The method used to identify voice using the Vector Quantization method, which is a method for conducting learning in a supervised competitive layer, with the Vector Quantization method it can be concluded that it has a better sound accuracy and clarity value with a performance result of 93%. from the results of a survey conducted from 45 people with hearing impairment, the application that was built played a very good role in terms of the continuity of the interaction process that was carried out very smoothly and easily understood by the deaf. Keywords— Speech recognition, Speech to Text, Vector Quantizatio

Media Publikasi Ilmiah UNIKA (Universitas Katolik) Santo Thomas Medan

A Hybrid Rough Sets K-Means Vector Quantization Model For Neural Networks Based Arabic Speech Recognition

Author: Babiker Elsadig Ahmed Mohamed
Publication venue
Publication date: 01/09/2002
Field of study

Speech is a natural, convenient and rapid means of human communication. The abil ity to respond to spoken language is of special importance in computer application wherein the user cannot use his/her limbs in a proper way, and may be useful in office automation systems. It can help in developing control systems for many applications such as in telephone assistance systems. Rough sets theory represents a mathematical approach to vagueness and uncertainty. Data analysis, data reduction, approxi mate classification, machine learning, and discovery of pattern in data are functions performed by a rough sets analysis. It was one of the first non-statistical methodologies of data analysis. It extends classical set theory by incorporating into the set model the notion of classification as indiscernibility relation.In previous work rough sets approach application to the field of speech recognition was limited to the pattern matching stage. That is, to use training speech patterns to generate classification rules that can be used later to classify input words patterns. In this thesis rough sets approach was used in the preprocessing stages, namely in the vector quantization operation in which feature vectors are quantized or classified to a finite set of codebook classes. Classification rules were generated from training feature vectors set, and a modified form of the standard voter classification algorithm, that use the rough sets generated rules, was applied. A vector quantization model that incorporate rough sets attribute reduction and rules generation with a modified version of the K-means clustering algorithm was developed, implemented and tested as a part of a speech recognition framework, in which the Learning Vector Quantization (LVQ) neural network model was used in the pattern matching stage. In addition to the Arabic speech data that used in the original experiments, for both speaker dependant and speaker independent tests, more verification experiments were conducted using the TI20 speech data. The rough sets vector quantization model proved its usefulness in the speech recognition framework, however it can be extended to different applications that involve large amounts of data such as speaker verification

Universiti Putra Malaysia Institutional Repository

Speech Recognition Using Vector Quantization through Modified K-meansLBG Algorithm

Author: Doye D. D.
Sonkamble Balwant A.
Publication venue: The International Institute for Science, Technology and Education (IISTE)
Publication date: 31/07/2012
Field of study

In the Vector Quantization, the main task is to generate a good codebook. The distortion measure between the original pattern and the reconstructed pattern should be minimum. In this paper, a proposed algorithm called Modified K-meansLBG algorithm used to obtain a good codebook. The system has shown good performance on limited vocabulary tasks. Keywords: K-means algorithm, LBG algorithm, Vector Quantization, Speech Recognitio

International Institute for Science, Technology and Education (IISTE): E-Journals

Enhancing Speech Recognition Using Improved Particle Swarm Optimization Based Hidden Markov Model

Author: Balakrishnan Ganesan
Lokesh Selvaraj
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2014
Field of study

Enhancing speech recognition is the primary intention of this work. In this paper a novel speech recognition method based on vector quantization and improved particle swarm optimization (IPSO) is suggested. The suggested methodology contains four stages, namely, (i) denoising, (ii) feature mining (iii), vector quantization, and (iv) IPSO based hidden Markov model (HMM) technique (IP-HMM). At first, the speech signals are denoised using median filter. Next, characteristics such as peak, pitch spectrum, Mel frequency Cepstral coefficients (MFCC), mean, standard deviation, and minimum and maximum of the signal are extorted from the denoised signal. Following that, to accomplish the training process, the extracted characteristics are given to genetic algorithm based codebook generation in vector quantization. The initial populations are created by selecting random code vectors from the training set for the codebooks for the genetic algorithm process and IP-HMM helps in doing the recognition. At this point the creativeness will be done in terms of one of the genetic operation crossovers. The proposed speech recognition technique offers 97.14% accuracy

Crossref

Directory of Open Access Journals

PubMed Central

Word And Speaker Recognition System

Author: TAN SHWU FEI
Publication venue: Universiti Teknologi Petronas
Publication date: 01/01/2010
Field of study

In this report, a system which combines user dependent Word Recognition and text dependent speaker recognition is described. Word recognition is the process of converting an audio signal, captured by a microphone, to a word. Speaker Identification is the ability to recognize a person identity base on the specific word he/she uttered. A person's voice contains various parameters that convey information such as gender, emotion, health, attitude and identity. Speaker recognition identifies who is the speaker based on the unique voiceprint from the speech data. Voice Activity Detection (VAD), Spectral Subtraction (SS), Mel-Frequency Cepstrum Coefficient (MFCC), Vector Quantization (VQ), Dynamic Time Warping (DTW) and k-Nearest Neighbour (k-NN) are methods used in word recognition part of the project to implement using MATLAB software. For Speaker Recognition part, Vector Quantization (VQ) is used. The recognition rate for word and speaker recognition system that was successfully implemented is 84.44% for word recognition while for speaker recognition is 54.44%

UTPedia

Speaker Recognition Based Home Automation Using Matlab

Author: Khandade S. (Shilpa)
Khot S. (Sucheta)
Publication venue: 'Infogain Publication'
Publication date: 01/09/2016
Field of study

Due to decline in both physical and mental abilities, some elderly are not allowed to leave the bed without assistance. Some time they are unable to make the desirable bodily movements and repositioning. In this paper the home automation is obtained using MATLAB based speaker recognition. The feature extraction of speech signal is done by using MFCC and for selection of features of speech signal vector quantization is used. By using above two steps the speaker is recognized and then this is given to the microcontroller by using serial communication .Then the particular home appliance get operated

Neliti

A Review on Emotion Recognition Algorithms using Speech Analysis

Author: Alghifari Muhammad Fahreza
Gunawan Teddy Surya
Kartiwi Mira
Morshidi Malik Arman
Publication venue: IAES Indonesia Section
Publication date: 01/03/2018
Field of study

In recent years, there is a growing interest in speech emotion recognition (SER) by analyzing input speech. SER can be considered as simply pattern recognition task which includes features extraction, classifier, and speech emotion database. The objective of this paper is to provide a comprehensive review on various literature available on SER. Several audio features are available, including linear predictive coding coefficients (LPCC), Mel-frequency cepstral coefficients (MFCC), and Teager energy based features. While for classifier, many algorithms are available including hidden Markov model (HMM), Gaussian mixture model (GMM), vector quantization (VQ), artificial neural networks (ANN), and deep neural networks (DNN). In this paper, we also reviewed various speech emotion database. Finally, recent related works on SER using DNN will be discussed

Indonesian Journal of Electrical Engineering and Informatics (IJEEI)

VOICE RECOGNITION SECURITY SYSTEM USING MEL-FREQUENCY CEPSTRUM COEFFICIENTS

Author: A Sharmila
M Muruganandam
Mahalakshmi P
Publication venue: 'Innovare Academic Sciences Pvt Ltd'
Publication date: 01/12/2016
Field of study

ABSTRACTObjective: Voice Recognition is a fascinating field spanning several areas of computer science and mathematics. Reliable speaker recognition is a hardproblem, requiring a combination of many techniques; however modern methods have been able to achieve an impressive degree of accuracy. Theobjective of this work is to examine various speech and speaker recognition techniques and to apply them to build a simple voice recognition system.Method: The project is implemented on software which uses different techniques such as Mel frequency Cepstrum Coefficient (MFCC), VectorQuantization (VQ) which are implemented using MATLAB.Results: MFCC is used to extract the characteristics from the input speech signal with respect to a particular word uttered by a particular speaker. VQcodebook is generated by clustering the training feature vectors of each speaker and then stored in the speaker database.Conclusion: Verification of the speaker is carried out using Euclidian Distance. For voice recognition we implement the MFCC approach using softwareplatform MatlabR2013b.Keywords: Mel-frequency cepstrum coefficient, Vector quantization, Voice recognition, Hidden Markov model, Euclidean distance

Innovare Academic Sciences: E-Journals