Search CORE

1,591 research outputs found

Modelling and Understanding of Speech and Speaker Recognition

Author: Gautam Sanyal
Tilendra Shishir Sinha
Publication venue: 'IntechOpen'
Publication date: 12/09/2011
Field of study

IntechOpen

Crossref

Low-delay nonuniform pseudo-QMF banks with application to speech enhancement

Author: Deng Ying
Mathews V. John
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2007
Field of study

Journal ArticleAbstract-This paper presents a method for designing low-delay nonuniform pseudo quadrature mirror filter (QMF) banks. This method is motivated by the work of Li, Nguyen, and Tantaratana, in which the nonuniform filter bank is realized by combining an appropriate number of adjacent sub-bands of a uniform pseudo-QMF bank. In prior work, the prototype filter of the uniform pseudo-QMF bank was constrained to have linear phase and the overall delay associated with the filter bank was often unacceptably large for filter banks with a large number of sub-bands. This paper proposes a pseudo-QMF filter bank design technique that significantly reduces the delay by relaxing the linear phase constraints. An example in which an oversampled critical-band nonuniform filter bank is designed and applied to a two-state modeling speech enhancement system is presented in this paper. Comparison of the performance of this system to competing methods employing tree-structured, linear phase multiresolution analysis indicates that the approach described in this paper strikes a good balance between system performance and low delay

The University of Utah: J. Willard Marriott Digital Library

Optimal analog wavelet bases construction using hybrid optimization algorithm

Author: He Yigang
Li Hongmin
Sun Yichuang
Publication venue: 'Exeley, Inc.'
Publication date: 01/01/2016
Field of study

An approach for the construction of optimal analog wavelet bases is presented. First, the definition of an analog wavelet is given. Based on the definition and the least-squares error criterion, a general framework for designing optimal analog wavelet bases is established, which is one of difficult nonlinear constrained optimization problems. Then, to solve this problem, a hybrid algorithm by combining chaotic map particle swarm optimization (CPSO) with local sequential quadratic programming (SQP) is proposed. CPSO is an improved PSO in which the saw tooth chaotic map is used to raise its global search ability. CPSO is a global optimizer to search the estimates of the global solution, while the SQP is employed for the local search and refining the estimates. Benefiting from good global search ability of CPSO and powerful local search ability of SQP, a high-precision global optimum in this problem can be gained. Finally, a series of optimal analog wavelet bases are constructed using the hybrid algorithm. The proposed method is tested for various wavelet bases and the improved performance is compared with previous works.Peer reviewedFinal Published versio

Crossref

Exeley Inc.

University of Hertfordshire Research Archive

Audio-Visual Automatic Speech Recognition Using PZM, MFCC and Statistical Analysis

Author: Debnath Saswati
Roy Pinki
Publication venue: 'Universidad Internacional de La Rioja'
Publication date: 10/05/2022
Field of study

Audio-Visual Automatic Speech Recognition (AV-ASR) has become the most promising research area when the audio signal gets corrupted by noise. The main objective of this paper is to select the important and discriminative audio and visual speech features to recognize audio-visual speech. This paper proposes Pseudo Zernike Moment (PZM) and feature selection method for audio-visual speech recognition. Visual information is captured from the lip contour and computes the moments for lip reading. We have extracted 19th order of Mel Frequency Cepstral Coefficients (MFCC) as speech features from audio. Since all the 19 speech features are not equally important, therefore, feature selection algorithms are used to select the most efficient features. The various statistical algorithm such as Analysis of Variance (ANOVA), Kruskal-wallis, and Friedman test are employed to analyze the significance of features along with Incremental Feature Selection (IFS) technique. Statistical analysis is used to analyze the statistical significance of the speech features and after that IFS is used to select the speech feature subset. Furthermore, multiclass Support Vector Machine (SVM), Artificial Neural Network (ANN) and Naive Bayes (NB) machine learning techniques are used to recognize the speech for both the audio and visual modalities. Based on the recognition rate combined decision is taken from the two individual recognition systems. This paper compares the result achieved by the proposed model and the existing model for both audio and visual speech recognition. Zernike Moment (ZM) is compared with PZM and shows that our proposed model using PZM extracts better discriminative features for visual speech recognition. This study also proves that audio feature selection using statistical analysis outperforms methods without any feature selection technique

Re-UNIR

Improved anti-noise attack ability of image encryption algorithm using de-noising technique

Author: Abdulwahed Mohanad Najm
Ahmed Ali Kamil
Publication venue: 'Universitas Ahmad Dahlan'
Publication date: 01/12/2020
Field of study

Information security is considered as one of the important issues in the information age used to preserve the secret information through out transmissions in practical applications. With regard to image encryption, a lot of schemes related to information security were applied. Such approaches might be categorized into 2 domains; domain frequency and domain spatial. The presented work develops an encryption technique on the basis of conventional watermarking system with the use of singular value decomposition (SVD), discrete cosine transform (DCT), and discrete wavelet transform (DWT) together, the suggested DWT-DCT-SVD method has high robustness in comparison to the other conventional approaches and enhanced approach for having high robustness against Gaussian noise attacks with using denoising approach according to DWT. MSE in addition to the peak signal-to-noise ratio (PSNR) specified the performance measures which are the base of this study’s results, as they are showing that the algorithm utilized in this study has high robustness against Gaussian noise attacks

Journal of Education and Learning (EduLearn)

TELKOMNIKA (Telecommunication Computing Electronics and Control)

UAD Journal Management System

Image Outlier filtering (IOF) : A Machine learning based DWT optimization Approach

Author: Dr. R.Sunitha
Yugandhar Dasari
Publication venue: Global Journals Inc. (US)
Publication date: 22/10/2012
Field of study

In this paper an image outlier technique, which is a hybrid model called SVM regression based DWT optimization have been introduced. Outlier filtering of RGB image is using the DWT model such as Optimal-HAAR wavelet changeover (OHC), which optimized by the Least Square Support Vector Machine (LS-SVM) . The LS-SVM regression predicts hyper coefficients obtained by using QPSO model. The mathematical models are discussed in brief in this paper: (i) OHC which results in better performance and reduces the complexity resulting in (Optimized FHT). (ii) QPSO by replacing the least good particle with the new best obtained particle resulting in 201C;Optimized Least Significant Particle based QPSO201D; (OLSP-QPSO). On comparing the proposed cross model of optimizing DWT by LS-SVM to perform oulier filtering with linear and nonlinear noise removal standards

Global Journal of Computer Science and Technology (GJCST)