Search CORE

4 research outputs found

Feature Transformation Based on Generalization of Linear Discriminant Analysis

Author: Makoto Sakai
Norihide Kitaoka
Seiichi Nakagawa
Publication venue: 'IntechOpen'
Publication date: 01/11/2008
Field of study

IntechOpen

Crossref

Áp dụng Bottle Neck Feature cho nhận dạng tiếng nói tiếng Việt

Author: Huy Nguyễn Văn
Mai Lương Chi
Thắng Vũ Tất
Publication venue: 'Publishing House for Science and Technology, Vietnam Academy of Science and Technology'
Publication date: 03/12/2013
Field of study

In the paper, the basic idea of Bottle Neck Feature (BNF) and the process how to extract BNF are presented. We apply BNF for Vietnamese speech recognition with five layers MLP network of different sizes for the first hidden layer. Input features to extract BNF feature are Perceptual Linear Prediction (PLP) and Mel Frequency Cepstral Coefficient (MFCC). The experiments are carried out on a data set of VOV (Voice of Vietnam). The results show that using BNF for Vietnamese speech recognition, a WER (Word Error Rate) is improved up to 6-7% comparing to the baseline system, and MFCC feature gives a better result than PLP feature.Bài báo trình bày việc áp dụng Bottle Neck Feature (BNF) - một dạng đặc trưng của tín hiệu tiếng nói được trích chọn thông qua mạng neural (Neural Network) - cho nhận dạng tiếng nói tiếng Việt. Nghiên cứu sử dụng mạng Multilayer Perceptron (MLP) năm lớp với kích thước của lớp ẩn thứ nhất khác nhau để trích chọn đặc trưng BNF từ hai loại dữ liệu đầu vào là Perceptual Linear Prediction (PLP) và Mel Frequency Cepstral Coefficient (MFCC), nhằm đánh giá hiệu quả của mỗi loại đặc trưng sau khi được áp dụng BNF. Kết quả thử nghiệm chứng tỏ BNF hiệu quả với tiếng nói tiếng Việt, kết quả nhận dạng trên đặc trưng BNF tốt hơn so với hệ thống cơ sở (baseline system) trong khoảng từ 6% đến 7%, và đặc trưng MFCC cho kết quả tốt hơn PLP.

Vietnam Academy of Science and Technology: Journals Online

Generalization of Linear Discriminant Analysis used in Segmental Unit Input HMM for Speech Recognition

Author
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Speech Recognition

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

Chapters in the first part of the book cover all the essential speech processing techniques for building robust, automatic speech recognition systems: the representation for speech signals and the methods for speech-features extraction, acoustic and language modeling, efficient algorithms for searching the hypothesis space, and multimodal approaches to speech recognition. The last part of the book is devoted to other speech processing applications that can use the information from automatic speech recognition for speaker identification and tracking, for prosody modeling in emotion-detection systems and in other speech processing applications that are able to operate in real-world environments, like mobile communication services and smart homes

Directory of Open Access Books (DOAB)