Search CORE

6 research outputs found

ディープニューラルネットワークに基づく悪環境下での音声分類に関する研究

Author: PHAPATANABURI KHOMDET
Publication venue: 教授（主査）　岩橋政宏、教授　山田耕一、教授　明田川正人、教授　湯川高志、准教授　杉田泰則、天津大学　教授　王龍標
Publication date: 31/08/2017
Field of study

国立大学法人長岡技術科学大

Nagaoka University of Technology Institutional Repository

A Study on Speech Classification Based on Deep Neural Network under Adverse Environments

Author: PHAPATANABURI KHOMDET
Publication venue: 長岡技術科学大学
Publication date
Field of study

Institutional Repositories DataBase (IRDB)

Real-Time Gait Phase Detection Using Wearable Sensors for Transtibial Prosthesis Based on a kNN Algorithm

Author: Atcharawan Rattanasak
Bura Sindhupakorn
Khomdet Phapatanaburi
Monthippa Uthansakul
Peerapong Uthansakul
Supakit Rooppakhun
Talit Jumphoo
Publication venue: 'MDPI AG'
Publication date: 01/06/2022
Field of study

Those with disabilities who have lost their legs must use a prosthesis to walk. However, traditional prostheses have the disadvantage of being unable to move and support the human gait because there are no mechanisms or algorithms to control them. This makes it difficult for the wearer to walk. To overcome this problem, we developed an insole device with a wearable sensor for real-time gait phase detection based on the kNN (k-nearest neighbor) algorithm for prosthetic control. The kNN algorithm is used with the raw data obtained from the pressure sensors in the insole to predict seven walking phases, i.e., stand, heel strike, foot flat, midstance, heel off, toe-off, and swing. As a result, the predictive decision in each gait cycle to control the ankle movement of the transtibial prosthesis improves with each walk. The results in this study can provide 81.43% accuracy for gait phase detection, and can control the transtibial prosthetic effectively at the maximum walking speed of 6 km/h. Moreover, this insole device is small, lightweight and unaffected by the physical factors of the wearer

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

PubMed Central

Whispered Speech Detection Using Glottal Flow-Based Features

Author: Khomdet Phapatanaburi
Longbiao Wang
Monthippa Uthansakul
Patikorn Anchuen
Peerapong Uthansakul
Prawit Buayai
Talit Jumphoo
Wongsathon Pathonsuwan
Publication venue: MDPI AG
Publication date: 01/04/2022
Field of study

Recent studies have reported that the performance of Automatic Speech Recognition (ASR) technologies designed for normal speech notably deteriorates when it is evaluated by whispered speech. Therefore, the detection of whispered speech is useful in order to attenuate the mismatch between training and testing situations. This paper proposes two new Glottal Flow (GF)-based features, namely, GF-based Mel-Frequency Cepstral Coefficient (GF-MFCC) as a magnitude-based feature and GF-based relative phase (GF-RP) as a phase-based feature for whispered speech detection. The main contribution of the proposed features is to extract magnitude and phase information obtained by the GF signal. In the GF-MFCC, Mel-frequency cepstral coefficient (MFCC) feature extraction is modified using the estimated GF signal derived from the iterative adaptive inverse filtering as the input to replace the raw speech signal. In a similar way, the GF-RP feature is the modification of the relative phase (RP) feature extraction by using the GF signal instead of the raw speech signal. The whispered speech production provides lower amplitude from the glottal source than normal speech production, thus, the whispered speech via Discrete Fourier Transformation (DFT) provides the lower magnitude and phase information, which make it different from a normal speech. Therefore, it is hypothesized that two types of our proposed features are useful for whispered speech detection. In addition, using the individual GF-MFCC/GF-RP feature, the feature-level and score-level combination are also proposed to further improve the detection performance. The performance of the proposed features and combinations in this study is investigated using the CHAIN corpus. The proposed GF-MFCC outperforms MFCC, while GF-RP has a higher performance than the RP. Further improved results are obtained via the feature-level combination of MFCC and GF-MFCC (MFCC&GF-MFCC)/RP and GF-RP(RP&GF-RP) compared with using either one alone. In addition, the combined score of MFCC&GF-MFCC and RP&GF-RP gives the best frame-level accuracy of 95.01% and the utterance-level accuracy of 100%

Directory of Open Access Journals

Whispered Speech Detection Using Glottal Flow-Based Features

Author: Khomdet Phapatanaburi
Longbiao Wang
Monthippa Uthansakul
Patikorn Anchuen
Peerapong Uthansakul
Prawit Buayai
Talit Jumphoo
Wongsathon Pathonsuwan
Publication venue: 'MDPI AG'
Publication date: 08/04/2022
Field of study

Multidisciplinary Digital Publishing Institute

Noise robust voice activity detection using joint phase and magnitude based feature enhancement

Author: A Benyassine
B Ren
D Ying
DS Williamson
G Hinton
GE Hinton
I McCowan
J Wu
J-H Chang
Khomdet Phapatanaburi
Longbiao Wang
Masahiro Iwahashi
N Kitaoka
R Tucker
RE Fan
S Nakagawa
SB Davis
Seiichi Nakagawa
Weifeng Li
X-L Zhang
XL Zhang
Y Ueda
Y Xu
Zeyan Oo
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref