9 research outputs found
Design of Multiplier-less FIR filters with Simultaneously Variable Bandwidth and Fractional Delay
Low complexity and reconfigurability are the key features in emerging communication applications in order to support multi-standards and operation modes. To obtain these features, an efficient implementation of finite impulse response (FIR) filter, with Variable Bandwidth and Fractional Delay, is proposed in this paper. The reduction in the implementation complexity is achieved by converting the continuous filter coefficients to signed power of two (SPT) space. This may lead to performance degradation. Hence, in this paper, Artificial Bee Colony (ABC) optimization is deployed for finding the near optimal solution in the discrete space
The impact of speaking rate on acoustic-to-articulatory inversion
Acoustic characteristics and articulatory movements are known to vary with speaking rates. This study investigates the role of speaking rate on acoustic-to-articulatory inversion (AAI) performance using deep neural networks (DNNs). Since fast speaking rate causes fast articulatory motion as well as changes in spectro-temporal characteristics of the speech signal, the articulatory-acoustic map in a fast speaking rate could be different from that in a slow speaking rate. We examine how these differences alter the accuracy with which different articulatory positions could be recovered from the acoustics. AAI experiments are performed in both matched and mismatched train-test conditions using data of five subjects, in three different rates - normal, fast and slow (fast and slow rates are at least 1.3 times faster and slower than the normal rate). Experiments in matched cases reveal that, the errors in estimating vertical motion of sensors on the tongue articulators from acoustics with fast speaking rate, is significantly higher than those with slow speaking rate. Experiments in mis-matched conditions reveal that there is consistent drop in AAI performance compared to the matched condition. Further experiments performed by training AAI with acoustic-articulatory data pooled from different speaking rates reveal that a single DNN based AAI model is capable of learning multiple rate-specific mapping
A COMPARATIVE STUDY OF ACOUSTIC-TO-ARTICULATORY INVERSION FOR NEUTRAL AND WHISPERED SPEECH
Whispered speech is known to have different characteristics in acoustics and articulation compared to neutral speech. In this study, we compare the accuracy with which the articulation can be recovered from the acoustics of both types of speech, individually. Acoustic-to-articulatory inversion (AAI) is performed with twelve articulatory features using the deep neural network (DNN) with data obtained from four subjects. We consider AAI in matched and mis-matched train-test conditions, where the speech types in training and test are identical and different respectively. Experiments in matched condition reveal that the AAI performance for whispered speech drops significantly compared to that for neutral speech, only for jaw, tongue tip and tongue body, consistently, for all four subjects. This indicates that the whispered speech encodes information about the rest of the articulators to a degree similar to that of the neutral speech. Experiments in the mis-matched condition show a consistent drop in the AAI performance compared to the matched condition. This drop in performance from matched to mis-matched condition is found be the highest for upper lip which indicates that the upper lip movement could be encoded differently in whispered speech compared to that in neutral speech
COMPARISON OF SPEECH TASKS FOR AUTOMATIC CLASSIFICATION OF PATIENTS WITH AMYOTROPHIC LATERAL SCLEROSIS AND HEALTHY SUBJECTS
In this work, we consider the task of acoustic and articulatory feature based automatic classification of Amyotrophic Lateral Sclerosis (ALS) patients and healthy subjects using speech tasks. In particular, we compare the roles of different types of speech tasks, namely rehearsed speech, spontaneous speech and repeated words for this purpose. Simultaneous articulatory and speech data were recorded from 8 healthy controls and 8 ALS patients using AG501 for the classification experiments. In addition to typical acoustic and articulatory features, new articulatory features are proposed for classification. As classifiers, both Deep Neural Networks (DNN) and Support Vector Machines (SVM) are examined. Classification experiments reveal that the proposed articulatory features outperform other acoustic and articulatory features using both DNN and SVM classifier. However, SVM performs better than DNN classifier using the proposed feature. Among three different speech tasks considered, the rehearsed speech was found to provide the highest F-score of 1, followed by an F-score of 0.92 when both repeated words and spontaneous speech are used for classification