7,318 research outputs found

    Identification of nonlinear lateral flow immunoassay state-space models via particle filter approach

    Get PDF
    This is the post-print of the Article. The official published version can be accessed from the link below - Copyright @ 2012 IEEEIn this paper, the particle filtering approach is used, together with the kernel smoothing method, to identify the state-space model for the lateral flow immunoassay through available but short time-series measurement. The lateral flow immunoassay model is viewed as a nonlinear dynamic stochastic model consisting of the equations for the biochemical reaction system as well as the measurement output. The renowned extended Kalman filter is chosen as the importance density of the particle filter for the purpose of modeling the nonlinear lateral flow immunoassay. By using the developed particle filter, both the states and parameters of the nonlinear state-space model can be identified simultaneously. The identified model is of fundamental significance for the development of lateral flow immunoassay quantification. It is shown that the proposed particle filtering approach works well for modeling the lateral flow immunoassay.This work was supported in part by the International Science and Technology Cooperation Project of China under Grant 2009DFA32050, Natural Science Foundation of China under Grants 61104041, International Science and Technology Cooperation Project of Fujian Province of China under Grant 2009I0016

    Effects of errorless learning on the acquisition of velopharyngeal movement control

    Get PDF
    Session 1pSC - Speech Communication: Cross-Linguistic Studies of Speech Sound Learning of the Languages of Hong Kong (Poster Session)The implicit motor learning literature suggests a benefit for learning if errors are minimized during practice. This study investigated whether the same principle holds for learning velopharyngeal movement control. Normal speaking participants learned to produce hypernasal speech in either an errorless learning condition (in which the possibility for errors was limited) or an errorful learning condition (in which the possibility for errors was not limited). Nasality level of the participants’ speech was measured by nasometer and reflected by nasalance scores (in %). Errorless learners practiced producing hypernasal speech with a threshold nasalance score of 10% at the beginning, which gradually increased to a threshold of 50% at the end. The same set of threshold targets were presented to errorful learners but in a reversed order. Errors were defined by the proportion of speech with a nasalance score below the threshold. The results showed that, relative to errorful learners, errorless learners displayed fewer errors (50.7% vs. 17.7%) and a higher mean nasalance score (31.3% vs. 46.7%) during the acquisition phase. Furthermore, errorless learners outperformed errorful learners in both retention and novel transfer tests. Acknowledgment: Supported by The University of Hong Kong Strategic Research Theme for Sciences of Learning © 2012 Acoustical Society of Americapublished_or_final_versio

    An evaluation of intrusive instrumental intelligibility metrics

    Full text link
    Instrumental intelligibility metrics are commonly used as an alternative to listening tests. This paper evaluates 12 monaural intrusive intelligibility metrics: SII, HEGP, CSII, HASPI, NCM, QSTI, STOI, ESTOI, MIKNN, SIMI, SIIB, and sEPSMcorr\text{sEPSM}^\text{corr}. In addition, this paper investigates the ability of intelligibility metrics to generalize to new types of distortions and analyzes why the top performing metrics have high performance. The intelligibility data were obtained from 11 listening tests described in the literature. The stimuli included Dutch, Danish, and English speech that was distorted by additive noise, reverberation, competing talkers, pre-processing enhancement, and post-processing enhancement. SIIB and HASPI had the highest performance achieving a correlation with listening test scores on average of ρ=0.92\rho=0.92 and ρ=0.89\rho=0.89, respectively. The high performance of SIIB may, in part, be the result of SIIBs developers having access to all the intelligibility data considered in the evaluation. The results show that intelligibility metrics tend to perform poorly on data sets that were not used during their development. By modifying the original implementations of SIIB and STOI, the advantage of reducing statistical dependencies between input features is demonstrated. Additionally, the paper presents a new version of SIIB called SIIBGauss\text{SIIB}^\text{Gauss}, which has similar performance to SIIB and HASPI, but takes less time to compute by two orders of magnitude.Comment: Published in IEEE/ACM Transactions on Audio, Speech, and Language Processing, 201

    Vocal fold vibratory and acoustic features in fatigued Karaoke singers

    Get PDF
    Session 3aMU - Musical Acoustics and Speech Communication: Singing Voice in Asian CulturesKaraoke is a popular singing entertainment particularly in Asia and is gaining more popularity in the rest of world. In Karaoke, an amateur singer sings with the background music and video (usually guided by the lyric captions on the video screen) played by Karaoke machine, using a microphone and an amplification system. As the Karaoke singers usually have no formal training, they may be more vulnerable to vocal fatigue as they may overuse and/or misuse their voices in the intensive and extensive singing activities. It is unclear whether vocal fatigue is accompanied by any vibration pattern or physiological changes of vocal folds. In this study, 20 participants aged from 18 to 23 years with normal voice were recruited to participate in an prolonged singing task, which induced vocal fatigue. High speed laryngscopic imaging and acoustic signals were recorded before and after the singing task. Images of /i/ phonation were quantitatively analyzed using the High Speed Video Processing (HSVP) program (Yiu, et al. 2010). It was found that the glottis became relatively narrower following fatigue, while the acoustic signals were not sensitive to measure change following fatigue. © 2012 Acoustical Society of Americapublished_or_final_versio

    Language experience enhances early cortical pitch-dependent responses

    Get PDF
    AbstractPitch processing at cortical and subcortical stages of processing is shaped by language experience. We recently demonstrated that specific components of the cortical pitch response (CPR) index the more rapidly-changing portions of the high rising Tone 2 of Mandarin Chinese, in addition to marking pitch onset and sound offset. In this study, we examine how language experience (Mandarin vs. English) shapes the processing of different temporal attributes of pitch reflected in the CPR components using stimuli representative of within-category variants of Tone 2. Results showed that the magnitude of CPR components (Na–Pb and Pb–Nb) and the correlation between these two components and pitch acceleration were stronger for the Chinese listeners compared to English listeners for stimuli that fell within the range of Tone 2 citation forms. Discriminant function analysis revealed that the Na–Pb component was more than twice as important as Pb–Nb in grouping listeners by language affiliation. In addition, a stronger stimulus-dependent, rightward asymmetry was observed for the Chinese group at the temporal, but not frontal, electrode sites. This finding may reflect selective recruitment of experience-dependent, pitch-specific mechanisms in right auditory cortex to extract more complex, time-varying pitch patterns. Taken together, these findings suggest that long-term language experience shapes early sensory level processing of pitch in the auditory cortex, and that the sensitivity of the CPR may vary depending on the relative linguistic importance of specific temporal attributes of dynamic pitch

    Improved Emotion Recognition Using Gaussian Mixture Model and Extreme Learning Machine in Speech and Glottal Signals

    Get PDF
    Recently, researchers have paid escalating attention to studying the emotional state of an individual from his/her speech signals as the speech signal is the fastest and the most natural method of communication between individuals. In this work, new feature enhancement using Gaussian mixture model (GMM) was proposed to enhance the discriminatory power of the features extracted from speech and glottal signals. Three different emotional speech databases were utilized to gauge the proposed methods. Extreme learning machine (ELM) and k-nearest neighbor (kNN) classifier were employed to classify the different types of emotions. Several experiments were conducted and results show that the proposed methods significantly improved the speech emotion recognition performance compared to research works published in the literature
    corecore