294 research outputs found
Acoustic and Respiratory Characteristics of Infant Vocalization
The purpose of this dissertation was to explore vibratory regime of infant phonation. The first study examined 1) differences in overall levels of acoustic and respiratory variables between different regimes and 2) differences in relationships between the acoustic and respiratory variables among regimes. The second study examined 3) the acoustic and respiratory ranges of modal phonation with respect to other regimes and 4) the range of modal phonation among infants of different ages. Two datasets were used in the study. Dataset I was acquired from eight infants of ages 8-18 months, and Dataset II from one infant of ages 4-6 months. Their vocalizations and respiratory movements were recorded during adult-interaction. Phonated segments were identified through waveform, spectrogram, and auditory inspection, and categorized into six mutually exclusive regimes (modal, pulse, loft, subharmonics, biphonation, and chaos). For each regime segment, the following measurements were made: fundamental frequency (F0), sound pressure level (SPL), expiratory slope, and relative lung volume at regime initiation. A series of linear mixed-effects model analysis and analysis of variance revealed differences in mean F0 between regimes, mean SPL, and mean. Correlations between the acoustic and respiratory variables differed among regimes, indicating their relationships were regime-dependent. The most revealing findings were that regime categories readily distributed into different regions of the intensity-frequency space, and that F0 ranges of modal regime tended to decrease with increasing age. In addition to modal, pulse, and loft distributing around the mid, low, and high intensity-frequency regions, respectively, biphonation and subharmonics were found between modal and loft ranges. The upper end of F0 range for pulse was much higher in infants compared to adults, however, biphonation and subharmonics rarely occurred between pulse and modal ranges. A range of modal F0 was about 500 Hz for the young infant in the vocal expansion stage, and about 200 Hz for older infants in the (post-)canonical stage. Although the results are tentative, this finding suggests that F0 variability decreases with age and phonation becomes more restricted to a lower end of an F0 range
Recommended from our members
Vocal communication of simulated pain
While evidence suggests that pain cries produced by human babies and other mammal infants communicate acoustic cues to pain intensity, whether the pain vocalisations of human adults also encode pain intensity, and which acoustic characteristics influence listenersâ perceptions, remains unexplored. Here, we investigated how trained actors communicated pain by comparing the acoustic characteristics of nonverbal vocalisations expressing different levels of pain intensity (mild, moderate, and severe). We then performed playback experiments to examine whether vocalisers successfully communicated pain intensity to listeners, and which acoustic characteristics were responsible for variation in pain ratings. We found that the mean and range of voice fundamental frequency (F0, perceived as pitch), the amplitude of the vocalisation, the degree of periodicity of the vocalisation, and the proportion of the signal displaying nonlinear phenomena all increased with the level of simulated pain intensity. In turn, these parameters predicted increases in listenersâ ratings of pain intensity. We also found that while different voice features contributed to increases in pain ratings within each level of expressed pain, a combination of these features explained an impressive amount of the variance in listenersâ pain ratings, both across (76%) and within (31-54%) pain levels. Our results show that adult vocalisers can volitionally simulate and modulate pain vocalisations to influence listenersâ perceptions of pain in a manner consistent with authentic human infant and nonhuman mammal pain vocalisations, and highlight potential for the development of a practical quantitative tool to improve pain assessment in populations unable to self-report their subjective pain experience
Reliability of Subjective Endoscopic Parameters in the Differentiation of Essential Voice Tremor and Adductor Spasmodic Dysphonia Using High-Speed Videoendoscopy
Certain neurogenic voice disorders present with similar or overlapping audio perceptual voice characteristics. Developing reliable and standardized perceptual measures of vocal fold vibratory characteristics for such voice disorders can enable accurate diagnosis and lead to faster, targeted treatment. In this study, subjective perceptual vocal fold vibratory characteristics and the presence and absence of supraglottic events during phonation were investigated to differentiate between Adductor Spasmodic Dysphonia (ADSD) and Essential Vocal Fold Tremor (EVT) using high-speed videoendoscopy (HSV). The specific aims of the study were to 1) assess which subjective endoscopic vocal fold vibratory measures differentiate EVT from AdSD; and 2) assess the inter-rater and intra-rater reliability of the ratings. High speed video recordings of vibratory vocal fold motion were selected to conduct a retrospective analysis on existing data. The participants were classified into three groups: 16 participants with a diagnosis of Adductor Spasmodic Dysphonia, 8 participants with a clinical diagnosis of Essential Vocal Tremor, and 10 participants with a diagnosis of Both (AdSD with Tremor). The inclusion criteria for HSV data was the presence of a full view of true vocal folds and supraglottic structures during vibration. It was hypothesized that HSV vocal fold vibratory measures and supraglottic events would distinguish EVT and ADSD and these measures would be reliable. In addition, the vocal fold vibratory features would be more reliable than supraglottic events in differentiating between the groups. Results demonstrated mixed reliability for supraglottic and vocal fold vibratory parameters. None of the hypothesized supraglottic parameters demonstrated any significant distinction between diagnostic groups given the three ratersâ responses. While all four vocal fold vibratory parameters revealed distinctive patterns between the three diagnostic categories, only two, right/left TVF symmetry and anterior/posterior TVF symmetry, met the requirements for both reliability and differentiation. For these parameters, EVT demonstrated greater vocal fold symmetry in comparison to AdSD; however, those with a differential diagnosis of both demonstrated the highest vocal fold symmetry
Tone production using inspiratory phonation by Cantonese speakers
Also available in print.Thesis (B.Sc)--University of Hong Kong, 2008.A dissertation submitted in partial fulfilment of the requirements for the Bachelor of Science (Speech and Hearing Sciences), The University of Hong Kong, June 30, 2008.Includes bibliographical references (p. 27-29).published_or_final_versionSpeech and Hearing SciencesBachelorBachelor of Science in Speech and Hearing Science
Detection of Irregular Phonation in Speech
This work addresses the detection & characterization of irregular phonation in spontaneous speech. While published work tackles this problem as a two-hypothesis problem only in regions of speech with phonation, this work focuses on distinguishing aperiodicity due to frication from that due to irregular voicing. This work also deals with correction of a current pitch tracking algorithm in regions of irregular phonation, where most pitch trackers fail to perform well. Relying on the detection of regions of irregular phonation, an acoustic parameter is developed in order to characterize these regions for speaker identification applications. The detection performance of the algorithm on a clean speech corpus (TIMIT) is seen to be 91.8%, with the percentage of false detections being 17.42%. On telephone speech corpus (NIST 98) database, the detection performance is 89.2%, with the percentage of false detections being 12.8%. The pitch detection accuracy increased from 95.4% to 98.3% for TIMIT, and from94.8% to 97.4% for NIST 98 databases. The creakiness parameter was added to a set of seven acoustic parameters for speaker identification on the NIST 98 database, and the performance was found to be enhanced by 1.5% for female speakers and 0.4% for male speakers for a population of 250 speakers
A therapeutic approach for improved vocal performance in individuals in teaching occupations
"Teachers represent one of the largest groups of professional voice users in the country and are among those individuals at greatest risk for developing vocal problems. This study investigated the efficacy of a specific therapy approach for treating voice problems among teachers. Five female teachers with reported voice problems participated in six sessions of voice therapy to improve body posture and diaphragmatic breathing, establish forward resonance patterns, reduce laryngeal tension through speaking and singing exercises, and improve vocal hygiene habits. Data was obtained via perceptual analysis, objective voice measurements, and two patient-based treatment outcome measures: the Voice Handicap Index (VHI) and a Vocal Symptoms Questionnaire. Results suggested that teachers with reported voice problems can establish and maintain healthier, more efficient voice use and improved vocal hygiene habits with the described course of treatment. Furthermore, voice clinicians can consider these techniques as effective alternatives in clinical settings when treating this population. "--Abstract from author supplied metadat
- âŠ