726 research outputs found

    Objective Gender and Age Recognition from Speech Sentences

    Get PDF
    In this work, an automatic gender and age recognizer from speech is investigated. The relevant features to gender recognition are selected from the first four formant frequencies and twelve MFCCs and feed the SVM classifier. While the relevant features to age has been used with k-NN classifier for the age recognizer model, using MATLAB as a simulation tool. A special selection of robust features is used in this work to improve the results of the gender and age classifiers based on the frequency range that the feature represents. The gender and age classification algorithms are evaluated using 114 (clean and noisy) speech samples uttered in Kurdish language. The model of two classes (adult males and adult females) gender recognition, reached 96% recognition accuracy. While for three categories classification (adult males, adult females, and children), the model achieved 94% recognition accuracy. For the age recognition model, seven groups according to their ages are categorized. The model performance after selecting the relevant features to age achieved 75.3%. For further improvement a de-noising technique is used with the noisy speech signals, followed by selecting the proper features that are affected by the de-noising process and result in 81.44% recognition accuracy

    Expression of gender in the human voice: investigating the “gender code”

    Get PDF
    We can easily and reliably identify the gender of an unfamiliar interlocutor over the telephone. This is because our voice is “sexually dimorphic”: men typically speak with a lower fundamental frequency (F0 - lower pitch) and lower vocal tract resonances (ΔF – “deeper” timbre) than women. While the biological bases of these differences are well understood, and mostly down to size differences between men and women, very little is known about the extent to which we can play with these differences to accentuate or de-emphasise our perceived gender, masculinity and femininity in a range of social roles and contexts. The general aim of this thesis is to investigate the behavioural basis of gender expression in the human voice in both children and adults. More specifically, I hypothesise that, on top of the biologically determined sexual dimorphism, humans use a “gender code” consisting of vocal gestures (global F0 and ΔF adjustments) aimed at altering the gender attributes conveyed by their voice. In order to test this hypothesis, I first explore how acoustic variation of sexually dimorphic acoustic cues (F0 and ΔF) relates to physiological differences in pre-pubertal speakers (vocal tract length) and adult speakers (body height and salivary testosterone levels), and show that voice gender variation cannot be solely explained by static, biologically determined differences in vocal apparatus and body size of speakers. Subsequently, I show that both children and adult speakers can spontaneously modify their voice gender by lowering (raising) F0 and ΔF to masculinise (feminise) their voice, a key ability for the hypothesised control of voice gender. Finally, I investigate the interplay between voice gender expression and social context in relation to cultural stereotypes. I report that listeners spontaneously integrate stereotypical information in the auditory and visual domain to make stereotypical judgments about children’s gender and that adult actors manipulate their gender expression in line with stereotypical gendered notions of homosexuality. Overall, this corpus of data supports the existence of a “gender code” in human nonverbal vocal communication. This “gender code” provides not only a methodological framework with which to empirically investigate variation in voice gender and its role in expressing gender identity, but also a unifying theoretical structure to understand the origins of such variation from both evolutionary and social perspectives

    Automatic classification possibilities of the voices of children with dysphonia

    Get PDF
    Dysphonia is a common complaint, almost every fourth child produces a pathological voice. A mobile based filtering system, that can be used by pre-school workers in order to recognize dysphonic voiced children in order to get professional help as soon as possible, would be desired. The goal of this research is to identify acoustic parameters that are able to distinguish healthy voices of children from those with dysphonia voices of children. In addition, the possibility of automatic classification is children. In addition, the possibility of automatic classification is examined. Two sample T-tests were used for statistical significance testing for the mean values of the acoustic parameters between healthy voices and those with dysphonia. A two-class classification was performed between the two groups using leave-one-out cross validation, with support vector machine (SVM) classifier. Formant frequencies, mel-frequency cepstral coefficients (MFCCs), Harmonics-to-Noise Ratio (HNR), Soft Phonation Index (SPI) and frequency band energy ratios, based on intrinsic mode functions measured on different variations of phonemes showed statistical difference between the groups. A high classification accuracy of 93% was achieved by SVM with linear and rbf kernel using only 8 acoustic parameters. Additional data is needed to build a more general model, but this research can be a reference point in the classification of voices using continuous speech between healthy children and children with dysphonia

    Auditory-motor adaptation is reduced in adults who stutter but not in children who stutter

    Full text link
    Previous studies have shown that adults who stutter produce smaller corrective motor responses to compensate for unexpected auditory perturbations in comparison to adults who do not stutter, suggesting that stuttering may be associated with deficits in integration of auditory feedback for online speech monitoring. In this study, we examined whether stuttering is also associated with deficiencies in integrating and using discrepancies between expect ed and received auditory feedback to adaptively update motor programs for accurate speech production. Using a sensorimotor adaptation paradigm, we measured adaptive speech responses to auditory formant frequency perturbations in adults and children who stutter and their matched nonstuttering controls. We found that the magnitude of the speech adaptive response for children who stutter did not differ from that of fluent children. However, the adaptation magnitude of adults who stutter in response to formant perturbation was significantly smaller than the adaptation magnitude of adults who do not stutter. Together these results indicate that stuttering is associated with deficits in integrating discrepancies between predicted and received auditory feedback to calibrate the speech production system in adults but not children. This auditory-motor integration deficit thus appears to be a compensatory effect that develops over years of stuttering

    Developmental refinement of cortical systems for speech and voice processing

    Get PDF
    Development typically leads to optimized and adaptive neural mechanisms for the processing of voice and speech. In this fMRI study we investigated how this adaptive processing reaches its mature efficiency by examining the effects of task, age and phonological skills on cortical responses to voice and speech in children (8-9years), adolescents (14-15years) and adults. Participants listened to vowels (/a/, /i/, /u/) spoken by different speakers (boy, girl, man) and performed delayed-match-to-sample tasks on vowel and speaker identity. Across age groups, similar behavioral accuracy and comparable sound evoked auditory cortical fMRI responses were observed. Analysis of task-related modulations indicated a developmental enhancement of responses in the (right) superior temporal cortex during the processing of speaker information. This effect was most evident through an analysis based on individually determined voice sensitive regions. Analysis of age effects indicated that the recruitment of regions in the temporal-parietal cortex and posterior cingulate/cingulate gyrus decreased with development. Beyond age-related changes, the strength of speech-evoked activity in left posterior and right middle superior temporal regions significantly scaled with individual differences in phonological skills. Together, these findings suggest a prolonged development of the cortical functional network for speech and voice processing. This development includes a progressive refinement of the neural mechanisms for the selection and analysis of auditory information relevant to the ongoing behavioral task

    Peer audience effects on children’s vocal masculinity and femininity

    Get PDF
    Existing evidence suggests that children from around the age of 8 years strategically alter their public image in accordance with known values and preferences of peers, through the self-descriptive information they convey. However, an important but neglected aspect of this ‘self-presentation’ is the medium through which such information is communicated: the voice itself. The present study explored peer audience effects on children's vocal productions. Fifty-six children (26 females, aged 8–10 years) were presented with vignettes where a fictional child, matched to the participant's age and sex, is trying to make friends with a group of same-sex peers with stereotypically masculine or feminine interests (rugby and ballet, respectively). Participants were asked to impersonate the child in that situation and, as the child, to read out loud masculine, feminine and gender-neutral self-descriptive statements to these hypothetical audiences. They also had to decide which of those self-descriptive statements would be most helpful for making friends. In line with previous research, boys and girls preferentially selected masculine or feminine self-descriptive statements depending on the audience interests. Crucially, acoustic analyses of fundamental frequency and formant frequency spacing revealed that children also spontaneously altered their vocal productions: they feminized their voices when speaking to members of the ballet club, while they masculinized their voices when speaking to members of the rugby club. Both sexes also feminized their voices when uttering feminine sentences, compared to when uttering masculine and gender-neutral sentences. Implications for the hitherto neglected role of acoustic qualities of children's vocal behaviour in peer interactions are discussed

    The effect of telepractice on vocal interaction between provider, deaf and hard-of-hearing pediatric patients, and caregivers.

    Get PDF
    The purpose of this thesis is to examine how telepractice affects a vocal interaction between a speech-language pathologist (SLP), deaf and hard-of-hearing children who received cochlear implants (n = 7), and caregivers as they engage in speech-language interventions conducted in-person and via telepractice (tele). Frequency of vocalizations, vocal turns, pause duration, fundamental frequency (F0) mean and range, utterance duration, syllable rate per utterance duration, and mean length of utterance (MLU) were examined. The SLP vocalized more during in-person than tele-sessions, opposite result for the mother. There were more SLP-child turns during in-person sessions than tele-sessions; opposite result for mother-child turns. Pauses were longer in SLP-child, mother-child turns during tele than in-person sessions. The SLP increased mean F0, SLP and child expanded F0 range in tele-sessions. The mother had longer utterance duration, higher MLU during in-person than tele-sessions. Results suggest vocal interactions between provider, patient, and caregiver are impacted by intervention service modality
    • 

    corecore