Search CORE

726 research outputs found

Objective Gender and Age Recognition from Speech Sentences

Author: Faek Fatima K.
Publication venue: 'Koya University'
Publication date: 01/10/2015
Field of study

In this work, an automatic gender and age recognizer from speech is investigated. The relevant features to gender recognition are selected from the first four formant frequencies and twelve MFCCs and feed the SVM classifier. While the relevant features to age has been used with k-NN classifier for the age recognizer model, using MATLAB as a simulation tool. A special selection of robust features is used in this work to improve the results of the gender and age classifiers based on the frequency range that the feature represents. The gender and age classification algorithms are evaluated using 114 (clean and noisy) speech samples uttered in Kurdish language. The model of two classes (adult males and adult females) gender recognition, reached 96% recognition accuracy. While for three categories classification (adult males, adult females, and children), the model achieved 94% recognition accuracy. For the age recognition model, seven groups according to their ages are categorized. The model performance after selecting the relevant features to age achieved 75.3%. For further improvement a de-noising technique is used with the noisy speech signals, followed by selecting the proper features that are affected by the de-noising process and result in 81.44% recognition accuracy

Crossref

Directory of Open Access Journals

ARO-THE SCIENTIFIC JOURNAL OF KOYA UNIVERSITY

Objective analysis versus subjective assessment of vowels pronounced by deaf and normal‐hearing children

Author: Louis C. W. Pols
Mark J. Bakkum
Reinier Plomp
Publication venue: 'Acoustical Society of America (ASA)'
Publication date: 01/01/1995
Field of study

Crossref

UvA-DARE

International Migration, Integration and Social Cohesion online publications

Expression of gender in the human voice: investigating the “gender code”

Author: Cartei Valentina
Publication venue
Publication date: 01/01/2014
Field of study

We can easily and reliably identify the gender of an unfamiliar interlocutor over the telephone. This is because our voice is “sexually dimorphic”: men typically speak with a lower fundamental frequency (F0 - lower pitch) and lower vocal tract resonances (ΔF – “deeper” timbre) than women. While the biological bases of these differences are well understood, and mostly down to size differences between men and women, very little is known about the extent to which we can play with these differences to accentuate or de-emphasise our perceived gender, masculinity and femininity in a range of social roles and contexts. The general aim of this thesis is to investigate the behavioural basis of gender expression in the human voice in both children and adults. More specifically, I hypothesise that, on top of the biologically determined sexual dimorphism, humans use a “gender code” consisting of vocal gestures (global F0 and ΔF adjustments) aimed at altering the gender attributes conveyed by their voice. In order to test this hypothesis, I first explore how acoustic variation of sexually dimorphic acoustic cues (F0 and ΔF) relates to physiological differences in pre-pubertal speakers (vocal tract length) and adult speakers (body height and salivary testosterone levels), and show that voice gender variation cannot be solely explained by static, biologically determined differences in vocal apparatus and body size of speakers. Subsequently, I show that both children and adult speakers can spontaneously modify their voice gender by lowering (raising) F0 and ΔF to masculinise (feminise) their voice, a key ability for the hypothesised control of voice gender. Finally, I investigate the interplay between voice gender expression and social context in relation to cultural stereotypes. I report that listeners spontaneously integrate stereotypical information in the auditory and visual domain to make stereotypical judgments about children’s gender and that adult actors manipulate their gender expression in line with stereotypical gendered notions of homosexuality. Overall, this corpus of data supports the existence of a “gender code” in human nonverbal vocal communication. This “gender code” provides not only a methodological framework with which to empirically investigate variation in voice gender and its role in expressing gender identity, but also a unifying theoretical structure to understand the origins of such variation from both evolutionary and social perspectives

Sussex Research Online

Automatic classification possibilities of the voices of children with dysphonia

Author: Tulics Miklós Gábriel
Vicsi Klára
Publication venue: 'Infocommunications Journal'
Publication date: 01/01/2018
Field of study

Dysphonia is a common complaint, almost every fourth child produces a pathological voice. A mobile based filtering system, that can be used by pre-school workers in order to recognize dysphonic voiced children in order to get professional help as soon as possible, would be desired. The goal of this research is to identify acoustic parameters that are able to distinguish healthy voices of children from those with dysphonia voices of children. In addition, the possibility of automatic classification is children. In addition, the possibility of automatic classification is examined. Two sample T-tests were used for statistical significance testing for the mean values of the acoustic parameters between healthy voices and those with dysphonia. A two-class classification was performed between the two groups using leave-one-out cross validation, with support vector machine (SVM) classifier. Formant frequencies, mel-frequency cepstral coefficients (MFCCs), Harmonics-to-Noise Ratio (HNR), Soft Phonation Index (SPI) and frequency band energy ratios, based on intrinsic mode functions measured on different variations of phonemes showed statistical difference between the groups. A high classification accuracy of 93% was achieved by SVM with linear and rbf kernel using only 8 acoustic parameters. Additional data is needed to build a more general model, but this research can be a reference point in the classification of voices using continuous speech between healthy children and children with dysphonia

Repository of the Academy's Library

Auditory-motor adaptation is reduced in adults who stutter but not in children who stutter

Author: Cai Shanqing
Chang Soo-Eun
Daliri Ayoub
Guenther Frank H.
Wieland Elizabeth
Publication venue
Publication date: 01/01/2016
Field of study

Previous studies have shown that adults who stutter produce smaller corrective motor responses to compensate for unexpected auditory perturbations in comparison to adults who do not stutter, suggesting that stuttering may be associated with deficits in integration of auditory feedback for online speech monitoring. In this study, we examined whether stuttering is also associated with deficiencies in integrating and using discrepancies between expect ed and received auditory feedback to adaptively update motor programs for accurate speech production. Using a sensorimotor adaptation paradigm, we measured adaptive speech responses to auditory formant frequency perturbations in adults and children who stutter and their matched nonstuttering controls. We found that the magnitude of the speech adaptive response for children who stutter did not differ from that of fluent children. However, the adaptation magnitude of adults who stutter in response to formant perturbation was significantly smaller than the adaptation magnitude of adults who do not stutter. Together these results indicate that stuttering is associated with deficits in integrating discrepancies between predicted and received auditory feedback to calibrate the speech production system in adults but not children. This auditory-motor integration deficit thus appears to be a compensatory effect that develops over years of stuttering

Boston University Institutional Repository (OpenBU)

Developmental refinement of cortical systems for speech and voice processing

Author: Bonte M.
Formisano E.
Ley A.
Scharke W.
Publication venue: 'Elsevier BV'
Publication date: 01/03/2016
Field of study

Development typically leads to optimized and adaptive neural mechanisms for the processing of voice and speech. In this fMRI study we investigated how this adaptive processing reaches its mature efficiency by examining the effects of task, age and phonological skills on cortical responses to voice and speech in children (8-9years), adolescents (14-15years) and adults. Participants listened to vowels (/a/, /i/, /u/) spoken by different speakers (boy, girl, man) and performed delayed-match-to-sample tasks on vowel and speaker identity. Across age groups, similar behavioral accuracy and comparable sound evoked auditory cortical fMRI responses were observed. Analysis of task-related modulations indicated a developmental enhancement of responses in the (right) superior temporal cortex during the processing of speaker information. This effect was most evident through an analysis based on individually determined voice sensitive regions. Analysis of age effects indicated that the recruitment of regions in the temporal-parietal cortex and posterior cingulate/cingulate gyrus decreased with development. Beyond age-related changes, the strength of speech-evoked activity in left posterior and right middle superior temporal regions significantly scaled with individual differences in phonological skills. Together, these findings suggest a prolonged development of the cortical functional network for speech and voice processing. This development includes a progressive refinement of the neural mechanisms for the selection and analysis of auditory information relevant to the ongoing behavioral task

Maastricht University Research Portal

Peer audience effects on children’s vocal masculinity and femininity

Author: Bergman Mariano
Calero Cecilia Ines
Fernandez Slezak Diego
Garbulsky Gerry
López y Rosenfeld Matías
Sigman Mariano
Trevisan Marcos Alberto
Publication venue: 'The Royal Society'
Publication date: 01/01/2015
Field of study

Existing evidence suggests that children from around the age of 8 years strategically alter their public image in accordance with known values and preferences of peers, through the self-descriptive information they convey. However, an important but neglected aspect of this ‘self-presentation’ is the medium through which such information is communicated: the voice itself. The present study explored peer audience effects on children's vocal productions. Fifty-six children (26 females, aged 8–10 years) were presented with vignettes where a fictional child, matched to the participant's age and sex, is trying to make friends with a group of same-sex peers with stereotypically masculine or feminine interests (rugby and ballet, respectively). Participants were asked to impersonate the child in that situation and, as the child, to read out loud masculine, feminine and gender-neutral self-descriptive statements to these hypothetical audiences. They also had to decide which of those self-descriptive statements would be most helpful for making friends. In line with previous research, boys and girls preferentially selected masculine or feminine self-descriptive statements depending on the audience interests. Crucially, acoustic analyses of fundamental frequency and formant frequency spacing revealed that children also spontaneously altered their vocal productions: they feminized their voices when speaking to members of the ballet club, while they masculinized their voices when speaking to members of the rugby club. Both sexes also feminized their voices when uttering feminine sentences, compared to when uttering masculine and gender-neutral sentences. Implications for the hitherto neglected role of acoustic qualities of children's vocal behaviour in peer interactions are discussed

CONICET Digital

University of Chichester EPrints Repository

Directory of Open Access Journals

PubMed Central

Repositorio Digital Universidad Torcuato Di Tella

Sussex Research Online

ScholarShip

FigShare

The effect of telepractice on vocal interaction between provider, deaf and hard-of-hearing pediatric patients, and caregivers.

Author: Betts Abigail
Publication venue: ThinkIR: The University of Louisville\u27s Institutional Repository
Publication date: 01/05/2022
Field of study

The purpose of this thesis is to examine how telepractice affects a vocal interaction between a speech-language pathologist (SLP), deaf and hard-of-hearing children who received cochlear implants (n = 7), and caregivers as they engage in speech-language interventions conducted in-person and via telepractice (tele). Frequency of vocalizations, vocal turns, pause duration, fundamental frequency (F0) mean and range, utterance duration, syllable rate per utterance duration, and mean length of utterance (MLU) were examined. The SLP vocalized more during in-person than tele-sessions, opposite result for the mother. There were more SLP-child turns during in-person sessions than tele-sessions; opposite result for mother-child turns. Pauses were longer in SLP-child, mother-child turns during tele than in-person sessions. The SLP increased mean F0, SLP and child expanded F0 range in tele-sessions. The mother had longer utterance duration, higher MLU during in-person than tele-sessions. Results suggest vocal interactions between provider, patient, and caregiver are impacted by intervention service modality

University of Louisville

Recommended from our members

Environment- and listener-oriented speaking style adaptations across the lifespan

Author: Gilbert Rachael Celia
Publication venue
Publication date: 06/11/2014
Field of study

textThis dissertation examines how age affects the ability to produce intelligibility- enhancing speaking style adaptations in response to environment-related difficulties (noise-adapted speech) and in response to listeners’ perceptual difficulties (clear speech). Materials consisted of conversational and clear speech sentences produced in quiet and in response to noise by children (11-13 years), young adults (18-29 years), and older adults (60-84 years). Acoustic measures of global, segmental, and voice characteristics were obtained. Young adult listeners participated in word-recognition-in-noise and perceived age tasks. The study also examined relative talker intelligibility as well as the relationship between the acoustic measurements and intelligibility results. Several age-related differences in speaking style adaptation strategies were found. Children increased mean F0 and F1 more than adults in response to noise, and exhibited greater changes to voice quality when producing clear speech (increased HNR, decreased shimmer). Older adults lengthened pause duration more in clear speech compared to younger talkers. Word recognition in noise results revealed no age-related differences in the intelligibility of conversational speech. Noise-adapted and clear speech modifications increased intelligibility for all talker groups. However, the acoustic changes implemented by children when producing noise-adapted and clear speech were less efficient in enhancing intelligibility compared to the young adult talkers. Children were also less intelligible than older adults for speech produced in quiet. Results confirmed that the talkers formed 3 perceptually-distinct age groups. Correlation analyses revealed that relative talker intelligibility was consistent for conversational and clear speech in quiet. However, relative talker intelligibility was found to be more variable with the inclusion of additional speaking style adaptations. 1-3 kHz energy, speaking rate, vowel and pause durations all emerged as significant acoustic-phonetic predictors of intelligibility. This is the first study to investigate how clear speech and noise-adapted speech benefits interact with each other across multiple talker groups. The findings enhance our understanding of intelligibility variation across the lifespan and have implications for a number of applied realms, from audiologic rehabilitation to speech synthesis.Linguistic

Texas ScholarWorks

Recommended from our members

Children's Perception of Conversational and Clear American-English Vowels in Noise

Author: Leone Dorothy
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2013
Field of study

A handful of studies have examined children's perception of clear speech in the presence of background noise. Although accurate vowel perception is important for listeners' comprehension, no study has focused on whether vowels uttered in clear speech aid intelligibility for children listeners. In the present study, American-English (AE) speaking children repeated the AE vowels /ε, æ, ɑ, ʌ/ in the nonsense word /gəbVpə/ in phrases produced in conversational and clear speech by two female AE-speaking adults. The recordings of the adults' speech were presented at a signal-to-noise ratio (SNR) of -6 dB to 15 AE-speaking children (ages 5.0-8.5) in an examination of whether the accuracy of AE school-age children's vowel identification in noise is more accurate when utterances are produced in clear speech than in conversational speech. Effects of the particular vowel uttered and talker effects were also examined. Clear speech vowels were repeated significantly more accurately (87%) than conversational speech vowels (59%), suggesting that clear speech aids children's vowel identification. Results varied as a function of the talker and particular vowel uttered. Child listeners repeated one talker's vowels more accurately than the other's and front vowels more accurately than central and back vowels. The findings support the use of clear speech for enhancing adult-to-child communication in AE, particularly in noisy environments

Columbia University Academic Commons