61 research outputs found
Perceptuo-motor biases in the perceptual organization of the height feature in French vowels
A paraĂźtre dans Acta AcusticaInternational audienceThis paper reports on the organization of the perceived vowel space in French. In a previous paper [28], we investigated the implementation of vocal height contrasts along the F1 dimension in French speakers. In this paper, we present results from perceptual identification tests performed by twelve participants who took part in the production experiment reported in the earlier paper. For each subject, stimuli presented in the identification test were synthesized in two different vowel spaces, corresponding to two different vocal tract lengths. The results showed that first, the perceived French vowels belonging to similar height degrees were aligned on stable F1 values, independent of place of articulation and roundedness, as was the case for produced vowels. Second, the produced F1 distances between height degrees correlated with the perceived F1 distances. This suggests that there is a link between perceptual and motor phonemic prototypes in the human brain. The results are discussed using the framework of the Perception for Action Control (PACT) theory, in which speech units are considered to be gestures shaped by perceptual processes
Metod maksimalnog opsega vokala u analizi vokala tokom prelingvalne faze
e main problems in the analysis of vowels which occur in prelingual
speech phase are centralization of utterance and unknown dimension
of vocal tract. Most researches in this eld are based on the analysis of
maximal vowel space (MVS) because discrimination of vowels is very
di cult in this early period. MVS analysis includes the estimation of vocal
tract (VT) physical dimensions. e aim of this research was to estimate
and de ne changes in vowel pronunciation during prelingual speech
phase. e analysis and voice recording were performed in a two month
old child until he turned one. e recording was performed in 42 sessions,
on average 4 sessions every month. Sound segments that look like vowel
pronunciation were extracted from the recordings and were used for the
formant frequencies estimation by PRAAT so ware. e Burg method
was used for formant frequency estimation. Research results showed
that MVS can be used in diagnostic procedure from a childâs earliest age.
MVS analysis is appropriate for a childâs earliest age as a child needs to
pronounce individual phonemes, and does not need to respond to speech
stimuli. ese results need to be con rmed on a larger sample when
extended analysis should de ne criteria for discrimination of typical and
atypical formant frequencies.Glavni problemi pri analizi vokala koji se javljaju u prelingvalnoj fazi
su centralizacija izgovora i nepoznate dimenzije vokalnog trakta. VeÄina
istraĆŸivaÄa u ovoj oblasti bazira analizu na maksimalnom opsegu vokala
(MOV) jer je njihova diskriminacija oteĆŸana kod dece na najranijem
uzrastu. Analiza MOV ukljuÄuje estimaciju ziÄkih dimenzija vokalnog
trakta (VT). Cilj istraĆŸivanja je da se utvrde i de niĆĄu promene pri izgovoru
vokala tokom prelingvalne faze. Snimanje i analiza su uraÄeni tokom prve
godine ĆŸivota (od drugog do dvanaestog meseca). Snimanje je obavljeno
u 42 sesije, u proseku 4 sesije u mesecu. ZvuÄni segmenti koji su liÄili na
izgovor vokala su izolovani i koriĆĄÄeni pri estimaciji u PRAAT so*veru.
Za estimaciju formantnih frekvencija koriĆĄÄen je Burgov metod. Rezultati
ukazuju da se MOV analiza moĆŸe koristiti na najranijem uzrastu jer nije
neophodno da dete odgovara na dati stimulus. Ovaj rezultat treba potvrditi
na veÄem uzorku, pri Äemu bi proĆĄirena analiza de nisala i kriterijume za
diskriminaciju tipiÄnih i atipiÄnih formantnih frekvencija
Maturing Temporal Bones as Non-Neural Sites for Transforming the Speech Signal during Language Development
Developmental events in the temporal bones shift the pattern of a given speech sounds acoustic profile through the time children are mapping linguistic sound systems. Before age 5 years, frequency information in vowels is differentially accessible through the years children are acquiring the sound systems of their native language(s). To model the acoustic effects caused by developing temporal bones, data collected to elicit steady-state vowels from adult native speakers of English and Diné were modified to reflect the form of children\u27s hearing sensitivities at different ages based on patterns established in the psychoacoustic literature. It was assumed, based on the work of psychacousticians (e.g., Werner, Fay & Popper 2012; and Werner & Marean 1996), that the effects caused by immature temporal bones were conductive immaturities, and the age-sensitive filters were constructed based on psychoacoustic research into the hearing of infants and children. Data were partitioned by language, sex, and individual vowels and compared for points of similarity and difference in the way information in vowels is filtered because of the constraints imposed by the immaturity of the temporal bones. Results show that the early formant pattern becomes successively modified in a constrained pattern reflecting maturational processes. Results also suggest that children may well be switching strategies for processing vowels, using a more adult-like process after 18 months. Future research should explore if early hearing not only affects individual speech sounds but their relationships to one another in the vowel space as well. Additionally, there is an interesting artifact in the observed gradual progression to full adult hearing which may be the effect of the foramen of Huschke contributing to the filters at 1 year and 18 months. Given that immature temporal bones reflect brain expansion and rotational birth in hominids, these results contribute to the discussion of the biological underpinnings of the evolution of language.\u2
Recommended from our members
Plasticity in second language (L2) learning: perception of L2 phonemes by native Greek speakers of English
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Understanding the process of language acquisition is a challenge that many researchers spanning different disciplines (e.g. linguistics, psychology, neuroscience) have grappled with for centuries. One which has in recent years attracted a lot of attention has been in the area of non-native phoneme acquisition. Speech sounds that contain multiple phonetic cues are often difficult for foreign-language learners, especially if certain cues are weighted differently in the foreign and native languages. Greek adult and child speakers of English were studied to determine which cues (duration or spectral) they were using to make discrimination and identification judgments for an English vowel contrast pair. To this end, two forms of identification and discrimination tasks were used: natural (unedited) stimuli and another âmodifiedâ vowel duration stimuli which were edited so that there were no duration differences between the vowels. Results show the Greek speakers were particularly impaired when they were unable to use the duration cue as compared to the native English speakers. Similar results were also obtained in control experiments where there was no orthographic representation or where the stimuli were cross-spliced to modify the phonetic neighborhood. Further experiments used high-variability training sessions to enhance vowel perception. Following training, performance improved for both Greek adult and child groups as revealed by post training tests. However the improvements were most pronounced for the child Greek speaker group. A further study examined the effect of different orthographic cues that might affect rhyme and homophony judgment. The results of that study showed that Greek speakers were in general more affected by orthography and regularity (particularly of the vowel) in making these judgments. This would suggest that Greek speakers were more sensitive to irrelevant orthographic cues, mirroring the results in the auditory modality where they focused on irrelevant acoustic cues. The results are discussed in terms of current theories of language acquisition, with particular reference to acquisition of non-native phonemes.School of Social Sciences, Brunel Universit
By
Vowel normalization is a computation that is meant to account for the differences in the absolute direct (physical or psychophysical) representations of qualitatively equivalent vowel productions that arise due to differences in speaker properties such as body size types, age, gender, and other socially interpreted categories that are based on natural variation in vocal tract size and shape. In this dissertation, we address the metaphysical and epistemological aspects of vowel normalization pertaining to spoken language acquisition during early infancy. We begin by reviewing approaches to conceptualizing and modeling the phonetic components of early spoken language acquisition, forming a catalog of phenomena that serves as the basis for our discourse. We then establish the existence of a vowel normalization computation carried out by infants early in their spoken language acquisition, and put forward a conceptual and technical framework for its investigation which focuses attention on the generative nature of the computation. We then situate the acquisition of vowel normalization within a broader developmental framework encompassing a suite of vocal learning phenomena, including language-specific caretaker vocal exchanges
Expression of gender in the human voice: investigating the âgender codeâ
We can easily and reliably identify the gender of an unfamiliar interlocutor over
the telephone. This is because our voice is âsexually dimorphicâ: men typically speak
with a lower fundamental frequency (F0 - lower pitch) and lower vocal tract resonances
(ÎF â âdeeperâ timbre) than women. While the biological bases of these differences are
well understood, and mostly down to size differences between men and women, very
little is known about the extent to which we can play with these differences to
accentuate or de-emphasise our perceived gender, masculinity and femininity in a range
of social roles and contexts.
The general aim of this thesis is to investigate the behavioural basis of gender
expression in the human voice in both children and adults. More specifically, I
hypothesise that, on top of the biologically determined sexual dimorphism, humans use
a âgender codeâ consisting of vocal gestures (global F0 and ÎF adjustments) aimed at
altering the gender attributes conveyed by their voice. In order to test this hypothesis, I
first explore how acoustic variation of sexually dimorphic acoustic cues (F0 and ÎF)
relates to physiological differences in pre-pubertal speakers (vocal tract length) and
adult speakers (body height and salivary testosterone levels), and show that voice
gender variation cannot be solely explained by static, biologically determined
differences in vocal apparatus and body size of speakers. Subsequently, I show that both
children and adult speakers can spontaneously modify their voice gender by lowering
(raising) F0 and ÎF to masculinise (feminise) their voice, a key ability for the
hypothesised control of voice gender. Finally, I investigate the interplay between voice
gender expression and social context in relation to cultural stereotypes. I report that
listeners spontaneously integrate stereotypical information in the auditory and visual
domain to make stereotypical judgments about childrenâs gender and that adult actors
manipulate their gender expression in line with stereotypical gendered notions of
homosexuality. Overall, this corpus of data supports the existence of a âgender codeâ in
human nonverbal vocal communication. This âgender codeâ provides not only a
methodological framework with which to empirically investigate variation in voice
gender and its role in expressing gender identity, but also a unifying theoretical
structure to understand the origins of such variation from both evolutionary and social
perspectives
Authentic self, incongruent acoustics : a corpus-based sociophonetic analysis of nonbinary speech.
This thesis examines the ways six nonbinary speakers in Christchurch, New Zealand
present their gender identity via speech. It examines their productions in reference to
both established trends in the literature, as well as speech collected from ten binary
speakers (5M, 5F) at the same time. It seeks to examine whether, in addition to
encoding binary gender, speech also encodes nonbinary gender.
Three hypotheses are proposed and tested across multiple linguistic variables.
The first hypothesis regards acoustic incongruence, and posits that nonbinary speakers
may assert their nonbinary identities via speech that utilises particular combinations
of variables which create either ambiguity or dissonance in regards to established
binary-gender norms. Ambiguous gender incongruence arises from the use of speech
that is neither reliably perceived as female, nor reliably perceived as male. Dissonant
gender incongruence arises from the use of speech that is reliably perceived as both
male and female. The second hypothesis predicts that nonbinary speakers will show
greater variation in speech based on immediate contextual factors, compared to
binary speakers. This difference is hypothesised to be due to to nonbinary speakers
paying greater attention to production, and the greater degree of variation in their
own speech over time compared to binary speakers. Hypothesis 3 predicts that
nonbinary speakers are not a uniform population, and that their use of incongruence
will be influenced extensively by their individual condition, including their professed
speech goals, history, and gender identity.
The hypotheses are tested quantitatively in regards to five linguistic variables:
Pitch, pitch range, monophthong production, Vowel Space Area (VSA), and intervocalic
/t/ frication rates. The interaction between multiple variables together is also
considered. In-depth examinations of the variation utilised by a single speaker in the
form of "Spotlights" address the hypotheses from a qualitative perspective.
Overall, the thesis finds some evidence for Hypothesis 1. In every linguistic
variable examined, nonbinary speakers show some distinction from binary speakers
that is not explained fully via speaker Assigned Sex at Birth (ASAB). Some binary
speakers also seem to produce incongruence, particularly binary women and particularly
within single variables. The small scale of the study presents a limitation in
addressing Hypothesis 2, but avenues for future work are identified. The qualitative
evidence provides strong support for Hypothesis 3, in the examination of individual
nonbinary speakers and the way their measured productions support their professed
speech goals and identities. Overall, this dissertation presents one of the first comparative
analyses of nonbinary speech, and presents a number of novel approaches
to examining phonetic data from a statistical perspective that still accommodates an
analysis of individual agency and goals in identity building
Models and analysis of vocal emissions for biomedical applications
This book of Proceedings collects the papers presented at the 3rd International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications, MAVEBA 2003, held 10-12 December 2003, Firenze, Italy. The workshop is organised every two years, and aims to stimulate contacts between specialists active in research and industrial developments, in the area of voice analysis for biomedical applications. The scope of the Workshop includes all aspects of voice modelling and analysis, ranging from fundamental research to all kinds of biomedical applications and related established and advanced technologies
- âŠ