370 research outputs found
Deep learning assessment of syllable affiliation of intervocalic consonants
In English, a sentence like “He made out our intentions.” could be misperceived as “He may doubt our intentions.” because the coda /d/ sounds like it has become the onset of the next syllable. The nature and occurrence condition of this resyllabification phenomenon are unclear, however. Previous empirical studies mainly relied on listener judgment, limited acoustic evidence, such as voice onset time, or average formant values to determine the occurrence of resyllabification. This study tested the hypothesis that resyllabification is a coarticulatory reorganisation that realigns the coda consonant with the vowel of the next syllable. Deep learning in conjunction with dynamic time warping (DTW) was used to assess syllable affiliation of intervocalic consonants. The results suggest that convolutional neural network- and recurrent neural network-based models can detect cases of resyllabification using Mel-frequency spectrograms. DTW analysis shows that neural network inferred resyllabified sequences are acoustically more similar to their onset counterparts than their canonical productions. A binary classifier further suggests that, similar to the genuine onsets, the inferred resyllabified coda consonants are coarticulated with the following vowel. These results are interpreted with an account of resyllabification as a speech-rate-dependent coarticulatory reorganisation mechanism in speech
A syllable-based investigation of coarticulation
Coarticulation has been long investigated in Speech Sciences and Linguistics (Kühnert &
Nolan, 1999). This thesis explores coarticulation through a syllable based model (Y. Xu,
2020). First, it is hypothesised that consonant and vowel are synchronised at the syllable
onset for the sake of reducing temporal degrees of freedom, and such synchronisation
is the essence of coarticulation. Previous efforts in the examination of CV alignment
mainly report onset asynchrony (Gao, 2009; Shaw & Chen, 2019). The first study of this
thesis tested the synchrony hypothesis using articulatory and acoustic data in Mandarin.
Departing from conventional approaches, a minimal triplet paradigm was applied, in
which the CV onsets were determined through the consonant and vowel minimal pairs,
respectively. Both articulatory and acoustical results showed that CV articulation started
in close temporal proximity, supporting the synchrony hypothesis. The second study
extended the research to English and syllables with cluster onsets. By using acoustic data
in conjunction with Deep Learning, supporting evidence was found for co-onset, which
is in contrast to the widely reported c-center effect (Byrd, 1995). Secondly, the thesis
investigated the mechanism that can maximise synchrony – Dimension Specific Sequential
Target Approximation (DSSTA), which is highly relevant to what is commonly known
as coarticulation resistance (Recasens & Espinosa, 2009). Evidence from the first two studies show that, when conflicts arise due to articulation requirements between CV, the
CV gestures can be fulfilled by the same articulator on separate dimensions simultaneously.
Last but not least, the final study tested the hypothesis that resyllabification is the result of
coarticulation asymmetry between onset and coda consonants. It was found that neural
network based models could infer syllable affiliation of consonants, and those inferred
resyllabified codas had similar coarticulatory structure with canonical onset consonants. In
conclusion, this thesis found that many coarticulation related phenomena, including local
vowel to vowel anticipatory coarticulation, coarticulation resistance, and resyllabification,
stem from the articulatory mechanism of the syllable
Words without vowels : phonetic and phonological evidence from Tashlhiyt Berber
This article deals with the Tashlhiyt dialect of Berber (henceforth TB) spoken in the southern part of Morocco. In TB, words may consist entirely of consonants without vowels and sometimes of only voiceless obstruents, e.g. tft#tstt "you rolled it (fem)". In this study we have carried out acoustic, video-endoscopic and phonological analyses to answer the following question: is schwa, which may function as syllabic, a segment at the level of phonetic representations in TB? Video-endoscopic films were made of one male native speaker of TB, producing a list of forms consisting entirely of voiceless obstruents. The same list was produced by 7 male native speakers of TB for the acoustic analysis. The phonological analysis is based on the behaviour of vowels with respect to the phonological rule of assibilation. This study shows the absence of schwa vowels in forms consisting of voiceless obstruents
Segmental alignment of English syllables with singleton and cluster onsets
Recent research has shown fresh evidence that consonant and vowel are synchronised at the syllable onset, as predicted by a number of theoretical models. The finding was made by using a minimal contrast paradigm to determine segment onset in Mandarin CV syllables, which differed from the conventional method of detecting gesture onset with a velocity threshold [1]. It has remained unclear, however, if CV co-onset also occurs between the nucleus vowel and a consonant cluster, as predicted by the articulatory syllable model [2]. This study applied the minimal contrast paradigm to British English in both CV and clusterV (CLV) syllables, and analysed the spectral patterns with signal chopping in conjunction with recurrent neural networks (RNN) with long short-term memory (LSTM) [3]. Results show that vowel onset is synchronised with the onset of the first consonant in a cluster, thus supporting the articulatory syllable model
Perception of allophonic cues to English word boundaries by Polish learners: Approximant devoicing in English
The study investigates the perception of devoicing of English /w, r, j, l/ after /p, t, k/ as a
word-boundary cue by Polish listeners. Polish does not devoice sonorants following
voiceless stops in word-initial positions. As a result, Polish learners are not made sensitive
to sonorant devoicing as a segmentation cue. Higher-proficiency and lower-proficiency
Polish learners of English participated in the task in which they recognised phrases such as
buy train vs. bite rain or pie plot vs. pipe lot. The analysis of accuracy scores revealed that
successful segmentation was only above chance level, indicating that sonorant
voicing/devoicing cue was largely unattended to in identifying the boundary location.
Moreover, higher proficiency did not lead to more successful segmentation. The analysis
of reaction times showed an unclear pattern in which higher-proficiency listeners
segmented the test phrases faster but not more accurately than lower-proficiency listeners.
Finally, #CS sequences were recognised more accurately than C#S sequences, which was
taken to suggest that the listeners may have had some limited knowledge that devoiced
sonorants appear only in word-initial positions, but they treated voiced sonorants as equal
candidates for word-final and word-initial position
Recommended from our members
Structure in mind, structure in vocal tract
We update our understanding of the view that grammar regulates intersegmental temporal coordination and present an extension of that view to a new domain: we argue that inter-segmental coordination is basic to prosody. It is the glue joining segments together differently in different languages (here, illustrated with examples from Arabic and Spanish) and orchestrates their unfolding in ways corresponding to constructs posited in theoretical analysis. The correspondence is one between organization in mind-brain and organization in vocal tract. Moreover, for both mind-brain and vocal tract, the organization is phonological and abstract. It is so because it holds over segments of various identities: in Arabic, the first segment in /bka/ is not prosodified as part of the same unit as /ka/ and this holds true also for /blat/, /klat/ and so on, regardless of sonority. In contrast, in English or Spanish, a different organization holds. Crucially, uniformity in organization (same organization presiding over sequences with varying segmental makeup) does not imply uniqueness of phonetic exponents: prosodic organization is pleiotropic, simultaneously expressed by more than one phonetic exponent. Finally, two properties of coordination relations are underscored: lawful flexibility and abstractness. The first is revealed in the degrees of freedom with which movements corresponding to any given effector begin; the second in invariances of task-relevant kinematic signatures regardless of the effectors implicated in any given segmental sequence. Once again, abstract phonological structure is mirrored in vocal tracts via coordination relations holding across physiology and the particular modes of its operation
Looking into Segments
This paper outlines Q theory, in which the traditional segment (consonant, vowel) is decomposed into a string of three ordered subsegments, or q, representing the onset, target, and offset of the segment. The postulation of subsegmental structure permits the representation of complex (contour) segments as well as subtle contrasts in segment-internal changes of state. Q Theory synthesizes insights from Autosegmental Phonology, Aperture Theory, and Articulatory Phonology in a representation that standard phonological constraints can refer to. Q theory is supported by arguments that subsegments act independently and need to be independently referenced by the phonological grammar. Embedded into Agreement by Correspondence Theory, Q theory permits the analysis of contour assimilation as well as contour formation, both in the tonal and segmental domains.
Catalan
This chapter presents a general overview of the phonetics and phonology of Catalan, taking into consideration both segmental and suprasegmental phenomena. The chapter provides an updated state-of-the-art report of the most recent investigations in this area. Though the report concentrates on the standard variety, Central Catalan, we also highlight the rich dialectal variation that affects both segmental and prosodic properties. After a brief introduction to the dialectal distribution of the language, the first part of the chapter is devoted to segmental phonology. In this part, we first present the basic segmen-tal inventories of the language, as well as syllable structure types; we then describe the phonological processes that affect both vocalic and consonantal systems. The prosodic part of the chapter covers the main prosodic characteristics of the language, followed by a description of the intonational variation found for a variety of sentence types
- …