435 research outputs found
Elements, Government, and Licensing: Developments in phonology
Elements, Government, and Licensing brings together new theoretical and empirical developments in phonology. It covers three principal domains of phonological representation: melody and segmental structure; tone, prosody and prosodic structure; and phonological relations, empty categories, and vowel-zero alternations. Theoretical topics covered include the formalisation of Element Theory, the hotly debated topic of structural recursion in phonology, and the empirical status of government.
In addition, a wealth of new analyses and empirical evidence sheds new light on empty categories in phonology, the analysis of certain consonantal sequences, phonological and non-phonological alternation, the elemental composition of segments, and many more. Taking up long-standing empirical and theoretical issues informed by the Government Phonology and Element Theory, this book provides theoretical advances while also bringing to light new empirical evidence and analysis challenging previous generalisations.
The insights offered here will be equally exciting for phonologists working on related issues inside and outside the Principles & Parameters programme, such as researchers working in Optimality Theory or classical rule-based phonology
Go large: The impact of size on gestural interaction in digital musical instrument design
This research is about the impact on musical gestural interaction of over- sized Digital Musical Instrument (DMI) design, that is instruments with physical dimensions that are larger than the human body performing them, but smaller than the size of the room they are in. When interacting with an interface not only does the performer move their body to control the interface, the interface design and affordances control the way the performer moves their body. In the context of DMIs, two instruments with the same sonic capabilities will elicit different patterns of gestural inter- action depending on their physical layout. Using the methodology of designing instruments for the purpose of exploring research questions, this research ex- amines the gestural interaction and music made by musicians performing with large DMIs to investigate impact of instrument size on music making. In this thesis I propose a process of investigating gestural interaction and how it shapes compositional choices through two studies. Each study examines the relative effects on performance and composition of various factors of affordances and idiomatic gestural language performed with large DMIs. Studying performer interactions and music composed with large instruments with novel layouts that participants have not yet developed idiomatic gestural languages for results in new discoveries that are relevant to the design of large instruments as well as instruments of all sizes. This research is relevant for digital musical instrument designers and Human Computer Interaction researchers as it will elucidate the influence that a DMI’s physical size and layout has on the performances and compositions created using digital musical instruments, so that designers can make informed decisions to either support or suppress specific influences in future DMI design. Further, this research contributes the design of a new digital musical instrument, Chaos Bells, that can be used by digital musical instrument performers and researchers in the future
Predictive Articulatory speech synthesis Utilizing Lexical Embeddings (PAULE)
Das Predictive Articulatory speech synthesis Utilizing Lexical Embeddings (PAULE)
Modell ist ein neues Modell zur Kontrolle des artikulatorischen Sprachsynthesizers
VocalTractLab (VTL) [15] . Mit PAULE lassen sich deutsche Wörter synthetisieren. Die
Wortsynthese kann entweder mit Hilfe eines semantischen Vektors, der die Wortbedeu-
tung kodiert, und der gewünschten Dauer der Wortsynthese gestartet werden oder es
kann eine Resynthese von einer Audiodatei gemacht werden. Die Audiodatei kann
beliebige Aufnahmen von Sprecher:innen enthalten, wobei die Resynthese immer über
den Standardsprecher des VTL erfolgt. Abhängig von der Wortbedeutung und der
Audiodatei variiert die Synthesequalität.
Neu an PAULE ist, dass es einen prädiktiven Ansatz verwendet, indem es aus
der geplanten Artikulation die dazugehörige perzeptuelle Akustik vorhersagt und
daraus die Wortbedeutung ableitet. Sowohl die Akustik als auch die Wortbedeutung
sind als metrische Vektorräume implementiert. Dadurch lässt sich ein Fehler zu einer
gewünschten Zielakustik und Zielbedeutung berechnen und minimieren. Bei dem
minimierten Fehler handelt es sich nicht um den tatsächlichen Fehler, der aus der
Synthese mit dem VTL entsteht, sondern um den Fehler, der aus den Vorhersagen eines
prädiktiven Modells generiert wird. Obwohl es nicht der tatsächliche Fehler ist, kann
dieser Fehler genutzt werden, um die tatsächliche Artikulation zu verbessern. Um das
prädiktive Modell mit der tatsächlichen Akustik in Einklang zu bringen, hört sich PAULE
selbst zu.
Ein in der Sprachsynthese zentrales Eins-Zu-Viele-Problem ist, dass eine Akustik durch
viele verschiedene Artikulationen erzeugt werden kann. Dieses Eins-Zu-Viele-Problem
wird durch die Vorhersagefehlerminimierung in PAULE aufgelöst, zusammen mit der
Bedingung, dass die Artikulation möglichst stationär und mit möglichst konstanter Kraft
ausgeführt wird. PAULE funktioniert ohne jegliche symbolische Repräsentation in der
Akustik (Phoneme) und in der Artikulation (motorische Gesten oder Ziele). Damit zeigt
PAULE, dass sich gesprochene Wörter ohne symbolische Beschreibungsebene model-
lieren lassen. Der gesprochenen Sprache könnte daher im Vergleich zur geschriebenen
Sprache eine fundamental andere Verarbeitungsebene zugrunde liegen. PAULE integriert
Erfahrungswissen sukzessive. Damit findet PAULE nicht die global beste Artikulation
sondern lokal gute Artikulationen. Intern setzt PAULE auf künstliche neuronale Netze
und die damit verbundenen Gradienten, die zur Fehlerkorrektur verwendet werden.
PAULE kann weder ganze Sätze synthetisieren noch wird somatosensorisches Feedback berücksichtigt. Zu Beidem gibt es Vorarbeiten, die in zukünftige Versionen integriert
werden sollen.The Predictive Articulatory speech synthesis Utilizing Lexical Embeddings (PAULE)
model is a new control model for the VocalTractLab (VTL) [15] speech synthesizer, a simulator of the human speech system. It is capable of synthesizing single words in the German language. The speech synthesis can be based on a target semantic vector or on target acoustics, i.e., a recorded word token. VTL is controlled by 30 parameters. These parameters have to be estimated for each time point during the production of a word, which is roughly every 2.5 milliseconds. The time-series of these 30 control parameters (cps) of the VTL are the control parameter trajectories (cp-trajectories). The high dimensionality of the cp-trajectories in combination with non-linear interactions leads to a many-to-one mapping problem, where many sets of cp-trajectories produce highly similar synthesized audio.
PAULE solves this many-to-one mapping problem by anticipating the effects of cp-
trajectories and minimizing a semantic and acoustic error between this nticipation
and a targeted meaning and acoustics. The quality of the anticipation is improved by an outer loop, where PAULE listens to itself. PAULE has three central design features that distinguish it from other control models: First, PAULE does not use any symbolic units, neither motor primitives, articulatory targets, or gestural scores on the movement side, nor any phone or syllable representation on the acoustic side. Second, PAULE is a learning model that accumulates experience with articulated words. As a consequence, PAULE will not find a global optimum for the inverse kinematic optimization task it has to solve. Instead, it finds a local optimum that is conditioned on its past experience. Third, PAULE uses gradient-based internal prediction errors of a predictive forward model to plan cp-trajectories for a given semantic or acoustic target. Thus, PAULE is an
error-driven model that takes its previous experiences into account.
Pilot study results indicate that PAULE is able to minimize an acoustic semantic and acoustic error in the resynthesized audio. This allows PAULE to find cp-trajectories that are correctly classified by a classification model as the correct word with an accuracy of 60 %, which is close to the accuracy for human recordings of 63 %. Furthermore, PAULE seems to model vowel-to-vowel anticipatory coarticulation in terms of formant shifts correctly and can be compared to human electromagnetic articulography (EMA) recordings in a straightforward way. Furthermore, with PAULE it is possible to condition
on already executed past cp-trajectories and to smoothly continue the cp-trajectories from the current state. As a side-effect of developing PAULE, it is possible to create large amounts of training data for the VTL through an automated segment-based approach.
Next steps, in the development of PAULE, include adding a somatosensory feedback channel, extending PAULE from producing single words to the articulation of small utterances and adding a thorough evaluation
Resilabificación incompleta y acoplamiento gestual ambisilábico en español
In the generative literature, the pattern of coronal fricative lenition found in the traditional Chinato Spanish dialect is commonly cited as a phonological argument that the resyllabification of word-final prevocalic consonants is complete, in the sense that onsets derived by resyllabification are structurally identical to canonical (word-level) onsets. However, recent acoustic studies of Northern-Central Peninsular Spanish have problematized the completeness of resyllabification with experimental evidence that /s̺/ is shorter and more voiced as a derived onset than as a canonical onset. Using a split-gesture, competitive, coupled oscillator model of the syllable in Articulatory Phonology, which divides consonants into a separate constriction and release gesture, we propose a novel representation of ambisyllabicity that predicts the phonetic behavior of derived onset /s̺/ in Northern-Central Peninsular Spanish. We then show that ambisyllabic coupling permits a simpler phonological analysis of coronal fricative lenition in Chinato Spanish as compared to alternative accounts. Our analysis makes typological predictions that are confirmed by patterns from other contemporary Spanish varieties. Lastly, we examine the consequences of ambisyllabicity for the analysis of Spanish rhotic consonants, which have also been argued to support complete resyllabification. We offer an analysis of rhotics that is entirely compatible with an ambisyllabic representation of incomplete resyllabification.En la literatura generativa, el debilitamiento de fricativas coronales en el dialecto chinato del español peninsular se cita comúnmente como un argumento fonológico a favor de la resilabificación completa de consonantes prevocálicas finales de palabra, o sea que los arranques derivados por resilabificación son idénticos estructuralmente a los arranques canónicos a nivel de palabra. Sin embargo, algunos estudios acústicos recientes han problematizado la resilabificación completa en el español peninsular centro-norteño al presentar evidencia experimental de que la /s̺/ es más corta y sonorizada como arranque derivado que como arranque canónico. Utilizamos un modelo de acoplamiento competitivo desde la Fonología Articulatoria, el cual divide a las consonantes en un gesto de constricción y de soltura, para proponer una nueva representación de la ambisilabicidad que predice el comportamiento fonético de la /s̺/ como arranque derivado en el español peninsular centro-norteño. Luego, demostramos que el acoplamiento ambisilábico permite analizar mejor el debilitamiento de fricativas coronales en el español chinato, en comparación con otras explicaciones alternativas. Confirmamos las predicciones tipológicas de nuestro análisis para otras variedades contemporáneas del español. Por último, examinamos las consecuencias de la ambisilabicidad para el análisis de las consonantes róticas del español, también citadas como otro argumento a favor de la resilabificación completa. Ofrecemos un análisis de las róticas que es totalmente compatible con una representación ambisilábica de la resilabificación incompleta
A model of sonority based on pitch intelligibility
Synopsis:
Sonority is a central notion in phonetics and phonology and it is essential for generalizations related to syllabic organization. However, to date there is no clear consensus on the phonetic basis of sonority, neither in perception nor in production. The widely used Sonority Sequencing Principle (SSP) represents the speech signal as a sequence of discrete units, where phonological processes are modeled as symbol manipulating rules that lack a temporal dimension and are devoid of inherent links to perceptual, motoric or cognitive processes. The current work aims to change this by outlining a novel approach for the extraction of continuous entities from acoustic space in order to model dynamic aspects of phonological perception. It is used here to advance a functional understanding of sonority as a universal aspect of prosody that requires pitch-bearing syllables as the building blocks of speech.
This book argues that sonority is best understood as a measurement of pitch intelligibility in perception, which is closely linked to periodic energy in acoustics. It presents a novel principle for sonority-based determinations of well-formedness – the Nucleus Attraction Principle (NAP). Two complementary NAP models independently account for symbolic and continuous representations and they mostly outperform SSP-based models, demonstrated here with experimental perception studies and with a corpus study of Modern Hebrew nouns.
This work also includes a description of ProPer (Prosodic Analysis with Periodic Energy). The ProPer toolbox further exploits the proposal that periodic energy reflects sonority in order to cover major topics in prosodic research, such as prominence, intonation and speech rate. The book is finally concluded with brief discussions on selected topics: (i) the phonotactic division of labor with respect to /s/-stop clusters; (ii) the debate about the universality of sonority; and (iii) the fate of the classic phonetics–phonology dichotomy as it relates to continuity and dynamics in phonology
The Classification of Arabic Dialects: Traditional Approaches, New Proposals, and Methodological Problems
The question of how to classify the different varieties of spoken Arabic is a long-standing problem in the fields of Arabic and Semitic linguistics, and it has been addressed by several authors and from a number of different perspectives. This collection of articles represents a further contribution to the vast collective effort of attempting to more effectively assess, organize, and understand the varieties of spoken Arabic, applying a classification of Arabic dialects in the broadest possible sense. The authors who contribute to this volume tackle this issue by examining varieties spoken from the Maghreb to the Mashreq and employing various approaches and perspectives, e.g., diatopic and diachronic, syntactical, and typological
- …