Search CORE

112 research outputs found

Paradoxes of MaxEnt markedness

Author: Anttila Arto
Magri Giorgio
Publication venue: Linguistic Society of America
Publication date: 13/05/2023
Field of study

Over the past two decades, theoretical linguistics has taken a probabilistic turn. Maximum entropy (ME) has been endorsed as a model of probabilistic phonology because of its classical guarantees for grammatical inference. Yet, little is known about the basic organizing principles of ME phonology beyond circumstantial evidence of ME’s ability to fit specific patterns of empirical frequencies. The study of ME typologies is difficult because they consist of infinitely many grammars that cannot be exhaustively listed and directly inspected. Uniform Probability Inequalities (Anttila and Magri 2018) are a new tool that solves the problem: they characterize cases where one phonological mapping has a probability smaller than another mapping and this probability inequality holds uniformly for every grammar in the typology. In other words, uniform probability inequalities are universals of probabilistic grammars. We present a new generalization about ME uniform probability inequalities and argue that they are phonologically paradoxical and prune the set of ME universals down to almost nothing. This suggests that ME is not a suitable model of phonology

Proceedings Published by the LSA (Linguistic Society of America)

Recommended from our members

MaxEnt Learners are Biased Against Giving Probability to Harmonically Bounded Candidates

Author: O\u27Hara Charlie
Publication venue: ScholarWorks@UMass Amherst
Publication date: 01/02/2022
Field of study

One of the major differences between MaxEnt Harmonic Grammar (Goldwater and Johnson, 2003) and Noisy Harmonic Grammar (Boersma and Pater, 2016) is that in MaxEnt harmonically bounded candidates are able to get some probability, whereas in most other constraint-based grammars they can never be output (Jesney, 2007). The probability given to harmonically bounded candidates is taken from other candidates, in some cases allowing Max- Ent to model grammars that subvert some of the universal implications that are true in NoisyHG (Anttila and Magri, 2018). Magri (2018) argues that the types of implicational universals that remain valid in MaxEnt are phonologically implausible, suggesting that Max- Ent overgenerates NoisyHG. However, recent work has shown that some of the possible grammars in a constraint based grammar may be unlikely to be observed because they are difficult to learn (Staubs, 2014; Stanton, 2016; Pater and Moreton, 2012; Hughto, 2019; O’Hara, 2021). Here, I show that grammars that give weight to harmonically bounded candidates are harder to learn than other grammars. With learnability applied, I claim that the typological predictions of MaxEnt and NoisyHG are in fact much more similar than they would seem based on the grammars alone

ScholarWorks@UMass Amherst

Does MaxEnt Overgenerate? Implicational Universals in Maximum Entropy Grammar

Author: Anttila Arto
Magri Giorgio
Publication venue: 'Linguistic Society of America'
Publication date: 10/02/2018
Field of study

A good linguistic theory should neither undergenerate (i.e., it should not miss any attested patterns) nor overgenerate (i.e., it should not predict any "unattestable" patterns). We investigate the question of overgeneration in Maximum Entropy Grammar (ME) in the context of basic syllabification (Prince and Smolensky 2004) and obstruent voicing (Lombardi 1999), using the theory's T-order as a measure of typological strength. We find that ME has non-trivial T-orders, but compared to OT and HG, they are relatively sparse and sometimes linguistically counterintuitive. The fact that many reasonable implicational universals fail under ME suggests that the theory overgenerates, at least in the two phonological examples we examine. More generally, our results serve as a reminder that linguistic theories should be evaluated in terms of both descriptive fit and explanatory depth. A good theory succeeds on both fronts: we want a flexible theory that best fits the data, but we also want an informative theory that excludes unnatural patterns and derives the correct implicational universals

Proceedings Published by the LSA (Linguistic Society of America)

Recommended from our members

T-orders across categorical and probabilistic constraint-based phonology

Author: Magri Giorgio
Tapani Anttila Arto
Publication venue: ScholarWorks@UMass Amherst
Publication date: 01/01/2018
Field of study

ScholarWorks@UMass Amherst

Recommended from our members

Stochastic harmonic grammars do not peak on the mappings singled out by categorical harmonic grammars

Author: Magri Giorgio
Publication venue: ScholarWorks@UMass Amherst
Publication date: 01/06/2023
Field of study

A candidate surface phonological realization is called a peak of a probabilistic constraint-based phonological grammar provided it achieves the largest probability mass over its candidate set. Obviously, the set of peaks of a maximum entropy grammar is the categorical harmonic grammar corresponding to the same weights. This paper shows that the set of peaks of a stochastic harmonic grammar instead can be different from the categorical harmonic grammar corresponding to any weights. Thus in particular, maximum entropy and stochastic harmonic grammars can peak on different candidates

ScholarWorks@UMass Amherst

Recommended from our members

Examining variability in Spanish monolingual and bilingual phonotactics: A look at sC-clusters

Author: Tetzloff Katerina A
Publication venue: ScholarWorks@UMass Amherst
Publication date: 26/10/2022
Field of study

Current models of generative phonology have failed to address the variability that is observed in bilingual language patterns patterns. This dissertation addresses exactly that issue by examining the perception of Spanish sC-clusters in Spanish monolinguals and English-Spanish bilinguals. Surface sC-clusters in onset position are prohibited in Spanish and are repaired by inserting a prothetic /e/ (sC

\rightarrow

esC). English differs in that it allows sC-cluster onsets, and the structure of the sC-cluster has been shown to differ based on the sonority profile (i.e., s+stop clusters are bisyllabic, s+liquid clusters are tautosyllabic). A batch version of a Harmonic Grammar Gradual Learning Algorithm (HG-GLA) was given Spanish input and predicted that Spanish sC-clusters may be syllabified differently based on the sonority of the sC-cluster. It predicted that s+stop clusters are more likely to instantiate /e/ prothesis than s+liqud clusters, but that s+liquid clusters are most likely to be syllabified as a true branching onset like in English. This led to the hypothesis that s+stop and s+liquid clusters may show observable differences in perception in Spanish. Furthermore, studies in bilingualism have shown strong evidence for bilingual variability, or non-monolingual-like language behavior, particularly in areas where there is non-identical structural overlap, as is the case with sC-clusters in Spanish and English. The perception of s+stop and s+liquid clusters was thus also analyzed with respect to the following language-external variables that affect bilingual variability: language profile (monolingual versus bilingual), age of exposure to bilingualism, and bilingual dominance. To test these hypotheses, two experiments were performed. The first was a replication of an AX task that has been shown to exhibit variability in Spanish sC-cluster perception in past studies. In this task, native Spanish speakers (monolingual and bilingual) listened to stimuli pairs that differed in the duration and quality of the initial vowel preceding the sC-cluster and were asked to respond if they were the same or different. The second was a nonce word judgment task where participants were presented with Spanish-like nonce words beginning with sC-clusters and had to give them acceptability ratings of how `Spanish-like\u27 they sounded. The results did not show evidence of a language-internal effect. s+stop and s+liquid clusters were treated the same in perception by Spanish native speakers, contrary to the predictions of the HG-GLA. Regarding the language-external variables, there was a strong effect of language profile on perception of sC-clusters in Spanish: monolinguals showed a strong dis-preference for sC-initial words, whereas bilinguals were more accepting of such clusters. However, the bilingual variability observed was not affected by age of exposure to bilingualism or by language dominance. Finally, a sketch of a proposal is made for how generative theories of phonology, like Harmonic Grammar, could potentially be adapted to accommodate the observed differences between the phonotactics of monolinguals and bilinguals, particularly for the case of sC-clusters in English-Spanish bilinguals

ScholarWorks@UMass Amherst

The principle of least effort within the hierarchy of linguistic preferences: external evidence from English

Author: Kul Malgorzata
Publication venue: Unpublished PhD thesis
Publication date: 01/01/2007
Field of study

The thesis is an investigation of the principle of least effort (Zipf 1949 [1972]). The principle is simple (all effort should be least) and universal (it governs the totality of human behavior). Since the principle is also functional, the thesis adopts a functional theory of language as its theoretical framework, i.e. Natural Linguistics. The explanatory system of Natural Linguistics posits that higher principles govern preferences, which, in turn, manifest themselves as concrete, specific processes in a given language. Therefore, the thesis’ aim is to investigate the principle of least effort on the basis of external evidence from English. The investigation falls into the three following strands: the investigation of the principle itself, the investigation of its application in articulatory effort and the investigation of its application in phonological processes. The structure of the thesis reflects the division of its broad aims. The first part of the thesis presents its theoretical background (Chapter One and Chapter Two), the second part of the thesis deals with application of least effort in articulatory effort (Chapter Three and Chapter Four), whereas the third part discusses the principle of least effort in phonological processes (Chapter Five and Chapter Six). Chapter One serves as an introduction, examining various aspects of the principle of least effort such as its history, literature, operation and motivation. It overviews various names which denote least effort, explains the origins of the principle and reviews the literature devoted to the principle of least effort in a chronological order. The chapter also discusses the nature and operation of the principle, providing numerous examples of the principle at work. It emphasizes the universal character of the principle from the linguistic field (low-level phonetic processes and language universals) and the non-linguistic ones (physics, biology, psychology and cognitive sciences), proving that the principle governs human behavior and choices. Chapter Two provides the theoretical background of the thesis in terms of its theoretical framework and discusses the terms used in the thesis’ title, i.e. hierarchy and preference. It justifies the selection of Natural Linguistics as the thesis’ theoretical framework by outlining its major assumptions and demonstrating its explanatory power. As far as the concepts of hierarchy and preference are concerned, the chapter provides their definitions and reviews their various understandings via decision theories and linguistic preference-based theories. Since the thesis investigates the principle of least effort in language and speech, Chapter Three considers the articulatory aspect of effort. It reviews the notion of easy and difficult sounds and discusses the concept of articulatory effort, overviewing its literature as well as various understandings in a chronological fashion. The chapter also presents the concept of articulatory gestures within the framework of Articulatory Phonology. The thesis’ aim is to investigate the principle of least effort on the basis of external evidence, therefore Chapters Four and Six provide evidence in terms of three experiments, text message studies (Chapter Four) and phonological processes in English (Chapter Six). Chapter Four contains evidence for the principle of least effort in articulation on the basis of experiments. It describes the experiments in terms of their predictions and methodology. In particular, it discusses the adopted measure of effort established by means of the effort parameters as well as their status. The statistical methods of the experiments are also clarified. The chapter reports on the results of the experiments, presenting them in a graphical way and discusses their relation to the tested predictions. Chapter Four establishes a hierarchy of speakers’ preferences with reference to articulatory effort (Figures 30, 31). The thesis investigates the principle of least effort in phonological processes, thus Chapter Five is devoted to the discussion of phonological processes in Natural Phonology. The chapter explains the general nature and motivation of processes as well as the development of processes in child language. It also discusses the organization of processes in terms of their typology as well as the order in which processes apply. The chapter characterizes the semantic properties of processes and overviews Luschützky’s (1997) contribution to NP with respect to processes in terms of their typology and incorporation of articulatory gestures in the concept of a process. Chapter Six investigates phonological processes. In particular, it identifies the issues of lenition/fortition definition and process typology by presenting the current approaches to process definitions and their typology. Since the chapter concludes that no coherent definition of lenition/fortition exists, it develops alternative lenition/fortition definitions. The chapter also revises the typology of phonological processes under effort management, which is an extended version of the principle of least effort. Chapter Seven concludes the thesis with a list of the concepts discussed in the thesis, enumerates the proposals made by the thesis in discussing the concepts and presents some questions for future research which have emerged in the course of investigation. The chapter also specifies the extent to which the investigation of the principle of least effort is a meaningful contribution to phonology

Adam Mickiewicz University Repository

Repozytorium Uniwersytetu im. Adama Mickiewicza (AMUR)

The variable development of /s/ + consonant onset clusters in Farsi-English interlanguage

Author: Boudaoud Malek
Publication venue
Publication date: 01/01/2008
Field of study

This thesis investigates the variable production of English /s/ + consonant onset clusters in the speech of 30 adult native Farsi speakers learning English as a second language (L2). In particular, the study examines the development of the homorganic /st/, /sn/ and /sl/ sequences (sC clusters), which are realized variably either via e-epenthesis (e.g., [e[barbelow]st]op) or via its target L2 pronunciation (e.g., [st]op) The sentence reading task as well as the picture-based interview utilized in this investigation followed standard sociolinguistic procedures for data collection and analyses, and included a set of linguistic (e.g., preceding phonological environment, sonority profile of the cluster) and extra-linguistic factors (e.g., level of formality, proficiency in English) whose effects were measured statistically via GoldVarb X. The results reveal that: (1) the proportion of [e]-epenthesis is higher after a word-final consonant or pause than after a vowel (in which case the sC cluster is resyllabified as two separate syllables, i.e. [V s.C V]); (2) over time (hence with increased L2 proficiency) and in formal situations, the amount of epenthesis decreases, conforming with Major's (2001) Ontogeny Phylogeny Model; and (3) as observed in several studies of L1 acquisition, markedness on continuancy - rather than markedness on sonority - is better able to capture the variable patterns of e-epenthesis in the Farsi-English interlanguage data (i.e., the more marked structures /st/ and /sn/, in which the continuancy feature varies (from [+continuant] /s/ to [-continuant] /t/ and /n/ ) are more likely to trigger the phenomenon of [e]-epenthesis than the less marked nonnative cluster /sl/, in which continuancy is maintained constant (from [+continuant] /s/ to [+continuant] /l/). Based on these results, I analyze the data within a stochastic version of Optimality Theory, and discuss their implications and pedagogical applications for the teaching of pronunciation

Concordia University Research Repository