16 research outputs found

    A Transformation-Based Learning Method on Generating Korean Standard Pronunciation

    Get PDF
    PACLIC 21 / Seoul National University, Seoul, Korea / November 1-3, 200

    Learning Local Phonological Processes

    Get PDF
    We present a learning algorithm for local phonological processes that relies on a restriction on the expressive power needed to compute phonological patterns that apply locally. Representing phonological processes as a functional mapping from an input to output form (an assumption compatible with either the SPE or OT formalism), the learner assumes the target process can be described with the functional counterpart to the Strictly Local (McNaughton and Papert 1971, Rogers and Pullum 2011) formal languages. Given a data set of input-output string pairs, the learner applies the two-stage grammatical induction procedure of 1) constructing a prefix tree representation of the input and 2) generalizing the pattern to words not found in the data set by merging states (Garcia and Vidal 1990, Oncina et al. 1993, Heinz 2007, 2009, de la Higuera 2010). The learner’s criterion for state merging enforces a locality requirement on the kind of function it can converge to and thereby directly reflects its own hypothesis space. We demonstrate with the example of German final devoicing, using a corpus of string pairs derived from the CELEX2 lemma corpus. The implications of our results include a proposal for how humans generalize to learn phonological patterns and a consequent explanation for why local phonological patterns have this property

    Inhibitory and Facilitatory Effects of Phonological and Orthographic Similarity on L2 Word Recognition Across Modalities in Bilinguals

    Get PDF
    Language perception studies on bilinguals often show that words that share form and meaning across languages (cognates) are easier to process than words that share only meaning. This facilitatory phenomenon is known as the cognate effect. Most previous studies have shown this effect visually, whereas the auditory modality as well as the interplay between type of similarity and modality remain largely unexplored. In this study, highly proficient late Spanish-English bilinguals carried out a lexical decision task in their second language, both visually and auditorily. Words had high or low phonological and orthographic similarity, fully crossed. We also included orthographically identical words (perfect cognates). Our results suggest that similarity in the same modality (i.e., orthographic similarity in the visual modality and phonological similarity in the auditory modality) leads to improved signal detection, whereas similarity across modalities hinders it. We provide support for the idea that perfect cognates are a special category within cognates. Results suggest a need for a conceptual and practical separation between types of similarity in cognate studies. The theoretical implication is that the representations of items are active in both modalities of the non-target language during language processing, which needs to be incorporated to our current processing models.This research was supported by the Basque Government through the BERC 2018-2021 program and by the Spanish State Research Agency through BCBL Severo Ochoa excellence accreditation SEV-2015-0490. CF and ENB are supported by MINECO predoctoral grants from the Spanish government (BES-2016-077169) and (BES-2016-078896), respectively. CDM is further supported by the Spanish Ministry of Economy and Competitiveness [PSI2017-82941-P; RED2018-102615-T], the Basque Government [PIBA18-29], and through a Grant from the H2020 European Research Council [ERC Consolidator Grant ERC-2018-COG-819093

    Linguistic probes into human history

    Get PDF
    Dit proefschrift omvat vijf reeds gepubliceerde artikelen en een studie die binnenkort verschijnt. Daarin heb ik taalkundige methoden onderzocht, getoetst en gebruikt om linguïstische variëteiten te classificeren op basis van steekproeven die bestaan uit lexicale items.De gerapporteerde studies hebben betrekking op de classificatie van Nederlandse variëteiten uit Nederland, talen en dialecten uit Spanje, Bantu-variëteiten uit Gabon, Tanzania en tenslotte Turkse en Indo-Iraanse talen die gesproken worden in Kirgizstan, Tadzjikistan en Oezbekistan.Binnen een multidisciplinair perspectief dat gericht is op het verschaffen van een hoger niveau van antropologische synthese wordt de taalkundige diversiteit gebruikt als proxy voor de culturele verschillen van de overeenkomstige populaties en wordt vervolgens vergeleken met de variabiliteit van familienamen (hun aantal, frequentie en geografische verdeling) of met genetische verschillen die gebaseerd zijn op moleculaire kenmerken in het DNA.Met betrekking tot dat laatste kan de analyse van familienamen migraties zichtbaar maken die mogelijk in historische tijden hebben plaatsgevonden, en kunnen we regio's onderscheiden die veel immigranten hebben ontvangen die wegtrokken uit demografisch stabieler gebleven regio's. Wij vermoeden dat dergelijke migratiepatronen dialect- en taalcontact hebben beïnvloed. Dit is een nieuw perspectief van waaruit we de effecten van migratie op taalverandering kunnen onderzoeken.This thesis in linguistics includes five published articles and one study to appear, in which I review, test and use computational linguistic methods to classify languages and dialects consisting of lexical items – the sort of material that is generally readily available from linguistic atlases and databases. To compare linguistic varieties and classify them, methods that lead to the computation of a linguistic distance matrix are used.The studies reported respectively concern the classification of Dutch dialects from the Netherlands; languages and dialects from Spain; Bantu languages from Gabon, Tanzania and, finally, Turkic and Indo-Iranian languages spoken in Kyrgyzstan, Tajikistan and Uzbekistan.In a multidisciplinary perspective aimed at providing a higher level of anthropological synthesis, linguistic diversity is used as a proxy for the cultural differences of corresponding populations and is then compared to the variability of family names (their number, frequency and geographic distribution) or to genetic differences based on molecular markers on the DNA. The analysis of family names enables the depiction of migrations which have taken place in historical times, and, allows us to distinguish regions that have received many immigrants from those that have remained demographically more stable. We conjecture that such migration patterns have influenced dialect and language contact. This is a novel perspective from which we may examine the effects of migration on language change, for example it appears that Spanish languages have remained lively because the regions where they are spoken have often be quite isolated demographically

    Linguistic probes into human history

    Get PDF

    Learning Phonological Mappings by Learning Strictly Local Functions

    Get PDF
    In this paper we identify strict locality as a defining computational property of the input-output mapping that underlies local phonological processes. We provide an automata-theoretic characterization for the class of Strictly Local functions, which are based on the well-studied Strictly Local formal languages (McNaughton & Papert 1971; Rogers & Pullum 2011; Rogers et al. 2013), and show how they can model a range of phonological processes. We then present a learning algorithm, the SLFLA, which uses the defining property of strict locality as an inductive principle to learn these mappings from finite data. The algorithm is a modification of an algorithm developed by Oncina et al. (1993) (called OSTIA) for learning the class of subsequential functions, of which the SL functions are a proper subset. We provide a proof that the SLFLA learns the class of SL functions and discuss these results alongside previous studies on using OSTIA to learn phonological mappings (Gildea and Jurafsky 1996)

    Strict Locality and Phonological Maps

    Get PDF

    Learning Repairs for Marked Structures

    Get PDF
    [Abstract not available
    corecore