Search CORE

686 research outputs found

Recommended from our members

Towards a Learning-Based Account of Underlying Forms: A Case Study in Turkish

Author: Belth Caleb
Publication venue: ScholarWorks@UMass Amherst
Publication date: 01/06/2023
Field of study

A traditional concept in phonological theory is that of the underlying form. However, the history of phonology has witnessed a debate about how abstract underlying representations ought to be allowed to be, and a number of arguments have been given that phonology should abandon such representations altogether. In this paper, we consider a learning-based approach to the question. We propose a model that, by default, constructs concrete representations of morphemes. When and only when such concrete representations make it challenging to generalize in the face of the sparse statistical profile of language, our proposed model constructs abstract underlying forms that allow for effective generalization. As a case study, we consider the highly agglutinative language, Turkish. We demonstrate that the underlying forms that our model constructs account for the complexities of Turkish phonology resulting from its multifaceted vowel harmony. Moreover, these underlying forms enable the highly-accurate prediction of novel surface forms, demonstrating the importance of some underlying forms to generalization

ScholarWorks@UMass Amherst

What is in a morpheme?:Theoretical, experimental and computational approaches to the relation of meaning and form in morphology

Author: Hammarström Harald
Kastner Itamar
Manova Stela
Nie Yining
Publication venue
Publication date: 31/03/2020
Field of study

Edinburgh Research Explorer

Early decomposition in visual word recognition: Dissociating morphology, form, and meaning

Author: Baayen R.H.
Baayen R.H.
Baayen R.H.
Billi Randall
Ford M.
Gold B.T.
Marchand H.
Mirjana Bozic
Rastle K.
William D. Marslen-Wilson
Publication venue: Taylor & Francis
Publication date
Field of study

The role of morphological, semantic, and form-based factors in the early stages of visual word recognition was investigated across different SOAs in a masked priming paradigm, focusing on English derivational morphology. In a first set of experiments, stimulus pairs co-varying in morphological decomposability and in semantic and orthographic relatedness were presented at three SOAs (36, 48, and 72 ms). No effects of orthographic relatedness were found at any SOA. Semantic relatedness did not interact with effects of morphological decomposability, which came through strongly at all SOAs, even for pseudo-suffixed pairs such as archer-arch. Derivational morphological effects in masked priming seem to be primarily driven by morphological decomposability at an early stage of visual word recognition, and are independent of semantic factors. A second experiment reversed the order of prime and target (stem-derived rather than derived-stem), and again found that morphological priming did not interact with semantic relatedness. This points to an early segmentation process that is driven by morphological decomposability and not by the structure or content of central lexical representations

Crossref

PubMed Central

Unsupervised learning of Arabic non-concatenative morphology

Author: Khaliq Bilal
Publication venue
Publication date: 24/04/2015
Field of study

Unsupervised approaches to learning the morphology of a language play an important role in computer processing of language from a practical and theoretical perspective, due their minimal reliance on manually produced linguistic resources and human annotation. Such approaches have been widely researched for the problem of concatenative affixation, but less attention has been paid to the intercalated (non-concatenative) morphology exhibited by Arabic and other Semitic languages. The aim of this research is to learn the root and pattern morphology of Arabic, with accuracy comparable to manually built morphological analysis systems. The approach is kept free from human supervision or manual parameter settings, assuming only that roots and patterns intertwine to form a word. Promising results were obtained by applying a technique adapted from previous work in concatenative morphology learning, which uses machine learning to determine relatedness between words. The output, with probabilistic relatedness values between words, was then used to rank all possible roots and patterns to form a lexicon. Analysis using trilateral roots resulted in correct root identification accuracy of approximately 86% for inflected words. Although the machine learning-based approach is effective, it is conceptually complex. So an alternative, simpler and computationally efficient approach was then devised to obtain morpheme scores based on comparative counts of roots and patterns. In this approach, root and pattern scores are defined in terms of each other in a mutually recursive relationship, converging to an optimized morpheme ranking. This technique gives slightly better accuracy while being conceptually simpler and more efficient. The approach, after further enhancements, was evaluated on a version of the Quranic Arabic Corpus, attaining a final accuracy of approximately 93%. A comparative evaluation shows this to be superior to two existing, well used manually built Arabic stemmers, thus demonstrating the practical feasibility of unsupervised learning of non-concatenative morphology

Sussex Research Online

A Novel Schema-Oriented Approach for Chinese New Word Identification

Author: Gu Junzhong
Lu Zhao
Yan Zhixian
Publication venue: Department of English, National Chengchi University
Publication date: 01/01/2013
Field of study

Waseda University Repository

Does linear position matter for morphological processing? Evidence from a Tagalog masked priming experiment

Author: Cayado DKT
Stockall L
Wray S
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2023
Field of study

This study investigated morphological decomposition of Tagalog infixed, prefixed, and suffixed words using the masked priming paradigm. We directly compared morphological priming of infixed, ni- prefixed and -in suffixed words to examine whether infixes are processed similarly to other affixes during early and automatic decomposition. We found significant priming effects for infixed, prefixed, and suffixed words, but no semantic or orthographic similarity priming. Magnitudes of priming effects for infixed and prefixed words were not significantly different, suggesting that decomposition of infixed words was not more costly for Tagalog speakers, contrary to phonological readjustment-based accounts of infixation. This is the first psycholinguistic experiment showing that infixed words are decomposed into morphological units during visual word recognition. We provide evidence that the imperfect edge-alignment of the stem within infixed words does not hamper the early morphological decomposition mechanisms, suggesting that edge-alignment might not be critical to trigger activation of morphological units

Queen Mary Research Online

A brain network for integration of tone and suffix

Author: Horne Merle
Roll Mikael
Söderström Pelle
Publication venue
Publication date: 01/01/2015
Field of study

Lund University Publications

A Dual-Route Approach to Orthographic Processing

Author: Grainger Jonathan
Ziegler Johannes C.
Publication venue: Frontiers Research Foundation
Publication date: 01/01/2011
Field of study

In the present theoretical note we examine how different learning constraints, thought to be involved in optimizing the mapping of print to meaning during reading acquisition, might shape the nature of the orthographic code involved in skilled reading. On the one hand, optimization is hypothesized to involve selecting combinations of letters that are the most informative with respect to word identity (diagnosticity constraint), and on the other hand to involve the detection of letter combinations that correspond to pre-existing sublexical phonological and morphological representations (chunking constraint). These two constraints give rise to two different kinds of prelexical orthographic code, a coarse-grained and a fine-grained code, associated with the two routes of a dual-route architecture. Processing along the coarse-grained route optimizes fast access to semantics by using minimal subsets of letters that maximize information with respect to word identity, while coding for approximate within-word letter position independently of letter contiguity. Processing along the fined-grained route, on the other hand, is sensitive to the precise ordering of letters, as well as to position with respect to word beginnings and endings. This enables the chunking of frequently co-occurring contiguous letter combinations that form relevant units for morpho-orthographic processing (prefixes and suffixes) and for the sublexical translation of print to sound (multi-letter graphemes)

Crossref

HAL AMU

Directory of Open Access Journals

PubMed Central

Frontiers - Publisher Connector