A lexical transducer for North Slope Iñupiaq

Abstract

Thesis (M.A.) University of Alaska Fairbanks, 2011This thesis describes the creation and evaluation of software designed to analyze and generate North Slope Iñupiaq words. Given a complete lñupiaq word as input, it attempts to identify the word's stem and suffixes, including the grammatical category and any inflectional information contained in the word. Given a stem and list of suffixes as input, it attempts to produce the corresponding Iñupiaq word, applying phonological processes as necessary. Innovations in the implementation of this software include Iñupiaq-specific formats for specifying lexical data, including a table-based format for specifying inflectional suffixes in paradigms; a treatment of phonologically-conditioned irregular allomorphy which leverages the pattern-recognition capabilities of the xfst programming language; and an idiom for composing morphographemic rules together in xfst which captures the state of the software each time a new rule is added, maximizing feedback during software compilation and facilitating troubleshooting. In testing, the software recognized 81.2% of all word tokens (78.3% of unique word types) and guessed at the morphology of an additional 16.8% of tokens (19.4% of types). Analyses of recognized words were largely accurate; a heuristic for identifying accurate parses is proposed. Most guesses were at least partly inaccurate. Improvements and applications are proposed.National Science Foundation (Award 0534217

    Similar works