19,906 research outputs found

    Big words, small phrases: Mismatches between pause units and the polysynthetic word in Dalabon

    Get PDF
    This article uses instrumental data from natural speech to examine the phenomenon of pause placement within the verbal word in Dalabon, a polysynthetic Australian language of Arnhem Land. Though the phenomenon is incipient and in two sample texts occurs in only around 4% of verbs, there are clear possibilities for interrupting the grammatical word by pause after the pronominal prefix and some associated material at the left edge, though these within-word pauses are significantly shorter, on average, than those between words. Within-word pause placement is not random, but is restricted to certain affix boundaries; it requires that the paused-after material be at least dimoraic, and that the remaining material in the verbal word be at least disyllabic. Bininj Gun-wok, another polysynthetic language closely related to Dalabon, does not allow pauses to interrupt the verbal word, and the Dalabon development appears to be tied up with certain morphological innovations that have increased the proportion of closed syllables in the pronominal prefix zone of the verb. Though only incipient and not yet phonologized, pause placement in Dalabon verbs suggests a phonology-driven route by which polysynthetic languages may ultimately become less morphologically complex by fracturing into smaller units

    Structure Theorem and Strict Alternation Hierarchy for FO^2 on Words

    Full text link
    It is well-known that every first-order property on words is expressible using at most three variables. The subclass of properties expressible with only two variables is also quite interesting and well-studied. We prove precise structure theorems that characterize the exact expressive power of first-order logic with two variables on words. Our results apply to both the case with and without a successor relation. For both languages, our structure theorems show exactly what is expressible using a given quantifier depth, n, and using m blocks of alternating quantifiers, for any m \leq n. Using these characterizations, we prove, among other results, that there is a strict hierarchy of alternating quantifiers for both languages. The question whether there was such a hierarchy had been completely open. As another consequence of our structural results, we show that satisfiability for first-order logic with two variables without successor, which is NEXP-complete in general, becomes NP-complete once we only consider alphabets of a bounded size

    A Very Low Resource Language Speech Corpus for Computational Language Documentation Experiments

    Full text link
    Most speech and language technologies are trained with massive amounts of speech and text information. However, most of the world languages do not have such resources or stable orthography. Systems constructed under these almost zero resource conditions are not only promising for speech technology but also for computational language documentation. The goal of computational language documentation is to help field linguists to (semi-)automatically analyze and annotate audio recordings of endangered and unwritten languages. Example tasks are automatic phoneme discovery or lexicon discovery from the speech signal. This paper presents a speech corpus collected during a realistic language documentation process. It is made up of 5k speech utterances in Mboshi (Bantu C25) aligned to French text translations. Speech transcriptions are also made available: they correspond to a non-standard graphemic form close to the language phonology. We present how the data was collected, cleaned and processed and we illustrate its use through a zero-resource task: spoken term discovery. The dataset is made available to the community for reproducible computational language documentation experiments and their evaluation.Comment: accepted to LREC 201

    On periodic points of free inverse monoid endomorphisms

    Full text link
    It is proved that the periodic point submonoid of a free inverse monoid endomorphism is always finitely generated. Using Chomsky's hierarchy of languages, we prove that the fixed point submonoid of an endomorphism of a free inverse monoid can be represented by a context-sensitive language but, in general, it cannot be represented by a context-free language.Comment: 18 page
    • 

    corecore