2,058 research outputs found

    Focus in Gur and Kwa

    Get PDF
    The project investigates focus phenomena in the two genetically relatedWest African Gur and Kwa language groups of the Niger-Congo phylum. Most of its members are tone languages, they are similar with respect to word order typology (all are SVO languages), but of divergent morphological type (agglutinating Gur versus isolating Kwa)

    Applying Tools and Techniques of Natural Language Processing to the Creation of Resources for Less Commonly Taught Languages

    Get PDF
    This paper proposes that research results from the area of naturallanguage processing could effectively be applied to creating softwareto facilitate the development oflanguage learning materials foranynaturallanguage. We will suggest that a knowledge-elicitationsystem called Boas, which was originally created to support amachine-translation application, could be modified to supportlanguage-learning ends. Boas leads a speaker of any natural Ianguage,who is not necessarily trained in linguistics, through a seriesof pedagogically-supported questionnaires, the responses to whichconstitute a" profile" of the language. This profile includes morphological,lexical and syntactic information. Once this structuredprofile is created, it can feed into virtually any type of system,including one to support language learning. Creating languagelearningsoftware using a system like this would be efficient in twoways: first, it would exploit extant cutting-edge research and technologiesin naturallanguage processin~ and second, it would permita single tool to be used for all languages, including less commonlytaught ones, for which limited funding for resource development isa bottleneck

    On the Typology of Inflection Class Systems

    Get PDF
    Inflectional classes are a property of the ideal inflecting-fusional language type. Thus strongly inflecting languages have the most complex vertical and horizontal stratification of hierarchical tree structures. Weakly inflecting languages which also approach the ideal isolating type or languages which also approach the agglutinating type have much shallower structures. Such properties follow from principles of Natural Morphology and from the distinction of the descendent hierarchy of macroclasses, classes, subclasses, subsubclasses etc. and homogeneous microclasses. The main languages of illustration are Latin, Lithuanian, Russian, German, French, Finnish, Hungarian and Turkis

    Polysynthetic Tendencies in Modern Greek

    Get PDF
    The aim of this paper is to provide a more accurate typological classification of Modern Greek. The verb in MG shows many polysynthetic traits, such as noun and adverb incorporation into the verbal complex, a large inventory of bound morphemes, pronominal marking of objects, many potential slots before the verbal head, nonconfigurational syntax, etc. On the basis of these traits, MG has similarities with polysynthetic languages such as Abkhaz, Cayuga, Chukchi, Mohawk, Nahuatl, a.o. I will show that the abundance of similar patterns between MG and polysynthesis point to the evolution of a new system away from the traditional dependent-marking strategy and simple synthesis towards head-marking and polysynthesis. Finally, I will point to the risk of undertaking a direct comparison of different language systems by discussing the pronominal head-marking strategies in MG and the North American languages

    The morphology-phonology interface: Isolating to polysynthetic languages

    Get PDF
    Given the substantial variation in the nature of the grammatical word (GW) across languages, this paper addresses the question of whether the Phonological Word (PW) exhibits the same degree of variation or rather abstracts away from it due to the typically flatter nature of the phonological hierarchy. Various types of languages are examined, focusing on isolating and polysynthetic languages—opposite ends of a word structure continuum. It is demonstrated that, indeed, the PW exhibits substantially less variation across languages than might be expected on the basis of the differences in GW structure. Furthermore, it is shown that an additional constituent (i.e., the Clitic Group, renamed Composite Group) is required between the PW and the Phonological Phrase to fully account for the interface between morpho-syntactic and phonological structures

    Kannada named entity recognition and classification (nerc) based on multinomial na\"ive bayes (mnb) classifier

    Full text link
    Named Entity Recognition and Classification (NERC) is a process of identification of proper nouns in the text and classification of those nouns into certain predefined categories like person name, location, organization, date, and time etc. NERC in Kannada is an essential and challenging task. The aim of this work is to develop a novel model for NERC, based on Multinomial Na\"ive Bayes (MNB) Classifier. The Methodology adopted in this paper is based on feature extraction of training corpus, by using term frequency, inverse document frequency and fitting them to a tf-idf-vectorizer. The paper discusses the various issues in developing the proposed model. The details of implementation and performance evaluation are discussed. The experiments are conducted on a training corpus of size 95,170 tokens and test corpus of 5,000 tokens. It is observed that the model works with Precision, Recall and F1-measure of 83%, 79% and 81% respectively.Comment: 14 pages, 3 figures, International Journal on Natural Language Computing (IJNLC) Vol. 4, No.4, August 201
    corecore