164 research outputs found

    Factors Affecting Part-of-Speech Tagging for Tagalog

    Get PDF
    PACLIC 23 / City University of Hong Kong / 3-5 December 200

    Dependency-based Analysis for Tagalog Sentences

    Get PDF

    Using Stanford Part-of-Speech Tagger for the Morphologically-rich Filipino Language

    Get PDF

    An investigation into deviant morphology : issues in the implementation of a deep grammar for Indonesian

    Get PDF
    This thesis investigates deviant morphology in Indonesian for the implementation of a deep grammar. In particular we focus on the implementation of the verbal suffix -kan. This suffix has been described as having many functions, which alter the kinds of arguments and the number of arguments the verb takes (Dardjowidjojo 1971; Chung 1976; Arka 1993; Vamarasi 1999; Kroeger 2007; Son and Cole 2008). Deep grammars or precision grammars (Butt et al. 1999a; Butt et al. 2003; Bender et al. 2011) have been shown to be useful for natural language processing (NLP) tasks, such as machine translation and generation (Oepen et al. 2004; Cahill and Riester 2009; Graham 2011), and information extraction (MacKinlay et al. 2012), demonstrating the need for linguistically rich information to aid NLP tasks. Although these linguistically-motivated grammars are invaluable resources to the NLP community, the biggest drawback is the time required for the manual creation and curation of the lexicon. Our work aims to expedite this process by applying methods to assign syntactic information to kan-affixed verbs automatically. The method we employ exploits the hypothesis that semantic similarity is tightly connected with syntactic behaviour (Levin 1993). Our endeavour in automatically acquiring verbal information for an Indonesian deep grammar poses a number of lingustic challenges. First of all Indonesian verbs exhibit voice marking that is characteristic of the subgrouping of its language family. In order to be able to characterise verbal behaviour in Indonesian, we first need to devise a detailed analysis of voice for implementation. Another challenge we face is the claim that all open class words in Indonesian, at least as it is spoken in some varieties (Gil 1994; Gil 2010), cannot linguistically be analysed as being distinct from each other. That is, there is no distiction between nouns, verbs or adjectives in Indonesian, and all word from the open class categories should be analysed uniformly. This poses difficulties in implementing a grammar in a linguistically motivated way, as well discovering syntactic behaviour of verbs, if verbs cannot be distinguished from nouns. As part of our investigation we conduct experiments to verify the need to employ word class categories, and we find that indeed these are linguistically motivated labels in Indonesian. Through our investigation into deviant morphological behaviour, we gain a better characterisation of the morphosyntactic effects of -kan, and we discover that, although Indonesian has been labelled as a language with no open word class distinctions, word classes can be established as being linguistically-motivated

    Social and structural aspects of language contact and change

    Get PDF
    This book brings together papers that discuss social and structural aspects of language contact and language change. Several papers look at the relevance of historical documents to determine the linguistic nature of early contact varieties, while others investigate the specific processes of contact-induced change that were involved in the emergence and development of these languages. A third set of papers look at how new datasets and greater sensitivity to social issues can help to (re)assess persistent theoretical and empirical questions as well as help to open up new avenues of research. In particular they highlight the heterogeneity of contemporary language practices and attitudes often obscured in sociolinguistic research. The contributions all focus on language variation and change but investigate it from a variety of disciplinary and empirical perspectives and cover a range of linguistic contexts

    Social and structural aspects of language contact and change

    Get PDF
    This book brings together papers that discuss social and structural aspects of language contact and language change. Several papers look at the relevance of historical documents to determine the linguistic nature of early contact varieties, while others investigate the specific processes of contact-induced change that were involved in the emergence and development of these languages. A third set of papers look at how new datasets and greater sensitivity to social issues can help to (re)assess persistent theoretical and empirical questions as well as help to open up new avenues of research. In particular they highlight the heterogeneity of contemporary language practices and attitudes often obscured in sociolinguistic research. The contributions all focus on language variation and change but investigate it from a variety of disciplinary and empirical perspectives and cover a range of linguistic contexts

    Social and structural aspects of language contact and change

    Get PDF
    This book brings together papers that discuss social and structural aspects of language contact and language change. Several papers look at the relevance of historical documents to determine the linguistic nature of early contact varieties, while others investigate the specific processes of contact-induced change that were involved in the emergence and development of these languages. A third set of papers look at how new datasets and greater sensitivity to social issues can help to (re)assess persistent theoretical and empirical questions as well as help to open up new avenues of research. In particular they highlight the heterogeneity of contemporary language practices and attitudes often obscured in sociolinguistic research. The contributions all focus on language variation and change but investigate it from a variety of disciplinary and empirical perspectives and cover a range of linguistic contexts
    • …
    corecore