28 research outputs found

    Information-theoretic causal inference of lexical flow

    Get PDF
    This volume seeks to infer large phylogenetic networks from phonetically encoded lexical data and contribute in this way to the historical study of language varieties. The technical step that enables progress in this case is the use of causal inference algorithms. Sample sets of words from language varieties are preprocessed into automatically inferred cognate sets, and then modeled as information-theoretic variables based on an intuitive measure of cognate overlap. Causal inference is then applied to these variables in order to determine the existence and direction of influence among the varieties. The directed arcs in the resulting graph structures can be interpreted as reflecting the existence and directionality of lexical flow, a unified model which subsumes inheritance and borrowing as the two main ways of transmission that shape the basic lexicon of languages. A flow-based separation criterion and domain-specific directionality detection criteria are developed to make existing causal inference algorithms more robust against imperfect cognacy data, giving rise to two new algorithms. The Phylogenetic Lexical Flow Inference (PLFI) algorithm requires lexical features of proto-languages to be reconstructed in advance, but yields fully general phylogenetic networks, whereas the more complex Contact Lexical Flow Inference (CLFI) algorithm treats proto-languages as hidden common causes, and only returns hypotheses of historical contact situations between attested languages. The algorithms are evaluated both against a large lexical database of Northern Eurasia spanning many language families, and against simulated data generated by a new model of language contact that builds on the opening and closing of directional contact channels as primary evolutionary events. The algorithms are found to infer the existence of contacts very reliably, whereas the inference of directionality remains difficult. This currently limits the new algorithms to a role as exploratory tools for quickly detecting salient patterns in large lexical datasets, but it should soon be possible for the framework to be enhanced e.g. by confidence values for each directionality decision

    Information-theoretic causal inference of lexical flow

    Get PDF
    This volume seeks to infer large phylogenetic networks from phonetically encoded lexical data and contribute in this way to the historical study of language varieties. The technical step that enables progress in this case is the use of causal inference algorithms. Sample sets of words from language varieties are preprocessed into automatically inferred cognate sets, and then modeled as information-theoretic variables based on an intuitive measure of cognate overlap. Causal inference is then applied to these variables in order to determine the existence and direction of influence among the varieties. The directed arcs in the resulting graph structures can be interpreted as reflecting the existence and directionality of lexical flow, a unified model which subsumes inheritance and borrowing as the two main ways of transmission that shape the basic lexicon of languages

    Céad mίle fáilte: a corpus-based study of the development of a community of practice within the Irish hotel management training sector

    Get PDF
    This thesis examines the discourse of a unique third-level academic institution in order to identify the variety of linguistic features, which align it, first of all, to the higher education sector in general, but more specifically to a specific professional world where students are being educated for their future careers. Specifically, a college of hotel management education in the south of Ireland is the locus of research. Students complete a four-year Business Degree in International Hotel Management during which time they gain academic and theoretical knowledge along with practical industry experience during placement internships in the industry. Data collection using oral recordings spanned a twelve-month period and two academic years. This allowed for a comprehensive matrix of recording events encapsulating the full gamut of college academic life across the three years of student presence on campus. Recordings included a variety of hotel-specific and business lectures, practical working sessions, language classes and some miscellaneous events, thus creating a one-million word spoken corpus devoted to this sector. The primary research question concerns the identification and quantification of the discourse specific to this academic and professionally-oriented environment, using corpus linguistics methodologies. Parallel to and supported by this specialised linguistic repertoire lies the development of the emergent identity among the students themselves and their place and future careers within the international hotel management sector. This aspect will be analysed within Wenger’s (1998) framework of community of practice and Lave and Wenger’s (1991) initial theory of legitimate peripheral participation. In addition, an ethnographic lens will be employed to shed light on the day-to-day operations of this college and how the totality of this unique community, expressed through its discourse, but not only so, establishes and fosters an environment where the students develop their future professional identities supported by the academic professionals who are experienced industry practitioners in the field of international hotel management.N

    Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2018)

    Get PDF
    Peer reviewe

    LSP Journal Vol5 No 1 2014

    Get PDF

    Focal I : papers from the Fourth International Conference on Austronesian Linguistics

    Get PDF

    Measuring phonological distance between languages

    Get PDF
    Three independent approaches to measuring cross-language phonological distance are pursued in this thesis: exploiting phonological typological parameters; measuring the cross-entropy of phonologically transcribed texts; and measuring the phonetic similarity of non-word nativisations by speakers from different language backgrounds. Firstly, a set of freely accessible online tools are presented to aid in establishing parametric values for syllable structure and phoneme inventory in different languages. The tools allow researchers to make differing analytical and observational choices and compare the results. These tools are applied to 16 languages, and correspondence between the resulting parameter values is used as a measure of phonological distance. Secondly, the computational technique of cross-entropy measurement is applied to texts from seven languages, transcribed in four different ways: a phonemic IPA transcription; with Elements; and with two sets of binary distinctive features in the SPE tradition. This technique results in consistently replicable rankings of phonological similarity for each transcription system. It is sensitive to differences in transcription systems. It can be used to probe the consequences for information transfer of the choices made in devising a representational system. Thirdly, participants from different language backgrounds are presented with non-words covering the vowel space, and asked to nativise them. The accent distance metric ACCDIST is applied to the resulting words. A profile of how each speaker’s productions cluster in the vowel space is produced, and ACCDIST measures the similarity of these profiles. Averaging across speakers with a shared native language produces a measure of similarity between language profiles. Each of these three approaches delivers a quantitative measure of phonological similarity between individual languages. They are each sensitive to different analytical choices, and require different types and quantities of input data, and so can complement each other. This thesis provides a proof-of-concept for methods which are both internally consistent and falsifiable

    A Study of Lexical Variation, Comprehension and Language Attitudes in Deaf Users of Chinese Sign Language (CSL) from Beijing and Shanghai

    Get PDF
    Regional variation between the Beijing and Shanghai varieties, particularly at the lexical level, has been observed by sign language researchers in China (Fischer & Gong, 2010; Shen, 2008; Yau, 1977). However, few investigations into the variation in Chinese Sign Language (CSL) from a sociolinguistic perspective have previously been undertaken. The current study is the first to systematically study sociolinguistic variation in CSL signers’ production and comprehension of lexical signs as well as their language attitudes. This thesis consists of three studies. The first study investigates the lexical variation between Beijing and Shanghai varieties. Results of analyses show that age, region and semantic category are the factors influencing lexical variation in Beijing and Shanghai signs. To further explore the findings of lexical variation, a lexical recognition task was undertaken with Beijing and Shanghai signers in a second study looking at mutual comprehension of lexical signs used in Beijing and Shanghai varieties. The results demonstrate that Beijing participants were able to understand more Shanghai signs than Shanghai participants could understand Beijing signs. Historical contact is proposed in the study as a possible major cause for the asymmetrical intelligibility between the two varieties. The third study investigated signers’ attitudes towards regional varieties of CSL and Signed Chinese via a questionnaire. The findings demonstrate that older signers tended to have a conservative attitude towards their comprehension of regional signs of CSL, and that participants of both regions tended to ascribe high solidarity to their own varieties and high social status to Signed Chinese. This study has expanded our knowledge of sociolinguistic variation in Beijing and Shanghai signing varieties, and lays the groundwork for a future comprehensive study of the regional varieties in CSL. This study may also serve as a useful reference for official sign language planning in China including such issues as promoting a standardised lexicon across China and offering qualifications for CSL learners and interpreters

    Austronesian and other languages of the Pacific and South-east Asia : an annotated catalogue of theses and dissertations

    Get PDF

    Proceedings of the 42nd Australian Linguistic Society Conference - 2011

    Get PDF
    ANU College of Arts & Social Sciences, School of Language Studies; ANU College of Asia and the Pacific, School of Culture, History and Languag
    corecore