325 research outputs found

    A Review of Corpus-based Statistical Models of Language Variation

    Get PDF
    This paper is a brief review of the research on language variation using corpus data and statistical modeling methods. The variation phenomena covered in this review include phonetic variation (in spontaneous speech) and syntactic variation, with a focus on studies of English and Chinese. The goal of this paper is to demonstrate the use of corpus-driven statistical models in the study of language variation, and discuss the contribution and future directions of this line of research.

    The role of language proficiency and statistical learning in on-line comprehension of syntax among bilingual adult readers

    Get PDF
    Statistical learning (SL) is the ability to identify co-occurring regularities from the environment, and has been implicated in learning across a range of skills, including language. This research project investigated whether there are associations between SL and on-line sentence processing in L1 Chinese L2 English bilinguals, and sought to examine whether second language proficiency mediated the relationship between visual SL and L2 language processing. To this end, two studies were conducted. In Study 1, sixty Chinese-English bilinguals completed a self-paced reading task in Mandarin and English, which tested participants’ on-line processing of subject and object relative clauses (RCs). They also completed a nonlinguistic visual SL task and a battery of additional measures measuring L2 English proficiency and general cognitive abilities. The results revealed that only nonverbal intelligence predicted L1 Chinese RCs processing, and neither visual SL capacity nor L2 proficiency predicted L2 English RCs processing. One possible explanation is that SL is partially modality-specific. Therefore, an auditory SL task was employed in addition to visual SL task in Study 2. In Study 2, fifty-two native Mandarin-speaking adults completed tests of visual and auditory SL, a self-paced reading task measuring the online processing of Mandarin relative clauses, and measures of general cognitive abilities. The results showed that auditory SL capacity independently predicted reading times in the self-paced reading task. Visual SL was also related to language processing, although the effect was marginal. The findings from Study 2 suggest that individual differences in adults’ capacity for SL are associated with on-line processing of Chinese

    The acquisition and use of Mandarin relative clauses by monolingual and bilingual children and adults

    Get PDF
    Children have been found to understand and use relative clauses (RCs) at an early age. However, not all types of RCs are acquired at the same time, and are used with the same frequency (e.g., Diessel & Tomasello, 2000, 2005). Using corpus-based and experimental methodologies, the three studies presented in this thesis investigate the acquisition and processing of different types of RCs in Mandarin, aiming to understand the mechanisms involved in the acquisition and processing of RC involving varying degrees of complexity. The first study (Chapter 3) presents a corpus analysis examining the naturalistic production of Mandarin RCs by Mandarin-speaking monolingual and heritage MandarinEnglish bilingual children (1;00-5;00). The results show that both monolingual and bilingual children produce more object RCs than subject RCs in Mandarin. This is because Mandarin object RCs resemble simple Subject-Verb-Object (SVO) sentences the children had previously acquired, and occur more frequently than subject RCs in their input. Compared to monolingual children, bilingual children produce more object RCs, suggesting that the acquisition of Mandarin RCs is not only facilitated by SVO transitives in Mandarin, but also SVO transitives in English. In contrast to the first study, the second study (Chapter 4) reports a subject RC advantage by looking at the comprehension of Mandarin subject and object RCs in heritage Mandarin-English bilingual children (4;00-10;11) and their vocabulary-matched monolingual peers (4;00-5;09). Using a character-sentence matching task, the results reveal that simple SVO transitives hinder children’s comprehension of Mandarin object RCs by misleading them to interpret the noun phrase occurring first as the head noun. Compared to monolingual children, bilingual children who are more English dominant make this type of error more frequently for Mandarin object RCs, suggesting that both English SVO transitives and language dominance contribute to cross-linguistic influence. However, unlike either the subject or object RC advantage shown in children, mixed results are found in the writing of adult Mandarin native speakers (L1) and advanced second language learners (L2) in the third study (Chapter 5). Using conditional inference trees and random forests, the results show that both adult Mandarin L1 and L2 speakers’ selection of subject and object RCs heavily depends on the discourse context that RCs are situated in. The first and second studies (Chapters 3 and 4) are novel in taking Mandarin RCs with omitted head nouns into account. In spontaneous speech (Chapter 3), the results indicate that monolingual and bilingual children as young as two can produce Mandarin RCs with omitted head nouns, and the omission of a head noun does not influence the subject-object asymmetry. Similarly, the absence of a head noun does not influence monolingual and bilingual children’s comprehension of Mandarin RCs (Chapter 4), suggesting that they are able to recover omitted head nouns from the context provided. In addition, the first and third studies (Chapters 3 and 5) also examine the matrixclause positions in which Mandarin RCs tend to occur. RCs that occur in the non-centreembedded matrix-clause position (e.g., The goat saw the horse [that hugged the pig]) are expected to be easier to process than RCs in the centre-embedded matrix-clause position (e.g., The horse [that hugged the pig] saw the goat), as they require lower working memory load (e.g., Gibson, 1998, 2000). Supporting this assumption, in adult Mandarin L1 and L2 speakers’ writing (Chapter 5), non-centre-embedded RCs occur more often than centreembedded RCs. Moreover, the longer the RCs, the higher the possibility they are placed in the non-centre-embedded matrix-clause position. However, in children’s spontaneous speech (Chapter 3), both monolingual and bilingual children do not show a tendency to prefer noncentre-embedded over centre-embedded RCs, which may relate to the short length of the RCs they produce. The shorter the RCs, the less memory load is needed to process centre-embedded RCs, and therefore the disadvantage of centre-embedded RCs may diminish. The three studies of this thesis present mixed findings regarding Mandarin RC processing, but consistently provide evidence to support the usage-based account. That is, the processing of RCs is shaped by an individual’s age and language experience, including input frequency, the related structures that have been acquired, language dominance and the discourse contexts that RCs tend to appear in

    Negative vaccine voices in Swedish social media

    Get PDF
    Vaccinations are one of the most significant interventions to public health, but vaccine hesitancy creates concerns for a portion of the population in many countries, including Sweden. Since discussions on vaccine hesitancy are often taken on social networking sites, data from Swedish social media are used to study and quantify the sentiment among the discussants on the vaccination-or-not topic during phases of the COVID-19 pandemic. Out of all the posts analyzed a majority showed a stronger negative sentiment, prevailing throughout the whole of the examined period, with some spikes or jumps due to the occurrence of certain vaccine-related events distinguishable in the results. Sentiment analysis can be a valuable tool to track public opinions regarding the use, efficacy, safety, and importance of vaccination

    Order in NP conjuncts in spoken English and Japanese

    No full text
    In the emerging field of cross-linguistic studies on language production, one particularly interesting line of inquiry is possible differences between English and Japanese in ordering words and phrases. Previous research gives rise to the idea that there is a difference in accessing meaning versus form during linearization between these two languages. This assumption is based on observations of language-specific effects of the length factor on the order of phrases (short-before-long in English, long-before-short in Japanese). We contribute to the cross-linguistic exploration of such differences by investigating the variables underlying the internal order of NP conjuncts in spoken English and Japanese. Our quantitative analysis shows that similar influences underlie the ordering process across the two languages. Thus we do not find evidence for the aforementioned difference in accessing meaning versus form with this syntactic phenomenon. With regard to length, Japanese also exhibits a short-before-long preference. However, this tendency is significantly weaker in Japanese than in English, which we explain through an attenuating influence of the typical Japanese phrase structure pattern on the universal effect of short phrases being more accessible. We propose that a similar interaction between entrenched long-before-short schemas and universal accessibility effects is responsible for the varying effects of length in Japanese

    Getting Past the Language Gap: Innovations in Machine Translation

    Get PDF
    In this chapter, we will be reviewing state of the art machine translation systems, and will discuss innovative methods for machine translation, highlighting the most promising techniques and applications. Machine translation (MT) has benefited from a revitalization in the last 10 years or so, after a period of relatively slow activity. In 2005 the field received a jumpstart when a powerful complete experimental package for building MT systems from scratch became freely available as a result of the unified efforts of the MOSES international consortium. Around the same time, hierarchical methods had been introduced by Chinese researchers, which allowed the introduction and use of syntactic information in translation modeling. Furthermore, the advances in the related field of computational linguistics, making off-the-shelf taggers and parsers readily available, helped give MT an additional boost. Yet there is still more progress to be made. For example, MT will be enhanced greatly when both syntax and semantics are on board: this still presents a major challenge though many advanced research groups are currently pursuing ways to meet this challenge head-on. The next generation of MT will consist of a collection of hybrid systems. It also augurs well for the mobile environment, as we look forward to more advanced and improved technologies that enable the working of Speech-To-Speech machine translation on hand-held devices, i.e. speech recognition and speech synthesis. We review all of these developments and point out in the final section some of the most promising research avenues for the future of MT

    Directional adposition use in English, Swedish and Finnish

    Get PDF
    Directional adpositions such as to the left of describe where a Figure is in relation to a Ground. English and Swedish directional adpositions refer to the location of a Figure in relation to a Ground, whether both are static or in motion. In contrast, the Finnish directional adpositions edellĂ€ (in front of) and jĂ€ljessĂ€ (behind) solely describe the location of a moving Figure in relation to a moving Ground (Nikanne, 2003). When using directional adpositions, a frame of reference must be assumed for interpreting the meaning of directional adpositions. For example, the meaning of to the left of in English can be based on a relative (speaker or listener based) reference frame or an intrinsic (object based) reference frame (Levinson, 1996). When a Figure and a Ground are both in motion, it is possible for a Figure to be described as being behind or in front of the Ground, even if neither have intrinsic features. As shown by Walker (in preparation), there are good reasons to assume that in the latter case a motion based reference frame is involved. This means that if Finnish speakers would use edellĂ€ (in front of) and jĂ€ljessĂ€ (behind) more frequently in situations where both the Figure and Ground are in motion, a difference in reference frame use between Finnish on one hand and English and Swedish on the other could be expected. We asked native English, Swedish and Finnish speakers’ to select adpositions from a language specific list to describe the location of a Figure relative to a Ground when both were shown to be moving on a computer screen. We were interested in any differences between Finnish, English and Swedish speakers. All languages showed a predominant use of directional spatial adpositions referring to the lexical concepts TO THE LEFT OF, TO THE RIGHT OF, ABOVE and BELOW. There were no differences between the languages in directional adpositions use or reference frame use, including reference frame use based on motion. We conclude that despite differences in the grammars of the languages involved, and potential differences in reference frame system use, the three languages investigated encode Figure location in relation to Ground location in a similar way when both are in motion. Levinson, S. C. (1996). Frames of reference and Molyneux’s question: Crosslingiuistic evidence. In P. Bloom, M.A. Peterson, L. Nadel & M.F. Garrett (Eds.) Language and Space (pp.109-170). Massachusetts: MIT Press. Nikanne, U. (2003). How Finnish postpositions see the axis system. In E. van der Zee & J. Slack (Eds.), Representing direction in language and space. Oxford, UK: Oxford University Press. Walker, C. (in preparation). Motion encoding in language, the use of spatial locatives in a motion context. Unpublished doctoral dissertation, University of Lincoln, Lincoln. United Kingdo

    Linguistic Variation from Cognitive Variability: The Case of English \u27Have\u27

    Get PDF
    In this dissertation, I seek to construct a model of meaning variation built upon variability in linguistic structure, conceptual structure, and cognitive makeup, and in doing so, exemplify an approach to studying meaning that is both linguistically principled and neuropsychologically grounded. As my test case, I make use of the English lexical item ‘have\u27 by proposing a novel analysis of its meaning based on its well-described variability in English and its embed- ding into crosslinguistically consistent patterns of variation and change.I support this analysis by investigating its real-time comprehension patterns through behavioral, electropsychophysiological, and hemodynamic brain data, thereby incorporating dimensions of domain-general cognitive variability as crucial determinants of linguistic variability. Per my account, ‘have\u27 retrieves a generalized relational meaning which can give rise to a conceptually constrained range of readings, depending on the degree of causality perceived from either linguistic or contextual cues. Results show that comprehenders can make use of both for ‘have\u27-sentences, though they vary in the degree to which they rely on each.At the very broadest level, the findings support a model in which the semantic distribution of ‘have\u27 is inherently principled due to a unified conceptual structure. This underlying conceptual structure and relevant context cooperate in guiding comprehension by modulating the salience of potential readings, as comprehension unfolds; though, this ability to use relevant context–context-sensitivity–is variable but systematic across comprehenders. These linguistic and cognitive factors together form the core of normal language processing and, with a gradient conceptual framework, the minimal infrastructure for meaning variation and change

    Linguistic variation from cognitive variability: the case of English \u27have\u27

    Get PDF
    In this dissertation, I seek to construct a model of meaning variation built upon variability in linguistic structure, conceptual structure, and cognitive makeup, and in doing so, exemplify an approach to studying meaning that is both linguistically principled and neuropsychologically grounded. As my test case, I make use of the English lexical item \u27have\u27 by proposing a novel analysis of its meaning based on its well-described variability in English and its embedding into crosslinguistically consistent patterns of variation and change. I support this analysis by investigating its real-time comprehension patterns through behavioral, electropsychophysiological, and hemodynamic brain data, thereby incorporating dimensions of domain-general cognitive variability as crucial determinants of linguistic variability. Per my account, \u27have\u27 retrieves a generalized relational meaning which can give rise to a conceptually constrained range of readings, depending on the degree of causality perceived from either linguistic or contextual cues. Results show that comprehenders can make use of both for \u27have\u27-sentences, though they vary in the degree to which they rely on each. At the very broadest level, the findings support a model in which the semantic distribution of \u27have\u27 is inherently principled due to a unified conceptual structure. This underlying conceptual structure and relevant context cooperate in guiding comprehension by modulating the salience of potential readings, as comprehension unfolds; though, this ability to use relevant context--context-sensitivity--is variable but systematic across comprehenders. These linguistic and cognitive factors together form the core of normal language processing and, with a gradient conceptual framework, the minimal infrastructure for meaning variation and change
    • 

    corecore