6,078 research outputs found

    Regimes in Babel are Confirmed: Report on Findings in Several Indonesian Ethnic Biblical Texts

    Get PDF
    The paper introduces the presence of three statistical regimes in the Zipfian analysis of texts in quantitative linguistics: the Mandelbrot, original Zipf, and Cancho- Solé-Montemurro regimes. The work is carried out over nine different languages of the same intention semantically: the bible from different languages in Indonesian ethnic and national language. As always, the same analysis is also brought in English version of the Bible for reference. The existence of the three regimes are confirmed while in advance the length of the texts are also becomes an important issue. We outline some further works regarding the quantitative analysis for parameterization used to analyze the three regimes and the task to have broad explanation, especially the microstructure of the language in human decision or linguistic effort – emerging the robustness of them

    Theoretical issues in the interpretation of Cappadocian, a not-so-dead Greek contact language

    Get PDF
    Cappadocian is a mixed Greek-Turkish dialect continuum spoken in the Turkish Central Anatolia Region until the population exchange between Greece and Turkey in the 1920s. Only a few Cappadocian dialects are still spoken in present-day Greece. Since the publication of Thomason and Kaufman’s Language Contact, Creolization, and Genetic Linguistics in 1988, Cappadocian has attracted the attention of historical and contact linguists, because of its unique mixed character. In this paper, I will discuss a number of theoretical issues in the interpretation of the linguistic structure of Cappadocian, focusing on the following topics: (1) the status of loan phonemes and loan morphemes in contact languages, (2) the distinction between code switching and code mixing in relation to Poplack’s Free Morpheme Constraint, (3) the schizoid typology of contact languages

    Gathering Statistics to Aspectually Classify Sentences with a Genetic Algorithm

    Full text link
    This paper presents a method for large corpus analysis to semantically classify an entire clause. In particular, we use cooccurrence statistics among similar clauses to determine the aspectual class of an input clause. The process examines linguistic features of clauses that are relevant to aspectual classification. A genetic algorithm determines what combinations of linguistic features to use for this task.Comment: postscript, 9 pages, Proceedings of the Second International Conference on New Methods in Language Processing, Oflazer and Somers ed

    Verb Physics: Relative Physical Knowledge of Actions and Objects

    Full text link
    Learning commonsense knowledge from natural language text is nontrivial due to reporting bias: people rarely state the obvious, e.g., "My house is bigger than me." However, while rarely stated explicitly, this trivial everyday knowledge does influence the way people talk about the world, which provides indirect clues to reason about the world. For example, a statement like, "Tyler entered his house" implies that his house is bigger than Tyler. In this paper, we present an approach to infer relative physical knowledge of actions and objects along five dimensions (e.g., size, weight, and strength) from unstructured natural language text. We frame knowledge acquisition as joint inference over two closely related problems: learning (1) relative physical knowledge of object pairs and (2) physical implications of actions when applied to those object pairs. Empirical results demonstrate that it is possible to extract knowledge of actions and objects from language and that joint inference over different types of knowledge improves performance.Comment: 11 pages, published in Proceedings of ACL 201

    Gradient Metaphoricity of the Preposition in: A Corpus-based Approach to Chinese Academic Writing in English

    Get PDF
    In Cognitive Linguistics, a conceptual metaphor is a systematic set of correspondences between two domains of experience (Kövecses 2020: 2). In order to have an extensive understanding of metaphors, metaphoricity (Müller and Tag 2010; Dunn 2011; Jensen and Cuffari 2014; Nacey and Jensen 2017) has been emphasized to address one of the properties of metaphors in language usage: gradience (Hanks 2006; Dunn 2011, 2014), which indicates that metaphorical expressions can be measured. Despite many noteworthy contributions, studies of metaphoricity are often accused of subjectivity (Müller 2008; Jensen and Cuffari 2014; Jensen 2017), this is why this study uses a big corpus as a database. Therefore, the main aim of this dissertation is to measure the gradient senses of the preposition in in an objective way, thus mapping the highly systematic semantic extension. Based on these gradient senses, the semantic and syntactic features of the preposition in produced by advanced Chinese English-major learners are investigated, combining quantitative and qualitative research methods. A quantitative analysis of the literal and other ten metaphorical senses of the preposition in is made at first. In accounting for the five factors influencing image schemata of each sense: “scale of Landmark”, “visibility”, “path”, “inclusion” and “boundary”, the formula of measuring the gradability of metaphorical degree is deduced: Metaphoricity=[[#Visibility] +[#Path] +[#Inclusion] +[#Boundary]]*[#Scale of Landmark]. The result is that the primary sense has the highest value:12, and all other extended senses have values down to zero. The more shared features with proto-scene, the higher the value of the metaphorical sense, and the less metaphorical the sense. EVENT and PERSON are the “least metaphoric” (value = 9-11); SITUATION, NUMBER, CONTENT and FIELD are “weak metaphoric” (value = 6-8); Also included are SEGMENTATION, TIME and MANNER (value = 3-5), and they are “strong metaphoric”; PURPOSE shares the least feature with proto-scene, and it has the lowest value, so it is “most metaphoric” (value = 0-2). Then, a corpus-based approach is employed, which offers a model for employing a corpus-based approach in Cognitive Linguistics. It compares two compiled sub-corpora: Chinese Master Academic Writing Corpus and Chinese Doctorate Academic Writing Corpus. The findings show that, on the semantic level, Chinese English-major students overuse in with a low level of metaphoricity, even advanced learners use the most metaphorical in rarely. In terms of syntactic behaviours, the most frequent nouns in [in+noun] construction are weakly metaphoric, whilst the nouns in the construction [in the noun of] are EVENT sense, which is least metaphorical. Moreover, action verbs tend to be used in the construction [verb+in] and [in doing sth.] in both master and doctorate groups. In the qualitative study, the divergent usages of the preposition in are explored. The preposition in is often substituted with other prepositions, such as on and at. The fundamental reason for the Chinese learners’ weakness is the negative transfer from their mother tongue (Wang 2001; Gong 2007; Zhang 2010). Although in and its Chinese equivalence zai...li (在...里) share the same proto-scene, there are discrepancies: the metaphorical senses of the preposition in are TIME, PURPOSE, NUMBER, CONTENT, FIELD, EVENT, SITUATION, SEGMENTATION, MANNER, PERSON, while those of zai...li (在...里) are only five: TIME, CONTENT, EVENT, SITUATION and PERSON. Thus the image schemata of each sense cannot be correspondingly mapped onto each other in different languages. This study also provides evidence for the universality and variation of spatial metaphors on the ground of cultural models. Philosophically, it supports the standpoint of Embodiment philosophy that abstract concepts are constructed on the basis of spatial metaphors that are grounded in the physical and cultural experience
    • …
    corecore