228 research outputs found

    Lower Perplexity is Not Always Human-Like

    Full text link
    In computational psycholinguistics, various language models have been evaluated against human reading behavior (e.g., eye movement) to build human-like computational models. However, most previous efforts have focused almost exclusively on English, despite the recent trend towards linguistic universal within the general community. In order to fill the gap, this paper investigates whether the established results in computational psycholinguistics can be generalized across languages. Specifically, we re-examine an established generalization -- the lower perplexity a language model has, the more human-like the language model is -- in Japanese with typologically different structures from English. Our experiments demonstrate that this established generalization exhibits a surprising lack of universality; namely, lower perplexity is not always human-like. Moreover, this discrepancy between English and Japanese is further explored from the perspective of (non-)uniform information density. Overall, our results suggest that a cross-lingual evaluation will be necessary to construct human-like computational models.Comment: Accepted by ACL 202

    Natural Language Understanding: Methodological Conceptualization

    Get PDF
    This article contains the results of a theoretical analysis of the phenomenon of natural language understanding (NLU), as a methodological problem. The combination of structural-ontological and informational-psychological approaches provided an opportunity to describe the subject matter field of NLU, as a composite function of the mind, which systemically combines the verbal and discursive structural layers. In particular, the idea of NLU is presented, on the one hand, as the relation between the discourse of a specific speech message and the meta-discourse of a language, in turn, activated by the need-motivational factors. On the other hand, it is conceptualized as a process with a specific structure of information metabolism, the study of which implies the necessity to differentiate the affective (emotional) and need-motivational influences on the NLU, as well as to take into account their interaction. At the same time, the hypothesis about the influence of needs on NLU under the scenario similar to the pattern of Yerkes-Dodson is argued. And the theoretical conclusion that emotions fulfill the function of the operator of the structural features of the information metabolism of NLU is substantiated. Thus, depending on the modality of emotions in the process of NLU, it was proposed to distinguish two scenarios for the implementation of information metabolism - reduction and synthetic. The argument in favor of the conclusion about the productive and constitutive role of emotions in the process of NLU is also given

    Voices:a clinical computational psycholinguistic approach to language and hallucinations in schizophrenia spectrum disorders

    Get PDF
    Spontaneous speech contains a wealth of information that reflects personal characteristics of the speaker, such as mood, motivation, intelligence, arousal, and variability in word use. Recent advances in Natural Language Processing (NLP) have paved the way for systematic recording and near real-time analysis of quantifiable properties of spoken language. NLP can reliably provide variables relevant to various aspects of brain functioning within seconds, while the cost and effort of speech recording is negligible. In this thesis, we investigated the use of state-of-the-art NLP models to support the diagnosis of psychotic disorders (e.g., schizophrenia). Psychiatric diagnoses are currently not reliable as no objective quantitative biomarkers are available. This is a serious social problem, because incorrect diagnoses lead to over- and under-treatment. NLP analyzes of spontaneous speech provide reproducible quantitative assessment.In this thesis, we have shown that acoustic, semantic and grammatical aspects of language can be quantified and used as a marker for psychotic disorders. Based on these analyses, we can say with ~85% certainty whether someone has a psychosis or not.In addition, we have shown that computational language analyzes provide clinically relevant insights in the study of auditory verbal hallucinations. In the future, these analyzes may be used to detect a relapse in psychosis earlier, so that you can see a psychosis coming before people become seriously ill

    Psychometric Predictive Power of Large Language Models

    Full text link
    Next-word probabilities from language models have been shown to successfully simulate human reading behavior. Building on this, we show that, interestingly, instruction-tuned large language models (LLMs) yield worse psychometric predictive power (PPP) for human reading behavior than base LLMs with equivalent perplexities. In other words, instruction tuning, which helps LLMs provide human-preferred responses, does not always make them human-like from the computational psycholinguistics perspective. In addition, we explore prompting methodologies in simulating human reading behavior with LLMs, showing that prompts reflecting a particular linguistic hypothesis lead LLMs to exhibit better PPP but are still worse than base LLMs. These highlight that recent instruction tuning and prompting do not offer better estimates than direct probability measurements from base LLMs in cognitive modeling.Comment: 8 page

    Language Teaching in India: Issues and Innovations

    Get PDF
    This collected volume on English language teaching (ELT) in India contains 22 articles written by Indian teachers and researchers. The book has been divided into six sections. The first section—“Problematizing ELT in India”—offers a critical, historical perspective along with innovative ideas for making English language learning and teaching meaningful and purposive in modern India. The second section—“Nature of ELT Materials”—demonstrates how the ELT materials used in Indian classrooms are not embedded in local needs and indigenous contexts. The section emphasizes the importance of developing instructional materials that not only make use of the rich linguistic and cultural resources available in India but also promote effective communication skills among the learners. The third section—“Learner Profiles”—provides interesting insights into the needs, wants, and lacks of Indian learners of English. This section shows how the instruments of needs analysis developed in monocultural and monolingual settings are inadequate for assessing the needs and wants of learners in multilingual and multicultural India. The fourth section—“Classroom Issues”—focuses on certain central issues affecting teaching and learning in the classroom context, particularly the role of native language knowledge and skills that Indian learners bring with them. The fifth section—“Course Evaluation and Teacher Development”—suggests ideas for making teacher education responsive to the changing roles and responsibilities of language teachers. The sixth and final section—“Curriculum Change”—deals with the principles and procedures for curricular changes that are in tune with the evolving knowledge about learning and teaching and the increasing desire for learner control of the process of materials development and evaluation

    Sentence parsing

    No full text

    Availability-Based Production Predicts Speakers' Real-time Choices of Mandarin Classifiers

    Get PDF
    Speakers often face choices as to how to structure their intended message into an utterance. Here we investigate the influence of contextual predictability on the encoding of linguistic content manifested by speaker choice in a classifier language. In English, a numeral modifies a noun directly (e.g., three computers). In classifier languages such as Mandarin Chinese, it is obligatory to use a classifier (CL) with the numeral and the noun (e.g., three CL.machinery computer, three CL.general computer). While different nouns are compatible with different specific classifiers, there is a general classifier "ge" (CL.general) that can be used with most nouns. When the upcoming noun is less predictable, the use of a more specific classifier would reduce surprisal at the noun thus potentially facilitate comprehension (predicted by Uniform Information Density, Levy & Jaeger, 2007), but the use of that more specific classifier may be dispreferred from a production standpoint if accessing the general classifier is always available (predicted by Availability-Based Production; Bock, 1987; Ferreira & Dell, 2000). Here we use a picture-naming experiment showing that Availability-Based Production predicts speakers' real-time choices of Mandarin classifiers.Comment: To appear in proceedings of CogSci 201

    Linear Logic for Meaning Assembly

    Full text link
    Semantic theories of natural language associate meanings with utterances by providing meanings for lexical items and rules for determining the meaning of larger units given the meanings of their parts. Meanings are often assumed to combine via function application, which works well when constituent structure trees are used to guide semantic composition. However, we believe that the functional structure of Lexical-Functional Grammar is best used to provide the syntactic information necessary for constraining derivations of meaning in a cross-linguistically uniform format. It has been difficult, however, to reconcile this approach with the combination of meanings by function application. In contrast to compositional approaches, we present a deductive approach to assembling meanings, based on reasoning with constraints, which meshes well with the unordered nature of information in the functional structure. Our use of linear logic as a `glue' for assembling meanings allows for a coherent treatment of the LFG requirements of completeness and coherence as well as of modification and quantification.Comment: 19 pages, uses lingmacros.sty, fullname.sty, tree-dvips.sty, latexsym.sty, requires the new version of Late
    • …
    corecore