1,471 research outputs found

    Event knowledge in large language models: the gap between the impossible and the unlikely

    Full text link
    Word co-occurrence patterns in language corpora contain a surprising amount of conceptual knowledge. Large language models (LLMs), trained to predict words in context, leverage these patterns to achieve impressive performance on diverse semantic tasks requiring world knowledge. An important but understudied question about LLMs' semantic abilities is whether they acquire generalized knowledge of common events. Here, we test whether five pre-trained LLMs (from 2018's BERT to 2023's MPT) assign higher likelihood to plausible descriptions of agent-patient interactions than to minimally different implausible versions of the same event. Using three curated sets of minimal sentence pairs (total n=1,215), we found that pre-trained LLMs possess substantial event knowledge, outperforming other distributional language models. In particular, they almost always assign higher likelihood to possible vs. impossible events (The teacher bought the laptop vs. The laptop bought the teacher). However, LLMs show less consistent preferences for likely vs. unlikely events (The nanny tutored the boy vs. The boy tutored the nanny). In follow-up analyses, we show that (i) LLM scores are driven by both plausibility and surface-level sentence features, (ii) LLM scores generalize well across syntactic variants (active vs. passive constructions) but less well across semantic variants (synonymous sentences), (iii) some LLM errors mirror human judgment ambiguity, and (iv) sentence plausibility serves as an organizing dimension in internal LLM representations. Overall, our results show that important aspects of event knowledge naturally emerge from distributional linguistic patterns, but also highlight a gap between representations of possible/impossible and likely/unlikely events.Comment: The two lead authors have contributed equally to this wor

    Event Structure In Vision And Language

    Get PDF
    Our visual experience is surprisingly rich: We do not only see low-level properties such as colors or contours; we also see events, or what is happening. Within linguistics, the examination of how we talk about events suggests that relatively abstract elements exist in the mind which pertain to the relational structure of events, including general thematic roles (e.g., Agent), Causation, Motion, and Transfer. For example, “Alex gave Jesse flowers” and “Jesse gave Alex flowers” both refer to an event of transfer, with the directionality of the transfer having different social consequences. The goal of the present research is to examine the extent to which abstract event information of this sort (event structure) is generated in visual perceptual processing. Do we perceive this information, just as we do with more ‘traditional’ visual properties like color and shape? In the first study (Chapter 2), I used a novel behavioral paradigm to show that event roles – who is acting on whom – are rapidly and automatically extracted from visual scenes, even when participants are engaged in an orthogonal task, such as color or gender identification. In the second study (Chapter 3), I provided functional magnetic resonance (fMRI) evidence for commonality in content between neural representations elicited by static snapshots of actions and by full, dynamic action sequences. These two studies suggest that relatively abstract representations of events are spontaneously extracted from sparse visual information. In the final study (Chapter 4), I return to language, the initial inspiration for my investigations of events in vision. Here I test the hypothesis that the human brain represents verbs in part via their associated event structures. Using a model of verbs based on event-structure semantic features (e.g., Cause, Motion, Transfer), it was possible to successfully predict fMRI responses in language-selective brain regions as people engaged in real-time comprehension of naturalistic speech. Taken together, my research reveals that in both perception and language, the mind rapidly constructs a representation of the world that includes events with relational structure

    Understanding and Supporting Vocabulary Learners via Machine Learning on Behavioral and Linguistic Data

    Full text link
    This dissertation presents various machine learning applications for predicting different cognitive states of students while they are using a vocabulary tutoring system, DSCoVAR. We conduct four studies, each of which includes a comprehensive analysis of behavioral and linguistic data and provides data-driven evidence for designing personalized features for the system. The first study presents how behavioral and linguistic interactions from the vocabulary tutoring system can be used to predict students' off-task states. The study identifies which predictive features from interaction signals are more important and examines different types of off-task behaviors. The second study investigates how to automatically evaluate students' partial word knowledge from open-ended responses to definition questions. We present a technique that augments modern word-embedding techniques with a classic semantic differential scaling method from cognitive psychology. We then use this interpretable semantic scale method for predicting students' short- and long-term learning. The third and fourth studies show how to develop a model that can generate more efficient training curricula for both human and machine vocabulary learners. The third study illustrates a deep-learning model to score sentences for a contextual vocabulary learning curriculum. We use pre-trained language models, such as ELMo or BERT, and an additional attention layer to capture how the context words are less or more important with respect to the meaning of the target word. The fourth study examines how the contextual informativeness model, originally designed to develop curricula for human vocabulary learning, can also be used for developing curricula for various word embedding models. We identify sentences predicted as low informative for human learners are also less helpful for machine learning algorithms. Having a rich understanding of user behaviors, responses, and learning stimuli is imperative to develop an intelligent online system. Our studies demonstrate interpretable methods with cross-disciplinary approaches to understand various cognitive states of students during learning. The analysis results provide data-driven evidence for designing personalized features that can maximize learning outcomes. Datasets we collected from the studies will be shared publicly to promote future studies related to online tutoring systems. And these findings can also be applied to represent different user states observed in other online systems. In the future, we believe our findings can help to implement a more personalized vocabulary learning system, to develop a system that uses non-English texts or different types of inputs, and to investigate how the machine learning outputs interact with students.PHDInformationUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/162999/1/sjnam_1.pd

    Detecting Sarcasm in Multimodal Social Platforms

    Full text link
    Sarcasm is a peculiar form of sentiment expression, where the surface sentiment differs from the implied sentiment. The detection of sarcasm in social media platforms has been applied in the past mainly to textual utterances where lexical indicators (such as interjections and intensifiers), linguistic markers, and contextual information (such as user profiles, or past conversations) were used to detect the sarcastic tone. However, modern social media platforms allow to create multimodal messages where audiovisual content is integrated with the text, making the analysis of a mode in isolation partial. In our work, we first study the relationship between the textual and visual aspects in multimodal posts from three major social media platforms, i.e., Instagram, Tumblr and Twitter, and we run a crowdsourcing task to quantify the extent to which images are perceived as necessary by human annotators. Moreover, we propose two different computational frameworks to detect sarcasm that integrate the textual and visual modalities. The first approach exploits visual semantics trained on an external dataset, and concatenates the semantics features with state-of-the-art textual features. The second method adapts a visual neural network initialized with parameters trained on ImageNet to multimodal sarcastic posts. Results show the positive effect of combining modalities for the detection of sarcasm across platforms and methods.Comment: 10 pages, 3 figures, final version published in the Proceedings of ACM Multimedia 201

    The Processing of Negation and Polarity: An Overview

    Get PDF
    Negation is a universal component of human language; polarity sensitivity (i.e., lexical distributional constraints in relation to negation) is arguably so while being pervasive across languages. Negation has long been a field of inquiry in psychological theories and experiments of reasoning, which inspired many follow-up studies of negation and negation-related phenomena in psycholinguistics. In generative theoretical linguistics, negation and polarity sensitivity have been extensively studied, as the related phenomena are situated at the interfaces of syntax, semantics and pragmatics, and are thus extremely revealing about the architecture of grammar. With the now long tradition of research on negation and polarity in psychology and psycholinguistics, and the emerging field of experimental semantics and pragmatics, a multitude of interests and experimental paradigms have emerged which call for re-evaluations and further development and integration. This special issue contains a collection of 16 research articles on the processing of negation and negation-related phenomena including polarity items, questions, conditionals, and irony, using a combination of behavioral (e.g., rating, reading, eye-tracking and sentence completion) and neuroimaging techniques (e.g., EEG). They showcase the processing of negation and polarity with or without context, in various languages and across different populations (adults, typically developing and ADHD children). The integration of multiple theoretical and empirical perspectives in this collection provides new insights, methodological advances and directions for future research.Deutsche Forschungsgemeinschaft http://dx.doi.org/10.13039/501100001659Humboldt-Universität zu Berlin (1034)Peer Reviewe

    The Processing of Emotional Sentences by Young and Older Adults: A Visual World Eye-movement Study

    Get PDF
    Carminati MN, Knoeferle P. The Processing of Emotional Sentences by Young and Older Adults: A Visual World Eye-movement Study. Presented at the Architectures and Mechanisms of Language and Processing (AMLaP), Riva del Garda, Italy

    Implications of Computational Cognitive Models for Information Retrieval

    Get PDF
    This dissertation explores the implications of computational cognitive modeling for information retrieval. The parallel between information retrieval and human memory is that the goal of an information retrieval system is to find the set of documents most relevant to the query whereas the goal for the human memory system is to access the relevance of items stored in memory given a memory probe (Steyvers & Griffiths, 2010). The two major topics of this dissertation are desirability and information scent. Desirability is the context independent probability of an item receiving attention (Recker & Pitkow, 1996). Desirability has been widely utilized in numerous experiments to model the probability that a given memory item would be retrieved (Anderson, 2007). Information scent is a context dependent measure defined as the utility of an information item (Pirolli & Card, 1996b). Information scent has been widely utilized to predict the memory item that would be retrieved given a probe (Anderson, 2007) and to predict the browsing behavior of humans (Pirolli & Card, 1996b). In this dissertation, I proposed the theory that desirability observed in human memory is caused by preferential attachment in networks. Additionally, I showed that documents accessed in large repositories mirror the observed statistical properties in human memory and that these properties can be used to improve document ranking. Finally, I showed that the combination of information scent and desirability improves document ranking over existing well-established approaches
    corecore