7 research outputs found

    Don’t Invite BERT to Drink a Bottle: Modeling the Interpretation of Metonymies Using BERT and Distributional Representations

    Get PDF
    In this work, we carry out two experiments in order to assess the ability of BERT to capture themeaning shift associated with metonymic expressions. We test the model on a new dataset that isrepresentative of the most common types of metonymy. We compare BERT with the StructuredDistributional Model (SDM), a model for the representation of words in context which is basedon the notion of Generalized Event Knowledge. The results reveal that, while BERT abilityto deal with metonymy is quite limited, SDM is good at predicting the meaning of metonymicexpressions, providing support for an account of metonymy based on event knowledge

    Event knowledge in large language models: the gap between the impossible and the unlikely

    Full text link
    Word co-occurrence patterns in language corpora contain a surprising amount of conceptual knowledge. Large language models (LLMs), trained to predict words in context, leverage these patterns to achieve impressive performance on diverse semantic tasks requiring world knowledge. An important but understudied question about LLMs' semantic abilities is whether they acquire generalized knowledge of common events. Here, we test whether five pre-trained LLMs (from 2018's BERT to 2023's MPT) assign higher likelihood to plausible descriptions of agent-patient interactions than to minimally different implausible versions of the same event. Using three curated sets of minimal sentence pairs (total n=1,215), we found that pre-trained LLMs possess substantial event knowledge, outperforming other distributional language models. In particular, they almost always assign higher likelihood to possible vs. impossible events (The teacher bought the laptop vs. The laptop bought the teacher). However, LLMs show less consistent preferences for likely vs. unlikely events (The nanny tutored the boy vs. The boy tutored the nanny). In follow-up analyses, we show that (i) LLM scores are driven by both plausibility and surface-level sentence features, (ii) LLM scores generalize well across syntactic variants (active vs. passive constructions) but less well across semantic variants (synonymous sentences), (iii) some LLM errors mirror human judgment ambiguity, and (iv) sentence plausibility serves as an organizing dimension in internal LLM representations. Overall, our results show that important aspects of event knowledge naturally emerge from distributional linguistic patterns, but also highlight a gap between representations of possible/impossible and likely/unlikely events.Comment: The two lead authors have contributed equally to this wor

    A Structured Distributional Model of Sentence Meaning and Processing

    No full text
    International audienceMost compositional distributional semantic models represent sentence meaning with a single vector. In this paper, we propose a Structured Distributional Model (SDM) that combines word embeddings with formal semantics and is based on the assumption that sentences represent events and situations. The semantic representation of a sentence is a formal structure derived from Discourse Representation Theory and containing distri-butional vectors. This structure is dynamically and incrementally built by integrating knowledge about events and their typical participants, as they are activated by lexical items. Event knowledge is modeled as a graph extracted from parsed corpora and encoding roles and relationships between participants that are represented as distributional vectors. SDM is grounded on extensive psycholinguistic research showing that generalized knowledge about events stored in semantic memory plays a key role in sentence comprehension. We evaluate SDM on two recently introduced compositionality datasets, and our results show that combining a simple compositional model with event knowledge constantly improves performances, even with different types of word embeddings. 1 Sentence Meaning in Vector Spaces While for decades sentence meaning has been represented in terms of complex formal structures, the most recent trend in computational semantics is to model semantic representations with dense distributional vectors (aka embeddings). As a matter of fact, distributional semantics has become one of the most influential approaches to lexical meaning, because of the important theoretical and computational advantages of representing words with continuous vectors, such as automatically learning lexical representations from natural language corpora and multimodal data, assessing semantic similarity in terms of the distance between the vectors, and dealing with the inherently gradient and fuzzy nature of meaning (Erk 2012, Lenci 2018a)

    A structured distributional model of sentence meaning and processing

    No full text
    Most compositional distributional semantic models represent sentence meaning with a single vector. Inthis paper, we propose a structured distributional model (SDM) that combines word embeddings withformal semantics and is based on the assumption that sentences represent events and situations. Thesemantic representation of a sentence is a formal structure derived from discourse representation theoryand containing distributional vectors. This structure is dynamically and incrementally built by integratingknowledge about events and their typical participants, as they are activated by lexical items. Event knowl-edge is modelled as a graph extracted from parsed corpora and encoding roles and relationships betweenparticipants that are represented as distributional vectors. SDM is grounded on extensive psycholinguisticresearch showing that generalized knowledge about events stored in semantic memory plays a key rolein sentence comprehension. We evaluate SDM on two recently introduced compositionality data sets, andour results show that combining a simple compositional model with event knowledge constantly improvesperformances, even with dif ferent types of word embeddings

    A Structured Distributional Model of Sentence Meaning and Processing

    No full text
    International audienceMost compositional distributional semantic models represent sentence meaning with a single vector. In this paper, we propose a Structured Distributional Model (SDM) that combines word embeddings with formal semantics and is based on the assumption that sentences represent events and situations. The semantic representation of a sentence is a formal structure derived from Discourse Representation Theory and containing distri-butional vectors. This structure is dynamically and incrementally built by integrating knowledge about events and their typical participants, as they are activated by lexical items. Event knowledge is modeled as a graph extracted from parsed corpora and encoding roles and relationships between participants that are represented as distributional vectors. SDM is grounded on extensive psycholinguistic research showing that generalized knowledge about events stored in semantic memory plays a key role in sentence comprehension. We evaluate SDM on two recently introduced compositionality datasets, and our results show that combining a simple compositional model with event knowledge constantly improves performances, even with different types of word embeddings. 1 Sentence Meaning in Vector Spaces While for decades sentence meaning has been represented in terms of complex formal structures, the most recent trend in computational semantics is to model semantic representations with dense distributional vectors (aka embeddings). As a matter of fact, distributional semantics has become one of the most influential approaches to lexical meaning, because of the important theoretical and computational advantages of representing words with continuous vectors, such as automatically learning lexical representations from natural language corpora and multimodal data, assessing semantic similarity in terms of the distance between the vectors, and dealing with the inherently gradient and fuzzy nature of meaning (Erk 2012, Lenci 2018a)
    corecore