7 research outputs found
Don’t Invite BERT to Drink a Bottle: Modeling the Interpretation of Metonymies Using BERT and Distributional Representations
In this work, we carry out two experiments in order to assess the ability of BERT to capture themeaning shift associated with metonymic expressions. We test the model on a new dataset that isrepresentative of the most common types of metonymy. We compare BERT with the StructuredDistributional Model (SDM), a model for the representation of words in context which is basedon the notion of Generalized Event Knowledge. The results reveal that, while BERT abilityto deal with metonymy is quite limited, SDM is good at predicting the meaning of metonymicexpressions, providing support for an account of metonymy based on event knowledge
Event knowledge in large language models: the gap between the impossible and the unlikely
Word co-occurrence patterns in language corpora contain a surprising amount
of conceptual knowledge. Large language models (LLMs), trained to predict words
in context, leverage these patterns to achieve impressive performance on
diverse semantic tasks requiring world knowledge. An important but understudied
question about LLMs' semantic abilities is whether they acquire generalized
knowledge of common events. Here, we test whether five pre-trained LLMs (from
2018's BERT to 2023's MPT) assign higher likelihood to plausible descriptions
of agent-patient interactions than to minimally different implausible versions
of the same event. Using three curated sets of minimal sentence pairs (total
n=1,215), we found that pre-trained LLMs possess substantial event knowledge,
outperforming other distributional language models. In particular, they almost
always assign higher likelihood to possible vs. impossible events (The teacher
bought the laptop vs. The laptop bought the teacher). However, LLMs show less
consistent preferences for likely vs. unlikely events (The nanny tutored the
boy vs. The boy tutored the nanny). In follow-up analyses, we show that (i) LLM
scores are driven by both plausibility and surface-level sentence features,
(ii) LLM scores generalize well across syntactic variants (active vs. passive
constructions) but less well across semantic variants (synonymous sentences),
(iii) some LLM errors mirror human judgment ambiguity, and (iv) sentence
plausibility serves as an organizing dimension in internal LLM representations.
Overall, our results show that important aspects of event knowledge naturally
emerge from distributional linguistic patterns, but also highlight a gap
between representations of possible/impossible and likely/unlikely events.Comment: The two lead authors have contributed equally to this wor
A Structured Distributional Model of Sentence Meaning and Processing
International audienceMost compositional distributional semantic models represent sentence meaning with a single vector. In this paper, we propose a Structured Distributional Model (SDM) that combines word embeddings with formal semantics and is based on the assumption that sentences represent events and situations. The semantic representation of a sentence is a formal structure derived from Discourse Representation Theory and containing distri-butional vectors. This structure is dynamically and incrementally built by integrating knowledge about events and their typical participants, as they are activated by lexical items. Event knowledge is modeled as a graph extracted from parsed corpora and encoding roles and relationships between participants that are represented as distributional vectors. SDM is grounded on extensive psycholinguistic research showing that generalized knowledge about events stored in semantic memory plays a key role in sentence comprehension. We evaluate SDM on two recently introduced compositionality datasets, and our results show that combining a simple compositional model with event knowledge constantly improves performances, even with different types of word embeddings. 1 Sentence Meaning in Vector Spaces While for decades sentence meaning has been represented in terms of complex formal structures, the most recent trend in computational semantics is to model semantic representations with dense distributional vectors (aka embeddings). As a matter of fact, distributional semantics has become one of the most influential approaches to lexical meaning, because of the important theoretical and computational advantages of representing words with continuous vectors, such as automatically learning lexical representations from natural language corpora and multimodal data, assessing semantic similarity in terms of the distance between the vectors, and dealing with the inherently gradient and fuzzy nature of meaning (Erk 2012, Lenci 2018a)
A structured distributional model of sentence meaning and processing
Most compositional distributional semantic models represent sentence meaning with a single vector. Inthis paper, we propose a structured distributional model (SDM) that combines word embeddings withformal semantics and is based on the assumption that sentences represent events and situations. Thesemantic representation of a sentence is a formal structure derived from discourse representation theoryand containing distributional vectors. This structure is dynamically and incrementally built by integratingknowledge about events and their typical participants, as they are activated by lexical items. Event knowl-edge is modelled as a graph extracted from parsed corpora and encoding roles and relationships betweenparticipants that are represented as distributional vectors. SDM is grounded on extensive psycholinguisticresearch showing that generalized knowledge about events stored in semantic memory plays a key rolein sentence comprehension. We evaluate SDM on two recently introduced compositionality data sets, andour results show that combining a simple compositional model with event knowledge constantly improvesperformances, even with dif ferent types of word embeddings
A Structured Distributional Model of Sentence Meaning and Processing
International audienceMost compositional distributional semantic models represent sentence meaning with a single vector. In this paper, we propose a Structured Distributional Model (SDM) that combines word embeddings with formal semantics and is based on the assumption that sentences represent events and situations. The semantic representation of a sentence is a formal structure derived from Discourse Representation Theory and containing distri-butional vectors. This structure is dynamically and incrementally built by integrating knowledge about events and their typical participants, as they are activated by lexical items. Event knowledge is modeled as a graph extracted from parsed corpora and encoding roles and relationships between participants that are represented as distributional vectors. SDM is grounded on extensive psycholinguistic research showing that generalized knowledge about events stored in semantic memory plays a key role in sentence comprehension. We evaluate SDM on two recently introduced compositionality datasets, and our results show that combining a simple compositional model with event knowledge constantly improves performances, even with different types of word embeddings. 1 Sentence Meaning in Vector Spaces While for decades sentence meaning has been represented in terms of complex formal structures, the most recent trend in computational semantics is to model semantic representations with dense distributional vectors (aka embeddings). As a matter of fact, distributional semantics has become one of the most influential approaches to lexical meaning, because of the important theoretical and computational advantages of representing words with continuous vectors, such as automatically learning lexical representations from natural language corpora and multimodal data, assessing semantic similarity in terms of the distance between the vectors, and dealing with the inherently gradient and fuzzy nature of meaning (Erk 2012, Lenci 2018a)