7 research outputs found
Are distributional representations ready for the real world? Evaluating word vectors for grounded perceptual meaning
Distributional word representation methods exploit word co-occurrences to
build compact vector encodings of words. While these representations enjoy
widespread use in modern natural language processing, it is unclear whether
they accurately encode all necessary facets of conceptual meaning. In this
paper, we evaluate how well these representations can predict perceptual and
conceptual features of concrete concepts, drawing on two semantic norm datasets
sourced from human participants. We find that several standard word
representations fail to encode many salient perceptual features of concepts,
and show that these deficits correlate with word-word similarity prediction
errors. Our analyses provide motivation for grounded and embodied language
learning approaches, which may help to remedy these deficits.Comment: Accepted at RoboNLP 201
Event knowledge in large language models: the gap between the impossible and the unlikely
Word co-occurrence patterns in language corpora contain a surprising amount
of conceptual knowledge. Large language models (LLMs), trained to predict words
in context, leverage these patterns to achieve impressive performance on
diverse semantic tasks requiring world knowledge. An important but understudied
question about LLMs' semantic abilities is whether they acquire generalized
knowledge of common events. Here, we test whether five pre-trained LLMs (from
2018's BERT to 2023's MPT) assign higher likelihood to plausible descriptions
of agent-patient interactions than to minimally different implausible versions
of the same event. Using three curated sets of minimal sentence pairs (total
n=1,215), we found that pre-trained LLMs possess substantial event knowledge,
outperforming other distributional language models. In particular, they almost
always assign higher likelihood to possible vs. impossible events (The teacher
bought the laptop vs. The laptop bought the teacher). However, LLMs show less
consistent preferences for likely vs. unlikely events (The nanny tutored the
boy vs. The boy tutored the nanny). In follow-up analyses, we show that (i) LLM
scores are driven by both plausibility and surface-level sentence features,
(ii) LLM scores generalize well across syntactic variants (active vs. passive
constructions) but less well across semantic variants (synonymous sentences),
(iii) some LLM errors mirror human judgment ambiguity, and (iv) sentence
plausibility serves as an organizing dimension in internal LLM representations.
Overall, our results show that important aspects of event knowledge naturally
emerge from distributional linguistic patterns, but also highlight a gap
between representations of possible/impossible and likely/unlikely events.Comment: The two lead authors have contributed equally to this wor
Classifying Relations using Recurrent Neural Network with Ontological-Concept Embedding
Relation extraction and classification represents a fundamental and challenging aspect of Natural Language Processing (NLP) research which depends on other tasks such as entity detection and word sense disambiguation. Traditional relation extraction methods based on pattern-matching using regular expressions grammars and lexico-syntactic pattern rules suffer from several drawbacks including the labor involved in handcrafting and maintaining large number of rules that are difficult to reuse. Current research has focused on using Neural Networks to help improve the accuracy of relation extraction tasks using a specific type of Recurrent Neural Network (RNN). A promising approach for relation classification uses an RNN that incorporates an ontology-based concept embedding layer in addition to word embeddings. This dissertation presents several improvements to this approach by addressing its main limitations. First, several different types of semantic relationships between concepts are incorporated into the model; prior work has only considered is-a hierarchical relationships. Secondly, a significantly larger vocabulary of concepts is used. Thirdly, an improved method for concept matching was devised. The results of adding these improvements to two state-of-the-art baseline models demonstrated an improvement to accuracy when evaluated on benchmark data used in prior studies