Search CORE

23 research outputs found

Recommended from our members

Testing for Grammatical Category Abstraction in Neural Language Models

Author: Kim Najoung
Smolensky Paul
Publication venue: ScholarWorks@UMass Amherst
Publication date: 01/01/2021
Field of study

We propose a new method inspired by human developmental studies to probe pretrained neural language models on their ability to make grammatical category (part-of-speech) abstraction and generalization to novel contexts. Our method does not require training a separate classifier, bypassing the methodological questions raised in the recent literature on the validity of using diagnostic classifiers as probes. The results of our experiment testing BERT-large suggests that it can make category-based generalizations to a degree, but this capacity is still limited in several aspects

ScholarWorks@UMass Amherst

Entity Tracking in Language Models

Author: Kim Najoung
Schuster Sebastian
Publication venue
Publication date: 08/09/2023
Field of study

Keeping track of how states of entities change as a text or dialog unfolds is a key prerequisite to discourse understanding. Yet, there have been few systematic investigations into the ability of large language models (LLMs) to track discourse entities. In this work, we present a task probing to what extent a language model can infer the final state of an entity given an English description of the initial state and a series of state-changing operations. We use this task to first investigate whether Flan-T5, GPT-3 and GPT-3.5 can track the state of entities, and find that only GPT-3.5 models, which have been pretrained on large amounts of code, exhibit this ability. We then investigate whether smaller models pretrained primarily on text can learn to track entities, through finetuning T5 on several training/evaluation splits. While performance degrades for more complex splits, we find that even when evaluated on a different set of entities from training or longer operation sequences, a finetuned model can perform non-trivial entity tracking. Taken together, these results suggest that language models can learn to track entities but pretraining on text corpora alone does not make this capacity surface.Comment: ACL 2023 Camera-read

arXiv.org e-Print Archive

Abstraction via exemplars? A representational case study on lexical category inference in BERT

Author: Kim Najoung
Misra Kanishka
Publication venue
Publication date: 03/11/2023
Field of study

Exemplar based accounts are often considered to be in direct opposition to pure linguistic abstraction in explaining language learners' ability to generalize to novel expressions. However, the recent success of neural network language models on linguistically sensitive tasks suggests that perhaps abstractions can arise via the encoding of exemplars. We provide empirical evidence for this claim by adapting an existing experiment that studies how an LM (BERT) generalizes the usage of novel tokens that belong to lexical categories such as Noun/Verb/Adjective/Adverb from exposure to only a single instance of their usage. We analyze the representational behavior of the novel tokens in these experiments, and find that BERT's capacity to generalize to unseen expressions involving the use of these novel tokens constitutes the movement of novel token representations towards regions of known category exemplars in two-dimensional space. Our results suggest that learners' encoding of exemplars can indeed give rise to abstraction like behavior.Comment: 2-page abstract, to appear in BUCLD4

arXiv.org e-Print Archive

Entity tracking in language models

Author: Kim Najoung
Schuster Sebastian
Publication venue: Association for Computational Linguistics
Publication date: 10/01/2024
Field of study

https://doi.org/10.18653/v1/2023.acl-long.213Published versio

Boston University Institutional Repository (OpenBU)

Compositional Linguistic Generalization in Artificial Neural Networks

Author: Kim Najoung
Publication venue: 'The Busan Gyeongnam Mathematical Society'
Publication date: 21/02/2022
Field of study

Compositionality---the principle that the meaning of a complex expression is built from the meanings of its parts---is considered a central property of human language. This dissertation focuses on compositional generalization, a key benefit of compositionality that enables the production and comprehension of novel expressions. Specifically, this dissertation develops a test for compositional generalization for sequence-to-sequence artificial neural networks (ANNs). Before doing so, I start by developing a test for grammatical category abstraction: an important precondition to compositional generalization, because category membership determines the applicability of compositional rules. Then, I construct a test for compositional generalization based on human generalization patterns discussed in existing linguistic and developmental studies. The test takes the form of semantic parsing (translation from natural language expressions to semantic representations) where the training and generalization sets have systematic gaps that can be filled by composing known parts. The generalization cases fall into two broad categories: lexical and structural, depending on whether generalization to novel combinations of known lexical items and known structures is required, or generalization to novel structures is required. The ANNs evaluated on this test exhibit limited degrees of compositional generalization, implying that the inductive biases of the ANNs and human learners differ substantially. An error analysis reveals that all ANNs tested frequently make generalizations that violate faithfulness constraints (e.g., Emma saw Lina ↝ see'(Emma', Audrey') instead of see'(Emma', Lina')). Adding a glossing task (word-by-word translation)---a task that requires maximally faithful input-output mappings---as an auxiliary objective to the Transformer model (Vaswani et al. 2017) greatly improves generalization, demonstrating that a faithfulness bias can be injected through the auxiliary training approach. However, the improvement is limited to lexical generalization; all models struggle with assigning appropriate semantic representations to novel structures regardless of auxiliary training. This difficulty of structural generalization leaves open questions for both ANN and human learners. I discuss promising directions for improving structural generalization in ANNs, and furthermore propose an artificial language learning study for human subjects analogous to the tests posed to ANNs, which will lead to more detailed characterization of the patterns of structural generalization in human learners

JScholarship

Inverse scaling can become U-shaped

Author: Kim Najoung
Le Quoc
Tay Yi
Wei Jason
Publication venue: Association for Computational Linguistics
Publication date: 10/01/2024
Field of study

https://doi.org/10.18653/v1/2023.emnlp-main.963Published versio

Boston University Institutional Repository (OpenBU)

Reconstruction probing

Author: Khilnani Jatin
Kim Najoung
Qaddoumi Abdelrahim
Warstadt Alex
Publication venue: Association for Computational Linguistics
Publication date: 10/01/2024
Field of study

https://doi.org/10.18653/v1/2023.findings-acl.523Published versio

Boston University Institutional Repository (OpenBU)

(QA)2: Question answering with questionable assumptions: question answering with questionable assumptions

Author: Bowman Samuel R.
Htut Phu Mon
Kim Najoung
Petty Jackson
Publication venue: Association for Computational Linguistics
Publication date: 10/01/2024
Field of study

https://doi.org/10.18653/v1/2023.acl-long.472Published versio

Boston University Institutional Repository (OpenBU)

SLOG: a structural generalization benchmark for semantic parsing

Author: Donatelli Lucia
Kim Najoung
Koller Alexander
Li Bingzhi
Linzen Tal
Yao Yuekun
Publication venue: Association for Computational Linguistics
Publication date: 10/01/2024
Field of study

http://10.0.72.221/v1/2023.emnlp-main.194Published versio

Boston University Institutional Repository (OpenBU)

LAMBADA: Backward Chaining for Automated Reasoning in Natural Language

Author: Bhatia Deepti
Kazemi Seyed Mehran
Kim Najoung
Ramachandran Deepak
Xu Xin
Publication venue
Publication date: 20/12/2022
Field of study

Remarkable progress has been made on automated reasoning with knowledge specified as unstructured, natural text, by using the power of large language models (LMs) coupled with methods such as Chain-of-Thought prompting and Selection-Inference. These techniques search for proofs in the forward direction from axioms to the conclusion, which suffers from a combinatorial explosion of the search space, and thus high failure rates for problems requiring longer chains of reasoning. The classical automated reasoning literature has shown that reasoning in the backward direction (i.e. from the intended conclusion to the set of axioms that support it) is significantly more efficient at proof-finding problems. We import this intuition into the LM setting and develop a Backward Chaining algorithm, which we call LAMBADA, that decomposes reasoning into four sub-modules, each of which can be simply implemented by few-shot prompted LM inference. We show that LAMBADA achieves massive accuracy boosts over state-of-the-art forward reasoning methods on two challenging logical reasoning datasets, particularly when deep and accurate proof chains are required.Comment: 16 page

arXiv.org e-Print Archive

Boston University Institutional Repository (OpenBU)