Search CORE

31 research outputs found

Recommended from our members

Pragmatically Informative Color Generation by Grounding Contextual Modifiers

Author: Ong Desmond C
Wu Zhengxuan
Publication venue: ScholarWorks@UMass Amherst
Publication date: 01/01/2021
Field of study

Grounding language in contextual information is crucial for fine-grained natural language understanding. One important task that involves grounding contextual modifiers is color generation. Given a reference color green, and a modifier bluey, how does one generate a color that could represent bluey green? We propose a computational pragmatics model that formulates this color generation task as a recursive game between speakers and listeners. In our model, a pragmatic speaker reasons about the inferences that a listener would make, and thus generates a modified color that is maximally informative to help the listener recover the original referents. In this paper, we show that incorporating pragmatic information provides significant improvements in performance compared with other state-of-the-art deep learning models where pragmatic inference and flexibility in representing colors from a large continuous space are lacking

ScholarWorks@UMass Amherst

Context-Guided BERT for Targeted Aspect-Based Sentiment Analysis

Author: Ong Desmond C.
Wu Zhengxuan
Publication venue
Publication date: 14/12/2020
Field of study

Aspect-based sentiment analysis (ABSA) and Targeted ASBA (TABSA) allow finer-grained inferences about sentiment to be drawn from the same text, depending on context. For example, a given text can have different targets (e.g., neighborhoods) and different aspects (e.g., price or safety), with different sentiment associated with each target-aspect pair. In this paper, we investigate whether adding context to self-attention models improves performance on (T)ABSA. We propose two variants of Context-Guided BERT (CG-BERT) that learn to distribute attention under different contexts. We first adapt a context-aware Transformer to produce a CG-BERT that uses context-guided softmax-attention. Next, we propose an improved Quasi-Attention CG-BERT model that learns a compositional attention that supports subtractive attention. We train both models with pretrained BERT on two (T)ABSA datasets: SentiHood and SemEval-2014 (Task 4). Both models achieve new state-of-the-art results with our QACG-BERT model having the best performance. Furthermore, we provide analyses of the impact of context in the our proposed models. Our work provides more evidence for the utility of adding context-dependencies to pretrained self-attention-based language models for context-based natural language tasks

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

ReCOGS: How Incidental Details of a Logical Form Overshadow an Evaluation of Semantic Interpretation

Author: Manning Christopher D.
Potts Christopher
Wu Zhengxuan
Publication venue
Publication date: 23/03/2023
Field of study

Compositional generalization benchmarks seek to assess whether models can accurately compute meanings for novel sentences, but operationalize this in terms of logical form (LF) prediction. This raises the concern that semantically irrelevant details of the chosen LFs could shape model performance. We argue that this concern is realized for the COGS benchmark (Kim and Linzen, 2020). COGS poses generalization splits that appear impossible for present-day models, which could be taken as an indictment of those models. However, we show that the negative results trace to incidental features of COGS LFs. Converting these LFs to semantically equivalent ones and factoring out capabilities unrelated to semantic interpretation, we find that even baseline models get traction. A recent variable-free translation of COGS LFs suggests similar conclusions, but we observe this format is not semantically equivalent; it is incapable of accurately representing some COGS meanings. These findings inform our proposal for ReCOGS, a modified version of COGS that comes closer to assessing the target semantic capabilities while remaining very challenging. Overall, our results reaffirm the importance of compositional generalization and careful benchmark task design.Comment: 10 pages, 5 figure

arXiv.org e-Print Archive

MQuAKE: Assessing Knowledge Editing in Language Models via Multi-Hop Questions

Author: Chen Danqi
Manning Christopher D.
Potts Christopher
Wu Zhengxuan
Zhong Zexuan
Publication venue
Publication date: 24/05/2023
Field of study

The information stored in large language models (LLMs) falls out of date quickly, and retraining from scratch is often not an option. This has recently given rise to a range of techniques for injecting new facts through updating model weights. Current evaluation paradigms are extremely limited, mainly validating the recall of edited facts, but changing one fact should cause rippling changes to the model's related beliefs. If we edit the UK Prime Minister to now be Rishi Sunak, then we should get a different answer to Who is married to the British Prime Minister? In this work, we present a benchmark MQuAKE (Multi-hop Question Answering for Knowledge Editing) comprising multi-hop questions that assess whether edited models correctly answer questions where the answer should change as an entailed consequence of edited facts. While we find that current knowledge-editing approaches can recall edited facts accurately, they fail catastrophically on the constructed multi-hop questions. We thus propose a simple memory-based approach, MeLLo, which stores all edited facts externally while prompting the language model iteratively to generate answers that are consistent with the edited facts. While MQuAKE remains challenging, we show that MeLLo scales well with LLMs (up to 175B) and outperforms previous model editors by a large margin.Comment: Our code and datasets are available at https://github.com/princeton-nlp/MQuAK

arXiv.org e-Print Archive

Higher-order Topological and Nodal Superconductors MS (M = Nb and Ta) Transition-metal Sulfides

Author: An Yipeng
Chen Juncai
Liu Wuming
Ma Chunlan
Wang Jinfeng
Wang Tianxing
Wang Zhengxuan
Wu Ruqian
Yan Yong
Zhou Yinong
Publication venue
Publication date: 10/04/2023
Field of study

Intrinsic topological superconducting materials are exotic and vital to develop the next-generation topological superconducting devices, topological quantum calculations, and quantum information technologies. Here, we predict the topological and nodal superconductivity of MS (M = Nb and Ta) transition-metal sulfides by using the density functional theory for superconductors combining with the symmetry indicators. We reveal their higher-order topology nature with an index of Z4 = 2. These materials have a higher Tc than the Nb or Ta metal superconductors due to their flat-band and strong electron-phonon coupling nature. Electron doping and lighter isotopes can effectively enhance the Tc. Our findings show that the MS (M = Nb and Ta) systems can be new platforms to study exotic physics in the higher-order topological superconductors, and provide a theoretical support to utilize them as the topological superconducting devices in the field of advanced topological quantum calculations and information technologies.Comment: 5 pages, 3 figure

arXiv.org e-Print Archive

Dynabench: Rethinking Benchmarking in NLP

Author: Bansal Mohit
Bartolo Max
Geiger Atticus
Jia Robin
Kaushik Divyansh
Kiela Douwe
Ma Zhiyi
Nie Yixin
Potts Christopher
Prasad Grusha
Riedel Sebastian
Ringshia Pratik
Singh Amanpreet
Stenetorp Pontus
Thrush Tristan
Vidgen Bertie
Waseem Zeerak
Williams Adina
Wu Zhengxuan
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 11/06/2021
Field of study

We introduce Dynabench, an open-source platform for dynamic dataset creation and model benchmarking. Dynabench runs in a web browser and supports human-and-model-in-the-loop dataset creation: annotators seek to create examples that a target model will misclassify, but that another person will not. In this paper, we argue that Dynabench addresses a critical need in our community: contemporary models quickly achieve outstanding performance on benchmark tasks but nonetheless fail on simple challenge examples and falter in real-world scenarios. With Dynabench, dataset creation, model development, and model assessment can directly inform each other, leading to more robust and informative benchmarks. We report on four initial NLP tasks, illustrating these concepts and highlighting the promise of the platform, and address potential objections to dynamic benchmarking as a new standard for the field

UCL Discovery