Search CORE

102 research outputs found

In Search of the Long-Tail: Systematic Generation of Long-Tail Knowledge via Logical Rule Guided Search

Author: Brahman Faeze
Choi Yejin
Li Huihan
Li Xiang Lorraine
Liao Zeyi
Lu Ximing
Ning Yuting
Ren Xiang
Wang Siyuan
Zhao Wenting
Publication venue
Publication date: 13/11/2023
Field of study

Since large language models have approached human-level performance on many tasks, it has become increasingly harder for researchers to find tasks that are still challenging to the models. Failure cases usually come from the long-tail distribution - data that an oracle language model could assign a probability on the lower end of its distribution. Current methodology such as prompt engineering or crowdsourcing are insufficient for creating long-tail examples because humans are constrained by cognitive bias. We propose a Logic-Induced-Knowledge-Search (LINK) framework for systematically generating long-tail knowledge statements. Grounded by a symbolic rule, we search for long-tail values for each variable of the rule by first prompting a LLM, then verifying the correctness of the values with a critic, and lastly pushing for the long-tail distribution with a reranker. With this framework we construct a dataset, Logic-Induced-Long-Tail (LINT), consisting of 200 symbolic rules and 50K knowledge statements spanning across four domains. Human annotations find that 84% of the statements in LINT are factually correct. In contrast, ChatGPT and GPT4 struggle with directly generating long-tail statements under the guidance of logic rules, each only getting 56% and 78% of their statements correct. Moreover, their "long-tail" generations in fact fall into the higher likelihood range, and thus are not really long-tail. Our findings suggest that LINK is effective for generating data in the long-tail distribution while enforcing quality. LINT can be useful for systematically evaluating LLMs' capabilities in the long-tail distribution. We challenge the models with a simple entailment classification task using samples from LINT. We find that ChatGPT and GPT4's capability in identifying incorrect knowledge drop by ~3% in the long-tail distribution compared to head distribution

arXiv.org e-Print Archive

A Systematic Investigation of Commonsense Knowledge in Large Language Models

Author: Blunsom Phil
d'Autume Cyprien de Masson
Hoffmann Jordan
Kuncoro Adhiguna
Li Xiang Lorraine
Nematzadeh Aida
Publication venue
Publication date: 31/10/2022
Field of study

Language models (LMs) trained on large amounts of data have shown impressive performance on many NLP tasks under the zero-shot and few-shot setup. Here we aim to better understand the extent to which such models learn commonsense knowledge -- a critical component of many NLP applications. We conduct a systematic and rigorous zero-shot and few-shot commonsense evaluation of large pre-trained LMs, where we: (i) carefully control for the LMs' ability to exploit potential surface cues and annotation artefacts, and (ii) account for variations in performance that arise from factors that are not related to commonsense knowledge. Our findings highlight the limitations of pre-trained LMs in acquiring commonsense knowledge without task-specific supervision; furthermore, using larger models or few-shot evaluation are insufficient to achieve human-level commonsense performance.Comment: Accepted to EMNLP 202

arXiv.org e-Print Archive

Critical Role of Leucine-Valine Change in Distinct Low pH Requirements for Membrane Fusion between Two Related Retrovirus Envelopes

Author: Albritton Lorraine M.
Côté Marceline
Li Kun
Liu Shan-Lu
Xiang Shi-Hua
Zheng Yi-Min
Publication venue: DigitalCommons@University of Nebraska - Lincoln
Publication date: 01/01/2011
Field of study

Many viruses use a pH-dependent pathway for fusion with host cell membrane, the mechanism of which is still poorly understood. Here we report that a subtle leucine (Leu)-valine (Val) change at position 501 in the envelope glycoproteins (Envs) of two related retroviruses, jaagsiekte sheep retro-virus (JSRV) and enzootic nasal tumor virus (ENTV), is responsible for their distinct low pH requirements for membrane fusion and infection. The Leu and Val residues are predicted to reside within the C-terminal heptad repeat (HR2) region of JSRV and ENTV Envs, particularly proximal to the hairpin turn of the putative six-helix bundle (6HB). Substitution of the JSRV Leu with a Val blocked the Env-mediated membrane fusion at pH 5.0, whereas replacement of the ENTV Val with a Leu rendered the ENTV Env capable of fusing at pH 5.0. A Leu-Val change has no apparent effect on the stability of native Env but appears to stabilize an intermediate induced by receptor binding. These results are consistent with the existence of at least two metastable conformations of these viral glycoproteins, the native prefusion conformation and a receptor-induced metastable intermediate. Collectively, this work represents an interesting perhaps unique example whereby a simple Leu-Val change has critical impact on pH-dependent virus fusion and entry

Crossref

DigitalCommons@University of Nebraska

PubMed Central

Editing Commonsense Knowledge in GPT

Author: Gupta Anshita
Li Xiang Lorraine
Mondal Debanjan
Sheshadri Akshay Krishna
Tandon Niket
Wiegreffe Sarah
Zhao Wenlong
Publication venue
Publication date: 24/05/2023
Field of study

Memory editing methods for updating encyclopedic knowledge in transformers have received increasing attention for their efficacy, specificity, and generalization advantages. However, it remains unclear if such methods can be adapted for the more nuanced domain of commonsense knowledge. We propose

MEMIT_{CSK}

, an adaptation of MEMIT to edit commonsense mistakes in GPT-2 Large and XL. We extend editing to various token locations and employ a robust layer selection strategy. Models edited by

MEMIT_{CSK}

outperforms the fine-tuning baselines by 10.97% and 10.73% F1 scores on subsets of PEP3k and 20Q. We further propose a novel evaluation dataset, MEMIT-CSK-PROBE, that contains unaffected neighborhood, affected neighborhood, affected paraphrase, and affected reasoning challenges.

MEMIT_{CSK}

demonstrates favorable semantic generalization, outperforming fine-tuning baselines by 13.72% and 5.57% overall scores on MEMIT-CSK-PROBE. These results suggest a compelling future direction of incorporating context-specific user feedback concerning commonsense in GPT by direct model editing, rectifying and customizing model behaviors via human-in-the-loop systems.Comment: Code and data is available at https://github.com/anshitag/memit_cs

arXiv.org e-Print Archive

UNcommonsense Reasoning: Abductive Reasoning about Uncommon Situations

Author: Brahman Faeze
Chiu Justin T
Choi Yejin
Choudhury Sanjiban
Hessel Jack
Hwang Jena D.
Li Xiang Lorraine
Suhr Alane
Zhao Wenting
Publication venue
Publication date: 14/11/2023
Field of study

Language technologies that accurately model the dynamics of events must perform commonsense reasoning. Existing work evaluating commonsense reasoning focuses on making inferences about common, everyday situations. To instead investigate the ability to model unusual, unexpected, and unlikely situations, we explore the task of uncommonsense abductive reasoning. Given a piece of context with an unexpected outcome, this task requires reasoning abductively to generate a natural language explanation that makes the unexpected outcome more likely in the context. To this end, we curate and release a new English language corpus called UNcommonsense. We characterize the differences between the performance of human explainers and the best performing large language models, finding that model-enhanced human-written explanations achieve the highest quality by trading off between specificity and diversity. Finally, we experiment with several online imitation learning algorithms to train open and accessible language models on this task. When compared with the vanilla supervised fine-tuning approach, these methods consistently reduce lose rates on both common and uncommonsense abductive reasoning judged by human evaluators

arXiv.org e-Print Archive

Faith and Fate: Limits of Transformers on Compositionality

Author: Bhagavatula Chandra
Bras Ronan Le
Choi Yejin
Dziri Nouha
Ettinger Allyson
Harchaoui Zaid
Hwang Jena D.
Jian Liwei
Li Xiang Lorraine
Lin Bill Yuchen
Lu Ximing
Ren Xiang
Sanyal Soumya
Sclar Melanie
Welleck Sean
West Peter
Publication venue
Publication date: 29/05/2023
Field of study

Transformer large language models (LLMs) have sparked admiration for their exceptional performance on tasks that demand intricate multi-step reasoning. Yet, these models simultaneously show failures on surprisingly trivial problems. This begs the question: Are these errors incidental, or do they signal more substantial limitations? In an attempt to demystify Transformers, we investigate the limits of these models across three representative compositional tasks -- multi-digit multiplication, logic grid puzzles, and a classic dynamic programming problem. These tasks require breaking problems down into sub-steps and synthesizing these steps into a precise answer. We formulate compositional tasks as computation graphs to systematically quantify the level of complexity, and break down reasoning steps into intermediate sub-procedures. Our empirical findings suggest that Transformers solve compositional tasks by reducing multi-step compositional reasoning into linearized subgraph matching, without necessarily developing systematic problem-solving skills. To round off our empirical study, we provide theoretical arguments on abstract multi-step reasoning problems that highlight how Transformers' performance will rapidly decay with increased task complexity.Comment: 10 pages + appendix (21 pages

arXiv.org e-Print Archive

Characteristics of clinical studies of summer acupoint herbal patching: a bibliometric analysis

Author: BL Wen
BL Wen
Bronchitis Group of Shen Hospital
C Tai
CJ Chen
CJ Tai
Dan Yang
F Zhou
Fen Zhou
FQ Zhang
George Lewith
GH Lee
J Peng
J Zhang
Jian-ping Liu
Jing-yu Lu
Juan Cheng
Kai-yue Gao
L Zhang
LB Zhu
LD Yang
Lily Lai
Ruo-xue Yang
SX Wu
W Lorraine
WD Mo
Xiao-xiong Qi
XX Xiang
Ya-jing Zhou
Yan-fu Li
YG Fang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref