Enhancing Logical Reasoning of Large Language Models through
  Logic-Driven Data Augmentation

Bao, Qiming; Chen, Yang; Deng, Zhenyun; Denny, Paul; Gendron, Gael; Liu, Jiamou; Peng, Alex Yuxuan; Pistotti, Timothy; Tan, Neset; Witbrock, Michael; Young, Nathan; Zhong, Wanjun; Zhu, Yonghua

Enhancing Logical Reasoning of Large Language Models through Logic-Driven Data Augmentation

Authors: Qiming Bao
Yang Chen
Zhenyun Deng
Paul Denny
Gael Gendron
Jiamou Liu
Alex Yuxuan Peng
Timothy Pistotti
Neset Tan
Michael Witbrock
Nathan Young
Wanjun Zhong
Yonghua Zhu
Publication date: 14 October 2023
Publisher

Abstract

Combining large language models with logical reasoning enhance their capacity to address problems in a robust and reliable manner. Nevertheless, the intricate nature of logical reasoning poses challenges to gathering reliable data from web for building comprehensive training datasets, subsequently affecting the performance on downstream tasks. To address this, we introduce a novel logic-driven data augmentation approach, AMR-LDA. AMR-LDA converts the original text into an Abstract Meaning Representation (AMR) graph, a structured semantic representation that encapsulates the logic structure of the sentence, upon which operations are performed to generate logically modified AMR graphs. The modified AMR graphs are subsequently converted back into texts to create augmented data. Notably, our methodology is architecture-agnostic and enhances generative large language models, such as GPT-3.5 and GPT-4, through prompt augmentation, and fine-tuning discriminative large language models through contrastive learning with logic-driven data augmentation. Empirical evidence underscores the efficacy of our proposed method with improvement in performance across seven downstream tasks, such as logical reasoning reading comprehension, textual entailment, and natural language inference. Furthermore, our method ranked first on the ReClor leaderboard \url{https://eval.ai/web/challenges/challenge-page/503/leaderboard/1347}. The source code and data are publicly available \url{https://github.com/Strong-AI-Lab/Logical-Equivalence-driven-AMR-Data-Augmentation-for-Representation-Learning}.Comment: Accepted for oral presentation at the LLM@IJCAI 2023 non-archival symposiu

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2305.12599

Last time updated on 24/05/2023