Combining large language models with logical reasoning enhance their capacity
to address problems in a robust and reliable manner. Nevertheless, the
intricate nature of logical reasoning poses challenges to gathering reliable
data from web for building comprehensive training datasets, subsequently
affecting the performance on downstream tasks. To address this, we introduce a
novel logic-driven data augmentation approach, AMR-LDA. AMR-LDA converts the
original text into an Abstract Meaning Representation (AMR) graph, a structured
semantic representation that encapsulates the logic structure of the sentence,
upon which operations are performed to generate logically modified AMR graphs.
The modified AMR graphs are subsequently converted back into texts to create
augmented data. Notably, our methodology is architecture-agnostic and enhances
generative large language models, such as GPT-3.5 and GPT-4, through prompt
augmentation, and fine-tuning discriminative large language models through
contrastive learning with logic-driven data augmentation. Empirical evidence
underscores the efficacy of our proposed method with improvement in performance
across seven downstream tasks, such as logical reasoning reading comprehension,
textual entailment, and natural language inference. Furthermore, our method
ranked first on the ReClor leaderboard
\url{https://eval.ai/web/challenges/challenge-page/503/leaderboard/1347}. The
source code and data are publicly available
\url{https://github.com/Strong-AI-Lab/Logical-Equivalence-driven-AMR-Data-Augmentation-for-Representation-Learning}.Comment: Accepted for oral presentation at the LLM@IJCAI 2023 non-archival
symposiu