Search CORE

154 research outputs found

Progressive-Hint Prompting Improves Reasoning in Large Language Models

Author: Li Yu
Li Zhenguo
Liu Zhengying
Xie Enze
Zheng Chuanyang
Publication venue
Publication date: 19/04/2023
Field of study

The performance of Large Language Models (LLMs) in reasoning tasks depends heavily on prompt design, with Chain-of-Thought (CoT) and self-consistency being critical methods that enhance this ability. However, these methods do not fully exploit the answers generated by the LLM to guide subsequent responses. This paper proposes a new prompting method, named Progressive-Hint Prompting (PHP), that enables automatic multiple interactions between users and LLMs by using previously generated answers as hints to progressively guide toward the correct answers. PHP is orthogonal to CoT and self-consistency, making it easy to combine with state-of-the-art techniques to further improve performance. We conducted an extensive and comprehensive evaluation to demonstrate the effectiveness of the proposed method. Our experimental results on six benchmarks show that combining CoT and self-consistency with PHP significantly improves accuracy while remaining highly efficient. For instance, with text-davinci-003, we observed a 4.2% improvement on GSM8K with greedy decoding compared to Complex CoT, and a 46.17% reduction in sample paths with self-consistency. With GPT-4 and PHP, we achieve state-of-the-art performances on SVAMP (91.9%), GSM8K (95.5%) and AQuA (79.9%).Comment: Tech Repor

arXiv.org e-Print Archive

Learning to Prove Trigonometric Identities

Author: Li Lin
Li Yujun
Li Zhenguo
Liu Zhengying
Liu Zhou
Publication venue
Publication date: 14/07/2022
Field of study

Automatic theorem proving with deep learning methods has attracted attentions recently. In this paper, we construct an automatic proof system for trigonometric identities. We define the normalized form of trigonometric identities, design a set of rules for the proof and put forward a method which can generate theoretically infinite trigonometric identities. Our goal is not only to complete the proof, but to complete the proof in as few steps as possible. For this reason, we design a model to learn proof data generated by random BFS (rBFS), and it is proved theoretically and experimentally that the model can outperform rBFS after a simple imitation learning. After further improvement through reinforcement learning, we get AutoTrig, which can give proof steps for identities in almost as short steps as BFS (theoretically shortest method), with a time cost of only one-thousandth. In addition, AutoTrig also beats Sympy, Matlab and human in the synthetic dataset, and performs well in many generalization tasks

arXiv.org e-Print Archive

Backward Reasoning in Large Language Models for Verification

Author: Jiang Weisen
Kwok James T.
Li Zhenguo
Liu Zhengying
Shi Han
Yu Longhui
Zhang Yu
Publication venue
Publication date: 15/08/2023
Field of study

Chain-of-Though (CoT) prompting has shown promising performance in various reasoning tasks. Recently, Self-Consistency \citep{wang2023selfconsistency} proposes to sample a diverse set of reasoning chains which may lead to different answers while the answer that receives the most votes is selected. In this paper, we propose a novel method to use backward reasoning in verifying candidate answers. We mask a token in the question by

{\bf x}

and ask the LLM to predict the masked token when a candidate answer is provided by \textit{a simple template}, i.e., ``\textit{\textbf{If we know the answer of the above question is \{a candidate answer\}, what is the value of unknown variable

{\bf x}

?}}'' Intuitively, the LLM is expected to predict the masked token successfully if the provided candidate answer is correct. We further propose FOBAR to combine forward and backward reasoning for estimating the probability of candidate answers. We conduct extensive experiments on six data sets and three LLMs. Experimental results demonstrate that FOBAR achieves state-of-the-art performance on various reasoning benchmarks.Comment: Preprin

arXiv.org e-Print Archive

MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models

Author: Jiang Weisen
Kwok James T.
Li Zhenguo
Liu Weiyang
Liu Zhengying
Shi Han
Weller Adrian
Yu Jincheng
Yu Longhui
Zhang Yu
Publication venue
Publication date: 09/10/2023
Field of study

Large language models (LLMs) have pushed the limits of natural language understanding and exhibited excellent problem-solving ability. Despite the great success, most existing open-source LLMs (e.g., LLaMA-2) are still far away from satisfactory for solving mathematical problem due to the complex reasoning procedures. To bridge this gap, we propose MetaMath, a fine-tuned language model that specializes in mathematical reasoning. Specifically, we start by bootstrapping mathematical questions by rewriting the question from multiple perspectives without extra knowledge, which results in a new dataset called MetaMathQA. Then we fine-tune the LLaMA-2 models on MetaMathQA. Experimental results on two popular benchmarks (i.e., GSM8K and MATH) for mathematical reasoning demonstrate that MetaMath outperforms a suite of open-source LLMs by a significant margin. Our MetaMath-7B model achieves 66.4% on GSM8K and 19.4% on MATH, exceeding the state-of-the-art models of the same size by 11.5% and 8.7%. Particularly, MetaMath-70B achieves an accuracy of 82.3% on GSM8K, slightly better than GPT-3.5-Turbo. We release all the MetaMathQA dataset, the MetaMath models with different model sizes and the training code for public use.Comment: Technical Report, Work in Progress. Project Page: https://meta-math.github.io

arXiv.org e-Print Archive

Irreversible dual inhibitory mode: the novel Btk inhibitor PLS-123 demonstrates promising anti-tumor activity in human B-cell lymphoma.

Author: Ding Ning
Feng Lixia
Fu Kai
Li Xitao
Pan Zhengying
Ping Lingyan
Shi Yunfei
Song Yuqin
Wu Lina
Zheng Xiaohui
Zhu Jun
Publication venue: DigitalCommons@UNMC
Publication date: 14/04/2015
Field of study

The B-cell receptor (BCR) signaling pathway has gained significant attention as a therapeutic target in B-cell malignancies. Recently, several drugs that target the BCR signaling pathway, especially the Btk inhibitor ibrutinib, have demonstrated notable therapeutic effects in relapsed/refractory patients, which indicates that pharmacological inhibition of BCR pathway holds promise in B-cell lymphoma treatment. Here we present a novel covalent irreversible Btk inhibitor PLS-123 with more potent anti-proliferative activity compared with ibrutinib in multiple cellular and in vivo models through effective apoptosis induction and dual-action inhibitory mode of Btk activation. The phosphorylation of BCR downstream activating AKT/mTOR and MAPK signal pathways was also more significantly reduced after treatment with PLS-123 than ibrutinib. Gene expression profile analysis further suggested that the different selectivity profile of PLS-123 led to significant downregulation of oncogenic gene PTPN11 expression, which might also offer new opportunities beyond what ibrutinib has achieved. In addition, PLS-123 dose-dependently attenuated BCR- and chemokine-mediated lymphoma cell adhesion and migration. Taken together, Btk inhibitor PLS-123 suggested a new direction to pharmacologically modulate Btk function and develop novel therapeutic drug for B-cell lymphoma treatment

PubMed Central

University of Nebraska Medical Center Research: DigitalCommons@UNMC

LEGO-Prover: Neural Theorem Proving with Growing Libraries

Author: Cao Qingxing
Huang Yinya
Li Lin
Li Zhenguo
Liang Xiaodan
Liao Heng
Liu Zhengying
Shi Han
Wang Haiming
Xie Enze
Xin Huajian
Xiong Jing
Yin Jian
Zheng Chuanyang
Publication venue
Publication date: 27/10/2023
Field of study

Despite the success of large language models (LLMs), the task of theorem proving still remains one of the hardest reasoning tasks that is far from being fully solved. Prior methods using language models have demonstrated promising results, but they still struggle to prove even middle school level theorems. One common limitation of these methods is that they assume a fixed theorem library during the whole theorem proving process. However, as we all know, creating new useful theorems or even new theories is not only helpful but crucial and necessary for advancing mathematics and proving harder and deeper results. In this work, we present LEGO-Prover, which employs a growing skill library containing verified lemmas as skills to augment the capability of LLMs used in theorem proving. By constructing the proof modularly, LEGO-Prover enables LLMs to utilize existing skills retrieved from the library and to create new skills during the proving process. These skills are further evolved (by prompting an LLM) to enrich the library on another scale. Modular and reusable skills are constantly added to the library to enable tackling increasingly intricate mathematical problems. Moreover, the learned library further bridges the gap between human proofs and formal proofs by making it easier to impute missing steps. LEGO-Prover advances the state-of-the-art pass rate on miniF2F-valid (48.0% to 57.0%) and miniF2F-test (45.5% to 47.1%). During the proving process, LEGO-Prover also manages to generate over 20,000 skills (theorems/lemmas) and adds them to the growing library. Our ablation study indicates that these newly added skills are indeed helpful for proving theorems, resulting in an improvement from a success rate of 47.1% to 50.4%. We also release our code and all the generated skills

arXiv.org e-Print Archive

FIMO: A Challenge Formal Dataset for Automated Theorem Proving

Author: Ju Wei
Li Lin
Liu Chengwu
Liu Qun
Liu Zhengying
Shen Jianhao
Wang Haiming
Xin Huajian
Yin Yichun
Yuan Ye
Zhang Ming
Zheng Chuanyang
Publication venue
Publication date: 08/09/2023
Field of study

We present FIMO, an innovative dataset comprising formal mathematical problem statements sourced from the International Mathematical Olympiad (IMO) Shortlisted Problems. Designed to facilitate advanced automated theorem proving at the IMO level, FIMO is currently tailored for the Lean formal language. It comprises 149 formal problem statements, accompanied by both informal problem descriptions and their corresponding LaTeX-based informal proofs. Through initial experiments involving GPT-4, our findings underscore the existing limitations in current methodologies, indicating a substantial journey ahead before achieving satisfactory IMO-level automated theorem proving outcomes

arXiv.org e-Print Archive

Analysis on Wheel – Ground Contact Load Characteristics of Unmanned Off - road Vehicles

Author: Boliang Liu
Feng Ren
Longhai Li
Tao Song
Xun Gong
Yaowu Shi
Zhengying Jiang
Publication venue: 'International Hellenic University'
Publication date: 01/06/2017
Field of study

Directory of Open Access Journals

TRIGO: Benchmarking Formal Mathematical Proof Reduction for Generative Language Models

Author: Cao Qingxing
Guo Zhijiang
Huang Yinya
Li Lin
Liang Xiaodan
Liu Qun
Liu Zhengying
Shen Jianhao
Wang Haiming
Xiong Jing
Yin Yichun
Yuan Ye
Zhang Ming
Zheng Chuanyang
Publication venue
Publication date: 24/10/2023
Field of study

Automated theorem proving (ATP) has become an appealing domain for exploring the reasoning ability of the recent successful generative language models. However, current ATP benchmarks mainly focus on symbolic inference, but rarely involve the understanding of complex number combination reasoning. In this work, we propose TRIGO, an ATP benchmark that not only requires a model to reduce a trigonometric expression with step-by-step proofs but also evaluates a generative LM's reasoning ability on formulas and its capability to manipulate, group, and factor number terms. We gather trigonometric expressions and their reduced forms from the web, annotate the simplification process manually, and translate it into the Lean formal language system. We then automatically generate additional examples from the annotated samples to expand the dataset. Furthermore, we develop an automatic generator based on Lean-Gym to create dataset splits of varying difficulties and distributions in order to thoroughly analyze the model's generalization ability. Our extensive experiments show our proposed TRIGO poses a new challenge for advanced generative LM's including GPT-4 which is pre-trained on a considerable amount of open-source formal theorem-proving language data, and provide a new tool to study the generative LM's ability on both formal and mathematical reasoning.Comment: Accepted by EMNLP 2023. Code is available at https://github.com/menik1126/TRIG

arXiv.org e-Print Archive