KwaiYiiMath: Technical Report

Chen, Bin; Chen, Zhengzong; Fu, Jiayi; Gai, Kun; Gao, Xiaoyang; Li, Yan; Liao, Chao; Liao, Yiqiao; Lin, Lei; Lin, Zijia; Liu, Pengli; Liu, Yuliang; Song, Chengru; Wan, Junchen; Wang, Zhongyuan; Yang, Zhirui; Ye, Xucheng; Zhang, Di; Zhang, Fuzheng; Zhang, Shengnan; Zheng, Xue

KwaiYiiMath: Technical Report

Authors: Bin Chen
Zhengzong Chen
Jiayi Fu
Kun Gai
Xiaoyang Gao
Yan Li
Chao Liao
Yiqiao Liao
Lei Lin
Zijia Lin
Pengli Liu
Yuliang Liu
Chengru Song
Junchen Wan
Zhongyuan Wang
Zhirui Yang
Xucheng Ye
Di Zhang
Fuzheng Zhang
Shengnan Zhang
Xue Zheng
Publication date: 19 October 2023
Publisher

Abstract

Recent advancements in large language models (LLMs) have demonstrated remarkable abilities in handling a variety of natural language processing (NLP) downstream tasks, even on mathematical tasks requiring multi-step reasoning. In this report, we introduce the KwaiYiiMath which enhances the mathematical reasoning abilities of KwaiYiiBase1, by applying Supervised Fine-Tuning (SFT) and Reinforced Learning from Human Feedback (RLHF), including on both English and Chinese mathematical tasks. Meanwhile, we also constructed a small-scale Chinese primary school mathematics test set (named KMath), consisting of 188 examples to evaluate the correctness of the problem-solving process generated by the models. Empirical studies demonstrate that KwaiYiiMath can achieve state-of-the-art (SOTA) performance on GSM8k, CMath, and KMath compared with the similar size models, respectively.Comment: technical report. arXiv admin note: text overlap with arXiv:2306.16636 by other author

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2310.07488

Last time updated on 06/01/2024