AceGPT, Localizing Large Language Models in Arabic

Alharthi, Abdulmohsen; An, Bang; Chen, Junying; Chen, Zhihong; Cheng, Hao; Huang, Huang; Li, Haizhou; Li, Jianquan; Liu, Ziche; Song, Dingjie; Sun, Ruoyu; Sun, Xuening; Wan, Xiang; Wang, Benyou; Xu, Jinchao; Yu, Fei; Zhang, Lian; Zhang, Zhiyi; Zhu, Jianqing

AceGPT, Localizing Large Language Models in Arabic

Authors: Abdulmohsen Alharthi
Bang An
Junying Chen
Zhihong Chen
Hao Cheng
Huang Huang
Haizhou Li
Jianquan Li
Ziche Liu
Dingjie Song
Ruoyu Sun
Xuening Sun
Xiang Wan
Benyou Wang
Jinchao Xu
Fei Yu
Lian Zhang
Zhiyi Zhang
Jianqing Zhu
Publication date: 22 September 2023
Publisher

Abstract

This paper explores the imperative need and methodology for developing a localized Large Language Model (LLM) tailored for Arabic, a language with unique cultural characteristics that are not adequately addressed by current mainstream models like ChatGPT. Key concerns additionally arise when considering cultural sensitivity and local values. To this end, the paper outlines a packaged solution, including further pre-training with Arabic texts, supervised fine-tuning (SFT) using native Arabic instructions and GPT-4 responses in Arabic, and reinforcement learning with AI feedback (RLAIF) using a reward model that is sensitive to local culture and values. The objective is to train culturally aware and value-aligned Arabic LLMs that can serve the diverse application-specific needs of Arabic-speaking communities. Extensive evaluations demonstrated that the resulting LLM called `AceGPT' is the SOTA open Arabic LLM in various benchmarks, including instruction-following benchmark (i.e., Arabic Vicuna-80 and Arabic AlpacaEval), knowledge benchmark (i.e., Arabic MMLU and EXAMs), as well as the newly-proposed Arabic cultural \& value alignment benchmark. Notably, AceGPT outperforms ChatGPT in the popular Vicuna-80 benchmark when evaluated with GPT-4, despite the benchmark's limited scale. % Natural Language Understanding (NLU) benchmark (i.e., ALUE) Codes, data, and models are in https://github.com/FreedomIntelligence/AceGPT.Comment: https://github.com/FreedomIntelligence/AceGP

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2309.12053

Last time updated on 12/10/2023