Everyone Deserves A Reward: Learning Customized Human Preferences

Bai, Ke; Cheng, Pengyu; Dai, Yong; Du, Nan; Xie, Jiawen

Everyone Deserves A Reward: Learning Customized Human Preferences

Authors: Ke Bai
Pengyu Cheng
Yong Dai
Nan Du
Jiawen Xie
Publication date: 6 September 2023
Publisher

Abstract

Reward models (RMs) are crucial in aligning large language models (LLMs) with human preferences for improving interaction quality. However, the real world is pluralistic, which leads to diversified human preferences based on different religions, politics, cultures, etc. Moreover, each individual can have their own unique preferences on various topics. Neglecting the diversity of human preferences, current LLM training processes only use a general reward model, which is below satisfaction for customized or personalized application scenarios. To explore customized preference learning, we collect a domain-specific preference (DSP) dataset, which collects preferred responses to each given query from four practical domains. Besides, from the perspective of data efficiency, we proposed a three-stage customized RM learning scheme, whose effectiveness is empirically verified on both general preference datasets and our DSP set. Furthermore, we test multiple training and data strategies on the three learning stages, and have found several ways to better preserve the general preferring ability while training the customized RMs, especially general preference enrichment and customized preference imitation learning. The DSP dataset and code are available at https://github.com/Linear95/DSP

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2309.03126

Last time updated on 12/09/2023