TeacherLM: Teaching to Fish Rather Than Giving the Fish, Language
  Modeling Likewise

Cheng, Zirui; He, Nan; Hou, Zhaohui; Huang, Zhiyuan; Lai, Hanyu; Liang, Ding; Lu, Rui; Lu, Ruofan; Lu, Shaoqing; Pan, Junting; Qin, Ruoyu; Zhan, Mingjie; Zhang, Yunchen; Zhao, Chenyang; Zhao, Gangming

TeacherLM: Teaching to Fish Rather Than Giving the Fish, Language Modeling Likewise

Authors: Zirui Cheng
Nan He
Zhaohui Hou
Zhiyuan Huang
Hanyu Lai
Ding Liang
Rui Lu
Ruofan Lu
Shaoqing Lu
Junting Pan
Ruoyu Qin
Mingjie Zhan
Yunchen Zhang
Chenyang Zhao
Gangming Zhao
Publication date: 31 October 2023
Publisher

Abstract

Large Language Models (LLMs) exhibit impressive reasoning and data augmentation capabilities in various NLP tasks. However, what about small models? In this work, we propose TeacherLM-7.1B, capable of annotating relevant fundamentals, chain of thought, and common mistakes for most NLP samples, which makes annotation more than just an answer, thus allowing other models to learn "why" instead of just "what". The TeacherLM-7.1B model achieved a zero-shot score of 52.3 on MMLU, surpassing most models with over 100B parameters. Even more remarkable is its data augmentation ability. Based on TeacherLM-7.1B, we augmented 58 NLP datasets and taught various student models with different parameters from OPT and BLOOM series in a multi-task setting. The experimental results indicate that the data augmentation provided by TeacherLM has brought significant benefits. We will release the TeacherLM series of models and augmented datasets as open-source.Comment: 5 figures, 15 page

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2310.19019

Last time updated on 18/01/2024