DeepSeek-Coder: When the Large Language Model Meets Programming -- The
  Rise of Code Intelligence

Bi, Xiao; Chen, Guanting; Dong, Kai; Guo, Daya; Li, Y. K.; Liang, Wenfeng; Luo, Fuli; Wu, Y.; Xie, Zhenda; Xiong, Yingfei; Yang, Dejian; Zhang, Wentao; Zhu, Qihao

DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence

Authors: Xiao Bi
Guanting Chen
Kai Dong
Daya Guo
Y. K. Li
Wenfeng Liang
Fuli Luo
Y. Wu
Zhenda Xie
Yingfei Xiong
Dejian Yang
Wentao Zhang
Qihao Zhu
Publication date: 26 January 2024
Publisher

Abstract

The rapid development of large language models has revolutionized code intelligence in software development. However, the predominance of closed-source models has restricted extensive research and development. To address this, we introduce the DeepSeek-Coder series, a range of open-source code models with sizes from 1.3B to 33B, trained from scratch on 2 trillion tokens. These models are pre-trained on a high-quality project-level code corpus and employ a fill-in-the-blank task with a 16K window to enhance code generation and infilling. Our extensive evaluations demonstrate that DeepSeek-Coder not only achieves state-of-the-art performance among open-source code models across multiple benchmarks but also surpasses existing closed-source models like Codex and GPT-3.5. Furthermore, DeepSeek-Coder models are under a permissive license that allows for both research and unrestricted commercial use

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2401.14196

Last time updated on 24/08/2024