Search CORE

1 research outputs found

Emergent and Predictable Memorization in Large Language Models

Author: Anthony Quentin
Biderman Stella
Prashanth USVSN Sai
Purohit Shivanshu
Raf Edward
Schoelkopf Hailey
Sutawika Lintang
Publication venue
Publication date: 21/04/2023
Field of study

Memorization, or the tendency of large language models (LLMs) to output entire sequences from their training data verbatim, is a key concern for safely deploying language models. In particular, it is vital to minimize a model's memorization of sensitive datapoints such as those containing personal identifiable information (PII). The prevalence of such undesirable memorization can pose issues for model trainers, and may even require discarding an otherwise functional model. We therefore seek to predict which sequences will be memorized before a large model's full train-time by extrapolating the memorization behavior of lower-compute trial runs. We measure memorization of the Pythia model suite, and find that intermediate checkpoints are better predictors of a model's memorization behavior than smaller fully-trained models. We additionally provide further novel discoveries on the distribution of memorization scores across models and data

arXiv.org e-Print Archive