Reducing Sequence Length by Predicting Edit Operations with Large
  Language Models

Kaneko, Masahiro; Okazaki, Naoaki

Reducing Sequence Length by Predicting Edit Operations with Large Language Models

Authors: Masahiro Kaneko
Naoaki Okazaki
Publication date: 19 May 2023
Publisher

Abstract

Large Language Models (LLMs) have demonstrated remarkable performance in various tasks and gained significant attention. LLMs are also used for local sequence transduction tasks, including grammatical error correction (GEC) and formality style transfer, where most tokens in a source text are kept unchanged. However, it is inefficient to generate all target tokens because a prediction error of a target token may cause a catastrophe in predicting subsequent tokens and because the computational cost grows quadratically with the target sequence length. This paper proposes to predict a set of edit operations for the source text for local sequence transduction tasks. Representing an edit operation with a span of the source text and changed tokens, we can reduce the length of the target sequence and thus the computational cost for inference. We apply instruction tuning for LLMs on the supervision data of edit operations. Experiments show that the proposed method achieves comparable performance to the baseline in four tasks, paraphrasing, formality style transfer, GEC, and text simplification, despite reducing the length of the target text by as small as 21\%. Furthermore, we report that the instruction tuning with the proposed method achieved the state-of-the-art performance in the four tasks.Comment: Work in progres

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2305.11862

Last time updated on 24/05/2023