We present REMARK-LLM, a novel efficient, and robust watermarking framework
designed for texts generated by large language models (LLMs). Synthesizing
human-like content using LLMs necessitates vast computational resources and
extensive datasets, encapsulating critical intellectual property (IP). However,
the generated content is prone to malicious exploitation, including spamming
and plagiarism. To address the challenges, REMARK-LLM proposes three new
components: (i) a learning-based message encoding module to infuse binary
signatures into LLM-generated texts; (ii) a reparameterization module to
transform the dense distributions from the message encoding to the sparse
distribution of the watermarked textual tokens; (iii) a decoding module
dedicated for signature extraction; Furthermore, we introduce an optimized beam
search algorithm to guarantee the coherence and consistency of the generated
content. REMARK-LLM is rigorously trained to encourage the preservation of
semantic integrity in watermarked content, while ensuring effective watermark
retrieval. Extensive evaluations on multiple unseen datasets highlight
REMARK-LLM proficiency and transferability in inserting 2 times more signature
bits into the same texts when compared to prior art, all while maintaining
semantic integrity. Furthermore, REMARK-LLM exhibits better resilience against
a spectrum of watermark detection and removal attacks