Semantic Text Transmission via Prediction with Small Language Models:
  Cost-Similarity Trade-off

Bhat, Rajshekhar V; Karevvanavar, Gangadhar; Madhabhavi, Bhavani A; Pappas, Nikolaos

Semantic Text Transmission via Prediction with Small Language Models: Cost-Similarity Trade-off

Authors: Rajshekhar V Bhat
Gangadhar Karevvanavar
Bhavani A Madhabhavi
Nikolaos Pappas
Publication date: 1 March 2024
Publisher

Abstract

We consider the communication of natural language text from a source to a destination over noiseless and character-erasure channels. We exploit language's inherent correlations and predictability to constrain transmission costs by allowing the destination to predict or complete words with potential dissimilarity with the source text. Concretely, our objective is to obtain achievable

(\bar{c}, \bar{s})

pairs, where

\bar{c}

is the average transmission cost at the source and

\bar{s}

is the average semantic similarity measured via cosine similarity between vector embedding of words at the source and those predicted/completed at the destination. We obtain

(\bar{c}, \bar{s})

pairs for neural language and first-order Markov chain-based small language models (SLM) for prediction, using both a threshold policy that transmits a word if its cosine similarity with that predicted/completed at the destination is below a threshold, and a periodic policy, which transmits words after a specific interval and predicts/completes the words in between, at the destination. We adopt an SLM for word completion. We demonstrate that, when communication occurs over a noiseless channel, the threshold policy achieves a higher

\bar{s}

for a given

\bar{c}

than the periodic policy and that the

\bar{s}

achieved with the neural SLM is greater than or equal to that of the Markov chain-based algorithm for the same

\bar{c}

. The improved performance comes with a higher complexity in terms of time and computing requirements. However, when communication occurs over a character-erasure channel, all prediction algorithms and scheduling policies perform poorly. Furthermore, if character-level Huffman coding is used, the required

\bar{c}

to achieve a given

\bar{s}

is reduced, but the above observations still apply

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2403.00290

Last time updated on 08/09/2024