BenLLMEval: A Comprehensive Evaluation into the Potentials and Pitfalls
  of Large Language Models on Bengali NLP

Bari, M Saiful; Hoque, Enamul; Islam, Mohammed Saidul; Kabir, Mohsinul; Laskar, Md Tahmid Rahman; Nayeem, Mir Tafseer

BenLLMEval: A Comprehensive Evaluation into the Potentials and Pitfalls of Large Language Models on Bengali NLP

Authors: M Saiful Bari
Enamul Hoque
Mohammed Saidul Islam
Mohsinul Kabir
Md Tahmid Rahman Laskar
Mir Tafseer Nayeem
Publication date: 22 September 2023
Publisher

Abstract

Large Language Models (LLMs) have emerged as one of the most important breakthroughs in natural language processing (NLP) for their impressive skills in language generation and other language-specific tasks. Though LLMs have been evaluated in various tasks, mostly in English, they have not yet undergone thorough evaluation in under-resourced languages such as Bengali (Bangla). In this paper, we evaluate the performance of LLMs for the low-resourced Bangla language. We select various important and diverse Bangla NLP tasks, such as abstractive summarization, question answering, paraphrasing, natural language inference, text classification, and sentiment analysis for zero-shot evaluation with ChatGPT, LLaMA-2, and Claude-2 and compare the performance with state-of-the-art fine-tuned models. Our experimental results demonstrate an inferior performance of LLMs for different Bangla NLP tasks, calling for further effort to develop better understanding of LLMs in low-resource languages like Bangla.Comment: First two authors contributed equall

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2309.13173

Last time updated on 12/10/2023