2 research outputs found
Masader Plus: A New Interface for Exploring +500 Arabic NLP Datasets
Masader (Alyafeai et al., 2021) created a metadata structure to be used for
cataloguing Arabic NLP datasets. However, developing an easy way to explore
such a catalogue is a challenging task. In order to give the optimal experience
for users and researchers exploring the catalogue, several design and user
experience challenges must be resolved. Furthermore, user interactions with the
website may provide an easy approach to improve the catalogue. In this paper,
we introduce Masader Plus, a web interface for users to browse Masader. We
demonstrate data exploration, filtration, and a simple API that allows users to
examine datasets from the backend. Masader Plus can be explored using this link
https://arbml.github.io/masader. A video recording explaining the interface can
be found here https://www.youtube.com/watch?v=SEtdlSeqchk
Taqyim: Evaluating Arabic NLP Tasks Using ChatGPT Models
Large language models (LLMs) have demonstrated impressive performance on
various downstream tasks without requiring fine-tuning, including ChatGPT, a
chat-based model built on top of LLMs such as GPT-3.5 and GPT-4. Despite having
a lower training proportion compared to English, these models also exhibit
remarkable capabilities in other languages. In this study, we assess the
performance of GPT-3.5 and GPT-4 models on seven distinct Arabic NLP tasks:
sentiment analysis, translation, transliteration, paraphrasing, part of speech
tagging, summarization, and diacritization. Our findings reveal that GPT-4
outperforms GPT-3.5 on five out of the seven tasks. Furthermore, we conduct an
extensive analysis of the sentiment analysis task, providing insights into how
LLMs achieve exceptional results on a challenging dialectal dataset.
Additionally, we introduce a new Python interface
https://github.com/ARBML/Taqyim that facilitates the evaluation of these tasks
effortlessly