Search CORE

4 research outputs found

Comparative Analysis of Urdu Based Stemming Techniques

Author: Muhammad Hassaan Rafiq
Shiza Gul Niazi
Subaika Ali
Publication venue: Lahore Garrison University
Publication date: 28/09/2018
Field of study

Stemming reduces many variant forms of a word into its base, stem or root, which is necessary for many different language processing application including Urdu. Urdu is a morphologically rich and resourceful language. Multilingual Urdu words are very challenging to process due to complexity of morphology. The Research of Urdu stemming has an age of a decade. The present work introduces a research on Urdu stemmers with better performance as compare to the existing Urdu stemmer

Lahore Garrison University Research Journal of Computer Science and Information Technology

Building a Text Collection for Urdu Information Retrieval

Author: Banka Haider
Khan Hamaid M.
Rasheed Imran
Publication venue: 'Wiley'
Publication date: 01/01/2021
Field of study

Urdu is a widely spoken language in the Indian subcontinent with over 300 million speakers worldwide. However, linguistic advancements in Urdu are rare compared to those in other European and Asian languages. Therefore, by following Text Retrieval Conference standards, we attempted to construct an extensive text collection of 85 304 documents from diverse categories covering over 52 topics with relevance judgment sets at 100 pool depth. We also present several applications to demonstrate the effectiveness of our collection. Although this collection is primarily intended for text retrieval, it can also be used for named entity recognition, text summarization, and other linguistic applications with suitable modifications. Ours is the most extensive existing collection for the Urdu language, and it will be freely available for future research and academic education

DSpace@FSM Vakif University