Arabic text summarization using pre-processing methodologies and techniques

Abstract

Recently, one of the problems that has arisen due to the amount of information and its availability on the web, is the increased need for effective and powerful tools to automatically summarize text. For English and European languages an intensive works has been done with high performance and nowadays they look forward to multi-document and multi-language summarization. However, Arabic language still suffers from the little attention and research done in this field. In our research we propose a model to automatically summarize Arabic text using text extraction. Various steps are involved in the approach: preprocessing text, extract set of features from sentences, classify sentence based on scoring method, ranking sentences and finally generate an extract summary. The main difference between our proposed system and other Arabic summarization systems are the consideration of semantics, entity objects such as names and places, and similarity factors in our proposed system. In recent years, text summarization has seen renewed interest, and has been experiencing an increasing number of research and products especially in English language. However, in Arabic language, little work and limited research have been done in this field. will be adopted Recall-Oriented Understudy for Gisting Evaluation (ROUGE) as an evaluation measure to examine our proposed technique and compare it with state-of-the-art methods. Finally, an experiment on the Essex Arabic Summaries Corpus (EASC) using the ROUGE-1 and ROUGE-2 metrics showed promising results in comparison with existing methods

    Similar works