2 research outputs found

    KALIMAT a multipurpose Arabic corpus

    Get PDF

    Arabic topic detection using automatic text summarisation

    No full text
    With the exponential growth of the online available Arabic documents, classifying and processing large Arabic corpora has became a challenging task. The presence of noisy information embedded in these documents has made it even more difficult to get accurate results when applying a Topic Detection (TD) process. To address this problem, a proper features selection approach is needed to enhance the topic detection accuracy. In this paper, we explore the impact of using automatic summarisation technique along with a feature-selection process to enhance Arabic Topic Detection. In our work we show that using automatic summarisation reduces noisy information and results in a significant enhancement to the topic detection process and therefore increases the performance of our TD system. This was achieved by the ability of our summariser system in reducing documents size to speed up the detection process
    corecore