4 research outputs found

    Normalization of common noisy terms in Malaysian online media

    Get PDF
    This paper proposes a normalization technique of noisy terms that occur in Malaysian micro-texts.Noisy terms are common in online messages and influence the results of activities such as text classification and information retrieval.Even though many researchers have study methods to solve this problem, few had looked into the problems using a language other than English. In this study, about 5000 noisy texts were extracted from 15000 documents that were created by the Malaysian.Normalization process was executed using specific translation rules as part or preprocessing steps in opinion mining of movie reviews.The result shows up to 5% improvement in accuracy values of opinion mining

    Sentence-Level Novelty Detection in English and Malay

    No full text
    corecore