Location of Repository

A new rule pruning text categorisation method

By Fadi Abdeljaber Thabtah, Wa’el Hadi, Hussein Abu-Mansour and T.L. McCluskey

Abstract

Associative classification integrates association rule and classification in data mining to build classifiers that are highly accurate than that of traditional classification approaches such as greedy and decision tree. However, the size of the classifiers produced by associative classification algorithms is usually large and contains insignificant rules. This may degrade the classification accuracy and increases the classification time, thus, pruning becomes an important task. In this paper, we investigate the problem of rule pruning in text categorisation and propose a new rule pruning techniques called High Precedence. Experimental results show that HP derives higher quality and more scalable classifiers than those produced by current pruning methods (lazy and database coverage). In addition, the number of rules generated by the developed pruning procedure is often less than that of lazy pruning

Topics: Q1, QA75
Publisher: IEEE
Year: 2010
OAI identifier: oai:eprints.hud.ac.uk:9156

Suggested articles

Preview

Citations

  1. (2008). A Lazy Approach to Associative Classification. doi
  2. (2004). A novel algorithm for associative classification of images blocks. doi
  3. (2008). ACN: An Associative Classifier with Negative Rules, doi
  4. (2004). An associative classifier based on positive and negative rules. doi
  5. (2009). Association Classification Based on doi
  6. (2001). CMAR: Accurate and efficient classification based on multiple-class association rule. doi
  7. (2003). CPAR: Classification based on predictive association rule. doi
  8. (1998). Data mining tools SeeS and CS.O.
  9. (1995). Fast effective rule induction. doi
  10. (1993). FOIL: A midterm report. doi
  11. (1998). Integrating classification and aSSOCIatIOn rule mInIng.
  12. MCAR: Multi-class classification based on association rule approach. doi
  13. (2004). MMAC: A new multi-class, multi-label associative classification approach. doi
  14. (2008). Multi-label Classification based on Association Rules with Application to Scene Classification, icycs, doi
  15. (2004). On support thresholds in associative classification. doi
  16. (1998). Reuters 21578 text categorisation test collection.http://www.daviddlewis.com/resources/t estcollections/reuters21578.
  17. (2003). Scoring the data using association rules. doi
  18. (1989). Statistical Methods, Eighth Edition, doi

To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.