Skip to main content
Article thumbnail
Location of Repository

Naive Bayes vs. Decision Trees vs. Neural Networks in the Classification of Training Web Pages

By Daniela XHEMALI, Christopher J. HINDE and Roger G. STONE

Abstract

Web classification has been attempted through many different technologies. In this study we concentrate on the comparison of Neural Networks (NN), Naïve Bayes (NB) and Decision Tree (DT) classifiers for the automatic analysis and classification of attribute data from training course web pages. We introduce an enhanced NB classifier and run the same data sample through the DT and NN classifiers to determine the success rate of our classifier in the training courses domain. This research shows that our enhanced NB classifier not only outperforms the traditional NB classifier, but also performs similarly as good, if not better, than some more popular, rival techniques. This paper also shows that, overall, our NB classifier is the best choice for the training courses domain, achieving an impressive F-Measure value of over 97%, despite it being trained with fewer samples than any of the classification systems we have encountered

Topics: Neural Nets
Publisher: International Journal of Computer Science Issues, IJCSI
Year: 2009
OAI identifier: oai:cogprints.org:6708

Suggested articles

Citations

  1. (2001). A competitive neural network approach to web-page categorization”,
  2. (2007). A machine learning approach to web page filtering using content and structure analysis”,
  3. (2007). A Naive-Bayes classifier for damage detection in engineering materials”, Materials and Design,
  4. (1993). An Adaptive Information Retrieval System Based on Neural Networks”,
  5. (2003). Artificial Intelligence: A Modern Approach,
  6. (2006). Classification using Hierarchical Naïve Bayes models”,
  7. (2001). Document classification with CC4 neural network”, in:
  8. (2005). Extending Decision Trees for Web Categorisation”, in:
  9. (2003). Hierarchical Classification of HTML Documents with WebClassII”,
  10. (1996). Improved use of continuous attributes in C4.5”,
  11. (2003). Neural Network - Based System of Leading Indicators”, CIBC World Markets,
  12. (1997). Neural Networks”,
  13. (1996). Producing Evidence for the Hypotheses of
  14. (2007). Programming Collective Intelligence,
  15. (2003). Text classification and co-training from positive and unlabeled examples”, in: ICML Workshop: The Continuum from Labeled to Unlabeled Data,
  16. (1995). Using Neural Networks as a Tool for Constructing Rule Based Systems”,
  17. (2006). Web Categorisation Using Distance-Based Decision Trees”,
  18. (2000). Web Document Classification Using Modified Decision Trees”, in:

To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.