2,836 research outputs found

    Text document pre-processing using the Bayes formula for classification based on the vector space model

    Get PDF
    This work utilizes the Bayes formula to vectorize a document according to a probability distribution based on keywords reflecting the probable categories that the document may belong to. The Bayes formula gives a range of probabilities to which the document can be assigned according to a pre determined set of topics (categories). Using this probability distribution as the vectors to represent the document, the text classification algorithms based on the vector space model, such as the Support Vector Machine (SVM) and Self-Organizing Map (SOM) can then be used to classify the documents on a multi-dimensional level, thus improving on the results obtained using only the highest probability to classify the document, such as that achieved by implementing the naïve Bayes classifier by itself. The effects of an inadvertent dimensionality reduction can be overcome using these algorithms. We compare the performance of these classifiers for high dimensional data

    Text document pre-processing using the Bayes formula for classification based on the vector space model

    Get PDF
    This work utilizes the Bayes formula to vectorize a document according to a probability distribution based on keywords reflecting the probable categories that the document may belong to. The Bayes formula gives a range of probabilities to which the document can be assigned according to a pre determined set of topics (categories). Using this probability distribution as the vectors to represent the document, the text classification algorithms based on the vector space model, such as the Support Vector Machine (SVM) and Self-Organizing Map (SOM) can then be used to classify the documents on a multi-dimensional level, thus improving on the results obtained using only the highest probability to classify the document, such as that achieved by implementing the naïve Bayes classifier by itself. The effects of an inadvertent dimensionality reduction can be overcome using these algorithms. We compare the performance of these classifiers for high dimensional data

    Penerapan Ensemble Stacking untuk Klasifikasi Multi Kelas

    Full text link
    Klasifikasi adalah salah satu topik utama yang banyak digunakan dalam penelitian pembelajaran mesin. Beberapa penelitian terdahulu telah menghasilkan base classifier yang sampai saat ini masih digunakan. Banyak base classifier menunjukkan performa yang baik untuk klasifikasi biner tetapi performa classifier tersebut menurun pada saat digunakan untuk klasifikasi multi-kelas. Pada penelitian sebelumnya digunakan hybrid classifier untuk klasifikasi multi kelas. Hasil penelitian menunjukkan akurasi hybrid classifier yang diajukan lebih baik dari base classifier. pada penelitian ini ensemble method stacking diterapkan. Decision tree dan naïve bayes digunakan sebagai classifier dasar. Hasil pengujian menunjukkan metode ensemble stacking hanya mampu melampui pada beberapa dataset jika dibandingkan dengan hybrid classifier

    Evaluation of Classification Algorithms for Intrusion Detection System: A Review

    Get PDF
    Intrusion detection is one of the most critical network security problems in the technology world. Machine learning techniques are being implemented to improve the Intrusion Detection System (IDS). In order to enhance the performance of IDS, different classification algorithms are applied to detect various types of attacks. Choosing a suitable classification algorithm for building IDS is not an easy task. The best method is to test the performance of the different classification algorithms. This paper aims to present the result of evaluating different classification algorithms to build an IDS model in terms of confusion matrix, accuracy, recall, precision, f-score, specificity and sensitivity. Nevertheless, most researchers have focused on the confusion matrix and accuracy metric as measurements of classification performance. It also provides a detailed comparison with the dataset, data preprocessing, number of features selected, feature selection technique, classification algorithms, and evaluation performance of algorithms described in the intrusion detection system

    Machine Learning Approaches for Heart Disease Detection: A Comprehensive Review

    Get PDF
    This paper presents a comprehensive review of the application of machine learning algorithms in the early detection of heart disease. Heart disease remains a leading global health concern, necessitating efficient and accurate diagnostic methods. Machine learning has emerged as a promising approach, offering the potential to enhance diagnostic accuracy and reduce the time required for assessments. This review begins by elucidating the fundamentals of machine learning and provides concise explanations of the most prevalent algorithms employed in heart disease detection. It subsequently examines noteworthy research efforts that have harnessed machine learning techniques for heart disease diagnosis. A detailed tabular comparison of these studies is also presented, highlighting the strengths and weaknesses of various algorithms and methodologies. This survey underscores the significant strides made in leveraging machine learning for early heart disease detection and emphasizes the ongoing need for further research to enhance its clinical applicability and efficacy
    • …
    corecore