105,607 research outputs found

    Survey: Data Mining Techniques in Medical Data Field

    Get PDF
    Now days most of the research area are working on data mining techniques in medical data. Knowledge discovery and data mining have found numerous applications in business and scientific domain. Valuable knowledge can be discovered from application of data mining techniques in healthcare system. In this study, we briefly examine the potential use of classification based data mining techniques such as Rule based, decision tree, machine learning algorithms like Support Vector Machines, Principle Component Analysis etc., Rough Set Theory and Fuzzy logic. In particular we consider a case study using classification techniques on a medical data set of diabetic patients

    Analyzing the Predictive Power of Machine Learning Models for Autism Detection

    Get PDF
    This study delves into the application of machine learning models for the early detection of Autism Spectrum Disorder (ASD). Early diagnosis and intervention are critical for improving the lives of individuals with ASD and their families. This research compares various machine learning models, including Decision Tree, Random Forest, Support Vector Machine, k-Nearest Neighbors, and more, assessing their performance based on key metrics such as F1-Score, accuracy, precision, and recall. The study reveals the Multi-layer Perceptron (MLP) as the top-performing model with an impressive F1-Score of 79.35%, demonstrating its potential for accurate ASD detection. The feature importance analysis highlights the significant roles of gender, genetic predisposition, age at diagnosis, and ethnicity-related features in predicting ASD. This study underscores the promise of machine learning in ASD detection and emphasizes the importance of early intervention and personalized approaches to diagnosis

    IMPLEMENTATION OF PARTICLE SWARM OPTIMIZATION BASED MACHINE LEARNING ALGORITHM FOR STUDENT PERFORMANCE PREDICTION

    Get PDF
    Education plays an important role in the development of a country, especially educational institutions as places where the educational process has an important goal to create quality education in improving student performance. Based on research conducted in the last few decades the quality of education in Portugal has improved, but statistics show that the failure rate of students in Portugal is high, especially in the fields of Mathematics and Portuguese. On the other hand, machine learning which is part of Artificial Intelligence is considered to be helpful in the field of education, one of which is in predicting student performance. However, measuring student performance becomes a challenge since student performance has several factors, one of which is the relationship of variables and factors for predicting the performance of participating in an orderly manner. This study aims to find out how the application of machine learning algorithms based on particle sworm optimization to predict student performance. By using experimental research methods and the results of empirical studies shown in each model, namely random forest, decision tree, support vector machine and particle swarm optimization based neural network can improve the accuracy of student performance predictions

    Comparison of supervised machine learning classification techniques in prediction of locoregional recurrences in early oral tongue cancer

    Get PDF
    Background: The proper estimate of the risk of recurrences in early-stage oral tongue squamous cell carcinoma (OTSCC) is mandatory for individual treatment-decision making. However, this remains a challenge even for experienced multidisciplinary centers. Objectives: We compared the performance of four machine learning (ML) algorithms for predicting the risk of locoregional recurrences in patients with OTSCC. These algorithms were Support Vector Machine (SVM), Naive Bayes (NB), Boosted Decision Tree (BDT), and Decision Forest (DF). Materials and methods: The study cohort comprised 311 cases from the five University Hospitals in Finland and A.C. Camargo Cancer Center, Sao Paulo, Brazil. For comparison of the algorithms, we used the harmonic mean of precision and recall called F1 score, specificity, and accuracy values. These algorithms and their corresponding permutation feature importance (PFI) with the input parameters were externally tested on 59 new cases. Furthermore, we compared the performance of the algorithm that showed the highest prediction accuracy with the prognostic significance of depth of invasion (DOI). Results: The results showed that the average specificity of all the algorithms was 71% The SVM showed an accuracy of 68% and F1 score of 0.63, NB an accuracy of 70% and F1 score of 0.64, BDT an accuracy of 81% and F1 score of 0.78, and DF an accuracy of 78% and F1 score of 0.70. Additionally, these algorithms outperformed the DOI-based approach, which gave an accuracy of 63%. With PFI-analysis, there was no significant difference in the overall accuracies of three of the algorithms; PFI-BDT accuracy increased to 83.1%, PFI-DF increased to 80%, PFI-SVM decreased to 64.4%, while PFI-NB accuracy increased significantly to 81.4%. Conclusions: Our findings show that the best classification accuracy was achieved with the boosted decision tree algorithm. Additionally, these algorithms outperformed the DOI-based approach. Furthermore, with few parameters identified in the PFI analysis, ML technique still showed the ability to predict locoregional recurrence. The application of boosted decision tree machine learning algorithm can stratify OTSCC patients and thus aid in their individual treatment planning.Peer reviewe

    Demand Forecasting for Food Production Using Machine Learning Algorithms: A Case Study of University Refectory

    Get PDF
    Accurate food demand forecasting is one of the critical aspects of successfully managing restaurants, cafeterias, canteens, and refectories. This paper aims to develop demand forecasting models for a university refectory. Our study focused on the development of Machine Learning-based forecasting models which take into account the calendar effect and meal ingredients to predict the heavy demand for food within a limited timeframe (e.g., lunch) and without pre-booking. We have developed eighteen prediction models gathered under five main techniques. Three Artificial Neural Network models (i.e., Feed Forward, Function Fitting, and Cascade Forward), four Gauss Process Regression models (i.e., Rational Quadratic, Squared Exponential, Matern 5/2, and Exponential), six Support Vector Regression models (i.e., Linear, Quadratic, Cubic, Fine Gaussian, Medium Gaussian, and Coarse Gaussian), three Regression Tree models (i.e., Fine, Medium, and Coarse), two Ensemble Decision Tree (EDT) models (i.e., Boosted and Bagged) and one Linear Regression model were applied. When evaluated in terms of method diversity, prediction performance, and application area, to the best of our knowledge, this study offers a different contribution from previous studies. The EDT Boosted model obtained the best prediction performance (i.e., Mean Squared Error = 0,51, Mean Absolute Erro = 0,50, and R = 0,96)

    Implementation of Supervised Machine Learning on Embedded Raspberry Pi System to Recognize Hand Motion as Preliminary Study for Smart Prosthetic Hand

    Get PDF
    EMG signals have random, non-linear, and non-stationary characteristics that require the selection of the suitable feature extraction and classifier for application to prosthetic hands based on EMG pattern recognition. This research aims to implement EMG pattern recognition on an embedded Raspberry Pi system to recognize hand motion as a preliminary study for a smart prosthetic hand. The contribution of this research is that the time domain feature extraction model and classifier machine can be implemented into the Raspberry Pi embedded system. In addition, the machine learning training and evaluation process is carried out online on the Raspberry Pi system. The online training process is carried out by integrating EMG data acquisition hardware devices, time domain features, classifiers, and motor control on embedded machine learning using Python programming. This study involved ten respondents in good health. EMG signals are collected at two lead flexor carpi radialis and extensor digitorum muscles. EMG signals are extracted using time domain features (TDF) mean absolute value (MAV), root mean square (RMS), variance (VAR) using a window length of 100 ms. Supervised machine learning decision tree (DT), support vector machine (SVM), and k-nearest neighbor (KNN) are chosen because they have a simple algorithm structure and less computation. Finally, the TDF and classifier are embedded in the Raspberry Pi 3 Model B+ microcomputer. Experimental results show that the highest accuracy is obtained in the open class, 97.03%. Furthermore, the additional datasets show a significant difference in accuracy (p-value <0.05). Based on the evaluation results obtained, the embedded system can be implemented for prosthetic hands based on EMG pattern recognition

    Prediksi Kelas Jamak dengan Deep Learning Berbasis Graphics Processing Units

    Get PDF
    For the first time, machine learning did the classical classification process using two classes (bi-class) such as class -1 and class +1, 0 and 1, or the form of categories such as true and false. Famous methods used are Artificial Neural Networks (ANN) and Support Vector Machine (SVM). The current development was a problem with more than two classes, known as multi-class classes. For SVM sometimes the plural classes are overcome by doing a gradual process like a decision tree (DT) method. Meanwhile, ANN has experienced rapid development and is currently being developed with a large number of layers with the new activation functions, i.e. the rectified linear units (ReLu), and the probabilistic-based activation, i.e. softmax, including its optimizer methods (adam, sgd, and others). Then the term changed to Deep Learning (DL). This study aimed to compare two well-known methods (DL and SVM) in classifying multiple classes. The number of DL layers was six with the neuron composition are 128, 64, 32, 8, 4, and 3, while SVM uses a radial kernel base function with gamma and c respectively 0.7 and 5. Besides, this study intends to compare the use of the Graphics Processing Unit (GPU) available on Google Interactive Notebook (Google Colab), an online Python language programming application. The results showed that DL accuracy outperformed SVM but required large computational resources, with the accuracy for DL and SVM are 99% and 98%, respectively. However, the use of the GPU can overcome these problems and is proven to increase the speed of the process as much as 47 times. Keywords: Artificial Neural Networks, Graphics Processing Unit, Google Interactive Notebook, Rectified Linear units, Support Vector Machine. Abstrak Di awal perkembangannya mesin pembelajaran melakukan proses klasikfikasi menggunakan dua kelas (bi-class) misalnya kelas -1 dan kelas +1, 0 dan 1, atau bentuk kategori seperti benar dan salah. Metode terkenal yang digunakan adalah Jaringan Syaraf Tiruan (JST) dan Support Vector Machine (SVM). Perkembangan selanjutnya adalah problem dengan kelas yang lebih dari dua kelas, dikenal dengan istilah kelas jamak (multi-class). Untuk SVM terkadang kelas jamak diatasi dengan melakukan proses berjenjang mirip pohon keputusan (decision tree). Sementara itu JST telah mengalami perkembangan yang pesat dan saat ini sudah dikembangkan dengan jumlah layer yang banyak disertai dengan fungsi-fungsi aktivasi terkini seperti rectified linear unit (ReLu), dan softmax yang berbasis probabilistik, termasuk juga metode-metode optimizernya (adam, sgd, dan lain-lain). Kemudian istilahnya berubah menjadi Deep Learning (DL). Penelitian ini mencoba membandingkan dua metode terkenal (DL dan SVM) dalam melakukan klasifikasi kelas jamak. Jumlah layer DL sebanyak enam dengan masing-masing neuron sebesar 128, 64, 32, 8, 4, dan 3, sementara SVM menggunakan kernel radial basis function dengan gamma dan c berturut-turut 0.7 dan 5. Selain itu penelitian ini bermaksud membandingkan penggunaan Graphics Processing Unit (GPU) yang tersedia di Google Interactive Notebook (Google Colab), sebuah aplikasi online pemrograman bahasa Python. Hasil penelitian menunjukan akurasi DL unggul tipis dibanding SVM namun memerlukan sumber daya komputasi yang besar masing-masing dengan akurasi 99% dan 98%. Namun penggunaan GPU mampu mengatasi permasalahan tersebut dan terbukti meningkatkan kecepatan proses sebanyak 47 kali. Kata kunci: Jaringan Syaraf Tiruan, Graphics Processing Unit, Google Interactive Notebook, Rectified Linear units, Support Vector Machine

    Predicting occupational injury causal factors using text-based analytics : A systematic review

    Get PDF
    Workplace accidents can cause a catastrophic loss to the company including human injuries and fatalities. Occupational injury reports may provide a detailed description of how the incidents occurred. Thus, the narrative is a useful information to extract, classify and analyze occupational injury. This study provides a systematic review of text mining and Natural Language Processing (NLP) applications to extract text narratives from occupational injury reports. A systematic search was conducted through multiple databases including Scopus, PubMed, and Science Direct. Only original studies that examined the application of machine and deep learning-based Natural Language Processing models for occupational injury analysis were incorporated in this study. A total of 27, out of 210 articles were reviewed in this study by adopting the Preferred Reporting Items for Systematic Review (PRISMA). This review highlighted that various machine and deep learning-based NLP models such as K-means, Naïve Bayes, Support Vector Machine, Decision Tree, and K-Nearest Neighbors were applied to predict occupational injury. On top of these models, deep neural networks are also included in classifying the type of accidents and identifying the causal factors. However, there is a paucity in using the deep learning models in extracting the occupational injury reports. This is due to these techniques are pretty much very recent and making inroads into decision-making in occupational safety and health as a whole. Despite that, this paper believed that there is a huge and promising potential to explore the application of NLP and text-based analytics in this occupational injury research field. Therefore, the improvement of data balancing techniques and the development of an automated decision-making support system for occupational injury by applying the deep learning-based NLP models are the recommendations given for future research
    corecore