Search CORE

1,701 research outputs found

Performance of machine learning classifiers in classifying stunting among under-five children in Zambia

Author: Chilengi Roma
Chilengi Roma
Chilyabanyama Obvious Nchimunya
Chirwa Masuzyo
Chisenga Caroline C
Hamusonde Kalongo
Iqbal Najeeha Talat
Ngaruye Innocent
Saroj Rakesh Kumar
Simuyandi Michelo
Publication venue: eCommons@AKU
Publication date: 20/07/2022
Field of study

Stunting is a global public health issue. We sought to train and evaluate machine learning (ML) classification algorithms on the Zambia Demographic Health Survey (ZDHS) dataset to predict stunting among children under the age of five in Zambia. We applied Logistic regression (LR), Random Forest (RF), SV classification (SVC), XG Boost (XgB) and Naïve Bayes (NB) algorithms to predict the probability of stunting among children under five years of age, on the 2018 ZDHS dataset. We calibrated predicted probabilities and plotted the calibration curves to compare model performance. We computed accuracy, recall, precision and F1 for each machine learning algorithm. About 2327 (34.2%) children were stunted. Thirteen of fifty-eight features were selected for inclusion in the model using random forest. Calibrating the predicted probabilities improved the performance of machine learning algorithms when evaluated using calibration curves. RF was the most accurate algorithm, with an accuracy score of 79% in the testing and 61.6% in the training data while Naïve Bayesian was the worst performing algorithm for predicting stunting among children under five in Zambia using the 2018 ZDHS dataset. ML models aids quick diagnosis of stunting and the timely development of interventions aimed at preventing stunting

eCommons@AKU

PubMed Central

A Survey of Existing E-mail Spam Filtering Methods Considering Machine Learning Techniques

Author: Jinat Ara
Publication venue: Global Journals Inc. (US)
Publication date: 15/03/2018
Field of study

E-mail is one of the most secure medium for online communication and transferring data or messages through the web. An overgrowing increase in popularity, the number of unsolicited data has also increased rapidly. To filtering data, different approaches exist which automatically detect and remove these untenable messages. There are several numbers of email spam filtering technique such as Knowledge-based technique, Clustering techniques, Learningbased technique, Heuristic processes and so on. This paper illustrates a survey of different existing email spam filtering system regarding Machine Learning Technique (MLT) such as Naive Bayes, SVM, K-Nearest Neighbor, Bayes Additive Regression, KNN Tree, and rules. However, here we present the classification, evaluation and comparison of different email spam filtering system and summarize the overall scenario regarding accuracy rate of different existing approache

Global Journal of Computer Science and Technology (GJCST)

Machine Learning Techniques for Stellar Light Curve Classification

Author: Hinners Trisha
Tat Kevin
Thorp Rachel
Publication venue: 'American Astronomical Society'
Publication date: 26/04/2018
Field of study

We apply machine learning techniques in an attempt to predict and classify stellar properties from noisy and sparse time series data. We preprocessed over 94 GB of Kepler light curves from MAST to classify according to ten distinct physical properties using both representation learning and feature engineering approaches. Studies using machine learning in the field have been primarily done on simulated data, making our study one of the first to use real light curve data for machine learning approaches. We tuned our data using previous work with simulated data as a template and achieved mixed results between the two approaches. Representation learning using a Long Short-Term Memory (LSTM) Recurrent Neural Network (RNN) produced no successful predictions, but our work with feature engineering was successful for both classification and regression. In particular, we were able to achieve values for stellar density, stellar radius, and effective temperature with low error (~ 2 - 4%) and good accuracy (~ 75%) for classifying the number of transits for a given star. The results show promise for improvement for both approaches upon using larger datasets with a larger minority class. This work has the potential to provide a foundation for future tools and techniques to aid in the analysis of astrophysical data.Comment: Accepted to The Astronomical Journa

arXiv.org e-Print Archive

Caltech Authors

Recommended from our members

Entropy Based Feature Selection For Multi-Relational Naïve Bayesian Classifier

Author: Modi Nilesh K
Vaghela Vimalkumar B
Vandra Kalpesh H
Publication venue: CSUSB ScholarWorks
Publication date: 01/01/2014
Field of study

Current industries data’s are stored in relation structures. In usual approach to mine these data, we often use to join several relations to form a single relation using foreign key links, which is known as flatten. Flatten may cause troubles such as time consuming, data redundancy and statistical skew on data. Hence, the critical issues arise that how to mine data directly on numerous relations. The solution of the given issue is the approach called multi-relational data mining (MRDM). Other issues are irrelevant or redundant attributes in a relation may not make contribution to classification accuracy. Thus, feature selection is an essential data pre- processing step in multi-relational data mining. By filtering out irrelevant or redundant features from relations for data mining, we improve classification accuracy, achieve good time performance, and improve comprehensibility of the models. We had proposed the entropy based feature selection method for Multi-relational Naïve Bayesian Classifier. We have use method InfoDist and Pearson’s Correlation parameters, which will be used to filter out irrelevant and redundant features from the multi-relational database and will enhance classification accuracy. We analyzed our algorithm over PKDD financial dataset and achieved the better accuracy compare to the existing features selection methods

CSUSB ScholarWorks