81,171 research outputs found
Utilizing Multi-level Classification Techniques to Predict Adverse Drug Effects and Reactions
Multi-class classification models are used to predict categorical response variables with more than two possible outcomes. A collection of multi-class classification techniques such as Multinomial Logistic Regression, Na\ {i}ve Bayes, and Support Vector Machine is used in predicting patientsā drug reactions and adverse drug effects based on patientsā demographic and drug administration. The newly released 2018 data on drug reactions and adverse drug effects from U.S. Food and Drug Administration are tested with the models. The applicability of model evaluation measures such as sensitivity, specificity and prediction accuracy in multi-class settings, are also discussed
Predicting regression test failures using genetic algorithm-selected dynamic performance analysis metrics
A novel framework for predicting regression test failures is proposed. The basic principle embodied in the framework is to use performance analysis tools to capture the runtime behaviour of a program as it executes each test in a regression suite. The performance information is then used to build a dynamically predictive model of test outcomes. Our framework is evaluated using a genetic algorithm for dynamic metric selection in combination with state-of-the-art machine learning classifiers. We show that if a program is modified and some tests subsequently fail, then it is possible to predict with considerable accuracy which of the remaining tests will also fail which can be used to help prioritise tests in time constrained testing environments
A critical assessment of imbalanced class distribution problem: the case of predicting freshmen student attrition
Predicting student attrition is an intriguing yet challenging problem for any academic institution. Class-imbalanced data is a common in the field of student retention, mainly because a lot of students register but fewer students drop out. Classification techniques for imbalanced dataset can yield deceivingly high
prediction accuracy where the overall predictive accuracy is usually driven by the majority class at the expense of having very poor performance on the crucial minority class. In this study, we compared different data balancing techniques to improve the predictive accuracy in minority class while maintaining satisfactory overall classification performance. Specifically, we tested three balancing techniquesāoversampling, under-sampling and synthetic minority over-sampling (SMOTE)āalong with four popular classification methodsālogistic regression, decision trees, neuron networks and support vector machines. We used a large and feature rich institutional student data (between the years 2005 and 2011) to assess the efficacy of both balancing techniques as well as prediction methods. The results indicated that the support vector machine combined with SMOTE data-balancing technique achieved the best classification performance with a 90.24% overall accuracy on the 10-fold holdout sample. All three data-balancing techniques improved the prediction accuracy for the minority class. Applying sensitivity analyses on developed models, we also identified the most important variables for accurate prediction of student attrition. Application of these models has the potential to accurately predict at-risk students and help reduce student dropout rates
Recommended from our members
Early Recognition of Burn- and Trauma-Related Acute Kidney Injury: A Pilot Comparison of Machine Learning Techniques.
Severely burned and non-burned trauma patients are at risk for acute kidney injury (AKI). The study objective was to assess the theoretical performance of artificial intelligence (AI)/machine learning (ML) algorithms to augment AKI recognition using the novel biomarker, neutrophil gelatinase associated lipocalin (NGAL), combined with contemporary biomarkers such as N-terminal pro B-type natriuretic peptide (NT-proBNP), urine output (UOP), and plasma creatinine. Machine learning approaches including logistic regression (LR), k-nearest neighbor (k-NN), support vector machine (SVM), random forest (RF), and deep neural networks (DNN) were used in this study. The AI/ML algorithm helped predict AKI 61.8 (32.5) hours faster than the Kidney Disease and Improving Global Disease Outcomes (KDIGO) criteria for burn and non-burned trauma patients. NGAL was analytically superior to traditional AKI biomarkers such as creatinine and UOP. With ML, the AKI predictive capability of NGAL was further enhanced when combined with NT-proBNP or creatinine. The use of AI/ML could be employed with NGAL to accelerate detection of AKI in at-risk burn and non-burned trauma patients
Incremental Predictive Process Monitoring: How to Deal with the Variability of Real Environments
A characteristic of existing predictive process monitoring techniques is to
first construct a predictive model based on past process executions, and then
use it to predict the future of new ongoing cases, without the possibility of
updating it with new cases when they complete their execution. This can make
predictive process monitoring too rigid to deal with the variability of
processes working in real environments that continuously evolve and/or exhibit
new variant behaviors over time. As a solution to this problem, we propose the
use of algorithms that allow the incremental construction of the predictive
model. These incremental learning algorithms update the model whenever new
cases become available so that the predictive model evolves over time to fit
the current circumstances. The algorithms have been implemented using different
case encoding strategies and evaluated on a number of real and synthetic
datasets. The results provide a first evidence of the potential of incremental
learning strategies for predicting process monitoring in real environments, and
of the impact of different case encoding strategies in this setting
A Multi Hidden Recurrent Neural Network with a Modified Grey Wolf Optimizer
Identifying university students' weaknesses results in better learning and
can function as an early warning system to enable students to improve. However,
the satisfaction level of existing systems is not promising. New and dynamic
hybrid systems are needed to imitate this mechanism. A hybrid system (a
modified Recurrent Neural Network with an adapted Grey Wolf Optimizer) is used
to forecast students' outcomes. This proposed system would improve instruction
by the faculty and enhance the students' learning experiences. The results show
that a modified recurrent neural network with an adapted Grey Wolf Optimizer
has the best accuracy when compared with other models.Comment: 34 pages, published in PLoS ON
Recommended from our members
A Data-informed Public Health Policy-Makers Platform
Hearing loss is a disease exhibiting a growing trend due to the number of factors, including but not limited to the mundane exposure to the noise and ever-increasing amount of older population. In the framework of a public health policymaking process, modeling of the hearing loss disease based on data is a key factor in alleviating the issues related to the disease issuing effective public health policies. First, the paper describes the steps of the data-driven policymaking process. Afterward, a scenario along with the part of the proposed platform, responsible for supporting policymaking are presented. With the aim of demonstrating the capabilities and usability of the platform for the policy-makers, some initial results of preliminary analytics are presented in a framework of a policy-making process. Ultimately, the utility of the approach is validated throughout the results of the survey which was presented to the health system policy-makers professionals involved in the policy development process in Croatia
Can the US Minimum Data Set Be Used for Predicting Admissions to Acute Care Facilities?
This paper is intended to give an overview of Knowledge Discovery in Large Datasets (KDD) and data mining applications in healthcare particularly as related to the Minimum Data Set, a resident assessment tool which is used in US long-term care facilities. The US Health Care Finance Administration, which mandates the use of this tool, has accumulated massive warehouses of MDS data. The pressure in healthcare to increase efficiency and effectiveness while improving patient outcomes requires that we find new ways to harness these vast resources. The intent of this preliminary study design paper is to discuss the development of an approach which utilizes the MDS, in conjunction with KDD and classification algorithms, in an attempt to predict admission from a long-term care facility to an acute care facility. The use of acute care services by long term care residents is a negative outcome, potentially avoidable, and expensive. The value of the MDS warehouse can be realized by the use of the stored data in ways that can improve patient outcomes and avoid the use of expensive acute care services. This study, when completed, will test whether the MDS warehouse can be used to describe patient outcomes and possibly be of predictive value
- ā¦