430 research outputs found
Survey on Classification Algorithms for Data Mining:(Comparison and Evaluation)
Data mining concept is growing fast in popularity, it is a technology that involving methods at the intersection of (Artificial intelligent, Machine learning, Statistics and database system), the main goal of data mining process is to extract information from a large data into form which could be understandable for further use. Some algorithms of data mining are used to give solutions to classification problems in database. In this paper a comparison among three classificationâs algorithms will be studied, these are (K- Nearest Neighbor classifier, Decision tree and Bayesian network) algorithms. The paper will demonstrate the strength and accuracy of each algorithm for classification in term of performance efficiency and time complexity required. For model validation purpose, twenty-four-month data analysis is conducted on a mock-up basis. Keywords: Decision tree, Bayesian network, k- nearest neighbour classifier
Doctor of Philosophy
dissertationAlthough renal transplant is the preferred modality for end-stage renal disease, it brings with it a number of challenges primarily associated with lack of individualized approach. The goals of the present project were: (1) to determine the most significant and clinically practical predictors of kidney transplant outcomes (patient survival, allograft survival, posttransplant complications) using United States Renal Data System (USRDS) data; (2) based on the selected predictors, to generate prediction models of renal transplant outcomes. Our initial study developed prediction models using logistic regression and treebased algorithms derived from data provided by the United Network of Organ Sharing (UNOS). A series of follow-up projects, using data supplied by the United States Renal Data System (USRDS), was performed. We were able to capture significant associations between donor, recipient, and transplant procedure variables (that could not be derived from UNOS data) and the allograph and recipient survival. Among our important findings, compared to peritoneal dialysis (PD), hemodialysis is associated with increased risk of graft failure and recipient death; preemptive retransplantation is associated with an increased risk of graft failure; increased time on dialysis between transplants is associated with a negative effect upon graft and recipient survival in most patient subgroups; short-term (6 months or less) dialysis had no negative effect on graft survival compared to preemptive transplants; certain socioeconomic factors, such as higher education level, citizenship, and type of insurance coverage, influenced graft and recipient outcomes, independent of racial differences; and that one particular iv immunosuppressive medication regimen was superior to others in prolonging graft and recipient survival. Based on these results, we developed a more comprehensive prediction model of the graft outcome using URSDS data using logistic regression and tree-based models. The new models included both deceased and living donor graft recipients, was based on the longer list of pertinent predictors while still being practical in the clinical setting, and addressed the probability of graft failure at five different time points (1, 3, 5, 7, and 10- year allograft survival). The models have been validated on the independent dataset and demonstrated performance suggesting implementation in the clinical decision support system
Artificial Intelligence and Liver Transplant:Predicting Survival of Individual Grafts
The demand for liver transplantation far outstrips the supply of deceased donor organs, and so, listing and allocation decisions aim to maximize utility. Most existing methods for predicting transplant outcomes use basic methods, such as regression modeling, but newer artificial intelligence (AI) techniques have the potential to improve predictive accuracy. The aim was to perform a systematic review of studies predicting graft outcomes following deceased donor liver transplantation using AI techniques and to compare these findings to linear regression and standard predictive modeling: donor risk index (DRI), Model for EndâStage Liver Disease (MELD), and Survival Outcome Following Liver Transplantation (SOFT). After reviewing available article databases, a total of 52 articles were reviewed for inclusion. Of these articles, 9 met the inclusion criteria, which reported outcomes from 18,771 liver transplants. Artificial neural networks (ANNs) were the most commonly used methodology, being reported in 7 studies. Only 2 studies directly compared machine learning (ML) techniques to liver scoring modalities (i.e., DRI, SOFT, and balance of risk [BAR]). Both studies showed better prediction of individual organ survival with the optimal ANN model, reporting an area under the receiver operating characteristic curve (AUROC) 0.82 compared with BAR (0.62) and SOFT (0.57), and the other ANN model gave an AUC ROC of 0.84 compared with a DRI (0.68) and SOFT (0.64). AI techniques can provide high accuracy in predicting graft survival based on donors and recipient variables. When compared with the standard techniques, AI methods are dynamic and are able to be trained and validated within every population. However, the high accuracy of AI may come at a cost of losing explainability (to patients and clinicians) on how the technology works
Data Mining-based Survival Analysis and Simulation Modeling for Lung Transplant
The objective of this research is to develop a decision support methodology for the lung transplant procedure by investigating the UNOS nation-wide dataset via data mining-based survival analysis and simulation-based optimization. Traditional statistical techniques have various limitations which hinder the exploration of the information hidden under the voluminous data. The deployment of the structural equation modeling integrated with decision trees provides a more effective matching between the donor organ and the recipient. Such an integration preceded by powerful data mining models to determine which variables to include for survival analysis is validated via the simulation-based optimization.The suggested data mining-based survival analysis was superior to the conventional statistical methods in predicting the lung graft survivability and in determining the critical variables to include in organ matching and allocation. The proposed matching index derived via structural equation model-based decision trees was validated to be a more effective priority-ranking mechanism than the current lung allocation scoring system. This validation was established by a simulation-based optimization model. It was demonstrated that with this novel matching index, a substantial improvement was achieved in the survival rate while only a short delay was caused in the average waiting time of candidate patients on the list. Furthermore, via the response surface methodology-based simulation optimization the optimal weighting scheme for the components of the novel matching index was determined by jointly optimizing the lung transplant performance measures, namely, the justice principle in terms of the waiting time and the utility principle in terms of the survival rate. The study presents uniqueness in that it provides a means to integrate the data mining modeling as well as simulation optimization with the survival analysis so that more useful information hidden in the large amount of data can be discovered. The developed methodology improves the modeling of matching and allocation system in terms of both interpretability and predictability. This will be beneficial to medical professionals at a great deal.Industrial Engineering & Managemen
A Comparative Analysis of Decision Tree and Bayesian Model for Network Intrusion Detection System
Denial of Service Attacks (DoS) is a major threat to computer networks. This paper presents two approaches (Decision tree and Bayesian network) to the building of classifiers for DoS attack. Important attributes selection increases the classification accuracy of intrusion detection systems; as decision tree which has the advantage of generating explainable rules was used for the selection of relevant attributes in this research. A C4.5 decision tree dimensional reduction algorithm was used in reducing the 41 attributes of the KDDÂŽ99 dataset to 29. Thereafter, a rule based classification system (decision tree) was built as well as Bayesian network classification system for denial of service attack (DoS) based on the selected attributes. The classifiers were evaluated and compared using performance on the test dataset. Experimental results show that Decision Tree is robust and gives the highest percentage of successful classification than Bayesian Network which was found to be sensitive to the discritization techniques. It has been successfully tested that significant attribute selection is important in designing a real world intrusion detection system (IDS). Keywordsâ Intrusion Detection System, Machine Learning, Decision Tree, and Bayesian Network
Deep learning survival analysis for clinical decision support in deceased donor kidney transplantation
In deceased donor kidney transplantation, the decision to accept or decline an offer relies on a clinicians intuition and ability to digest complex information in order to maximize patient survival. Risks affecting patient survival post-KT must be balanced with the risks of remaining on the waitlist. These risks include mortality, graft failure, and becoming too sick to transplant. The allocation system today takes these risk into account by way of the KDPI and EPTS scores. While these scores are discriminative of patient survival they were built with an assumption of independence between risks and very few donor-recipient variables. Deep learning survival analysis can effectively handle competing risks and learn complex relationships between many more donor-recipient variables. We used DeepHit to assess the risk benefit associated with accepting a kidney offer or remaining on the waitlist. Our models achieved comparable, if not better performance in certain tasks, with other high performing models in the literature and revealed that decoupling competing risks led to increased clinical information gain. We show that comprehensively modeling competing risks using machine learning can achieve more granular, meaningful clinical risk analysis enabling more effective decision making in deceased donor kidney transplantation
Bayesian averaging over Decision Tree models for trauma severity scoring
Health care practitioners analyse possible risks of misleading decisions and need to estimate and quantify uncertainty in predictions. We have examined the âgoldâ standard of screening a patient's conditions for predicting survival probability, based on logistic regression modelling, which is used in trauma care for clinical purposes and quality audit. This methodology is based on theoretical assumptions about data and uncertainties. Models induced within such an approach have exposed a number of problems, providing unexplained fluctuation of predicted survival and low accuracy of estimating uncertainty intervals within which predictions are made. Bayesian method, which in theory is capable of providing accurate predictions and uncertainty estimates, has been adopted in our study using Decision Tree models. Our approach has been tested on a large set of patients registered in the US National Trauma Data Bank and has outperformed the standard method in terms of prediction accuracy, thereby providing practitioners with accurate estimates of the predictive posterior densities of interest that are required for making risk-aware decisions
Decision Aid to Determine the Necessity of Right Ventricular Support for Patients Receiving a Left Ventricular Assist Device
The purpose of this study was to improve the efficacy of VAD therapy for patients intended for VAD insertion. The study focused on the specific decision whether an LVAD or BiVAD is appropriate. A hierarchical decision model was constructed using an influence diagram of clinical risk factors derived through interviews with expert cardiologists and cardiac surgeons. Most of the variables are summarized by two independent criteria: risk of surgery and risk of right ventricular (RV) failure. These risks are computed from various patient demographics, tests, and hemodynamics using expert physician-selected weighted linear and weighted non-linear relationships. The model was validated with retrospective data from patient records at University of Pittsburgh Medical Center (UPMC) for patients implanted after 1990 and explanted before 2006. In total 239 patients were implanted and explanted during this time, of those 168 had sufficient information to be used in this analysis. 48 patients received biventricular assistance (BiVADs), 119 patients received only left ventricular assistance (LVADs). Of these 119 LVAD patients, 19 subsequently received an RVAD due to unanticipated RV dysfunction. Pre-implant data were used as input to the model. The model parameters were derived from two different physicians. The models based on individual physician's weightings predicted 63% (47%) of the patients who required an RVAD after implant. However, these decision models also recommended BiVAD implantation for 40% (43%) of patients who were treated successfully with an LVAD alone.A nonlinear numerical optimizer was used to improve the model parameters to optimize the agreement with eventual outcomes. The optimized model predicted 74% of the patients who required an RVAD post-implant and recommended the implantation of BiVADs in 21% of patients who were treated successfully with an LVAD alone. In conclusion, the decision model provided a more aggressive use of biventricular assistance, which retrospectively would have benefited patients who required an RVAD at a later date, but would have unnecessarily implanted RVADs in some patients that survived with an LVAD alone. However the model also identified that 48% of the patients who initially received BiVADs to be candidates for LVAD alone
- âŠ