6,875 research outputs found

    Predicting Pancreatic Cancer Using Support Vector Machine

    Get PDF
    This report presents an approach to predict pancreatic cancer using Support Vector Machine Classification algorithm. The research objective of this project it to predict pancreatic cancer on just genomic, just clinical and combination of genomic and clinical data. We have used real genomic data having 22,763 samples and 154 features per sample. We have also created Synthetic Clinical data having 400 samples and 7 features per sample in order to predict accuracy of just clinical data. To validate the hypothesis, we have combined synthetic clinical data with subset of features from real genomic data. In our results, we observed that prediction accuracy, precision, recall with just genomic data is 80.77%, 20%, 4%. Prediction accuracy, precision, recall with just synthetic clinical data is 93.33%, 95%, 30%. While prediction accuracy, precision, recall for combination of real genomic and synthetic clinical data is 90.83%, 10%, 5%. The combination of real genomic and synthetic clinical data decreased the accuracy since the genomic data is weakly correlated. Thus we conclude that the combination of genomic and clinical data does not improve pancreatic cancer prediction accuracy. A dataset with more significant genomic features might help to predict pancreatic cancer more accurately

    Regression Models For Readmission Prediction Using Electronic Medical Records

    Get PDF
    Hospital readmissions are not only expensive but are also potentially harmful, and most importantly, they are often preventable. Providing special care for a targeted group of patients who are at a high risk of readmission can signiïŹcantly improve the chances of avoiding rehospitalization. Despite the signiïŹcance of this problem, not many researchers have thoroughly investigated it due to the inherent complexities involved in analyzing and estimating the inherent predictive power of such complex hospitalization records. In this thesis, we propose using support vector machines and survival analysis methods to analyze data collected from Electronic Medical Records (EMR). We define the notion of abnormal patients and understand how they affect the performance of classifiers. We use sparse methods with survival regression models to build clinical models which are suitable to apply on such complex clinical data. These models are compared with existing readmission models such as ADHERE, TABAK and logistic regression models. Finally, we provide inferences and conclusions on how to extend this work to build better regression models
    • 

    corecore