7,071 research outputs found

    Using machine learning to predict prescription opioid misuse in patients

    Get PDF

    Risk Prediction of Renal Failure for Chronic Disease Population Based on Electronic Health Record Big Data

    Get PDF
    Abstract Renal failure is a fatal disease raising global concerns. Previous risk models for renal failure mostly rely on the diagnosis of chronic kidney disease, which lacks obvious clinical symptoms and thus is mostly undiagnosed, causing significant omission of high-risk patients. In this paper, we proposed a framework to predict the risk of renal failure directly from a big data repository of chronic disease population without prerequisite diagnosis of chronic kidney disease. The electronic health records of 42,256 patients with hypertension or diabetes in Shenzhen Health Information Big Data Platform were collected, with 398 suffered from renal failure during a 3-year follow-up. Five state-of-the-art machine learning methods are utilized to build risk prediction models of renal failure for chronic disease population. Extensive experimental results show that the proposed framework achieves quite well performance. Particularly, the XGBoost obtains the best performance with an area under receiving-operating-characteristics curve (AUC) of 0.9139. By analyzing the effect of risk factors, we identified that serum creatine, age, urine acid, systolic blood pressure, and blood urea nitrogen are the top five factors associated with renal failure risk. Compared with existing models, our model can be deployed into routine chronic disease management procedures and enable more preemptive, widely-covered screening of renal risks, which would in turn reduce the damage caused by the disease through timely intervention

    Statistical analysis and data mining of Medicare patients with diabetes.

    Get PDF
    The purpose of this dissertation is to find ways to decrease Medicare costs and to study health outcomes of diabetes patients as well as to investigate the influence of Medicare, part D since its introduction in 2006 using the CMS CCW (Chronic Condition Data Warehouse) Data and the MEPS (Medical Expenditure Panel Survey) data. In this dissertation, we introduce pattern recognition analysis into the study of medical characteristics and demographic characteristics of the inpatients who have a higher readmission risk. We also broaden the cost-effectiveness analysis by including medical resources usage when investigating the effects of Medicare, part D. In addition, we apply several statistical linear models such as the generalized linear model and data mining techniques such as the neural network model to study the costs and outcomes of both inpatients and outpatients with diabetes in Medicare. Moreover, some descriptive statistics such as kernel density estimation and survival analysis are also employed. One important conclusion from these analyses is that only diseases and procedures, rather than age are key factors to inpatients\u27 mortality rate. Another important discovery is that at the influence of Medicare part 0, insulin is the most efficient oral anti-diabetes drug treatment and that the drug usage in 2006 is not as stable as that in 2005. We also find that the patients who are discharged to home or hospice are more likely to re-enter the hospital after discharge within 30 days. Two - way interaction effect analysis demonstrates that diabetes complications interact with each other, which makes healthcare costs and health outcomes different between a case with one complication and a case with two complications. Accordingly, we propose some useful suggestions. For instance, as for how to decrease Medicare payments for outpatients with diabetes, we suggest that the patients should often monitor their blood glucose level. We also recommend that inpatients with diabetes should pay more attention to their kidney disease, and use prevention to avoid such diseases to decrease the costs

    Chronic Risk and Disease Management Model Using Structured Query Language and Predictive Analysis

    Get PDF
    Individuals with chronic conditions are the ones who use health care most frequently and more than 50% of top ten causes of death are chronic diseases in United States and these members always have health high risk scores. In the field of population health management, identifying high risk members is very important in terms of patient health care, disease management and cost management. Disease management program is very effective way of monitoring and preventing chronic disease and health related complications and risk management allows physicians and healthcare companies to reduce patient’s health risk, help identifying members for care/disease management along with help in managing financial risk. The main objective of this research is to introduce efficient and accurate risk assessment model maintaining the accuracy of risk scores compared to existing model and based on calculated risk scores identify the members for disease management programs using structured query language. For the experimental purpose we have used data and information from different sources like CMS, NCQA, existing models and Healthcare Insurance Industry. In this approach, base principle is used from chronic and disability payment system (CDPS), based on this model weight of chronic disease is defined to calculate risk of each patient. Also to be more focused, three chronic diseases have been selected as a part of analysis. They are breast cancer, diabetes and congestive heart failure. Different sets of diagnosis, electronic medical records, and member pharmacy information are key source. Industry standard database system have been in taken in consideration while implementing the logic, which makes implementation of model more efficient since data is warehoused where model is built. We obtained experimental result from total 4761 relevant medical records taken from Molina Healthcare’s data warehouse. We tested proposed model using risk score data from State of Illinois using multiple linear regression method and implemented proposed logic in health plan data, based on which we calculated p-value and confidence level of our variables and also as second validation process we ran same data source through original risk model. In next step we showed that risk scores of members are highly contributing in member selection process for disease management program. To validate member selection criteria we used fast and frugal decision tree algorithm and confusion matrix result is used to measure the performance of member selection process for disease management program. The results show that the proposed model achieved overall risk assessment confidence level of 99%, with R-squared value of 98% and on disease management member identification we achieved 99% of sensitivity, 89% of accuracy and 74% of specificity. The experimental result from proposed model shows that if risk assessment model is taken one step further not only risk of member can be determined but it can help in disease management approach by identifying and prioritizing members for disease management. The resulting chronic risk and disease management method is very promising method for any patient, insurance companies, provider groups, claims processing organizations and physician groups to more accurately and effectively manage their members in terms of member health risk and enrolling them under required care management programs. Methods and design used in this research contributes to business analytics approach, overall member risk and disease management approach using predictive analytics based on member’s medical diagnosis, pharmacy utilization and member demographics

    Doctor of Philosophy

    Get PDF
    dissertationIn its report To Err is Human, The Institute of Medicine recommended the implementation of internal and external voluntary and mandatory automatic reporting systems to increase detection of adverse events. Knowledge Discovery in Databases (KDD) allows the detection of patterns and trends that would be hidden or less detectable if analyzed by conventional methods. The objective of this study was to examine novel KDD techniques used by other disciplines to create predictive models using healthcare data and validate the results through clinical domain expertise and performance measures. Patient records for the present study were extracted from the enterprise data warehouse (EDW) from Intermountain Healthcare. Patients with reported adverse events were identified from ICD9 codes. A clinical classification of the ICD9 codes was developed, and the clinical categories were analyzed for risk factors for adverse events including adverse drug events. Pharmacy data were categorized and used for detection of drugs administered in temporal sequence with antidote drugs. Data sampling and data boosting algorithms were used as signal amplification techniques. Decision trees, Naïve Bayes, Canonical Correlation Analysis, and Sequence Analysis were used as machine learning algorithms. iv Performance measures of the classification algorithms demonstrated statistically significant improvement after the transformation of the dataset through KDD techniques, data boosting and sampling. Domain expertise was applied to validate clinical significance of the results. KDD methodologies were applied successfully to a complex clinical dataset. The use of these methodologies was empirically proven effective in healthcare data through statistically significant measures and clinical validation. Although more research is required, we demonstrated the usefulness of KDD methodologies in knowledge extraction from complex clinical data
    • …
    corecore