19 research outputs found

    The use of clinical, behavioral, and social determinants of health to improve identification of patients in need of advanced care for depression

    Get PDF
    Indiana University-Purdue University Indianapolis (IUPUI)Depression is the most commonly occurring mental illness the world over. It poses a significant health and economic burden across the individual and community. Not all occurrences of depression require the same level of treatment. However, identifying patients in need of advanced care has been challenging and presents a significant bottleneck in providing care. We developed a knowledge-driven depression taxonomy comprised of features representing clinical, behavioral, and social determinants of health (SDH) that inform the onset, progression, and outcome of depression. We leveraged the depression taxonomy to build decision models that predicted need for referrals across: (a) the overall patient population and (b) various high-risk populations. Decision models were built using longitudinal, clinical, and behavioral data extracted from a population of 84,317 patients seeking care at Eskenazi Health of Indianapolis, Indiana. Each decision model yielded significantly high predictive performance. However, models predicting need of treatment across high-risk populations (ROC’s of 86.31% to 94.42%) outperformed models representing the overall patient population (ROC of 78.87%). Next, we assessed the value of adding SDH into each model. For each patient population under study, we built additional decision models that incorporated a wide range of patient and aggregate-level SDH and compared their performance against the original models. Models that incorporated SDH yielded high predictive performance. However, use of SDH did not yield statistically significant performance improvements. Our efforts present significant potential to identify patients in need of advanced care using a limited number of clinical and behavioral features. However, we found no benefit to incorporating additional SDH into these models. Our methods can also be applied across other datasets in response to a wide variety of healthcare challenges

    Machine Learning Approaches to Identify Nicknames from A Statewide Health Information Exchange

    Get PDF
    Patient matching is essential to minimize fragmentation of patient data. Existing patient matching efforts often do not account for nickname use. We sought to develop decision models that could identify true nicknames using features representing the phonetical and structural similarity of nickname pairs. We identified potential male and female name pairs from the Indiana Network for Patient Care (INPC), and developed a series of features that represented their phonetical and structural similarities. Next, we used the XGBoost classifier and hyperparameter tuning to build decision models to identify nicknames using these feature sets and a manually reviewed gold standard. Decision models reported high Precision/Positive Predictive Value and Accuracy scores for both male and female name pairs despite the low number of true nickname matches in the datasets under study. Ours is one of the first efforts to identify patient nicknames using machine learning approaches

    An Adversorial Approach to Enable Re-Use of Machine Learning Models and Collaborative Research Efforts Using Synthetic Unstructured Free-Text Medical Data

    Get PDF
    We leverage Generative Adversarial Networks (GAN) to produce synthetic free-text medical data with low re-identification risk, and apply these to replicate machine learning solutions. We trained GAN models to generate free-text cancer pathology reports. Decision models were trained using synthetic datasets reported performance metrics that were statistically similar to models trained using original test data. Our results further the use of GANs to generate synthetic data for collaborative research and re-use of machine learning models

    Personalizing Longitudinal Care Coordination for Patients with Chronic Kidney Disease

    Get PDF
    Chronic care coordination efforts often focus on the needs of the healthcare team and not on the individual needs of each patient. However, developing a personalized care plan for patients with Chronic Kidney Disease (CKD) requires individual patient engagement with the health care team. We describe the development of a CKD e-care plan that focuses on patient specific needs and life goals, and can be personalized according to provider needs

    Generative Adversarial Networks for Creating Synthetic Free-Text Medical Data: A Proposal for Collaborative Research and Re-use of Machine Learning Models

    Get PDF
    Restrictions in sharing Patient Health Identifiers (PHI) limit cross-organizational re-use of free-text medical data. We leverage Generative Adversarial Networks (GAN) to produce synthetic unstructured free-text medical data with low re-identification risk, and assess the suitability of these datasets to replicate machine learning models. We trained GAN models using unstructured free-text laboratory messages pertaining to salmonella, and identified the most accurate models for creating synthetic datasets that reflect the informational characteristics of the original dataset. Natural Language Generation metrics comparing the real and synthetic datasets demonstrated high similarity. Decision models generated using these datasets reported high performance metrics. There was no statistically significant difference in performance measures reported by models trained using real and synthetic datasets. Our results inform the use of GAN models to generate synthetic unstructured free-text data with limited re-identification risk, and use of this data to enable collaborative research and re-use of machine learning models

    Identifying Biases in Clinical Decision Models Designed to Predict Need of Wraparound Services

    Get PDF
    Investigation of systemic biases in AI models for the clinical domain have been limited. We re-created a series of models predicting need of wraparound services, and inspected them for biases across age, gender and race using the AI Fairness 360 framework. AI models reported performance metrics which were comparable to original efforts. Investigation of biases using the AI Fairness framework found low likelihood that patient age, gender and sex are introducing bias into our algorithms

    Evaluation of a Parsimonious COVID-19 Outbreak Prediction Model: Heuristic Modeling Approach Using Publicly Available Data Sets

    Get PDF
    Background: The COVID-19 pandemic has changed public health policies and human and community behaviors through lockdowns and mandates. Governments are rapidly evolving policies to increase hospital capacity and supply personal protective equipment and other equipment to mitigate disease spread in affected regions. Current models that predict COVID-19 case counts and spread are complex by nature and offer limited explainability and generalizability. This has highlighted the need for accurate and robust outbreak prediction models that balance model parsimony and performance. Objective: We sought to leverage readily accessible data sets extracted from multiple states to train and evaluate a parsimonious predictive model capable of identifying county-level risk of COVID-19 outbreaks on a day-to-day basis. Methods: Our modeling approach leveraged the following data inputs: COVID-19 case counts per county per day and county populations. We developed an outbreak gold standard across California, Indiana, and Iowa. The model utilized a per capita running 7-day sum of the case counts per county per day and the mean cumulative case count to develop baseline values. The model was trained with data recorded between March 1 and August 31, 2020, and tested on data recorded between September 1 and October 31, 2020. Results: The model reported sensitivities of 81%, 92%, and 90% for California, Indiana, and Iowa, respectively. The precision in each state was above 85% while specificity and accuracy scores were generally >95%. Conclusions: Our parsimonious model provides a generalizable and simple alternative approach to outbreak prediction. This methodology can be applied to diverse regions to help state officials and hospitals with resource allocation and to guide risk management, community education, and mitigation strategies

    Toward better public health reporting using existing off the shelf approaches: The value of medical dictionaries in automated cancer detection using plaintext medical data

    Get PDF
    Objectives Existing approaches to derive decision models from plaintext clinical data frequently depend on medical dictionaries as the sources of potential features. Prior research suggests that decision models developed using non-dictionary based feature sourcing approaches and “off the shelf” tools could predict cancer with performance metrics between 80% and 90%. We sought to compare non-dictionary based models to models built using features derived from medical dictionaries. Materials and methods We evaluated the detection of cancer cases from free text pathology reports using decision models built with combinations of dictionary or non-dictionary based feature sourcing approaches, 4 feature subset sizes, and 5 classification algorithms. Each decision model was evaluated using the following performance metrics: sensitivity, specificity, accuracy, positive predictive value, and area under the receiver operating characteristics (ROC) curve. Results Decision models parameterized using dictionary and non-dictionary feature sourcing approaches produced performance metrics between 70 and 90%. The source of features and feature subset size had no impact on the performance of a decision model. Conclusion Our study suggests there is little value in leveraging medical dictionaries for extracting features for decision model building. Decision models built using features extracted from the plaintext reports themselves achieve comparable results to those built using medical dictionaries. Overall, this suggests that existing “off the shelf” approaches can be leveraged to perform accurate cancer detection using less complex Named Entity Recognition (NER) based feature extraction, automated feature selection and modeling approaches

    Integrating Data Science into T32 Training Programs at IUPUI

    Get PDF
    Data science is critically important to the biomedical research enterprise. Many research efforts currently and in the future will employ advanced computational techniques to analyze extremely large datasets in order to discover insights relevant to human health. Therefore the next generation of biomedical scientists requires knowledge of and proficiency in data science. With support from the U.S. National Library of Medicine, a team of faculty from Indiana University-Purdue University Indianapolis (IUPUI) facilitated curricula enhancement for National Institutes of Health (NIH) T32 research training programs with respect to data science. In collaboration with the existing NIH T32 Program Directors at IUPUI and the IU School of Medicine, the interdisciplinary team of faculty drawn from multiple schools and departments examined the existing landscape of data science offerings on campus in parallel with an assessment of the competencies that future biomedical and clinician scientists will require to be comfortable using data science methods to advance their research. The IUPUI campus possesses a rich tapestry of data science education programs across multiple schools and departments. Furthermore, the campus is home to more than a dozen world-class T32 programs funded by the NIH to train biomedical and clinician scientists. However, existing training programs do not currently emphasize data science or provide specific curriculum designed to ensure T32 graduates possess basic competencies in data science. To position the campus for the future, robust T32 programs need to connect with the rapidly growing data science programs. This report summarizes the rationale for the importance of connection and the competencies that future biomedical and clinical scientists will require to be successful. The report further describes the curriculum mapping efforts to link competencies with available degree programs, courses and workshops on campus. The report further recommends next steps for campus leadership, including but not limited to T32 Program Directors, the Office of the Vice Chancellor for Research, the Executive Associate Dean for Research Affairs at the IU School of Medicine, and the President and CEO of the Regenstrief Institute. Together we can strengthen the IUPUI campus and help ensure its T32 graduates are successful in their research careers.National Library of Medicin
    corecore