4,877 research outputs found

    Used-habitat calibration plots: a new procedure for validating species distribution, resource selection, and step-selection models

    Get PDF
    “Species distribution modeling” was recently ranked as one of the top five “research fronts” in ecology and the environmental sciences by ISI's Essential Science Indicators (Renner and Warton 2013), reflecting the importance of predicting how species distributions will respond to anthropogenic change. Unfortunately, species distribution models (SDMs) often perform poorly when applied to novel environments. Compounding on this problem is the shortage of methods for evaluating SDMs (hence, we may be getting our predictions wrong and not even know it). Traditional methods for validating SDMs quantify a model's ability to classify locations as used or unused. Instead, we propose to focus on how well SDMs can predict the characteristics of used locations. This subtle shift in viewpoint leads to a more natural and informative evaluation and validation of models across the entire spectrum of SDMs. Through a series of examples, we show how simple graphical methods can help with three fundamental challenges of habitat modeling: identifying missing covariates, non-linearity, and multicollinearity. Identifying habitat characteristics that are not well-predicted by the model can provide insights into variables affecting the distribution of species, suggest appropriate model modifications, and ultimately improve the reliability and generality of conservation and management recommendations

    Machine Learning for Diabetes and Mortality Risk Prediction From Electronic Health Records

    Get PDF
    Data science can provide invaluable tools to better exploit healthcare data to improve patient outcomes and increase cost-effectiveness. Today, electronic health records (EHR) systems provide a fascinating array of data that data science applications can use to revolutionise the healthcare industry. Utilising EHR data to improve the early diagnosis of a variety of medical conditions/events is a rapidly developing area that, if successful, can help to improve healthcare services across the board. Specifically, as Type-2 Diabetes Mellitus (T2DM) represents one of the most serious threats to health across the globe, analysing the huge volumes of data provided by EHR systems to investigate approaches for early accurately predicting the onset of T2DM, and medical events such as in-hospital mortality, are two of the most important challenges data science currently faces. The present thesis addresses these challenges by examining the research gaps in the existing literature, pinpointing the un-investigated areas, and proposing a novel machine learning modelling given the difficulties inherent in EHR data. To achieve these aims, the present thesis firstly introduces a unique and large EHR dataset collected from Saudi Arabia. Then we investigate the use of a state-of-the-art machine learning predictive models that exploits this dataset for diabetes diagnosis and the early identification of patients with pre-diabetes by predicting the blood levels of one of the main indicators of diabetes and pre-diabetes: elevated Glycated Haemoglobin (HbA1c) levels. A novel collaborative denoising autoencoder (Col-DAE) framework is adopted to predict the diabetes (high) HbA1c levels. We also employ several machine learning approaches (random forest, logistic regression, support vector machine, and multilayer perceptron) for the identification of patients with pre-diabetes (elevated HbA1c levels). The models employed demonstrate that a patient's risk of diabetes/pre-diabetes can be reliably predicted from EHR records. We then extend this work to include pioneering adoption of recent technologies to investigate the outcomes of the predictive models employed by using recent explainable methods. This work also investigates the effect of using longitudinal data and more of the features available in the EHR systems on the performance and features ranking of the employed machine learning models for predicting elevated HbA1c levels in non-diabetic patients. This work demonstrates that longitudinal data and available EHR features can improve the performance of the machine learning models and can affect the relative order of importance of the features. Secondly, we develop a machine learning model for the early and accurate prediction all in-hospital mortality events for such patients utilising EHR data. This work investigates a novel application of the Stacked Denoising Autoencoder (SDA) to predict in-hospital patient mortality risk. In doing so, we demonstrate how our approach uniquely overcomes the issues associated with imbalanced datasets to which existing solutions are subject. The proposed model –– using clinical patient data on a variety of health conditions and without intensive feature engineering –– is demonstrated to achieve robust and promising results using EHR patient data recorded during the first 24 hours after admission

    Artificial intelligence for clinical decision support for monitoring patients in cardiovascular ICUs: a systematic review

    Get PDF
    Background: Artificial intelligence (AI) and machine learning (ML) models continue to evolve the clinical decision support systems (CDSS). However, challenges arise when it comes to the integration of AI/ML into clinical scenarios. In this systematic review, we followed the Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA), the population, intervention, comparator, outcome, and study design (PICOS), and the medical AI life cycle guidelines to investigate studies and tools which address AI/ML-based approaches towards clinical decision support (CDS) for monitoring cardiovascular patients in intensive care units (ICUs). We further discuss recent advances, pitfalls, and future perspectives towards effective integration of AI into routine practices as were identified and elaborated over an extensive selection process for state-of-the-art manuscripts. Methods: Studies with available English full text from PubMed and Google Scholar in the period from January 2018 to August 2022 were considered. The manuscripts were fetched through a combination of the search keywords including AI, ML, reinforcement learning (RL), deep learning, clinical decision support, and cardiovascular critical care and patients monitoring. The manuscripts were analyzed and filtered based on qualitative and quantitative criteria such as target population, proper study design, cross-validation, and risk of bias. Results: More than 100 queries over two medical search engines and subjective literature research were developed which identified 89 studies. After extensive assessments of the studies both technically and medically, 21 studies were selected for the final qualitative assessment. Discussion: Clinical time series and electronic health records (EHR) data were the most common input modalities, while methods such as gradient boosting, recurrent neural networks (RNNs) and RL were mostly used for the analysis. Seventy-five percent of the selected papers lacked validation against external datasets highlighting the generalizability issue. Also, interpretability of the AI decisions was identified as a central issue towards effective integration of AI in healthcare
    • …
    corecore