5 research outputs found

    Validation of Random Forest Machine Learning Models to Predict Dementia-Related Neuropsychiatric Symptoms in Real-World Data

    Get PDF
    Background: Neuropsychiatric symptoms (NPS) are the leading cause of the social burden of dementia but their role is underestimated. Objective: The objective of the study was to validate predictive models to separately identify psychotic and depressive symptoms in patients diagnosed with dementia using clinical databases representing the whole population to inform decisionmakers. Methods: First, we searched the electronic health records of 4,003 patients with dementia to identify NPS. Second, machine learning (random forest) algorithms were applied to build separate predictive models for psychotic and depressive symptom clusters in the training set (N = 3,003). Third, calibration and discrimination were assessed in the test set (N = 1,000) to assess the performance of the models. Results: Neuropsychiatric symptoms were noted in the electronic health record of 58% of patients. The area under the receiver operating curve reached 0.80 for the psychotic cluster model and 0.74 for the depressive cluster model. The Kappa index and accuracy also showed better discrimination in the psychotic model. Calibration plots indicated that both types of model had less predictive accuracy when the probability of neuropsychiatric symptoms was <25%. The most important variables in the psychotic cluster model were use of risperidone, level of sedation, use of quetiapine and haloperidol and the number of antipsychotics prescribed. In the depressive cluster model, the most important variables were number of antidepressants prescribed, escitalopram use, level of sedation, and age. Conclusion: Given their relatively good performance, the predictive models can be used to estimate prevalence of NPS in population databases

    Prevalence of Child-Adolescent Chronic Pathologies in Cantabria (Spain)

    Get PDF
    Objective: The study of child-adolescent chronic pathologies (CACPs) is a relevant aspect for public health and this knowledge is necessary for prevention and control. Our objective was to analyze the prevalence and distribution of the main CACPs in the region of Cantabria (Spain). Methodology: The stratified data were obtained from the whole child and adolescent population treated by the Cantabria Health Service in the year 2019. The prevalence of the most frequent CACPs and their differences based on sex, urban non-urban location and levels of complexity and risk were assessed. The statistical analysis was performed with the SPSS program. Results: 90,163 people aged from birth to 17-year-olds were analyzed (48.7% girls and 51.3% boys). At least one CACP was present in 29.1% of the population studied (26,234). CACPs were less prevalent in girls (25.5%) than in boys (32.6%), (?2, p <0.001; OR: 0.708; IC 95%: 0.688-0.729). The two most common CACPs were: asthma (16.7%) and attention deficit hyperactivity disorder (6.3%). Multimorbidity occurred in 1,978 girls (4.5%) and in 2,916 boys (6.3%). Those chronic pathologies were less prevalent in urban areas (26.5%) than in non-urban areas (28.0%), (?2, p <0.001; OR: 1.037; 95% CI: 1.023-1.052). CACPs level of complexity was slightly higher in girls and in non-urban locations. The level of risk was similar when sex and location where compared. Conclusion: CACPs were very common in the age group studied. Our data indicate that not only is there a need for a greater number of descriptive studies, but also a greater accuracy on the clinical records of CACPs. These actions would allow the implementation of effective programs for prevention and early detection of CACPs and better treatment and control, once they are diagnosed

    On the relevance of preprocessing in predictive maintenance for dynamic systems

    Get PDF
    The complexity involved in the process of real-time data-driven monitoring dynamic systems for predicted maintenance is usually huge. With more or less in-depth any data-driven approach is sensitive to data preprocessing, understood as any data treatment prior to the application of the monitoring model, being sometimes crucial for the final development of the employed monitoring technique. The aim of this work is to quantify the sensitiveness of data-driven predictive maintenance models in dynamic systems in an exhaustive way. We consider a couple of predictive maintenance scenarios, each of them defined by some public available data. For each scenario, we consider its properties and apply several techniques for each of the successive preprocessing steps, e.g. data cleaning, missing values treatment, outlier detection, feature selection, or imbalance compensation. The pretreatment configurations, i.e. sequential combinations of techniques from different preprocessing steps, are considered together with different monitoring approaches, in order to determine the relevance of data preprocessing for predictive maintenance in dynamical systems

    Interpreting Remaining Useful Life estimations combining Explainable Artificial Intelligence and domain knowledge in industrial machinery

    No full text
    This paper presents the implementation and explanations of a remaining life estimator model based on machine learning, applied to industrial data. Concretely, the model has been applied to a bushings testbed, where fatigue life tests are performed to find more suitable bushing characteristics. Different regressors have been compared Environmental and Operational Condition and setting variables as input data to prognosticate the remaining life on each observation during fatigue tests, where final model is a Random Forest was chosen given its accuracy and explainability potential. The model creation, optimisation and interpretation has been guided combining eXplainable Artificial Intelligence with domain knowledge. Precisely, ELI5 and LIME explainable techniques have been used to perform local and global explanations. These were used to understand the relevance of predictor variables in individual and overall remaining life estimations. The achieved results have been process knowledge gain and expert knowledge validation, assertion of huge potential of data-driven models in industrial processes and highlight the need of collaboration between expert knowledge technicians and eXplainable Artificial Intelligence techniques to understand advanced machine learning models
    corecore