774 research outputs found

    Weighted metrics are required when evaluating the performance of prediction models in nested case–control studies

    Get PDF
    Background: Nested case–control (NCC) designs are efficient for developing and validating prediction models that use expensive or difficult-to-obtain predictors, especially when the outcome is rare. Previous research has focused on how to develop prediction models in this sampling design, but little attention has been given to model validation in this context. We therefore aimed to systematically characterize the key elements for the correct evaluation of the performance of prediction models in NCC data. Methods: We proposed how to correctly evaluate prediction models in NCC data, by adjusting performance metrics with sampling weights to account for the NCC sampling. We included in this study the C-index, threshold-based metrics, Observed-to-expected events ratio (O/E ratio), calibration slope, and decision curve analysis. We illustrated the proposed metrics with a validation of the Breast and Ovarian Analysis of Disease Incidence and Carrier Estimation Algorithm (BOADICEA version 5) in data from the population-based Rotterdam study. We compared the metrics obtained in the full cohort with those obtained in NCC datasets sampled from the Rotterdam study, with and without a matched design. Results: Performance metrics without weight adjustment were biased: the unweighted C-index in NCC datasets was 0.61 (0.58–0.63) for the unmatched design, while the C-index in the full cohort and the weighted C-index in the NCC datasets were similar: 0.65 (0.62–0.69) and 0.65 (0.61–0.69), respectively. The unweighted O/E ratio was 18.38 (17.67–19.06) in the NCC datasets, while it was 1.69 (1.42–1.93) in the full cohort and its weighted version in the NCC datasets was 1.68 (1.53–1.84). Similarly, weighted adjustments of threshold-based metrics and net benefit for decision curves were unbiased estimates of the corresponding metrics in the full cohort, while the corresponding unweighted metrics were biased. In the matched design, the bias of the unweighted metrics was larger, but it could also be compensated by the weight adjustment. Conclusions: Nested case–control studies are an efficient solution for evaluating the performance of prediction models that use expensive or difficult-to-obtain biomarkers, especially when the outcome is rare, but the performance metrics need to be adjusted to the sampling procedure.</p

    Weighted metrics are required when evaluating the performance of prediction models in nested case–control studies

    Get PDF
    Background: Nested case–control (NCC) designs are efficient for developing and validating prediction models that use expensive or difficult-to-obtain predictors, especially when the outcome is rare. Previous research has focused on how to develop prediction models in this sampling design, but little attention has been given to model validation in this context. We therefore aimed to systematically characterize the key elements for the correct evaluation of the performance of prediction models in NCC data. Methods: We proposed how to correctly evaluate prediction models in NCC data, by adjusting performance metrics with sampling weights to account for the NCC sampling. We included in this study the C-index, threshold-based metrics, Observed-to-expected events ratio (O/E ratio), calibration slope, and decision curve analysis. We illustrated the proposed metrics with a validation of the Breast and Ovarian Analysis of Disease Incidence and Carrier Estimation Algorithm (BOADICEA version 5) in data from the population-based Rotterdam study. We compared the metrics obtained in the full cohort with those obtained in NCC datasets sampled from the Rotterdam study, with and without a matched design. Results: Performance metrics without weight adjustment were biased: the unweighted C-index in NCC datasets was 0.61 (0.58–0.63) for the unmatched design, while the C-index in the full cohort and the weighted C-index in the NCC datasets were similar: 0.65 (0.62–0.69) and 0.65 (0.61–0.69), respectively. The unweighted O/E ratio was 18.38 (17.67–19.06) in the NCC datasets, while it was 1.69 (1.42–1.93) in the full cohort and its weighted version in the NCC datasets was 1.68 (1.53–1.84). Similarly, weighted adjustments of threshold-based metrics and net benefit for decision curves were unbiased estimates of the corresponding metrics in the full cohort, while the corresponding unweighted metrics were biased. In the matched design, the bias of the unweighted metrics was larger, but it could also be compensated by the weight adjustment. Conclusions: Nested case–control studies are an efficient solution for evaluating the performance of prediction models that use expensive or difficult-to-obtain biomarkers, especially when the outcome is rare, but the performance metrics need to be adjusted to the sampling procedure.</p

    Quality of Life: Effects of Physical Activity in an Anthropometric, Cognitive and Psychosocial Background, and Variation of Odontoiatric Parameters

    Get PDF
    The "Quality of Life" project arises from the need to observe and define the existing interconnection between a "healthy" lifestyle and the quality of life perceived by people. Starting from the W.H.O. (World Health Organization) definitions of quality of life and health, and analyzing the studies that describe the benefits of both movement physical and mental, a protocol three months has been set for the pursuit of a diet and personalized training. For this purpose, 20 healthy subjects from the 1st year of the Physiotherapy degree, who had never practiced sports at an agonistic level, have been recruited. All individuals were subject to initial evaluations to define psycho-physical wellbeing, then they have been divided into two groups: the first group, defined as experimental, followed the protocol for the duration of the study, while the second, was only subject to evaluation. At the end of the three months period, the subjects of both groups were evaluated twice, the first immediately after the last training session, the second was evaluated one month from the latter. The final evaluation aimed at verifying a possible conservative effect on modification of experimental group parameters, even when physical activity and nutrition were not supervised by us. The ultimate goal of the project was to specifically analyze, after three months, the changes in cognitive memory capacity and concentration, the stress experienced in work or study, and psycho-physical wellbeing perceived by the concerned subjects. Data collected during the three evaluations showed that, in the experimental group, as our scientific protocol has altered all these parameters in terms of quality of life in positive terms, modifications are susceptible to changes in the event that physical activity and proper nutrition are not followed

    A machine learning approach using endpoint adjudication committee labels for the identification of sepsis predictors at the emergency department

    Get PDF
    Accurate sepsis diagnosis is paramount for treatment decisions, especially at the emergency department (ED). To improve diagnosis, clinical decision support (CDS) tools are being developed with machine learning (ML) algorithms, using a wide range of variable groups. ML models can find patterns in Electronic Health Record (EHR) data that are unseen by the human eye. A prerequisite for a good model is the use of high-quality labels. Sepsis gold-standard labels are hard to define due to a lack of reliable diagnostic tools for sepsis at the ED. Therefore, standard clinical tools, such as clinical prediction scores (e.g. modified early warning score and quick sequential organ failure assessment), and claims-based methods (e.g. ICD-10) are used to generate suboptimal labels. As a consequence, models trained with these “silver” labels result in ill-trained models. In this study, we trained ML models for sepsis diagnosis at the ED with labels of 375 ED visits assigned by an endpoint adjudication committee (EAC) that consisted of 18 independent experts. Our objective was to evaluate which routinely measured variables show diagnostic value for sepsis. We performed univariate testing and trained multiple ML models with 95 routinely measured variables of three variable groups; demographic and vital, laboratory and advanced haematological variables. Apart from known diagnostic variables, we identified added diagnostic value for less conventional variables such as eosinophil count and platelet distribution width. In this explorative study, we show that the use of an EAC together with ML can identify new targets for future sepsis diagnosis research

    Ambiguous definitions for baseline serum creatinine affect acute kidney diagnosis at the emergency department

    Get PDF
    Background: Acute kidney injury (AKI) incidence is increasing, however AKI is often missed at the emergency department (ED). AKI diagnosis depends on changes in kidney function by comparing a serum creatinine (SCr) measurement to a baseline value. However, it remains unclear to what extent different baseline values may affect AKI diagnosis at ED. Methods: Routine care data from ED visits between 2012 and 2019 were extracted from the Utrecht Patient Oriented Database. We evaluated baseline definitions with criteria from the RIFLE, AKIN and KDIGO guidelines. We evaluated four baseline SCr definitions (lowest, most recent, mean, median), as well as five different time windows (up to 365 days prior to ED visit) to select a baseline and compared this to the first measured SCr at ED. As an outcome, we assessed AKI prevalence at ED. Results: We included 47,373 ED visits with both SCr-ED and SCr-BL available. Of these, 46,100 visits had a SCr-BL from the − 365/− 7 days time window. Apart from the lowest value, AKI prevalence remained similar for the other definitions when varying the time window. The lowest value with the − 365/− 7 time window resulted in the highest prevalence (21.4%). Importantly, applying the guidelines with all criteria resulted in major differences in prevalence ranging from 5.9 to 24.0%. Conclusions: AKI prevalence varies with the use of different baseline definitions in ED patients. Clinicians, as well as researchers and developers of automatic diagnostic tools should take these considerations into account when aiming to diagnose AKI in clinical and research settings

    Design, validation and implementation of an automated e-alert for acute kidney injury: 6-month pilot study shows increased awareness

    Get PDF
    BACKGROUND: Acute kidney injury (AKI) is defined as a sudden episode of kidney failure but is known to be under-recognized by healthcare professionals. The Kidney Disease Improving Global Outcome (KDIGO) guidelines have formulated criteria to facilitate AKI diagnosis by comparing changes in plasma creatinine measurements (PCr). To improve AKI awareness, we implemented these criteria as an electronic alert (e-alert), in our electronic health record (EHR) system. METHODS: For every new PCr measurement measured in the University Medical Center Utrecht that triggered the e-alert, we provided the physician with actionable insights in the form of a memo, to improve or stabilize kidney function. Since e-alerts qualify for software as a medical device (SaMD), we designed, implemented and validated the e-alert according to the European Union In Vitro Diagnostic Regulation (IVDR). RESULTS: We evaluated the impact of the e-alert using pilot data six months before and after implementation. 2,053 e-alerts of 866 patients were triggered in the before implementation, and 1,970 e-alerts of 853 patients were triggered after implementation. We found improvements in AKI awareness as measured by (1) 2 days PCr follow up (56.6-65.8%, p-value: 0.003), and (2) stop of nephrotoxic medication within 7 days of the e-alert (59.2-63.2%, p-value: 0.002). CONCLUSION: Here, we describe the design and implementation of the e-alert in line with the IVDR, leveraging a multi-disciplinary team consisting of physicians, clinical chemists, data managers and data scientists, and share our firsts results that indicate an improved awareness among treating physicians

    Transportability and Implementation Challenges of Early Warning Scores for Septic Shock in the ICU: A Perspective on the TREWScore

    Get PDF
    The increased use of electronic health records (EHRs) has improved the availability of routine care data for medical research. Combined with machine learning techniques this has spurred the development of early warning scores (EWSs) in hospitals worldwide. EWSs are commonly used in the hospital where they have been developed, yet few have been transported to external settings and/or internationally. In this perspective, we describe our experiences in implementing the TREWScore, a septic shock EWS, and the transportability challenges regarding domain, predictors, and clinical outcome we faced. We used data of 53,330 ICU stays from Medical Information Mart for Intensive Care-III (MIMIC-III) and 18,013 ICU stays from the University Medical Center (UMC) Utrecht, including 17,023 (31.9%) and 2,557 (14.2%) cases of sepsis, respectively. The MIMIC-III and UMC populations differed significantly regarding the length of stay (6.9 vs. 9.0 days) and hospital mortality (11.6% vs. 13.6%). We mapped all 54 TREWScore predictors to the UMC database: 31 were readily available, seven required unit conversion, 14 had to be engineered, one predictor required text mining, and one predictor could not be mapped. Lastly, we classified sepsis cases for septic shock using the sepsis-2 criteria. Septic shock populations (UMC 31.3% and MIMIC-III 23.3%) and time to shock events showed significant differences between the two cohorts. In conclusion, we identified challenges to transportability and implementation regarding domain, predictors, and clinical outcome when transporting EWS between hospitals across two continents. These challenges need to be systematically addressed to improve model transportability between centers and unlock the potential clinical utility of EWS

    First-episode psychosis and migration in Italy (PEP-Ita migration): a study in the Italian mental health services

    Get PDF
    BACKGROUND: It has been frequently reported a higher incidence of psychotic disorders in immigrants than in native populations. There is, however, a lack of knowledge about risk factors which may explain this phenomenon. A better understanding of the causes of psychosis among first-generation migrants is highly needed, particularly in Italy, a country with a recent massive migration. METHODS/DESIGN: The "Italian study on first-episode psychosis and migration (PEP-Ita)" is a prospective observational study over a two-year period (1 January 2012-31 December 2013) which will be carried out in 11 Italian mental health centres. All participating centres will collect data about all new cases of migrants with first-episode psychosis. The general purpose ("core") of the PEP-Ita study is to explore the socio-demographic and clinical characteristics, and the pathways to care of a population of first-episode psychosis migrants in Italy. Secondary aims of the study will be: 1) to understand risk and protective factors for the development of psychotic disorders in migrants; 2) to evaluate the correlations between psychopathology of psychotic disorders in migrants and socio-demographic characteristics, migration history, life experiences; 3) to evaluate the clinical and social outcomes of first-episode psychoses in migrants. DISCUSSION: The results of the PEP-Ita study will allow a better understanding of risk factors for psychosis in first-generation migrants in Italy. Moreover, our results will contribute to the development of prevention programmes for psychosis and to the improvement of early intervention treatments for the migrant population in Italy

    A Systematic Review and International Web-Based Survey of Randomized Controlled Trials in the Perioperative and Critical Care Setting: Interventions Reducing Mortality

    Get PDF
    The authors aimed to identify interventions documented by randomized controlled trials (RCTs) that reduce mortality in adult critically ill and perioperative patients, followed by a survey of clinicians’ opinions and routine practices to understand the clinicians’ response to such evidence. The authors performed a comprehensive literature review to identify all topics reported to reduce mortality in perioperative and critical care settings according to at least 2 RCTs or to a multicenter RCT or to a single-center RCT plus guidelines. The authors generated position statements that were voted on online by physicians worldwide for agreement, use, and willingness to include in international guidelines. From 262 RCT manuscripts reporting mortality differences in the perioperative and critically ill settings, the authors selected 27 drugs, techniques, and strategies (66 RCTs, most frequently published by the New England Journal of Medicine [13 papers], Lancet [7], and Journal of the American Medical Association [5]) with an agreement ≥67% from over 250 physicians (46 countries). Noninvasive ventilation was the intervention supported by the largest number of RCTs (n = 13). The concordance between agreement and use (a positive answer both to “do you agree” and “do you use”) showed differences between Western and other countries and between anesthesiologists and intensive care unit physicians. The authors identified 27 clinical interventions with randomized evidence of survival benefit and strong clinician support in support of their potential life-saving properties in perioperative and critically ill patients with noninvasive ventilation having the highest level of support. However, clinician views appear affected by specialty and geographical location
    corecore