9 research outputs found

    Evaluation of SOFA-based models for predicting mortality in the ICU: A systematic review

    Get PDF
    Introduction To systematically review studies evaluating the performance of Sequential Organ Failure Assessment ( SOFA)based models for predicting mortality in patients in the intensive care unit (ICU). Methods Medline, EMBASE and other databases were searched for English-language articles with the major objective of evaluating the prognostic performance of SOFA-based models in predicting mortality in surgical and/or medical ICU admissions. The quality of each study was assessed based on a quality framework for prognostic models. Results Eighteen articles met all inclusion criteria. The studies differed widely in the SOFA derivatives used and in their methods of evaluation. Ten studies reported about developing a probabilistic prognostic model, only five of which used an independent validation data set. The other studies used the SOFA-based score directly to discriminate between survivors and non-survivors without fitting a probabilistic model. In five of the six studies, admission-based models ( Acute Physiology and Chronic Health Evaluation (APACHE) II/III) were reported to have a slightly better discrimination ability than SOFA-based models at admission ( the receiver operating characteristic curve (AUC) of SOFA-based models ranged between 0.61 and 0.88), and in one study a SOFA model had higher AUC than the Simplified Acute Physiology Score (SAPS) II model. Four of these studies used the Hosmer-Lemeshow tests for calibration, none of which reported a lack of fit for the SOFA models. Models based on sequential SOFA scores were described in 11 studies including maximum SOFA scores and maximum sum of individual components of the SOFA score ( AUC range: 0.69 to 0.92) and delta SOFA ( AUC range: 0.51 to 0.83). Studies comparing SOFA with other organ failure scores did not consistently show superiority of one scoring system to another. Four studies combined SOFA-based derivatives with admission severity of illness scores, and they all reported on improved predictions for the combination. Quality of studies ranged from 11.5 to 19.5 points on a 20-point scale. Conclusions Models based on SOFA scores at admission had only slightly worse performance than APACHE II/III and were competitive with SAPS II models in predicting mortality in patients in the general medical and/or surgical ICU. Models with sequential SOFA scores seem to have a comparable performance with other organ failure scores. The combination of sequential SOFA derivatives with APACHE II/III and SAPS II models clearly improved prognostic performance of either model alone. Due to the heterogeneity of the studies, it is impossible to draw general conclusions on the optimal mathematical model and optimal derivatives of SOFA scores. Future studies should use a standard evaluation methodology with a standard set of outcome measures covering discrimination, calibration and accurac

    Effect of changes over time in the performance of a customized SAPS-II model on the quality of care assessment

    Get PDF
    Purpose: The aim of our study was to explore, using an innovative method, the effect of temporal changes in the mortality prediction performance of an existing model on the quality of care assessment. The prognostic model (rSAPS-II) was a recalibrated Simplified Acute Physiology Score-II model developed for very elderly Intensive Care Unit (ICU) patients. Methods: The study population comprised all 12,143 consecutive patients aged 80 years and older admitted between January 2004 and July 2009 to one of the ICUs of 21 Dutch hospitals. The prospective dataset was split into 30 equally sized consecutive subsets. Per subset, we measured the model's discrimination [area under the curve (AUC)], accuracy (Brier score), and standardized mortality ratio (SMR), both without and after repeated recalibration. All performance measures were considered to be stable if 1 without and after repeated recalibration for the year 2009. Results: For all subsets, the AUCs were stable, but the Brier scores and SMRs were not. The SMR was downtrending, achieving levels significantly below 1. Repeated recalibration rendered it stable again. The proportions of hospitals with SMR>1 and SMR <1 changed from 15 versus 85% to 35 versus 65%. Conclusions: Variability over time may markedly vary among different performance measures, and infrequent model recalibration can result in improper assessment of the quality of care in many hospitals. We stress the importance of the timely recalibration and repeated validation of prognostic models over tim

    Assessing and combining repeated prognosis of physicians and temporal models in the intensive care

    No full text
    Recently, we devised a method to develop prognostic models incorporating patterns of sequential organ failure to predict the eventual hospital mortality at each day of intensive care unit (ICU) stay. In this study, we investigate using a real world setting how these models perform compared to physicians, who are exposed to additional information than the models. We developed prognostic models for days 2-7 of ICU stay by data-driven discovery of patterns of sequential qualitative organ failure (SOFA) scores and embedding the patterns as binary variables in three types of logistic regression models. Type A models include the severity of illness score at admission (SAPS-II) and the SOFA patterns. Type B models add to these covariates the mean, max and delta (increments) of SOFA scores. Type C models include, in addition, the mean, max and delta in expert opinion (i.e. the physicians' prediction of mortality). Physicians had a statistically significantly better discriminative ability compared to the models without subjective information (AUC range over days: 0.78-0.79 vs. 0.71-0.74) and comparable accuracy (Brier score range: 0.15-0.18 vs. 0.16-0.18). However when we combined both sources of predictions, in Type C models, we arrived at a significantly superior discrimination as well as accuracy than the objective and subjective models alone (AUC range: 0.80-0.83; Brier score range: 0.13-0.16). The models and the physicians draw on complementary information that can be best harnessed by combining both prediction sources. Extensive external validation and impact studies are imperative to further investigate the ability of the combined mode

    Consistency of nurses' daily predictions of survival in the intensive care

    No full text
    In the Intensive Care Unit, clinicians are continuously faced with the difficult task of prognosis, but their predictions of patient survival status may not always be consistent. Specifically very little is known about consistency of predictions over time. The aim of this paper is to assess the consistency of nurses' daily predictions of survival in terms of inter-observer variance and variance of observers over time. We found a low consistency of these predictions between observers and over time, even though changes in the patients' condition are considered. Our findings have implications to the process of end-of-life decision-making, which pertains to withholding or withdrawing intensive care treatmen

    Characterizing Predictive Models of Mortality for Older Adults and Their Validation for Use in Clinical Practice

    No full text
    OBJECTIVES: To systematically identify and characterize prognostic models of mortality for older adults, their reported potential use, and the actual level of their (external) validity. DESIGN: The Scopus database until January 2010 was searched for articles that developed and validated new models or validated existing prognostic models of mortality or survival in older adults. SETTING: All domains of health care. PARTICIPANTS: Adults aged 50 and older. MEASUREMENTS: Study and model characteristics were summarized, including the model's development method and degree of validation, data types used, and outcomes. RESULTS: One hundred three articles describing 193 models in 10 domains and mostly originating from the United States were included. These domains were mostly secondary or tertiary care settings (54%) such as intensive care (7%) or geriatric units (8%). Half of the studies (50%) were not disease specific. Heart failure-related diseases (9%) and pneumonia (9%) constituted the major disease-specific subgroups. Most studies (67%) reported support of clinical individual (treatment) decisions as use of prognostic models, but only 34% were externally validated, and only four models (2%) were validated in more than two studies. Most studies (68%) developed at least one new model, but they did not often go beyond addressing their apparent validation (49%). CONCLUSION: Although prognostic models are regularly developed to support clinical individual decisions and could be useful for this purpose, their use is premature. Because clinical credibility and evidence of external validity build trust in prognostic models, both require much more consideration to enhance model acceptance in the future. J Am Geriatr Soc 59:1110-1115, 201

    Prognostic models for predicting mortality in elderly ICU patients: a systematic review

    No full text
    To systematically review prognostic research literature on development and/or validation of mortality predictive models in elderly patients. We searched the Scopus database until June 2010 for articles aimed at validating prognostic models for survival or mortality in elderly intensive care unit (ICU) patients. We assessed the models' fitness for their intended purpose on the basis of barriers for use reported in the literature, using the following categories: (1) clinical credibility, (2) methodological quality (based on an existing quality assessment framework), (3) external validity, (4) model performance, and (5) clinical effectiveness. Seven studies were identified which met our inclusion criteria, one of which was an external validation study. In total, 17 models were found of which six were developed for the general adult ICU population and eleven specifically for elderly patients. Cohorts ranged from 148 to 12,993 patients and only smaller ones were obtained prospectively. The area under the receiver operating characteristic curve (AUC) was most commonly used to measure performance (range 0.71-0.88). The median number of criteria met for clinical credibility was 4.5 out of 7 (range 2.5-5.5) and 17 out of 20 for methodological quality (range 15-20). Although the models scored relatively well on methodological quality, none of them can be currently considered sufficiently credible or valid to be applicable in clinical practice for elderly patients. Future research should focus on external validation, addressing performance measures relevant for their intended use, and on clinical credibility including the incorporation of factors specific for the elderly populatio

    Statistical process control for validating a classification tree model for predicting mortality – A novel approach towards temporal validation

    No full text
    Prediction models are postulated as useful tools to support tasks such as clinical decision making and benchmarking. In particular, classification tree models have enjoyed much interest in the Biomedical Informatics literature. However, their prospective predictive performance over the course of time has not been investigated. In this paper we suggest and apply statistical process control methods to monitor over more than 5 years the prospective predictive performance of TM80+, one of the few classification-tree models published in the clinical literature. TM80+ is a model for predicting mortality among very elderly patients in the intensive care based on a multi-center dataset. We also inspect the predictive performance at the tree's leaves. This study provides important insights into patterns of (in)stability of the tree's performance and its "shelf life". The study underlies the importance of continuous validation of prognostic models over time using statistical tools and the timely recalibration of tree models. (C) 2011 Elsevier Inc. All rights reserve
    corecore