CORE
CO
nnecting
RE
positories
Services
Services overview
Explore all CORE services
Access to raw data
API
Dataset
FastSync
Content discovery
Recommender
Discovery
OAI identifiers
OAI Resolver
Managing content
Dashboard
Bespoke contracts
Consultancy services
Support us
Support us
Membership
Sponsorship
Research partnership
About
About
About us
Our mission
Team
Blog
FAQs
Contact us
Community governance
Governance
Advisory Board
Board of supporters
Research network
Innovations
Our research
Labs
Identifying the DEAD: Development and Validation of a Patient-Level Model to Predict Death Status in Population-Level Claims Data
Authors
J.M. Reps
P.R. (Peter) Rijnbeek
P.B. (Patrick) Ryan
Publication date
1 January 2019
Publisher
'Springer Science and Business Media LLC'
Doi
Abstract
Introduction US claims data contain medical data on large heterogeneous populations and are excellent sources for medical research. Some claims data do not contain complete death records, limiting their use for mortality or mortality-related studies. A model to predict whether a patient died at the end of the follow-up time (referred to as the end of observation) is needed to enable mortality-related studies. Objective The objective of this study was to develop a patient-level model to predict whether the end of observation was due to death in US claims data. Methods We used a claims dataset with full death records, Optum© De-Identifed Clinformatics® Data-Mart-Database—Date of Death mapped to the Observational Medical Outcome Partnership common data model, to develop a model that classifes the end of observations into death or non-death. A regularized logistic regression was trained using 88,514 predictors (recorded within the prior 365 or 30 days) and externally validated by applying the model to three US claims datasets. Results Approximately 25 in 1000 end of observations in Optum are due to death. The Discriminating End of observation into Alive and Dead (DEAD) model obtained an area under the receiver operating characteristic curve of 0.986. When defning death as a predicted risk of>0.5, only 2% of the end of observations were predicted to be due to death and the model obtained a sensitivity of 62% and a positive predictive value of 74.8%. The external validation showed the model was transportable, with area under the receiver operating characteristic curves ranging between 0.951 and 0.995 across the US claims databases. Conclusions US claims data often lack complete death records. The DEAD model can be used to impute death at various sensitivity, specifcity, or positive predictive values depending on the use of the model. The DEAD model can be readily applied to any observational healthcare database mapped to the Observational Medical Outcome Partnership common data model and is available from https://github.com/OHDSI/StudyProtocolSandbox/tree/master/DeadModel
Similar works
Full text
Open in the Core reader
Download PDF
Available Versions
EUR Research Repository
See this paper in CORE
Go to the repository landing page
Download from data provider
oai:pure.eur.nl:openaire/98a95...
Last time updated on 16/10/2025
Erasmus University Digital Repository
See this paper in CORE
Go to the repository landing page
Download from data provider
oai:repub.eur.nl:122306
Last time updated on 25/12/2019
EUR Research Repository
See this paper in CORE
Go to the repository landing page
Download from data provider
oai:pure.eur.nl:publications/9...
Last time updated on 29/05/2023
EUR Research Repository
See this paper in CORE
Go to the repository landing page
Download from data provider
oai:pure.eur.nl:openaire_cris_...
Last time updated on 29/05/2023