6 research outputs found
How to Simulate Realistic Survival Data? A Simulation Study to Compare Realistic Simulation Models
In statistics, it is important to have realistic data sets available for a
particular context to allow an appropriate and objective method comparison. For
many use cases, benchmark data sets for method comparison are already available
online. However, in most medical applications and especially for clinical
trials in oncology, there is a lack of adequate benchmark data sets, as patient
data can be sensitive and therefore cannot be published. A potential solution
for this are simulation studies. However, it is sometimes not clear, which
simulation models are suitable for generating realistic data. A challenge is
that potentially unrealistic assumptions have to be made about the
distributions. Our approach is to use reconstructed benchmark data sets %can be
used as a basis for the simulations, which has the following advantages: the
actual properties are known and more realistic data can be simulated. There are
several possibilities to simulate realistic data from benchmark data sets. We
investigate simulation models based upon kernel density estimation, fitted
distributions, case resampling and conditional bootstrapping. In order to make
recommendations on which models are best suited for a specific survival
setting, we conducted a comparative simulation study. Since it is not possible
to provide recommendations for all possible survival settings in a single
paper, we focus on providing realistic simulation models for two-armed phase
III lung cancer studies. To this end we reconstructed benchmark data sets from
recent studies. We used the runtime and different accuracy measures (effect
sizes and p-values) as criteria for comparison