6 research outputs found
Improving adaptive seamless designs through Bayesian optimization
We propose to use Bayesian optimization (BO) to improve the efficiency of the design selection process in clinical trials. BO is a method to optimize expensive black-box functions, by using a regression as a surrogate to guide the search. In clinical trials, planning test procedures and sample sizes is a crucial task. A common goal is to maximize the test power, given a set of treatments, corresponding effect sizes, and a total number of samples. From a wide range of possible designs, we aim to select the best one in a short time to allow quick decisions. The standard approach to simulate the power for each single design can become too time consuming. When the number of possible designs becomes very large, either large computational resources are required or an exhaustive exploration of all possible designs takes too long. Here, we propose to use BO to quickly find a clinical trial design with high power from a large number of candidate designs. We demonstrate the effectiveness of our approach by optimizing the power of adaptive seamless designs for different sets of treatment effect sizes. Comparing BO with an exhaustive evaluation of all candidate designs shows that BO finds competitive designs in a fraction of the time
Combining heterogeneous subgroups with graph-structured variable selection priors for Cox regression
Important objectives in cancer research are the prediction of a patient's
risk based on molecular measurements such as gene expression data and the
identification of new prognostic biomarkers (e.g. genes). In clinical practice,
this is often challenging because patient cohorts are typically small and can
be heterogeneous. In classical subgroup analysis, a separate prediction model
is fitted using only the data of one specific cohort. However, this can lead to
a loss of power when the sample size is small. Simple pooling of all cohorts,
on the other hand, can lead to biased results, especially when the cohorts are
heterogeneous. For this situation, we propose a new Bayesian approach suitable
for continuous molecular measurements and survival outcome that identifies the
important predictors and provides a separate risk prediction model for each
cohort. It allows sharing information between cohorts to increase power by
assuming a graph linking predictors within and across different cohorts. The
graph helps to identify pathways of functionally related genes and genes that
are simultaneously prognostic in different cohorts. Results demonstrate that
our proposed approach is superior to the standard approaches in terms of
prediction performance and increased power in variable selection when the
sample size is small.Comment: under review, 19 pages, 10 figure
Extending model-based optimization with resource-aware parallelization and for dynamic optimization problems
This thesis contains two works on the topic of sequential model-based optimization (MBO).
In the first part an extension of MBO towards resource-aware parallelization is presented and
in the second part MBO is adapted to optimize dynamic optimization problems. Before the
newly developed methods are introduced the reader is given a detailed introduction into various
aspects of MBO and related work. This covers thoughts on the choice of the initial design, the
surrogate model, the acquisition functions, and the final optimization result. As most methods
in this thesis rely on the Gaussian process regression it is covered in detail as well.
The chapter on “Parallel MBO” dives into the topic of making use of multiple workers that
can evaluate the black-box and especially focuses on the problem of heterogeneous runtimes.
Strategies that tackle this problem can be divided into synchronous and asynchronous methods.
Instead of proposing one configuration in an iterative fashion, as done by ordinary MBO,
synchronous methods usually propose as many configurations as there are workers available.
Previously proposed synchronous methods neglect the problem of heterogeneous runtimes which
causes idling, when evaluations end at different times. This work presents current methods
for parallel MBO that cover synchronous and asynchronous methods and presents the newly
proposed Resource-Aware Model-based Optimization (RAMBO) Framework. This work shows
that synchronous and asynchronous methods each have their advantages and disadvantages and
that RAMBO can outperform common synchronous MBO methods if the runtime is predictable
but still obtains comparable results in the worst case.
The chapter on “MBO with Concept Drift” (MBO-CD) explains the adaptions that have
been developed to allow optimization of black-box functions that change systematically over
time. Two approaches are explained on how MBO can be taught to handle black-box functions
where the relation between input and output changes over time, i.e. where a concept drift
occurs. The window approach trains the surrogate only on the most recent observations. The
time-as-covariate approach includes the time as an additional input variable in the surrogate,
giving it the ability to learn the effect of the time. For the latter, a special acquisition function,
the temporal expected improvement, is proposed
Weighted Cox regression for the prediction of heterogeneous patient subgroups
An important task in clinical medicine is the construction of risk prediction
models for specific subgroups of patients based on high-dimensional molecular
measurements such as gene expression data. Major objectives in modeling
high-dimensional data are good prediction performance and feature selection to
find a subset of predictors that are truly associated with a clinical outcome
such as a time-to-event endpoint. In clinical practice, this task is
challenging since patient cohorts are typically small and can be heterogeneous
with regard to their relationship between predictors and outcome. When data of
several subgroups of patients with the same or similar disease are available,
it is tempting to combine them to increase sample size, such as in multicenter
studies. However, heterogeneity between subgroups can lead to biased results
and subgroup-specific effects may remain undetected. For this situation, we
propose a penalized Cox regression model with a weighted version of the Cox
partial likelihood that includes patients of all subgroups but assigns them
individual weights based on their subgroup affiliation. Patients who are likely
to belong to the subgroup of interest obtain higher weights in the
subgroup-specific model. Our proposed approach is evaluated through simulations
and application to real lung cancer cohorts. Simulation results demonstrate
that our model can achieve improved prediction and variable selection accuracy
over standard approaches.Comment: under review, 15 pages, 6 figure
Model-Based Optimization of Subgroup Weights for Survival Analysis
To obtain a reliable prediction model for a specific cancer subgroup or cohort is often difficult due to the limited number of samples and, in survival analysis, even more due to potentially high censoring rates. Sometimes similar datasets are available for other patient subgroups with the same or a similar disease and treatment, e.g., from other clinical centers. Simple pooling of all subgroups can decrease the variance of the predicted parameters of the prediction models, but also increase the bias due to potential high heterogeneity between the cohorts.
A promising compromise is to identify which subgroups are similar enough to the specific subgroup of interest and then include only these for model building.
Similarity here refers to the relationship between input and output in the prediction model, and not necessarily to the distributions of the input and output variables themselves.
Here, we propose a subgroup-based weighted likelihood approach and evaluate it on a set of lung cancer cohorts. When interested in a prediction model for a specific subgroup, then for every other subgroup, an individual weight determines the strength with which its observations enter into the likelihood-based optimization of the model parameters. A weight close to 0 indicates that a subgroup should be discarded, and a weight close to 1 indicates that the subgroup fully enters into the model building process.
MBO (model based optimization) can be used to quickly find a good prediction model in the presence of a large number of hyperparameters to be tuned. Here, we use MBO to identify the best model for survival prediction in lung cancer subgroups, where besides the parameters of a Cox model additionally the individual values of the subgroup weights are optimized. Interestingly, often the resulting models with highest prediction quality are obtained for a mixed weight structure, i.e. both weights close to 0, weights close to 1, and medium weights are optimal, reflecting the similarity of the corresponding cancer subgroups